diff a/doc/panama_ffi.html b/doc/panama_ffi.html --- /dev/null +++ b/doc/panama_ffi.html @@ -0,0 +1,693 @@ + + + + + +panama_ffi + +
+

State of foreign function support

January 2023

Maurizio Cimadamore

The Foreign Function & Memory API (FFM API in short) provides access to foreign functions through the Linker interface, which has been available as an incubating API since Java 16. A linker allows clients to construct downcall method handles — that is, method handles whose invocation targets a native function defined in some native library. In other words, FFM API's foreign function support is completely expressed in terms of Java code and no intermediate native code is required.

Zero-length memory segments

Before we dive into the specifics of the foreign function support, it would be useful to briefly recap some of the main concepts we have learned when exploring the foreign memory access support. The Foreign Memory Access API allows client to create and manipulate memory segments. A memory segment is a view over a memory source (either on- or off-heap) which is spatially bounded, temporally bounded and thread-confined. The guarantees ensure that dereferencing a segment that has been created by Java code is always safe, and can never result in a VM crash, or, worse, in silent memory corruption.

Now, in the case of memory segments, the above properties (spatial bounds, temporal bounds and confinement) can be known in full when the segment is created. But when we interact with native libraries we often receive raw pointers; such pointers have no spatial bounds (does a char* in C refer to one char, or a char array of a given size?), no notion of temporal bounds, nor thread-confinement. Raw addresses in the FFM API are modelled using zero-length memory segments.

If clients want to dereference a zero-length memory segment, they can do so unsafely in two ways. First, the client can create a new memory segment from the zero-length memory segment unsafely, using the MemorySegment::ofAddress factory. This method is restricted and will generate runtime warnings if called without specifying the --enable-native-access command-line flag. By calling MemorySegment::ofAddress a client inject extra knowledge about spatial bounds which might be available in the native library the client is interacting with:

Alternatively, clients can obtain an unbounded address value layout. This is done using the ValueLayout.OfAddress::asUnbounded method (which is also a restricted method). When an access operation uses an unbounded address value layouts, the runtime will wrap any corresponding raw addresses with native segments with maximal size (i.e. Long.MAX_VALUE). As such, these segments can be accessed directly, as follows:

Which approach is taken largely depends on the information that a client has available when obtaining a memory segment wrapping a native pointer. For instance, if such pointer points to a C struct, the client might prefer to resize the segment unsafely, to match the size of the struct (so that out-of-bounds access will be detected by the API). In other instances, however, there will be no, or little information as to what spatial and/or temporal bounds should be associated with a given native pointer. In these cases using an unbounded address layout might be preferable.

Segment allocators

Idiomatic C code implicitly relies on stack allocation to allow for concise variable declarations; consider this example:

A variable initializer such as the one above can be implemented as follows, using the Foreign Memory Access API:

There are a number of issues with the above code snippet:

To address these problems, the FFM API provides a SegmentAllocator abstraction, a functional interface which provides methods to allocate commonly used values. Since Arena implements the SegmentAllocator interface, the above code can be rewritten conveniently as follows:

In the above code, the arena acts as a native allocator (that is, an allocator built on top of MemorySegment::allocateNative). The arena is then used to create a native array, initialized to the values 0, 1, 2, 3, 4. The array initialization is more efficient, compared to the previous snippet, as the Java array is copied in bulk into the memory region associated with the newly allocated memory segment. The returned segment is associated with the scope of the arena which performed the allocation, meaning that the segment will no longer be accessible after the try-with-resource construct.

Custom segment allocators are also critical to achieve optimal allocation performance; for this reason, a number of predefined allocators are available via factories in the SegmentAllocator interface. For example, the following code creates a slicing allocator and uses it to allocate a segment whose content is initialized from a Java int array:

This code creates a native segment whose size is 1024 bytes. The segment is then used to create a slicing allocator, which responds to allocation requests by returning slices of that pre-allocated segment. If the current segment does not have sufficient space to accommodate an allocation request, an exception is thrown. All of the memory associated with the segments created by the allocator (i.e., in the body of the for loop) is deallocated atomically when the arena is closed. This technique combines the advantages of deterministic deallocation, provided by the Arena abstraction, with a more flexible and scalable allocation scheme. It can be very useful when writing code which manages a large number of off-heap segments.

All the methods in the FFM API which produce memory segments (see VaList::nextVarg and downcall method handles), allow for an allocator parameter to be provided — this is key in ensuring that an application using the FFM API achieves optimal allocation performances, especially in non-trivial use cases.

Symbol lookups

The first ingredient of any foreign function support is a mechanism to lookup symbols in native libraries. In traditional Java/JNI, this is done via the System::loadLibrary and System::load methods. Unfortunately, these methods do not provide a way for clients to obtain the address associated with a given library symbol. For this reason, the Foreign Linker API introduces a new abstraction, namely SymbolLookup (similar in spirit to a method handle lookup), which provides capabilities to lookup named symbols; we can obtain a symbol lookup in 3 different ways:

Once a lookup has been obtained, a client can use it to retrieve handles to library symbols (either global variables or functions) using the find(String) method, which returns an Optional<MemorySegment>. The memory segments returned by the lookup are zero-length segments, whose base address is the address of the function or variable in the library.

For instance, the following code can be used to look up the clang_getClangVersion function provided by the clang library; it does so by creating a library lookup whose lifecycle is associated to that of a confined arena.

Linker

At the core of the FFM API's foreign function support we find the Linker abstraction. This abstraction plays a dual role: first, for downcalls, it allows modelling foreign function calls as plain MethodHandle calls (see Linker::downcallHandle); second, for upcalls, it allows to convert an existing MethodHandle (which might point to some Java method) into a MemorySegment which could then be passed to foreign functions as a function pointer (see Linker::upcallStub):

Both functions take a FunctionDescriptor instance — essentially an aggregate of memory layouts which is used to describe the argument and return types of a foreign function in full. Supported layouts are value layouts (for scalars and pointers) and group layouts (for structs/unions). Each layout in a function descriptor is associated with a carrier Java type (see table below); together, all the carrier types associated with layouts in a function descriptor will determine a unique Java MethodType — that is, the Java signature that clients will be using when interacting with said downcall handles, or upcall stubs.

The Linker::nativeLinker factory is used to obtain a Linker implementation for the ABI associated with the OS and processor where the Java runtime is currently executing. As such, the native linker can be used to call C functions. The following table shows the mapping between C types, layouts and Java carriers under the Linux/macOS native linker implementation; note that the mappings can be platform dependent: on Windows/x64, the C type long is 32-bit, so the JAVA_INT layout (and the Java carrier int.class) would have to be used instead:

C typeLayoutJava carrier
boolJAVA_BOOLEANbyte
charJAVA_BYTEbyte
shortJAVA_SHORTshort, char
intJAVA_INTint
longJAVA_LONGlong
long longJAVA_LONGlong
floatJAVA_FLOATfloat
doubleJAVA_DOUBLEdouble
char*
int**
...
ADDRESSMemorySegment
struct Point { int x; int y; };
union Choice { float a; int b; };
...
MemoryLayout.structLayout(...)
MemoryLayout.unionLayout(...)
MemorySegment

Both C structs/unions and pointers are modelled using the MemorySegment carrier type. However, C structs/unions are modelled in function descriptors with memory layouts of type GroupLayout, whereas pointers are modelled using the ADDRESS value layout constant (whose size is platform-specific). Moreover, the behavior of a downcall method handle returning a struct/union type is radically different from that of a downcall method handle returning a C pointer:

A tool, such as jextract, will generate all the required C layouts (for scalars and structs/unions) automatically, so that clients do not have to worry about platform-dependent details such as sizes, alignment constraints and padding.

Downcalls

We will now look at how foreign functions can be called from Java using the native linker. Assume we wanted to call the following function from the standard C library:

In order to do that, we have to:

Here's an example of how we might want to do that (a full listing of all the examples in this and subsequent sections will be provided in the appendix):

Note that, since the function strlen is part of the standard C library, which is loaded with the VM, we can just use the default lookup of the native linker to look it up. The rest is pretty straightforward — the only tricky detail is how to model size_t: typically this type has the size of a pointer, so we can use JAVA_LONG both Linux and Windows. On the Java side, we model the size_t using a long and the pointer is modelled using an Addressable parameter.

Once we have obtained the downcall method handle, we can just use it as any other method handle1:

Here we are using a confined arena to convert a Java string into an off-heap memory segment which contains a NULL terminated C string. We then pass that segment to the method handle and retrieve our result in a Java long. Note how all this is possible without any piece of intervening native code — all the interop code can be expressed in (low level) Java. Note also how we use an arena to control the lifecycle of the allocated C string, which ensures timely deallocation of the memory segment holding the native string.

The Linker interface also supports linking of native functions without an address known at link time; when that happens, an address (of type MemorySegment) must be provided when the method handle returned by the linker is invoked — this is very useful to support virtual calls. For instance, the above code can be rewritten as follows:

It is important to note that, albeit the interop code is written in Java, the above code can not be considered 100% safe. There are many arbitrary decisions to be made when setting up downcall method handles such as the one above, some of which might be obvious to us (e.g. how many parameters does the function take), but which cannot ultimately be verified by the Java runtime. After all, a symbol in a dynamic library is nothing but a numeric offset and, unless we are using a shared library with debugging information, no type information is attached to a given library symbol. This means that the Java runtime has to trust the function descriptor passed in2; for this reason, the Linker::nativeLinker factory is also a restricted method.

When working with shared arenas, it is always possible for the arena associated with a memory segment passed by reference to a native function to be closed (by another thread) while the native function is executing. When this happens, the native code is at risk of dereferencing already-freed memory, which might trigger a JVM crash, or even result in silent memory corruption. For this reason, the Linker API provides some basic temporal safety guarantees: any MemorySegment instance passed by reference to a downcall method handle will be kept alive for the entire duration of the call. In other words, it's as if the call to the downcall method handle occurred inside an invisible call to SegmentScope::whileAlive.

Performance-wise, the reader might ask how efficient calling a foreign function using a native method handle is; the answer is very. The JVM comes with some special support for native method handles, so that, if a give method handle is invoked many times (e.g, inside a hot loop), the JIT compiler might decide to generate a snippet of assembly code required to call the native function, and execute that directly. In most cases, invoking native function this way is as efficient as doing so through JNI.

Upcalls

Sometimes, it is useful to pass Java code as a function pointer to some native function; we can achieve that by using foreign linker support for upcalls. To demonstrate this, let's consider the following function from the C standard library:

The qsort function can be used to sort the contents of an array, using a custom comparator function — compar — which is passed as a function pointer. To be able to call the qsort function from Java we have first to create a downcall method handle for it:

As before, we use JAVA_LONG and long.class to map the C size_t type, and ADDRESS for both the first pointer parameter (the array pointer) and the last parameter (the function pointer).

This time, in order to invoke the qsort downcall handle, we need a function pointer to be passed as the last parameter; this is where the upcall support in foreign linker comes in handy, as it allows us to create a function pointer out of an existing method handle. First, let's write a function that can compare two int elements (passed as pointers):

Here we can see that the function is performing some unsafe dereference of the pointer contents.

Now let's create a method handle pointing to the comparator function above:

To do that, we first create a function descriptor for the function pointer type, and then we use the CLinker::upcallType to turn that function descriptor into a suitable MethodType instance to be used in a method handle lookup. Now that we have a method handle for our Java comparator function, we finally have all the ingredients to create an upcall stub, and pass it to the qsort downcall handle:

The above code creates an upcall stub — comparFunc — a function pointer that can be used to invoke our Java comparator function, of type MemorySegment. The upcall stub is associated with the provided segment scope instance; this means that the stub will be uninstalled when the arena is closed.

The snippet then creates an off-heap array from a Java array, which is then passed to the qsort handle, along with the comparator function we obtained from the foreign linker. As a side effect, after the call, the contents of the off-heap array will be sorted (as instructed by our comparator function, written in Java). We can than extract a new Java array from the segment, which contains the sorted elements. This is a more advanced example, but one that shows how powerful the native interop support provided by the foreign linker abstraction is, allowing full bidirectional interop support between Java and native.

Varargs

Some C functions are variadic and can take an arbitrary number of arguments. Perhaps the most common example of this is the printf function, defined in the C standard library:

This function takes a format string, which features zero or more holes, and then can take a number of additional arguments that is identical to the number of holes in the format string.

The foreign function support can support variadic calls, but with a caveat: the client must provide a specialized Java signature, and a specialized description of the C signature. For instance, let's say we wanted to model the following C call:

To do this using the foreign function support provided by the FFM API we would have to build a specialized downcall handle for that call shape, using a linker option to specify the position of the first variadic layout, as follows:

Then we can call the specialized downcall handle as usual:

While this works, and provides optimal performance, it has some limitations3:

Appendix: full source code

The full source code containing most of the code shown throughout this document can be seen below:

 

+ +