Skip to content

YARA-Java/jextract

 
 

Repository files navigation

Jextract

jextract is a tool which mechanically generates Java bindings from a native library headers. This tools leverages the clang C API in order to parse the headers associated with a given native library, and the generated Java bindings build upon the Foreign Function & Memory API. The jextract tool was originally developed in the context of Project Panama (and then made available in the Project Panama Early Access binaries).

Getting started

jextract depends on the C libclang API. To build the jextract sources, the easiest option is to download LLVM binaries for your platform, which can be found here (a version >= 9 is required). Both the jextract tool and the bindings it generates depend heavily on the Foreign Function & Memory API, so a suitable jdk 18 distribution is also required.

jextract can be built using gradle, as follows (on Windows, gradlew.bat should be used instead):

$ sh ./gradlew -Pjdk18_home=<jdk18_home_dir> -Pllvm_home=<libclang_dir> clean verify
Using a local installation of LLVM

While the recommended way is to use a release from the LLVM project, extract it then make llvm_home point to this directory, it may be possible to use a local installation instead.

E.g. on macOs the llvm_home can also be set as one of these locations :

  • /Library/Developer/CommandLineTools/usr/ if using Command Line Tools
  • /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/ if using XCode
  • $(brew --prefix llvm) if using the LLVM install from Homebrew

After building, there should be a new jextract folder under build. To run the jextract tool, simply run the jextract command in the bin folder:

build/jextract/bin/jextract 
WARNING: Using incubator modules: jdk.incubator.foreign
Expected a header file

The repository also contains a comprehensive set of tests, written using the jtreg test framework, which can be run as follows (again, on Windows, gradlew.bat should be used instead):

$ sh ./gradlew -Pjdk18_home=<jdk18_home_dir> -Pllvm_home=<libclang_dir> -Pjtreg_home=<jtreg_home> jtreg

Note however that running jtreg task requires cmake to be available on the PATH.

Using jextract

To understand how jextract works, consider the following C header file:

//point.h
struct Point2d {
    double x;
    double y;
};

double distance(struct Point2d);

We can run jextract, as follows:

jextract --source -t org.jextract point.h

We can then use the generated code as follows:

import jdk.incubator.foreign.*;
import static org.jextract.point_h.*;
import org.jextract.Point2d;

class TestPoint {
    public static void main(String[] args) {
        try (ResourceScope scope = ResourceScope.newConfinedScope()) {
           MemorySegment point = MemorySegment.allocateNative(Point2d.$LAYOUT(), scope);
           Point2d.x$set(point, 3d);
           Point2d.y$set(point, 4d);
           distance(point);
        }
    }
}

As we can see, the jextract tool generated a Point2d class, modelling the C struct, and a point_h class which contains static native function wrappers, such as distance. If we look inside the generated code for distance we can find the following:

static final FunctionDescriptor distance$FUNC = FunctionDescriptor.of(Constants$root.C_DOUBLE$LAYOUT,
    MemoryLayout.structLayout(
         Constants$root.C_DOUBLE$LAYOUT.withName("x"),
         Constants$root.C_DOUBLE$LAYOUT.withName("y")
    ).withName("Point2d")
);
static final MethodHandle distance$MH = RuntimeHelper.downcallHandle(
    "distance",
    constants$0.distance$FUNC
);

public static MethodHandle distance$MH() {
    return RuntimeHelper.requireNonNull(constants$0.distance$MH,"distance");
}
public static double distance ( MemorySegment x0) {
    var mh$ = distance$MH();
    try {
        return (double)mh$.invokeExact(x0);
    } catch (Throwable ex$) {
        throw new AssertionError("should not reach here", ex$);
    }
}

In other words, the jextract tool has generated all the required supporting code (MemoryLayout, MethodHandle and FunctionDescriptor) that is needed to call the underlying distance native function. For more examples on how to use the jextract tool with real-world libraries, please refer to the samples folder (building/running particular sample may require specific third-party software installation).

Command line options

The jextract tool includes several customization options. Users can select in which package the generated code should be emitted, and what the name of the main extracted class should be. A complete list of all the supported options is given below:

Option Meaning
-D <macro> define a C preprocessor macro
--header-class-name <name> specify the name of the main header class
-t, --target-package <package> specify target package for the generated bindings
-I <path> specify include files path for the clang parser
-l <library> specify a library that will be loaded by the generated bindings
--output <path> specify where to place generated files
--source generate java sources instead of classfiles
--dump-includes <String> dump included symbols into specified file (see below)
--include-[function,macro,struct,union,typedef,var]<String> Include a symbol of the given name and kind in the generated bindings (see below). When one of these options is specified, any symbol that is not matched by any specified filters is omitted from the generated bindings.
'--version` print version information and exit

Additional clang options

Users can specify additional clang compiler options, by creating a file named compile_flags.txt in the current folder, as described here.

Filtering symbols

To allow for symbol filtering, jextract can generate a dump of all the symbols encountered in an header file; this dump can be manipulated, and then used as an argument file (using the @argfile syntax also available in other JDK tools) to e.g. generate bindings only for a subset of symbols seen by jextract. For instance, if we run jextract with as follows:

jextract --dump-includes=includes.txt point.h

We obtain the following file (includes.txt):

#### Extracted from: point.h

--include-struct Point2d    # header: point.h
--include-function distance # header: point.h

This file can be passed back to jextract, as follows:

jextract -t org.jextract --source @includes.txt point.h

It is easy to see how this mechanism allows developers to look into the set of symbols seen by jextract while parsing, and then process the generated include file, so as to prevent code generation for otherwise unused symbols.

Releases

No releases published

Packages

No packages published

Languages

  • Java 85.6%
  • C 12.3%
  • Shell 1.6%
  • Other 0.5%