1 # Heterogeneous Accelerator Toolkit (HAT)
  2 
  3 [![repo](https://img.shields.io/badge/github-repo-blue?logo=github)](https://github.com/openjdk/babylon/tree/code-reflection/hat)
  4 
  5 HAT is a toolkit that allows developers to express data-parallel applications in Java, optimize, offload and execute them on hardware accelerators.
  6 - **Heterogeneous**: a variety of devices and their corresponding programming languages.
  7 - **Accelerator**: GPUs, FPGA, CPUs, etc.
  8 - **Toolkit**: a set of libraries for Java developers.
  9 
 10 HAT uses the code reflection API from the [Project Babylon](https://github.com/openjdk/babylon).
 11 
 12 The toolkit offers:
 13 - An API for Kernel Programming on Accelerators from Java.
 14 - An API for Combining multiple kernels into a compute-graph.
 15 - An API for Java object mapping to hardware accelerators using Panama FFM.
 16 - An extensible backend system for multiple accelerators:
 17   - OpenCL
 18   - CUDA
 19   - Java
 20 
 21 ## Prerequisites
 22 
 23 - HAT currently requires Babylon JDK, which contains the code reflection APIs.
 24 - A base JDK >= 25. We currently use OpenJDK 26 for development.
 25 - A GPU SDK (one or more of the SDKs below) to be able to run on GPUs:
 26   - An OpenCL implementation (e.g., Intel, Apple Silicon, CUDA SDK)
 27     - OpenCL >= 1.2
 28   - CUDA SDK >= 12.9
 29 - `cmake` >= `3.22.1`
 30 - `gcc` >= 12.0, or `clang` >= 17.0
 31 
 32 ## Compatible systems
 33 
 34 We actively develop and run tests on the following systems:
 35 
 36 - Apple Silicon M1-M4
 37 - Linux Fedora >= 42
 38 - Oracle Linux 10
 39 - Ubuntu >= 22.04
 40 
 41 ## Quick Start
 42 
 43 ### 1. Build Babylon JDK
 44 
 45 ```bash
 46 git clone https://github.com/openjdk/babylon
 47 cd babylon
 48 bash configure --with-boot-jdk=${JAVA_HOME}
 49 make clean
 50 make images
 51 ```
 52 
 53 ### 2. Update JAVA_HOME and PATH
 54 
 55 ```bash
 56 export JAVA_HOME=<BABYLON-DIR>/build/macosx-aarch64-server-release/images/jdk
 57 export PATH=$JAVA_HOME/bin:$PATH
 58 ```
 59 
 60 ### 3. Build HAT
 61 
 62 ```bash
 63 sdk install jextract #if needed
 64 cd hat
 65 java @.bld
 66 ```
 67 
 68 Done!
 69 
 70 ## Run Examples
 71 
 72 For instance, matrix-multiply:
 73 
 74 ```bash
 75 java @.run ffi-opencl matmul --size=1024
 76 ```
 77 
 78 Some examples have a GUI implementation:
 79 
 80 ```java
 81 java @.run ffi-opencl mandel
 82 ```
 83 
 84 Full list of examples:
 85 - [link](https://github.com/openjdk/babylon/tree/code-reflection/hat/examples)
 86 
 87 
 88 ## Run Unit-Tests
 89 
 90 OpenCL backend:
 91 
 92 ```bash
 93 java @.test-suite ffi-opencl
 94 ```
 95 
 96 CUDA backed:
 97 
 98 ```bash
 99 java @.test-suite ffi-cuda
100 ```
101 
102 ## Full Example Explained
103 
104 The following example compute the square value of an input vector.
105 The example is self-contained and it can be directly run with the `java` command.
106 
107 Place the following code in the `hat` directory.
108 
109 ```java
110 import hat.*;
111 import hat.Accelerator.Compute;
112 import hat.backend.*;
113 import hat.buffer.*;
114 import optkl.ifacemapper.MappableIface.*;
115 import jdk.incubator.code.Reflect;
116 import java.lang.invoke.MethodHandles;
117 
118 public class ExampleHAT {
119 
120     // Kernel Code: This is the function to be offloaded to the accelerator (e.g.,
121     // a GPU). The kernel will be executed by many GPU threads, in this case,
122     // as many threads as elements in `array`.
123     // The `kc` object can be used to obtain the thread identifier and map
124     // the data element to process.
125     // HAT kernels follow the SIMT programming model (Single Instruction Multiple Thread)
126     // mode.
127     // Kernel code is reflectable. Thus, the HAT runtime and HAT compiler can build
128     // and optimize the code model. Once the code model is optimized, HAT generates
129     // OpenCL/CUDA C99 code.
130     @Reflect
131     public static void squareKernel(@RO KernelContext kc, @RW S32Array array) {
132         // HAT kernels support a reduced set of Java.
133         // Kernels express the work to be done per thread (GPU/accelerator thread).
134         if (kc.gix < array.length()) {
135             int value = array.array(kc.gix);
136             array.array(kc.gix, (value * value));
137         }
138     }
139 
140     // The following method represents the compute layer, in which we specify
141     // the number of threads to be deployed on the accelerator. The number of threads
142     // is specified in an ND-Range. An ND-Range could be 1D, 2D and 3D.
143     // In this example, we launch 1D-range with the number of threads equal to
144     // the input array size.
145     @Reflect
146     public static void square(@RO ComputeContext cc, @RW S32Array array) {
147         var ndRange = NDRange.of1D(array.length());
148 
149         // Dispatch the kernel. The HAT runtime will offload the kernels
150         // reached from this point and run the generated GPU kernels on the
151         // target accelerator.
152         // Furthermore, HAT automatically transfers data to the accelerator.
153         // This is a blocking call, and when it returns control to the main
154         // Java thread, results (outputs) are available to be consumed.
155         cc.dispatchKernel(ndRange, kc -> squareKernel(kc, array));
156     }
157 
158     static void main(String[] args) {
159         final int size = 4096;
160 
161         // Create a new accelerator object
162         var accelerator = new Accelerator(MethodHandles.lookup(), Backend.FIRST);
163 
164         // Instantiate an array on the target accelerator.
165         // Data is stored off-heap using the Panama FFM API.
166         var array = S32Array.create(accelerator, size);
167 
168         // Data initialization
169         for (int i = 0; i < array.length(); i++) {
170             array.array(i, i);
171         }
172 
173         // Offload and dispatch of the compute-graph on the target accelerator.
174         // This is a blocking call. Once this call finalizes, the results (outputs)
175         // will be available to consume by the current Java thread.
176         accelerator.compute((@Reflect Compute) cc -> ExampleHAT.square(cc, array));
177 
178         // Test result
179         boolean isCorrect = true;
180         for (int i = 0; i < size; i++) {
181             if (array.array(i) != i * i) {
182                 isCorrect = false;
183             }
184         }
185         if (isCorrect) {
186             IO.println("Result is correct");
187         } else {
188             IO.println("Result is NOT correct");
189         }
190     }
191 }
192 ```
193 
194 Run this example in the `babylon/hat` directory.
195 If you run from another directory, update the `--class-path` parameter accordingly.
196 Use the `java` version built with the Babylon JDK.
197 
198 ```bash
199 java --enable-preview \
200    --add-modules=jdk.incubator.code \
201    --enable-native-access=ALL-UNNAMED \
202    --class-path build/hat-optkl-1.0.jar:build/hat-core-1.0.jar:build/hat-backend-ffi-shared-1.0.jar:build/hat-backend-ffi-opencl-1.0.jar \
203    -Djava.library.path=/Users/juanfumero/repos/babylon/hat/build \
204    ExampleHAT
205 ```
206 
207 If you run with `HAT=INFO` you can see which accelerator was used:
208 
209 ```bash
210 $ HAT=INFO java --enable-preview ... ExampleHAT.java
211 
212 [INFO] Config Bits = 8000
213 [INFO] Platform :"Apple"
214 [INFO]   Version      :"OpenCL 1.2 (Jan 16 2026 07:22:26)"
215 [INFO]   Name         :"Apple"
216 [INFO]   Device Type  : GPU  4
217 [INFO] OpenCLBackend::OpenCLQueue::dispatch
218 [INFO] numDimensions: 1
219 [INFO] GLOBAL [4096,1,1]
220 [INFO] LOCAL  [ nullptr ] // The driver will setup a default value
221 
222 Result is correct
223 ```
224 
225 ## Documentation
226 
227 Visit the [docs](docs/) folder.
228 
229 ## Contributing
230 
231 Contributions are welcome. Please see the [OpenJDK Developers' Guide](https://openjdk.org/guide/).
232 
233 ## Development Workflow
234 
235 1. Fork the repository
236 2. Create a feature branch: `git checkout -b <branch>`
237 3. Commit with clear messages
238 4. Run formatting and tests:
239    1. For OpenCL: `java @.est-suite ffi-opencl`
240    1. For CUDA: `java @.test-suite ffi-cuda`
241 5. Submit a pull request
242 
243 
244 ## Contacts/Questions
245 
246 You can interact, provide feedback and ask questions using the [babylon-dev](https://mail.openjdk.org/pipermail/babylon-dev/) mailing list.
247