1
2 # Running Examples on the GPU
3
4 ----
5 * [Contents](hat-00.md)
6 * Build Babylon and HAT
7 * [Quick Install](hat-01-quick-install.md)
8 * [Building Babylon with jtreg](hat-01-02-building-babylon.md)
9 * [Building HAT with jtreg](hat-01-03-building-hat.md)
10 * [Enabling the NVIDIA CUDA Backend](hat-01-05-building-hat-for-cuda.md)
11 * [Testing Framework](hat-02-testing-framework.md)
12 * [Running Examples](hat-03-examples.md)
13 * [HAT Programming Model](hat-03-programming-model.md)
14 * Interface Mapping
15 * [Interface Mapping Overview](hat-04-01-interface-mapping.md)
16 * [Cascade Interface Mapping](hat-04-02-cascade-interface-mapping.md)
17 * Development
18 * [Project Layout](hat-01-01-project-layout.md)
19 * Implementation Details
20 * [Walkthrough Of Accelerator.compute()](hat-accelerator-compute.md)
21 * [How we minimize buffer transfers](hat-minimizing-buffer-transfers.md)
22 * [Running HAT with Docker on NVIDIA GPUs](hat-07-docker-build-nvidia.md)
23 ---
24
25 The [examples-package]() in HAT contains a list of of examples varying from matrix operations, DFT, Flash-Attention, nbody, shaders, and image detection.
26
27 To run an example:
28
29 ```bash
30 java @hat/run ffi-<backend> <example>
31 ```
32
33 For instance, to run `flash-attention` with the OpenCL backend:
34
35 ```bash
36 java @hat/run ffi-opencl flashattention
37 ```
38
39 For the CUDA backend:
40
41 ```backend
42 java @hat/run ffi-cuda flashattention
43 ```
44
45 ## Options for Examples
46
47 Some of the examples accept command line options to specify input size, kernel version, etc.
48
49 For example, `flashattention`:
50
51 ```bash
52 java -cp hat/job.jar hat.java run ffi-opencl flashattention --size=2048 --iterations=10 --verbose
53 ```
54
55 You can see the full list of options by using `--help`:
56
57 ```bash
58 --size=<size> Specify an input size
59 --iterations=<numIterations> Specify the number of iterations to perform
60 --skip-sequential Flag to bypass the sequential execution in Java
61 --check Flag to check the results. This implies the Java sequential version runs.
62 --verbose Flag to print information between runs (e.g., total time).
63 --help Print this help.
64 ```
65
66 ### Obtaining information about the accelerator used to launch the application
67
68 You can use the variable `INFO` to indicate the HAT runtime to dump the device name and device version used to launch the generated GPU kernel:
69
70
71 ```bash
72 $ HAT=INFO java @hat/run ffi-cuda matmul
73
74 [INFO] Input Size : 1024x1024
75 [INFO] Check Result: : false
76 [INFO] Num Iterations : 100
77 [INFO] NDRangeConfiguration: 2DTILING
78
79 [INFO] Using NVIDIA GPU: NVIDIA GeForce RTX 5060 << an NVIDIA 5060 was used
80 [INFO] Dispatching the CUDA kernel
81 \_ BlocksPerGrid = [64,64,1] << Num blocks dispatched
82 \_ ThreadsPerBlock = [16,16,1] << threads-per-block dispatched
83 ```
84
85 ## Running Headless Versions
86
87 GUI applications contains a `headless` version, which can be able by passing the following argument:
88
89 ```bash
90 java @hat/run headless ffi-opencl mandel
91 ```
92
93 This sets `-Dheadless=true` and passes '--headless' to the example.
94