1
2 # Running Examples on the GPU
3
4 ----
5 * [Contents](hat-00.md)
6 * Build Babylon and HAT
7 * [Quick Install](hat-01-quick-install.md)
8 * [Building Babylon with jtreg](hat-01-02-building-babylon.md)
9 * [Building HAT with jtreg](hat-01-03-building-hat.md)
10 * [Enabling the NVIDIA CUDA Backend](hat-01-05-building-hat-for-cuda.md)
11 * [Testing Framework](hat-02-testing-framework.md)
12 * [Running Examples](hat-03-examples.md)
13 * [HAT Programming Model](hat-03-programming-model.md)
14 * Interface Mapping
15 * [Interface Mapping Overview](hat-04-01-interface-mapping.md)
16 * [Cascade Interface Mapping](hat-04-02-cascade-interface-mapping.md)
17 * Development
18 * [Project Layout](hat-01-01-project-layout.md)
19 * [IntelliJ Code Formatter](hat-development.md)
20 * Implementation Details
21 * [Walkthrough Of Accelerator.compute()](hat-accelerator-compute.md)
22 * [How we minimize buffer transfers](hat-minimizing-buffer-transfers.md)
23 * [Running HAT with Docker on NVIDIA GPUs](hat-07-docker-build-nvidia.md)
24 ---
25
26 The [examples-package]() in HAT contains a list of of examples varying from matrix operations, DFT, Flash-Attention, nbody, shaders, and image detection.
27
28 To run an example:
29
30 ```bash
31 java @hat/run ffi-<backend> <example>
32 ```
33
34 For instance, to run `flash-attention` with the OpenCL backend:
35
36 ```bash
37 java @hat/run ffi-opencl flashattention
38 ```
39
40 For the CUDA backend:
41
42 ```backend
43 java @hat/run ffi-cuda flashattention
44 ```
45
46 ## Options for Examples
47
48 Some of the examples accept command line options to specify input size, kernel version, etc.
49
50 For example, `flashattention`:
51
52 ```bash
53 java -cp hat/job.jar hat.java run ffi-opencl flashattention --size=2048 --iterations=10 --verbose
54 ```
55
56 You can see the full list of options by using `--help`:
57
58 ```bash
59 --size=<size> Specify an input size
60 --iterations=<numIterations> Specify the number of iterations to perform
61 --skip-sequential Flag to bypass the sequential execution in Java
62 --check Flag to check the results. This implies the Java sequential version runs.
63 --verbose Flag to print information between runs (e.g., total time).
64 --help Print this help.
65 ```
66
67 ### Obtaining information about the accelerator used to launch the application
68
69 You can use the variable `INFO` to indicate the HAT runtime to dump the device name and device version used to launch the generated GPU kernel:
70
71
72 ```bash
73 $ HAT=INFO java @hat/run ffi-cuda matmul
74
75 [INFO] Input Size : 1024x1024
76 [INFO] Check Result: : false
77 [INFO] Num Iterations : 100
78 [INFO] NDRangeConfiguration: 2DTILING
79
80 [INFO] Using NVIDIA GPU: NVIDIA GeForce RTX 5060 << an NVIDIA 5060 was used
81 [INFO] Dispatching the CUDA kernel
82 \_ BlocksPerGrid = [64,64,1] << Num blocks dispatched
83 \_ ThreadsPerBlock = [16,16,1] << threads-per-block dispatched
84 ```
85
86 ## Running Headless Versions
87
88 GUI applications contains a `headless` version, which can be able by passing the following argument:
89
90 ```bash
91 java @hat/run headless ffi-opencl mandel
92 ```
93
94 This sets `-Dheadless=true` and passes '--headless' to the example.
95