1 
 2 # Running Examples on the GPU
 3 
 4 ----
 5 * [Contents](hat-00.md)
 6 * Build Babylon and HAT
 7     * [Quick Install](hat-01-quick-install.md)
 8     * [Building Babylon with jtreg](hat-01-02-building-babylon.md)
 9     * [Building HAT with jtreg](hat-01-03-building-hat.md)
10         * [Enabling the NVIDIA CUDA Backend](hat-01-05-building-hat-for-cuda.md)
11 * [Testing Framework](hat-02-testing-framework.md)
12 * [Running Examples](hat-03-examples.md)
13 * [HAT Programming Model](hat-03-programming-model.md)
14 * Interface Mapping
15     * [Interface Mapping Overview](hat-04-01-interface-mapping.md)
16     * [Cascade Interface Mapping](hat-04-02-cascade-interface-mapping.md)
17 * Development
18     * [Project Layout](hat-01-01-project-layout.md)
19 * Implementation Details
20     * [Walkthrough Of Accelerator.compute()](hat-accelerator-compute.md)
21     * [How we minimize buffer transfers](hat-minimizing-buffer-transfers.md)
22 * [Running HAT with Docker on NVIDIA GPUs](hat-07-docker-build-nvidia.md)
23 ---
24 
25 The [examples-package]() in HAT contains a list of of examples varying from matrix operations, DFT, Flash-Attention, nbody, shaders, and image detection.
26 
27 To run an example:
28 
29 ```bash
30 java @hat/run ffi-<backend> <example>
31 ```
32 
33 For instance, to run `flash-attention` with the OpenCL backend:
34 
35 ```bash
36 java @hat/run ffi-opencl flashattention
37 ```
38 
39 For the CUDA backend:
40 
41 ```backend
42 java @hat/run ffi-cuda flashattention
43 ```
44 
45 ## Options for Examples
46 
47 Some of the examples accept command line options to specify input size, kernel version, etc.
48 
49 For example, `flashattention`:
50 
51 ```bash
52 java -cp hat/job.jar hat.java run ffi-opencl flashattention --size=2048 --iterations=10 --verbose
53 ```
54 
55 You can see the full list of options by using `--help`:
56 
57 ```bash
58     --size=<size>                   Specify an input size
59     --iterations=<numIterations>    Specify the number of iterations to perform
60     --skip-sequential               Flag to bypass the sequential execution in Java
61     --check                         Flag to check the results. This implies the Java sequential version runs.
62     --verbose                       Flag to print information between runs (e.g., total time).
63     --help                          Print this help.
64 ```
65 
66 ### Obtaining information about the accelerator used to launch the application
67 
68 You can use the variable `INFO` to indicate the HAT runtime to dump the device name and device version used to launch the generated GPU kernel:
69 
70 
71 ```bash
72 $ HAT=INFO java @hat/run ffi-cuda matmul
73 
74 [INFO] Input Size     : 1024x1024
75 [INFO] Check Result:  : false
76 [INFO] Num Iterations : 100
77 [INFO] NDRangeConfiguration: 2DTILING
78 
79 [INFO] Using NVIDIA GPU: NVIDIA GeForce RTX 5060   << an NVIDIA 5060 was used
80 [INFO] Dispatching the CUDA kernel
81         \_ BlocksPerGrid   = [64,64,1]    << Num blocks dispatched
82         \_ ThreadsPerBlock = [16,16,1]    << threads-per-block dispatched
83 ```
84 
85 ## Running Headless Versions
86 
87 GUI applications contains a `headless` version, which can be able by passing the following argument:
88 
89 ```bash
90 java @hat/run headless ffi-opencl mandel
91 ```
92 
93 This sets `-Dheadless=true` and passes '--headless' to the example.
94