1 # Welcome to the Leyden Prototype Repository 2 3 The purpose of the Leyden repository is to prototype improvements to 4 the startup time, time to peak performance, and footprint of Java programs, as a part of 5 [Project Leyden](https://openjdk.org/projects/leyden). We solicit feedback from 6 the Java community, with the hope that some of these improvements can eventually be 7 incorporated into future JDK releases. 8 9 ## 0. Disclaimers 10 11 - *This repository contains experimental and unstable code. It is not intended to be used 12 in a production environment.* 13 - *This repository is intended for developers of the JDK, and advanced Java developers who 14 are familiar with building the JDK.* 15 - *The experimental features in this repository may be changed or removed without notice. 16 Command line flags and workflows will change.* 17 - *The benchmarks results reported on this page are for illustrative purposes only. Your 18 applications may get better or worse results.* 19 20 To try out the Leyden prototype without building it from source code, please download the 21 Leyden EA Release from [https://jdk.java.net/leyden/](https://jdk.java.net/leyden/). 22 23 ## 1. Overview 24 25 As of JDK 25, the Leyden Project has successfully delivered ahead-of-time (AOT) 26 optimizations JEPs: 27 28 - [JEP 483 - Ahead-of-Time Class Loading & Linking](https://openjdk.org/jeps/483) 29 - [JEP 514 - Ahead-of-Time Command-Line Ergonomics](https://openjdk.org/jeps/514) 30 - [JEP 515 - Ahead-of-Time Method Profiling](https://openjdk.org/jeps/515) 31 32 Please refer to the above JEPs for a detailed discussion of AOT optimizations. 33 34 The Leyden "[premain](https://github.com/openjdk/leyden/blob/premain/)" prototype 35 includes new experimental AOT optimizations that are not yet integrated into the JDK mainline: 36 37 - **[Ahead-of-Time Code Compilation (JEP draft 8335368)](https://openjdk.org/jeps/8335368)**: Methods that are frequently used during the training run can be 38 compiled and stored along with the AOT cache. As a result, as soon as the application starts up 39 in the production run, its methods can be natively executed. 40 - This feature is enabled by default when you create an AOT cache. It can be disabled with the diagnostic 41 flag `-XX:-AOTCodeCaching`. 42 43 - **Ahead-of-time generation of [Dynamic Proxies](https://docs.oracle.com/en/java/javase/22/docs/api/java.base/java/lang/reflect/Proxy.html)**: 44 Dynamic proxies are frequently used by popular application frameworks. We can improve start-up time by generating these proxies ahead of time. 45 - This feature is enabled by default when you create an AOT cache. It can be disabled with the diagnostic 46 flag `-XX:-ArchiveDynamicProxies`. 47 48 - **Ahead-of-time generation of reflection data**: Reflection data (such as instances of 49 `java.lang.reflect.Method`) are generated by the JVM to support `java.lang.reflect` operations. We can 50 generate these ahead of time to improve start-up. 51 - This feature is enabled by default when you create an AOT cache. It can be disabled with the diagnostic 52 flag `-XX:-ArchiveReflectionData`. 53 54 - **Class Not Found Cache**: Sometimes application frameworks repeatedly try to load classes that do not exist. This optimization allows such failing lookups to be done quickly without repeatedly scanning the class path. 55 - This feature is enabled by default when you create an AOT cache. It can be disabled with the diagnostic 56 flag `-XX:-ArchiveLoaderLookupCache`. 57 58 ## 2. Building the Leyden Repository 59 60 The Leyden Repository can be built in the same way as the main-line JDK repository. 61 Please use the "premain" branch. I.e., [https://github.com/openjdk/leyden/tree/premain](https://github.com/openjdk/leyden/tree/premain). 62 63 For build instructions please see the 64 [online documentation](https://openjdk.org/groups/build/doc/building.html), 65 or either of these files: 66 67 - [doc/building.html](doc/building.html) (html version) 68 - [doc/building.md](doc/building.md) (markdown version) 69 70 See <https://openjdk.org/> for more information about the OpenJDK 71 Community and the JDK and see <https://bugs.openjdk.org> for JDK issue 72 tracking. 73 74 ## 3. Trying out Leyden Features 75 76 The easiest way to try out the Leyden optimizations is to build a JVM from the Leyden repository, and use it with your application with the `-XX:AOTCache` flag. 77 78 79 Here's a small benchmark that uses the JDK's built-in 80 [`JavaCompiler`](https://docs.oracle.com/en/java/javase/21/docs/api/java.compiler/javax/tools/JavaCompiler.html) 81 class to compile some Java source files. This benchmark spends a significant amount of start-up time 82 setting up the classes used by `JavaCompiler`, so it will benefit from the Leyden features. 83 84 First, download [JavacBenchApp.java](test/setup_aot/JavacBenchApp.java) and compile it into a JAR file. 85 86 (Remember to use the `java` program that you built from the Leyden repository.) 87 88 ``` 89 $ javac JavacBenchApp.java 90 $ jar cvf JavacBenchApp.jar JavacBenchApp*.class 91 added manifest 92 adding: JavacBenchApp$ClassFile.class(in = 1608) (out= 787)(deflated 51%) 93 adding: JavacBenchApp$FileManager.class(in = 2090) (out= 979)(deflated 53%) 94 adding: JavacBenchApp$SourceFile.class(in = 1351) (out= 671)(deflated 50%) 95 adding: JavacBenchApp.class(in = 7571) (out= 3302)(deflated 56%) 96 ``` 97 98 We can run this benchmark without any AOT optimizations. It takes 893 ms: 99 100 ``` 101 $ java -cp JavacBenchApp.jar JavacBenchApp 50 102 Generated source code for 51 classes and compiled them in 893 ms 103 ``` 104 105 To use AOT optimizations for JavacBenchApp, we should first perform a _training run_ and 106 capture the profiling information into `JavacBenchApp.aotconfig` 107 108 ``` 109 $ java -XX:AOTMode=record -XX:AOTConfiguration=JavacBenchApp.aotconfig \ 110 -cp JavacBenchApp.jar JavacBenchApp 50 111 $ ls -l JavacBenchApp.aotconfig 112 -rw-rw-r-- 1 iklam iklam 27652096 Mar 3 16:23 JavacBenchApp.aotconfig 113 ``` 114 115 With the `JavacBenchApp.aotconfig` file, we can create the AOT cache. This is called the _assembly phase_: 116 117 ``` 118 $ java -XX:AOTMode=create -XX:AOTConfiguration=JavacBenchApp.aotconfig \ 119 -cp JavacBenchApp.jar -XX:AOTCache=JavacBenchApp.aot 120 $ ls -l JavacBenchApp.aot 121 -r--r--r-- 1 iklam iklam 42332160 Mar 3 16:58 JavacBenchApp.aot 122 ``` 123 124 Alternatively, you can also combine the training run and assembly phase with a single command: 125 126 ``` 127 $ java -XX:AOTCacheOutput=JavacBenchApp.aot \ 128 -cp JavacBenchApp.jar JavacBenchApp 50 129 $ ls -l JavacBenchApp.aot 130 -r--r--r-- 1 iklam iklam 42332160 Mar 3 16:58 JavacBenchApp.aot 131 ``` 132 133 Now, we can make a _production run_ of the program using the AOT cache `JavacBenchApp.aot`. It finishes in 423 ms, or more than twice as fast as 134 before. 135 136 ``` 137 $ java -XX:AOTCache=JavacBenchApp.aot -cp JavacBenchApp.jar JavacBenchApp 50 138 Generated source code for 51 classes and compiled them in 423 ms 139 ``` 140 141 ### Ending the Training Run Early 142 143 By default, training runs end when the application terminates. You have two other options to end training runs: 144 145 - `-XX:AOTEndTrainingOnMethodEntry=<method1,method2,...>[,count=100]` 146 - `jcmd <pid> AOT.end_training` 147 148 Note that `-XX:AOTEndTrainingOnMethodEntry` uses the same format as `-XX:CompileOnly` and the default count is 1. 149 150 See [EndTrainingOnMethodEntry.java](test/hotspot/jtreg/runtime/cds/appcds/leyden/EndTrainingOnMethodEntry.java) for a test case. 151 152 ### Diagnosing Potential Performance Issues 153 154 As mentioned below, parts or all of the AOT cache may be disabled under certain circumstances. This may lead 155 to lower performance than expected. To diagnose potential performance issues, you can add `-Xlog:aot*` to the 156 command line to see detailed information about what parts of the AOT cache are being utilized. For example, if the 157 the AOT-compiled code cannot be loaded, you will see a log message like this: 158 159 ``` 160 [0.008s][info][aot,codecache,init] Mapped 652184 bytes at address 0x00007f491005f028 from AOT Code Cache 161 [0.008s][info][aot,codecache,init] Loaded 439 AOT code entries from AOT Code Cache 162 [0.008s][info][aot,codecache,init] Unable to use AOT Code Cache. 163 ``` 164 165 ### Diagnostic VM Flags 166 167 By default, all of the optimizations described 168 in the [Overview](#1-overview) section above are enabled by default. This ensures that you can get all the optimizations 169 without specifying them individually. 170 171 For diagnostic purposes, you can selectively disable some of the options: 172 173 - The `-XX:+AOTCodeCaching` flag affects only the assembly phase and the production run. 174 - The `-XX:+AOTRecordTraining` flag affects only the training run and the assembly phase. 175 - The `-XX:+AOTReplayTraining` flag affects only the production run. 176 - All other options affect only the assembly phase. 177 178 For example, you can disable the loading of AOT-compiled methods during the production run. Notice that the benchmark now 179 starts more slowly than it did when AOT-compiled methods was loaded. 180 181 ``` 182 $ java -XX:AOTCache=JavacBenchApp.aot -Xlog:cds=error -XX:-AOTCodeCaching \ 183 -cp JavacBenchApp.jar JavacBenchApp 50 184 Generated source code for 51 classes and compiled them in 647 ms 185 ``` 186 187 You can also disable AOT compilation in the assembly phase. Note that the size of the AOT 188 cache is smaller because it no longer has AOT-compiled methods. 189 190 ``` 191 $ java -XX:AOTMode=create -XX:AOTConfiguration=JavacBenchApp.aotconfig \ 192 -cp JavacBenchApp.jar \ 193 -XX:AOTCache=JavacBenchApp.aot -XX:-AOTCodeCaching 194 $ ls -l JavacBenchApp.aot 195 -r--r--r-- 1 iklam iklam 29990912 Mar 3 16:34 JavacBenchApp.aot 196 ``` 197 198 199 ## 4. Limitations of the Leyden Prototype 200 201 When trying out the Leyden prototype, please pay attention to the following limitations. 202 203 ### The Same CPU Must be Used between Training and Production Runs 204 205 The AOT-compiled code will be only used if the production run is on a machine with the same type of CPU 206 as used in the training run and assembly phase. If this is not the case (for example, the production run is on 207 a machine that has different AVX capabilities), the AOT-compiled code will be ignored. 208 209 210 ### The Same Garbage Collector Must be Used between Training and Production Runs 211 212 The AOT cache generated by the Leyden prototype includes machine instructions that are specific to 213 the garbage collector. We recommend that you explicitly specify the same collector during both 214 training and production runs. For example, if you prefer to use the SerialGC: 215 216 ``` 217 # assembly phase. 218 $ java -XX:AOTMode=create -XX:AOTConfiguration=JavacBenchApp.aotconfig \ 219 -cp JavacBenchApp.jar \ 220 -XX:AOTCache=JavacBenchApp.aot -XX:+UseSerialGC 221 222 # production run 223 $ java -XX:AOTCache=JavacBenchApp.aot -XX:+UseSerialGC -cp JavacBenchApp.jar \ 224 JavacBenchApp 50 225 ``` 226 227 Otherwise, the AOT cache may not be usable for the production run, leading to suboptimal performance. 228 For example, sometimes you may perform the assembly phase run on a large development host, and then use 229 a container to run the application in a small production node. In the following scenario, as the collector 230 is not explicitly specified, the VM will automatically pick G1 for the assembly phase, and SerialGC for the 231 production run (due to its limited amount of memory): 232 233 ``` 234 # Assembly phase (uses G1 by default) 235 $ java -XX:AOTMode=create -XX:AOTConfiguration=JavacBenchApp.aotconfig \ 236 -cp JavacBenchApp.jar -XX:AOTCache=JavacBenchApp.aot 237 238 # Production run (uses SerialGC) 239 $ docker run --rm -v /repos/leyden/build/linux-x64/images/jdk:/jdk -v $(pwd):/test \ 240 --memory=1024m \ 241 container-registry.oracle.com/java/openjdk \ 242 bash -c 'cd /test; ' \ 243 '/jdk/bin/java -XX:AOTCache=JavacBenchApp.aot ' \ 244 ' -cp JavacBenchApp.jar JavacBenchApp 50' 245 [0.001s][error][aot] AOT cache has aot-linked classes. It cannot be used because 246 GC used during dump time (G1) is not the same as runtime (Serial) 247 [0.001s][error][aot] An error has occurred while processing the AOT cache. 248 [0.001s][error][aot] Unable to map shared spaces 249 Error occurred during initialization of VM 250 Unable to use AOT cache. 251 ``` 252 253 ### Only G1GC, SerialGC, ParallelGC, EpsilonGC, ShenandoahGC are Supported 254 255 Currently, if you use any other garbage collector in combination with `-XX:AOTMode` or `-XX:AOTCache`, the VM will 256 exit with an error. 257 258 ``` 259 $ java -XX:AOTMode=record -XX:AOTConfiguration=JavacBenchApp.aotconfig \ 260 -cp JavacBenchApp.jar -XX:+UseZGC JavacBenchApp 50 261 Error occurred during initialization of VM 262 Cannot create the AOT configuration file: UseCompressedClassPointers must be enabled, 263 and collector must be G1, Parallel, Serial, Epsilon, or Shenandoah 264 ``` 265 266 ## 5. Benchmarking 267 268 We use a small set of benchmarks to demonstrate the performance of the optimizations in the Leyden repo. 269 270 | Benchmark | Source | 271 | ------------- | ------------- | 272 |[helidon-quickstart-se](test/hotspot/jtreg/premain/helidon-quickstart-se) | https://helidon.io/docs/v4/se/guides/quickstart| 273 |[javac-bench](test/hotspot/jtreg/premain/javac_bench) | Using Javac to compile 50 source files | 274 |[micronaut-first-app](test/hotspot/jtreg/premain/micronaut-first-app) | https://guides.micronaut.io/latest/creating-your-first-micronaut-app-maven-java.html| 275 |[quarkus-getting-started](test/hotspot/jtreg/premain/quarkus-getting-started) | https://quarkus.io/guides/getting-started| 276 |[spring-boot-getting-started](test/hotspot/jtreg/premain/spring-boot-getting-started) | https://spring.io/guides/gs/spring-boot| 277 |[spring-petclinic](test/hotspot/jtreg/premain/spring-petclinic) | https://github.com/spring-projects/spring-petclinic| 278 279 ### Benchmarking Against JDK Main-line 280 281 To can compare the performance of Leyden vs the main-line JDK, you need: 282 283 - An official build of JDK 21 284 - An up-to-date build of the JDK main-line 285 - The latest Leyden build 286 - Maven (ideally 3.8 or later, as required by some of the demos). Note: if you are behind 287 a firewall, you may need to [set up proxies for Maven](https://maven.apache.org/guides/mini/guide-proxies.html) 288 289 The same steps are used for benchmarking all of the above demos. For example: 290 291 ``` 292 $ cd test/hotspot/jtreg/premain/helidon-quickstart-se 293 $ make PREMAIN_HOME=/repos/leyden/build/linux-x64/images/jdk \ 294 MAINLINE_HOME=/repos/jdk/build/linux-x64/images/jdk \ 295 BLDJDK_HOME=/usr/local/jdk21 \ 296 bench 297 run,mainline default,mainline custom static cds,mainline aot cache,premain aot cache 298 1,456,229,156,117 299 2,453,227,157,117 300 3,455,232,155,116 301 4,448,230,154,114 302 5,440,228,156,114 303 6,446,228,156,114 304 7,448,232,156,114 305 8,465,261,159,114 306 9,448,226,157,113 307 10,442,233,154,114 308 Geomean,450.05,232.41,155.99,114.69 309 Stdev,6.98,9.72,1.41,1.35 310 Markdown snippets in mainline_vs_premain.md 311 ``` 312 313 The above command runs each configuration 10 times, in an interleaving order. This way 314 the noise of the system (background processes, thermo throttling, etc) is more likely to 315 be spread across the different runs. 316 317 As is typical for benchmarking start-up performance, the numbers are not very steady. 318 It is best to plot 319 the results (as saved in the file `mainline_vs_premain.csv`) in a spreadsheet to check for 320 noise and other artifacts. 321 322 The "make bench" target also generates GitHub markdown snippets (in the file `mainline_vs_premain.md`) for creating the 323 graphs below. 324 325 ### Benchmarking Between Two Leyden Builds 326 327 This is useful for Leyden developers to measure the benefits of a particular optimization. 328 The steps are similar to above, but we use the "make compare_premain_builds" target: 329 330 ``` 331 $ cd helidon-quickstart-se 332 $ make PM_OLD=/repos/leyden_old/build/linux-x64/images/jdk \ 333 PM_NEW=/repos/leyden_new/build/linux-x64/images/jdk \ 334 BLDJDK_HOME=/usr/local/jdk21 \ 335 compare_premain_builds 336 Old build = /repos/leyden_old/build/linux-x64/images/jdk with options 337 New build = /repos/leyden_new/build/linux-x64/images/jdk with options 338 Run,Old CDS + AOT,New CDS + AOT 339 1,110,109 340 2,131,111 341 3,118,115 342 4,110,108 343 5,117,110 344 6,114,109 345 7,110,109 346 8,118,110 347 9,110,110 348 10,113,114 349 Geomean,114.94,110.48 350 Stdev,6.19,2.16 351 Markdown snippets in compare_premain_builds.md 352 ``` 353 354 Please see [test/hotspot/jtreg/premain/lib/Bench.gmk](test/hotspot/jtreg/premain/lib/Bench.gmk) for more details. 355 356 Note: due to the variability of start-up time, the benefit of minor improvements may 357 be difficult to measure. 358 359 ### Preliminary Benchmark Results 360 361 The following charts show the relative start-up performance of the Leyden/Premain branch vs 362 the JDK main-line. 363 364 For example, a number of "premain aot cache: 255" indicates that if the application takes 365 1000 ms to start-up with the JDK main-line, it takes only 255 ms to start up when all the 366 current set of Leyden optimizations are enabled. 367 368 The benchmark results are collected with `make bench` in the following directories under [test/hotspot/jtreg/premain](test/hotspot/jtreg/premain): 369 370 - `helidon-quickstart-se` 371 - `javac-bench` 372 - `micronaut-first-app` 373 - `quarkus-getting-started` 374 - `spring-boot-getting-started` 375 - `spring-petclinic` 376 377 The meaning of the four rows in the following the charts: 378 379 | Row | Meaning | 380 | ------------- | ------------- | 381 | **mainline default** |Run benchmark with no optimizations| 382 | **mainline custom static cds** |Run benchmark with a custom static CDS archive| 383 | **mainline aot cache** |Run benchmark with a custom AOT cache (JDK mainline)| 384 | **premain aot cache** |Run benchmark with a custom AOT cache (Leyden Premain Prototype)| 385 386 These JDK versions were used in the comparisons: 387 388 - JDK main-line: JDK 25, build 25+37-LTS-3491 389 - Leyden: https://github.com/openjdk/leyden/tree/8df3504f55cabe9ff8a1d239f469b18d00ff802b 390 391 For details information about the hardware and raw numbers, see [bench.20250912.txt](test/hotspot/jtreg/premain/bench_data/bench.20250912.txt) 392 393 ### Helidon Quick Start (SE) Demo (3.52x improvement) 394 395 ```mermaid 396 --- 397 config: 398 xyChart: 399 chartOrientation: horizontal 400 height: 300 401 --- 402 xychart-beta 403 x-axis "variant" ["mainline default", "mainline custom static cds", "mainline aot cache", "premain aot cache"] 404 y-axis "Elapsed time (normalized, smaller is better)" 0 --> 1000 405 bar [1000, 484, 398, 351] 406 ``` 407 408 ### JavacBenchApp 50 source files (2.17x improvement) 409 410 ```mermaid 411 --- 412 config: 413 xyChart: 414 chartOrientation: horizontal 415 height: 300 416 --- 417 xychart-beta 418 x-axis "variant" ["mainline default", "mainline custom static cds", "mainline aot cache", "premain aot cache"] 419 y-axis "Elapsed time (normalized, smaller is better)" 0 --> 1000 420 bar [1000, 779, 567, 460] 421 ``` 422 423 ### Micronaut First App Demo (2.85x improvement) 424 425 ```mermaid 426 --- 427 config: 428 xyChart: 429 chartOrientation: horizontal 430 height: 300 431 --- 432 xychart-beta 433 x-axis "variant" ["mainline default", "mainline custom static cds", "mainline aot cache", "premain aot cache"] 434 y-axis "Elapsed time (normalized, smaller is better)" 0 --> 1000 435 bar [1000, 475, 366, 321] 436 ``` 437 438 ### Quarkus Getting Started Demo (2.73x improvement) 439 440 ```mermaid 441 --- 442 config: 443 xyChart: 444 chartOrientation: horizontal 445 height: 300 446 --- 447 xychart-beta 448 x-axis "variant" ["mainline default", "mainline custom static cds", "mainline aot cache", "premain aot cache"] 449 y-axis "Elapsed time (normalized, smaller is better)" 0 --> 1000 450 bar [1000, 487, 412, 367] 451 ``` 452 453 ### Spring-boot Getting Started Demo (3.96x improvement) 454 455 ```mermaid 456 --- 457 config: 458 xyChart: 459 chartOrientation: horizontal 460 height: 300 461 --- 462 xychart-beta 463 x-axis "variant" ["mainline default", "mainline custom static cds", "mainline aot cache", "premain aot cache"] 464 y-axis "Elapsed time (normalized, smaller is better)" 0 --> 1000 465 bar [1000, 496, 334, 253] 466 ``` 467 468 ### Spring PetClinic Demo (3.24 improvement) 469 470 ```mermaid 471 --- 472 config: 473 xyChart: 474 chartOrientation: horizontal 475 height: 300 476 --- 477 xychart-beta 478 x-axis "variant" ["mainline default", "mainline custom static cds", "mainline aot cache", "premain aot cache"] 479 y-axis "Elapsed time (normalized, smaller is better)" 0 --> 1000 480 bar [1000, 598, 554, 308] 481 ``` 482 483 ## 6. More Documentation 484 485 Please see [test/hotspot/jtreg/premain/](test/hotspot/jtreg/premain) for more information.