J-spot, a runtime compiler for a Virtual Machine called Wonka
$Id: j-spot.html 3944 2003-05-16 10:56:30Z dbuytaert $
home · about · news · downloads · mailing lists · contributions · bugs and testing · documents · contact information
Currently Wonka, an Open Source Virtual Machine (VM) developed at ACUNIA, runs in interpreter mode only which imposes a performance penalty due to the run-time overhead of fetching, decoding and executing bytecode instructions. For performance's sake, effort is put in the development of a generic runtime compiler and code generation framework.
Currently, ARM an x86, respectively a RISC and a CISC processor, are the targets architectures being worked on. However note also that j-spot is a work in progress. As such, j-spot is not competitive yet with other runtime compilers.
This section enumerates a brief list of the more interesting changes and silently ignores bug-fixes, clean-ups and smaller improvements.
May 2003
- Worked on elimination of runtime checks such as ArrayOutOfBoundsExceptions and NullPointerExceptions.
April 2003
- Implemented SSA-based constant propagation (CP).
March 2003
- Implemented SSA-based dead code elimination (DCE).
- Added support for static single information (SSI), an extension to SSA.
February 2003
- Introduced stack map support to reduce GC overhead.
January 2003
- Prepared j-spot for use in real-life projects on both
x86
and ARM
architectures; it will soon be enabled by default.
- Paper reading and paper writing.
December 2002
- Basic support for elimination of runtime checks, dead code elimination, join point elimination (idioms), etc.
November 2002
- Massaged a lot of code: various improvements, most notably improvements to SSA construction as well as more aggresive constant/copy propagation.
- Attended the Java and embedded systems symposium (JAES'02) in Gent, Belgium.
October 2002
- We can run the Mauve tests; no unexpected failures compared to execution by interpretion.
September 2002
- Refactored most of the IR construction algorithm; Java bytecode is now being translated to an IR that is in static single assignment (SSA) form. Currently writing a paper about it.
- Presented a poster and extended abstract called A selective runtime compiler for the Wonka Virtual Machine at the PACT '02 Symposium in Edegem. Download the abstract here.
August 2002
- Added support for the ARM processor.
- Presented an extended abstract called A profiler and compiler for the Wonka Virtual Machine at the Java Virtual Machine '02 Symposium in San Francisco. Download the abstract here.
June 2002
- Added support for
float
s and regression tests to go with it.
- Implemented early optimizations such as constant propagation, copy propagation and copy elimination.
- Attented the ACM SIGPLAN 2002 Conference on Programming Language Design and Implementation (PLDI'02) and co-located tutorials/workshops on Jun 16-19 in Berlin, Germany.
May 2002
- Wrote and integrated a code generator generator whose design is based on iburg and friends. The code generator generator emits a tree pattern matcher that selects the lowest-cost implementation of a given IR-construct. While already fully working, the current x86 machine description doens't to take full advantage of this yet; for now, it is only a naive port of the old x86 backend.
- Changed the way multiple exit points are being handled: instead of generating an "epilogue" for every exit point, every exit point (but the last) now jumps to a single "epilogue".
- Refactored the code emitter or assembler. Instantiation and re-initialization of the assembler is now being delayed as long as possible, and in contrast to the old assembler, the new assembler can only compile one method at a time to make the assembler more efficient. As a direct result, the register allocation pass and code generation pass have been merged into a single pass.
- Updated the profiler and the hot-spot detection algorithm.
J-spot is a Wonka component and is distributed as such. You can download the sources here or you can browse the CVS repository online. Information on how to build and install Wonka (including j-spot) can be found here.
The wonka-developers mailing list is the recommended forum to exchange ideas and discuss development. You can browse the wonka-developers archives here, and you can subscribe to the mailing list here.
This section lists the open issues and TODO items. Any help concerning these issues is highly welcome, as are suggestions. If you want to contribute, this would be good places to jump in. Contributions can be submitted as patches and must conform the Wonka coding standards. However note that contributing to j-spot requires the desire to explore. Read: up-to-date documentation is not always readily available. The documentation that is available is listed below.
- Support for
long
s and double
s: currently, j-spot doens't support any of the Java bytecode instructions that operate on the 64-bit primitives; long
s and double
s. As such, we can't roll accurate performance results, nor can j-spot's performance be compared or benchmarked against other compilers.
- Sun's JIT compiler interface: investigate Sun's JIT Compiler Interface and check how easy it would be to support, and whether it would actually make sense to support. (Update: after some investigation and having talked to some people, we believe that Sun's interface is no longer a usefull candidate for it has been abonded soon after their JDK 1.1.x release.)
- Method and/or object inlining.
- Loadable machine code: implement an import/export framework. Make j-spot emit machine code to a code or compiler cache with objects/libraries that can be dynamically loaded and linked.
- Support for more architectures: port j-spot to your preferred architecture by writing a new compiler backend.
- J-spot website: these web pages are also in the CVS repository and you can check them out, submit patches, just like you do for the compiler itself. Changes to the web site must validate as XHTML 1.0 Transitional.
We maintain two suites with regression tests for j-spot: one written in C to test the compiler internals, and one written in Java to validate the compiler's output. In addition to our own regression tests, we run compliance tests such as those provided by the Mauve project and various benchmark suites such as the SPEC JVM98 benchmarks.
Java-based regression tests
After having setup your build environment, you can compile the Java-based regression tests by typing:
$ cd open-wonka
$ jam -sDEBUG=true -sCOMPILER=j-spot j-spot.jar
To run the Java-based regression tests, type:
$ cd open-wonka/build-x86-linux/wonka
$ ./wonka -Woempa=n -Wcompiler:profile=n com.acunia.wonka.test.jspot.JTest
If you want j-spot to compile nothing but the test methods, type:
$ cd open-wonka/build-x86-linux/wonka
$ ./wonka -Woempa=n -Wcompiler:profile=n,name-range=com_ com.acunia.wonka.test.jspot.JTest
You know all tests passed when you see something like:
--> passed num tests
Note that either way you have to disable selective compilation by specifying the "-Wcompiler:profile=n"-option on the command line. Unconditional compilation is required for the regression tests to work properly.
C-based regression tests
After having setup your build environment, you can compile the C-based regression tests by typing:
$ cd open-wonka
$ jam -sCOMPILER=j-spot jtest
To run the C-based regression tests, type:
$ cd open-wonka
$ ./build-x86-linux/compiler/j-spot/bin/jtest
You know all tests passed when you see something like:
--> passed num tests
If you found a bug or regression, write a small test case for inclusion in any of our test suites and post it on the mailing list.
Documentation
Further reading
- For practical papers on process of translating Java bytecode to an intermediate representation, I recommend reading (1) LaTTe: a Java VM just-in-time compiler with fast and efficient register allocation by B. Yang, S. Moon, S. Park, J. Lee and S. Lee, (2) Efficient Java VM just-in-time compilation by A. Krall and (3) CACAO, a 64 bit Java VM just-in-time compiler by A. Krall and R. Grafl.
- For a greater understanding of the code generator generator, I recommend reading (1) Engineering a simple, efficient code generator generator by C. W. Fraser, D. R. Hanson and T. A. Proebsting and (2) Tree automata for code selection by C. Ferdinand, H. Seidl and R. Wilhelm.
- To compute dominators we use the algorithm developed by T. Lengauer and R. E. Tarjan as described in their paper A fast algorithm for finding dominators in a flowgraph but has been explained in other papers as well. Also, A. W. Appel explains the Lengauer-Tarjan algorithm and its variants/extensions rathers well in his book "Modern compiler implementation in C", page 444-450.
- For a paper on SSA construction, read Practical improvements to the construction and destruction of static single assignment form by P. Briggs, K. D. Cooper, T. J. Harvey and L. T. Simpson.
If you have questions or comments please subscribe to the mailing lists or drop me an e-mail at dries.buytaert at acunia.com.