|
Java Runtime
Optimizations
IBM has been supportive of Java from
its beginnings in 1995, releasing its first Java virtual machine
in 1996 as one of the first Java execution platforms in the industry.
Since then, IBM Research has been deeply involved in developing
many advanced techniques for Just-In-Time (JIT) compilation, garbage
collection and other key components in the virtual machine for
high performance. In 2000, the technologies evolved into the IBM
Sovereign virtual machine, which has achieved and maintained industry-leading
performance, and is currently used in many middleware products
for all IBM platforms.
From December 1997, the Jalapeño project
at IBM Research developed virtual machine infrastructure to enable
innovation within the area of efficient implementation of Java
applications. The system was released as the Jikes
RVM open source project in October 2001. Many researchers
have used the system for a wide variety of research, including
garbage collection, distributed systems, compiler optimizations,
and aspect-oriented programming, resulting in over 100 publications
that use Jikes RVM as their underlying infrastructure and over
a dozen courses have been taught using the system. The open source
project has a steering committee and core team of maintainers,
50 percent of whom are from outside of IBM. Using these two execution
platforms, Sovereign JVM and Jikes RVM, researchers have developed
many innovative technologies, some of which are described below:

Java Virtual Machine
In the area of garbage collection (GC), IBM Research
has made notable technological advances in several fronts. For
example, Real-time
GC is the process of collecting data that is no longer in
use by an application while the application continues to run and
keeping the interruptions by the collector (also called 'pause
times') short and evenly spaced. The goal of this project is to
produce a garbage collector with pause times that are below one
millisecond in the worst case while providing highly
uniform CPU utilization, low memory overhead and low garbage collection
cost. To date, researchers have made progress towards these goals
with a maximum pause time of two milliseconds. The results rely
on 1) a high-performance read-barrier with only 4 percent CPU
overhead, which is an order of magnitude faster than previous
software read-barriers, by tightly integrating the collector with
the compiler for optimization, 2) time-based scheduling, which
we show to be fundamentally necessary for real-time response and
different from most previous real-time collectors, and 3) arraylets,
a technique for efficiently breaking up large arrays into smaller
objects to limit the real-time cost of scan, update or move operations
on large objects. Based on these techniques, researchers are able
to provide guaranteed performance based on a small number of input
parameters from the programmer.
Another example of GC advances is in the domain
of server-side, multi-threaded Java applications (such as Web
application server and portals) with multi-gigabyte heaps. Here,
pause times over a second are intolerable, but pauses up to several
hundred milliseconds are acceptable and should be achieved with
minimal throughput hit. These requirements provide new challenges
for 'server-oriented' GC: ensuring short pause times on a multi-gigabyte
heap without sacrificing the high throughput; solving the memory
fragmentation problems typical to server applications, which run
for a very long time and allocate very large objects; and maintaining
good scaling on multiprocessor hardware.
The Large
heap GC project consists of two major parts: 1) a parallel,
incremental, and mostly concurrent mark sweep collector, which
typically reduces the pause time by 80 percent or more, bringing
it down from seconds to a few hundreds of milliseconds, without
sacrificing too much throughput, and 2) a parallel and incremental
compactor, which not only demonstrates high scalability when running
in parallel, but is also extremely efficient when running single-threaded
on a uniprocessor. This latter technology almost achieves perfect
compaction in the lower addresses of the heap. Running on a 24-way
AIX machine, IBM’s parallel compactor shows a 27 times speed-up
over the previously used compactor.
Resolving the problem of high overhead of lock
operations has been another research focus area. Because of the
built-in support for multi-threaded programming, Java applications
tend to perform a significant number of lock operations. Thus,
optimizing the lock operations was very important to improve Java
performance. IBM Research addressed this problem and developed
many sophisticated algorithms for Java locks. In 1998, researchers
proposed a very fast-locking algorithm called thin lock
based on the observation that the locks are rarely contended for
in many Java applications. They then discovered that most of those
contentions are temporary and enhanced the algorithm as tasuki
lock. In these two algorithms, Java locks can be acquired
with only one atomic operation in most cases. The team also made
several attempts to further improve these algorithms, and succeeded
in eliminating the atomic operation by exploiting the thread
locality of Java locks.
The dynamic nature of the Java language provides
opportunities for better optimization based on runtime profile
information, and this is a significant advantage of a Java dynamic
compiler over a traditional static compiler.
IBM researchers have made several technical
advances in the area of adaptive and dynamic optimization by demonstrating
1) that using a principled cost/benefit analysis can improve system
performance for short- and long-running applications by automatically
determining which parts of an application should be optimized
while it executes; 2) how to efficiently capture detailed profiling
information with low overhead; 3) how to use the profiling information
to improve performance by tailoring optimizations based on the
current profile; 4) that handling and optimizing exception-intensive
programs by profiling hot exception paths is practical; and 5)
that compiling with a region, instead of method boundary, that
results from eliminating uncommon execution paths has the potential
to achieve better performance.
Sovereign
JVM and JIT
Toshio Suganuma, Takeshi Ogasawara, Mikio Takeuchi, Toshiaki Yasue,
Motohiro Kawahito, Kazuaki Ishizaki, Hideaki Komatsu, Toshio Nakatani.
"Overview
of the IBM Java Just-In-Time Compiler", IBM Systems Journal,
39(1), February 2000.
Toshio Suganuma, Takeshi Ogasawara, Kiyokuni Kawachiya,
Mikio Takeuchi, Kazuaki Ishizaki, Akira Koseki, Tatsushi Inagaki,
Toshiaki Yasue, Motohiro Kawahito, Tamiya Onodera, Hideaki Komatsu,
Toshio Nakatani. "Evolution
of a Java Just-in-Time Compiler for IA-32 Platforms", IBM
Journal of Research and Development, 48(5/6), September 2004.
Jikes
RVM
Bowen Alpern, C. R. Attanasio, John J. Barton, Michael G. Burke,
Perry Cheng, Jong-Deok Choi, Anthony Cocchi, Stephen J. Fink, David
Grove, Michael Hind, Susan. F. Hummel, Derek Lieber, Vassily Litvinov,
Mark F. Mergen, Ton Ngo, James R. Russell, Vivek Sarkar, Mauricio
J. Serrano, Janice C. Shepherd, Stephen E. Smith, V. C. Sreedhar,
Harini. Srinivasan, and John Whaley. "The
Jalapeño Virtual Machine", IBM Systems Journal,
39(1), February 2000.
Bowen Alpern, Steve Augart, Steve M. Blackburn, Maria Butrico, Antony
Cocchi, Perry Cheng, Julian Dolby, Stephen Fink, David Grove, Michael
Hind, Kathryn S. McKinley, Mark Mergen, J. Eliot B. Moss, Ton Ngo,
and Vivek Sarkar. "The
Jikes RVM Project: Building an Open Source Research Community",
IBM Systems Journal, 44(2), May 2005.
Real
time GC
David F. Bacon, Perry Cheng, and V.T. Rajan. "A
Real-time Garbage Collector with Low Overhead and Consistent Utilization",
In Conference Record of the ACM POPL Symposium, January 2003.
David F. Bacon, Perry Cheng, and V.T. Rajan. "Controlling
Fragmentation and Space Consumption in the Metronome, a Real-time
Garbage Collector for Java", In Proceedings of the ACM
LCTES Conference, June 2003.
Large
heap GC
Yoav Ossia, Ori Ben-Yitzhak, Irit Goft, Elliot K. Kolodner, Victor
Leikehman and Avi Owshanko. "A
Parallel, Incremental and Concurrent GC for Servers", In
Proceedings of the ACM PLDI Conference, June 2002.
Ori Ben-Yitzhak, Irit Goft, Elliot K. Kolodner, Victor Leikehman.
"An
Algorithm for Parallel Incremental Compaction", In Proceedings
of the ACM ISMM Symposium, June 2002.
Katherine Barabash, Yoav Ossia and Erez Petrank. "Mostly
Concurrent Garbage Collection Revisited", In Proceedings
of the ACM OOPSLA Conference, October 2003.
Diab Abuaiadh, Yoav Ossia, Erez Petrank and Uri Silbershtein. "An
Efficient Parallel Heap Compaction Algorithm", In Proceedings
of the ACM OOPSLA Conference, October 2004.
Lock optimizations
(Retrospective: Thin
Locks - Selected as one of the most influential publications
in the PLDI conference)
David F. Bacon, Ravi Konuru, Chet Murthy, and Mauricio Serrano.
"Thin
Locks: Featherweight Synchronization for Java", In Proceedings
of the ACM PLDI Conference, June 1998.
Tamiya Onodera and Kiyokuni Kawachiya. "A
Study of Locking Objects with Bimodal Fields", In Proceedings
of the ACM OOPSLA Conference, November 1999.
Kiyokuni Kawachiya, Akira Koseki, and Tamiya Onodera. "Lock
Reservation: Java Locks Can Mostly Do Without Atomic Operations",
In Proceedings of the ACM OOPSLA Conference, November 2002.
Dynamic and adaptive optimizations
Matthew Arnold, Stephen Fink, David Grove, Michael Hind, and Peter
F. Sweeney. "Adaptive
Optimization in the Jalapeño JVM", In Proceedings
of the ACM OOPSLA Conference, October, 2000.
Matthew Arnold and Barbara G. Ryder. "A
Framework for Reducing the Cost of Instrumented Code",
In Proceedings of the ACM PLDI Conference, June, 2001.
Takeshi Ogasawara, Hideaki Komatsu, and Toshio Nakatani. "A
Study of Exception Handling and Its Dynamic Optimization in Java",
In Proceedings of the ACM OOPSLA Conference, October, 2001.
Matthew Arnold, Michael Hind, and Barbara G. Ryder. "Online
Feedback-Directed Optimization of Java", In Proceedings
of the ACM OOPSLA Conference, November 2002.
Toshio Suganuma, Toshiaki Yasue, and Toshio Nakatani. "A
Region-Based Compilation Technique for a Java Just-in-Time Compiler",
In Proceedings of the ACM PLDI Conference, June, 2003.
Related Publications
http://jikesrvm.sourceforge.net/info/papers.shtml
http://www.haifa.il.ibm.com/projects/systems/rs/bibliography.html
http://www.research.ibm.com/trl/projects/jit/pub_int.htm
|
 |
 |
|
|
What is the most exciting potential
future use for the work you're doing?
I’ve been involved with
the Jikes RVM project since I joined IBM in 1998. Over the
last couple of years we’ve seen a large, active community
of academic researchers form around the infrastructure.
It’s exciting to see so many graduate students using it
in their thesis work, especially when they use it to do
things we never thought of when we were building Jikes RVM.
By providing a quality open source infrastructure for VM
research, I believe the project has enabled a much larger
group of people to do high quality work because they are
free to focus their energies on tackling the “interesting”
problems instead of building a base infrastructure from
scratch.
Recently, I’ve also been involved with the real-time
GC project. This is an area with challenging technical and
engineering problems still to be solved, but with huge potential
for future impact. If we are successful, the project could
revolutionize the development of real-time systems. This
will occur by enabling the productivity and reliability
gains that have been realized in other domains that have
made the switch from C/C++ to garbage collection languages.
What is the most interesting part
of your research?
Identifying and solving hard technical problems
and then implementing those solutions in real systems. Pushing
a solution all the way to a real implementation can be quite
interesting because in the process you often discover that
you overlooked some subtle aspect of the problem or that
there is an unexpected interaction between your solution
and other aspects of the system.
What inspired you to go into this
field?
I got into optimizing compilers because I found
the combination of abstract theory (iterative data flow,
abstract interpretation) and hard core engineering and system
programming (building and debugging a real working optimizer)
addictive. Compilers and runtime systems also represent
an interesting leverage point: they are part of that base
layer of software on top of which everything else is built.
Improvements at this base level can have a large impact,
especially when they enable shifts in programming models
or wide spread adoption of higher level language features.
What is your favorite invention
of all time?
The printing press.
|
|