IBM Israel
Skip to main content
 
Search IBM Research
   Home  |  Products & services  |  Support & downloads  |  My account
Select a Country Select a country
IBM Research Home IBM Research Home
IBM Haifa Labs Homepage IBM Haifa Labs Home

Cluster Virtual Machine for Java

Distributed Computing Systems
Project Homepage
 ·Overview
 ·Motivation
 ·Application Model
 ·Approach
 ·Benchmark Results
 ·Optimizations
 ·Status
 ·Related Works
 ·Papers
 ·Contact Information
Feedback


Motivation
  Background

Today, a Java application, even a multithreaded one will run on a single node of a cluster. A programmer who is willing to exploit the full strength of a cluster needs to deal explicitly with the distribution aspects of the application. Building on top of standard communication libraries (RMI, CORBA, etc.) helps but only slightly relieves the Java program distribution complexity. Cluster Virtual Machine for Java is a Java Virtual Machine built on top of such an infrastructure, such that any multithreaded Java application can exploit the full power of the underlying system transparently (i.e., no explicit coding is required from the programmer). This distributed implementation of the Java Virtual Machine preserves its original semantics, presenting a single system image to the application; it therefore, can safely run any Java application, without any program modification.

Our intent is to enable large multithreaded applications (e.g., Jigsaw, etc.) to run transparently on a cluster and to leverage the full power of a cluster, attaining high scalability.

  Challenges

Designing and building a cluster JVM presents many interesting research challenges. First, one needs to choose the most suitable memory model paradigm. To allow Java objects to be shared by (potentially) all threads, one could implement the cluster JVM on top of a distributed shared memory model, an object shipping model, a function shipping model, or some hybrid combination. In a distributed shared memory paradigm or an object shipping model, the object or a part of it (e.g., a page) could migrate to the node where the thread is running; whereas in the function shipping model, a method invocation is shipped to the node "owning" the object. Depending on the memory model, there are interesting issues related to distributed garbage collection, object clustering, migration, caching, or replication, etc.

There are many performance aspects to tackle. First, the performance of a single thread should be comparable to that of a single-threaded application in a non-cluster JVM. In addition, the implementation of the cluster JVM should allow the throughput of a parallel application (containing threads with limited shared data) to increase, as the number of nodes increases.

There are several issues that need to be tackled in a secondary stage, but which may need to be taken into consideration from day one. A partial list of these issues includes load balancing, fault tolerance (to allow the JVM to survive in the presence of node failures), etc.

Finally, issues more tightly related to Java, such as direct access to public data members and static code execution need to be handled. In addition, there are some potential optimizations such as off-loading part of the "class loading" process to other nodes.

 

  About IBM  |  Privacy  |  Legal  |  Contact