|
Distributed
& Fault-Tolerant Computing
|
|
|
Computer
Science > Distributed
& Fault-Tolerant Computing
> Computer Science Brochure
|
|
| Computer Science Brochure | |
|
IBM Research has a rich history in distributed and fault-tolerant computing. Our legacy includes foundational work in fault-tolerant hardware, distributed programming languages, network technologies, and many other areas. Today, our research is focused on improving existing technology, such as creating highly scalable middleware, as well as cultivating new application areas, such as utility-based computing. Middleware and Coordination Languages Modern distributed applications depend on middleware architectures that define communication abstractions. A key advantage of this approach is that middleware and applications are implemented separately. Thus, applications may take advantage of advances in middleware technology without costly redesign and testing. We are actively developing middleware technology for use in a variety of applications. The Gryphon project focuses on message brokering middleware for advanced publish/subscribe applications. A key feature of Gryphon is its support for content-based pub/sub for Internet-scale applications. The Dependency-Spheres (D-Spheres) project explores transaction processing in heterogeneous middleware environments. D-Spheres is a new transaction model that allows for the execution of both standard object transactions and asynchronous messages within one global transaction context. The Distributed Connection Language (DCL) project provides a more abstract approach. DCL exposes middleware components as language abstractions, which may then be customized and redeployed in new applications. Thus, DCL focuses on deploying middleware, rather than on developing a specific middleware architecture. Utility-Based Computing Corporate computing infrastructures include both hardware components, such as servers and networks, and software components, such as client applications and databases. These infrastructures are costly to implement, and they waste resources: infrastructure may be duplicated within a large corporation. Computing utilities seek to reduce this cost by providing a shared resource pool that would replace expensive infrastructure with a metered computing service. We have recently initiated several projects for developing and exploiting computing utilities. The eUtopia project is an attempt to design and prototype a general-purpose computing utility platform. Based on a scalable and composable infrastructure, the eUtopia platform will provide tools and middleware for creating, deploying, subscribing to, and executing utilities. The e-Utilities project provides a publicly accessible testbed for developing new computing utilities. The current e-Utilities prototype provides a pub/sub utility using Gryphon technology (see above). Clusters A cluster is a collection of tightly coupled commodity hardware components. Clustered systems promise a low-cost, highly scalable computing platform, but require new software architectures to make full use of available resources. We have several projects that are developing clustered software architectures. The Cluster Virtual Machine for Java project is implementing a Java Virtual Machine (JVM) that provides a single system image of a traditional JVM while executing on a cluster. This approach allows standard Java programs to execute without modification: the underlying runtime system automatically distributes threads and objects among the nodes of the cluster. In contrast, the Océano project is developing a clustered test bed for prototyping advanced web hosting and other e-business architectures. In Océano, cluster resources will be used to simultaneously host multiple business applications and dynamically respond to peak workloads by reassigning resources. Future Technology In addition to the work described above, we are exploring new distributed models and applications. For example, the TSpaces project views the network as both communication medium and data store. TSpaces is a network communication buffer with database capabilities: applications may communicate in a heterogeneous network using group communication services, database services, file transfer services, and rules-based event notification services. New distributed models are also being explored in the Blue Gene project. Blue Gene researchers are developing a software architecture that can optimize locality and communication trade-offs as well as operate in the presence of faulty computing elements. Please contact Paridhi Verma to obtain copies of the Computer Science Brochure |