| Distributed
and Fault Tolerant Computing (DFTC) addresses many of the
core issues in computer science and is a critical area of research
for IBM. Our researchers work at a host of research locations across
the world and are interested in a wide range of topics. Our current
focus is on addressing key challenges in Grid and Autonomic computing,
and flexible integration of disparate IT and business processes
for enabling on-demand services all of which have a significant
impact in transforming todays businesses:
Integration
-- creating business flexibility by integration of disparate, unconnected
business and IT process.
Automation
-- reducing costs and increasing business responsiveness through
IT and business linkage.
Virtualization
-- improving working capital and asset utilization.
Links
to a few representative projects are listed below along with a sampling
of different aspects of our activities in the larger research community.
WSLA:
Web Service Level Agreement for On Demand Services
WSLA is a novel framework for specifying and monitoring Service
Level Agreements (SLAs) for Web Services. The framework components
include tooling for online creation of SLAs, dynamic deployment
of SLAs and measurement and monitoring of QoS parameters for checking
the agreed-upon service levels, and reporting violations to the
authorized parties involved in the SLA management process.
WSMM
- Web Services Management Middleware for On Demand Services
WSMM provides fundamental support for differentiated services based
on Service Level Agreements (SLAs), for web services. This enables
service providers to offer the same web service, on-demand, at different
performance levels (e.g., response time thresholds and throughput
limits). WSMM dynamically allocates resources to web service requests,
with the goal of optimizing the system utility. WSMM and WSLA technologies
are available for download as part of the Emerging
Technologies Toolkit.
Rainforest
The goal of Rainforest (follow
on to Océano project) is to enable policy directed dynamic
componentized autonomic provisioning and management. The primary
objectives are to provide a framework for heterogeneity of resources
and solutions; to autonomically create complex provisioning tasks
and policies from basic elements; and to utilize existing provisioning
products and management systems.
ReGS-
Reporting Grid Service
ReGS is an OGSA-based Meta-OS Grid Service for logging, tracing
and monitoring applications in a distributed, heterogeneous computer
environment, with extensible filtering capabilities. It exploits
OGSI interfaces and provides standard logging interfaces for use
by other Grid Services and Applications by virtualizing existing
logging systems, e.g., zOS logging, NT events, and Unix syslogs.
Optimal
Grid
The Optimal Grid middleware provides a grid-enabled collaboration
framework, sophisticated management infrastructure, and problem-solving
environment for grid computing. It is designed to help users harness
the power of future grid utilities by hiding the complexities of
partitioning, distributing, load balancing, and adapting a grid
application to dynamic changes in available compute resources.
Gryphon
The Gryphon project focuses on advancing messaging middleware.
It extends the scalable publish/subscribe framework to support efficient
content-based routing in wide-area networks and stronger guarantees
for message delivery by developing protocols tolerant to failures
in the overlay network and in the clients.
Policy-based
Computing Systems
In policy-based computing, management operations are specified in
terms of the objectives or goals that need to be met, rather than
the detailed instructions that need to be executed. A number of
activities have been undertaken to develop policy schemas and architectures
in different domains, including: automated provisioning of computing
systems, auditing of configuration constraints in Storage Area Networks
(SANs), and guiding analysis and problem determination processes.
XMT
- Policy-based Extended Web Services and Messaging Transactions
The
XMT project addresses main challenges in transactional coordination,
including: defining transaction policies in an XML Web services/messaging-based
environment, managing transaction policies and corresponding actions
through an effective middleware system, and integrating such transaction
policy system with Web services and messaging-based transaction
middleware.
DSF
- Data Sharing Facility
DSF is an experimental project to build a server-less file system
that distributes all aspects of file and storage management over
cooperating machines interconnected by a fast-switched network.
DSF is aimed at scaling to hundreds of machines using commodity
components. DSF provides a global memory cache, distributed file
management, and distributed storage repository.
SINTRA
- Distributing Trust on the Internet
SINTRA (Secure INtrusion-Tolerant Replication Architecture) is a
protocol suite for secure and fault-tolerant service replication
in asynchronous networks such as the Internet. Using randomization,
novel customized cryptographic tools, and optimistic methods, SINTRA
provides the first practical protocols that do not rely on any timing
assumption, while tolerating active coordinated attacks. |
Web
Services On Demand
|
| Leaders
Seminar Series |
Research
hosts Prof.
Ian Foster on June 2, 2003, to speak on The Grids
First 50 Years as part of the Leaders Seminar Series.
|
|
Selected
Publications
|
|
Anton
Riabov, Zhen Liu, Joel L. Wolf, Philip S. Yu, Li Zhang, New
Algorithms for Content-Based Publication-Subscription Systems,
ICDCS 2003 - The 23rd International Conference on Distributed
Computing Systems, May 2003, Providence, Rhode Island.
Avraham
Leff, James T. Rayfield, Daniel Dias Meeting
Service Level Agreements In a Commercial Grid, IEEE Internet
Computing, July/August, 2003 (Special issue on Grid Computing).
Heiko
Ludwig, Alexander Keller, Asit Dan, Richard King and Richard
Franck, A Service Level Agreement
Language for Dynamic Electronic Services, Electronic
Commerce Research, Vol. 3, No. 1-2, January/April 2003.
Jeffrey
O. Kephart, David M. Chess, The
Vision of Autonomic Computing, IEEE
Computer 36(1): 41-50 (2003)
Liana
Fong , Michael Kalantar, Don Pazel, German Goldszmidt, K.
Appleby, T. Eilam ,S.Fakhouri, S. Krishnakumar, Dynamic
Resource Management in an eUtility. Network Operations
and Management Symposium. April 2002.
Melissa
J. Buco, Rong N. Chang, Laura Zaihua Luan, Christopher Ward,
Joel L. Wolf, Philip S. Yu, Tevfik Kosar, Syed Umair Shah,
Managing eBusiness on Demand
SLA Contracts in Business Terms Using the Cross-SLA Execution
Manager SAM, ISADS 2003 - International Symposium on
Autonomous Decentralized Systems, April, 2002, Pisa, Italy.
Stefan
Tai, Thomas A. Mikalsen, Isabelle Rouvellou, Stanley M. Sutton
Jr., Conditional Messaging:
Extending Reliable Messaging with Application Conditions,
Proceedings of the 22nd IEEE International Conference on Distributed
Computing Systems (ICDCS 2002, Vienna, Austria), IEEE, pp
123-132, July 2002.
Sumeer
Bhola, Rob Strom, Saurav Bagchi, Y. Zhao and Josh Auerbach,
Exactly Once Delivery in a Content-Based
Publish-Subscribe System, Proc. International Conference
on Dependable Systems and Networks, June 2002, Washington
D. C.
|
| |
| Recent
Accomplishments |
|
Alfred
Spector
receives IEEE Tsutomu
Kanai award on April 9, 2003, in recognition of his contributions
in the area of distributed computing systems.
Alan
Ganek invited as keynote speaker at IM
2003, March 2003.
German
Goldszmit was the Technical Program Committee Co-chair for
IM 2003, March 2003.
Ron
Levy, Jay Nagarajao, Giovanni Pacifici, Mike Spreitzer, Asser
Tantawi, and Alaa Youssef , Performance
management For Cluster Based Web Services, Received Best
Paper Award at IM 2003,
March 2003.
|
|