Our work in Scalable Parallel Systems focuses on the technologies needed for the design and effective use of scalable systems such as the IBM PowerParallel SP2 or workstation clusters. The effort was established in 1986 to develop the software support for the Vulcan massively parallel computer prototype. The technology that we developed provided much of the foundation for the SP product line: communication software, parallel I/O, tools, libraries, application codes, etc.
Our current research activity is focused in the following areas:
Scalable computer architectures.
What communication models will future architectures support for parallel
processes? Our past work has focused on message passing.
After contributing to the design and development of
the
MPL
message passing library of native SP communication commands,
we contributed to the design of the
de facto industry standard
MPI (Message Passing Interface) and developed
MPI-F,
a complete, high-performance implementation of MPI on the SP2.
This technology has been incorporated into the SP
parallel environment
product that supports both MPI and MPL.
We are involved in the
MPI2
forum, which considers extensions to MPI, such as Remote Memory
Access (put/get). Our architecture work focuses on support and
exploitation of communication primitives that allow us to support
efficiently shared memory programming models, without suffering from
the scalability problems of conventional shared memory support.
Scalable system services
How does one build, atop a distributed operating system, the global
system services that are needed to support well tightly coupled
parallel applications?
Our past work has focused on parallel I/O. The
Vesta
parallel file system prototype that was developed in Research provided
much of the technology for the
PIOFS parallel I/O file system
product.
We are currently involved in the design and implementation of an
MPI-IO Portable Parallel I/O Library.
Other ongoing parallel system activities are in the area of parallel
job scheduling, and dynamic binding to parallel objects.
Programming environments and tools.
The
DRMS Distributed Resource Management System
allows users dynamic control of their parallel run-time environments,
and the
UTE Unified Trace Environment
is a powerful, versatile tool for understanding parallel program behavior.
Parallel benchmarks and applications.
By paying attention to parallel computation as well as parallel I/O,
we've created high-performance algorithms and applications in a range
of disciplines:
molecular dynamics,
seismic processing,
acoustical modeling,
finite element methods,
computational fluid dynamics,
meteorological modeling,
and
data mining.
We also contributed significantly to the development of
record-setting
NAS Parallel Benchmarks
NAS Parallel Benchmarks
and
Linpack TPP benchmarks
for the SP2.
More information about our work and some of our people: