|
|
 |
|
 |
Volume 38, Number 5, 1994
POWER2 and PowerPC architecture |
|
Table of contents: HTML |
|
DOI: 10.1147/rd.385.0563 |
Copyright info |
 |
 |
 |
 |
| |
|
Exploiting functional parallelism of POWER2 to design high-performance numerical algorithms |
 |
by R. C. Agarwal, F. G. Gustavson, and M. Zubair |
 |
 |
 |
 |
|
We describe the algorithms and architecture approach to produce
high-performance codes for numerically intensive computations.
In this approach, for a given computation, we design algorithms
so that they perform optimally when run on a target machine--in
this case, the new POWER2* machines from the RS/6000 family of
RISC processors. The algorithmic features that we emphasize are
functional parallelism, cache/register blocking, algorithmic
prefetching, loop unrolling, and algorithmic restructuring. The
architectural features of the POWER2 machine that we describe
and that lead to high performance are multiple functional units,
high bandwidth between registers, cache, and memory, a large
number of fixed- and floating-point registers, and a large cache
and TLB (translation lookaside buffer). The paper gives two
examples that illustrate how the algorithms and architectural
features interplay to produce high-performance codes. They are
BLAS (basic linear algebra subroutines) and narrow-band matrix
routines. These routines are included in ESSL (Engineering and
Scientific Subroutine Library); an overview of ESSL is also
given in this paper.
|
 |
 |
|
*IBM, RISC System/6000, AIX, POWER Architecture, PowerPC, PowerPC Architecture, PowerPC 601, PowerPC 603, PowerPC 604, PowerPC 620, POWER2, POWER Parallel SP2, and Micro Channel are all trademarks or registered trademarks of International Business Machines Corporation.
|
 |
|
|