|
The IBM ASTI optimizer provides the foundation for
high-order transformations and automatic shared-memory parallelization
in the latest IBM XL FORTRAN (XLF) compilers for RS/6000* and
PowerPC® uniprocessors and symmetric multiprocessors (SMPs), and
for automatic distributed-memory parallelizationin the IBM XL
High-Performance FORTRAN (XLHPF) compiler for the SP2®
distributed-memory multiprocessor. In this paper, we describe how the
transformer component of the ASTI optimizer automatically selects
high-order transformations for a given input program and a target
uniprocessor, so as to improve utilization of the memory hierarchy
(including cache and registers) and instruction-level parallelism. Our
solution is centered on a quantitative approach in which optimization
problems are formulated using quantitative cost models. The loop and
data transformations currently employed by the ASTI transformer for
optimizing uniprocessor performance are loop
distribution, loop interchange, loop reversal, loop skewing, loop
tiling/blocking (with compiler-selected tile sizes), loop fusion,
unrolling of multiple loops (with compiler-selected unroll factors),
and scalar replacement of selected array references. The design and
initial implementation of the ASTI optimizer were completed during the
1991-1993 time period. To the best of our knowledge, the ASTI
transformer is the first system to perform automatic selection of this
wide range of transformations using a cost-based framework.
|