Performance programming

Performance programming is the design, writing, and tuning of programs to sustain near-peak performance. In addition to concern for selecting algorithms with good asymptotic complexity, development of high performance programs has always required an acute sensitivity to details of processor and memory hierarchy architecture. The advent of modern workstations and supercomputers brings to the fore another concern --- parallelism.

Performance programming seeks to improve performance beyond what is achieved by programming an algorithm in the most expedient manner. The goal is that each processing element be kept as busy as possible doing useful work. This entails satisfying four requirements: breaking problems into independent subproblems that can be executed concurrently, distributing these subproblems appropriately among the processing elements, making sure that the necessary data is close to its processing element, and overlapping communication with computation where possible. To attain high performance, these requirements must be satisfied whether one views "processing elements" as stages of an arithmetic or vector pipeline, functional units of a CPU, processors of a tightly coupled shared-memory multiprocessor, nodes of a distributed-memory supercomputer, or heterogeneous computers on a network.

The performance programming course and some talks.



[ IBM Research home page ]
[ IBM home page | Order | Search | Contact IBM | Help | © | ® ]