Performance Optimization and Verification Technologies (POVT)
The team explores and develops technologies in two separate but complementing technology areas that require similar technical background and knowledge: Performance Optimization and Hardware Verification.
Performance analysis and optimization have become increasingly important tasks as underlying hardware complexity increases. Over the years, the POVT group has developed leading-edge technologies to enable better utilization of computer system architectural resources. The technologies are embodied in several contexts such as automatic performance optimization and tuning, binary instrumentation, and performance analysis via visualization. Our engagement with customers has resulted in significant performance improvements for many important enterprise applications and systems as well as in enhanced benchmarking measurements such as SPEC CPU and TPC-C.
In the Hardware Verification domain, we work on a solution to a growing problem of pervasive hardware and software verification. More specifically, the team explores and develops new model-based approach for implementation of pervasive code sequences and procedures. These sequences, which are part of a system firmware code, are responsible for a variety of highly important processor functions that are complimentary to a regular processor functionality such as boot process, power on/off, power management, unit and block management, etc. This highly innovative approach will allow hardware and software development teams to formally model the relevant aspects of the pervasive logic and the corresponding firmware code to facilitate their co-design and co-implementation. We aim to develop a technology that will enable model-based generation of actual pervasive sequences and procedures code, their system configuration dependent variants, behavior simulation models for pervasive hardware, and testcases for hardware and software verification purposes.
In addition, the team has recently started to explore a new Performance Verification domain that complements out Performance Optimization and Hardware Verification activities and requires deep knowledge in both of them. The goal is to learn performance verification challenges and performance related bugs of real processor designs and develop new methodology and technologies to solve them.
Feedback Directed Program Restructuring (FDPR)
FDPR is a feedback-based post-link optimization tool. It optimizes the executable image of a program by collecting a program behavior profile while the program is used for a typical workload, and then creating a new version of the program that is optimized for that workload. The new program generated by FDPR typically runs faster and uses less memory. FDPR performs global optimizations at the level of the entire executable. Since the executable to be optimized by FDPR is not re-linked, the compiler and linker conventions do not need to be preserved, thus allowing aggressive optimizations that are not usually available to optimizing compilers. The tool is suitable for very large programs and DLLs. The major optimizations of FDPR include global code and data reordering, function inlining, loop unrolling, inter-procedural register re-allocation, improved instruction scheduling, and data prefetching.
FDPR is currently available on the following platforms:
- AIX/POWER - a part of the operating system since AIX Version 5
- Linux for POWER - available from the IBM Linux SDK site ( https://www-304.ibm.com/webapp/set2/sas/f/lopdiags/sdklop.html )
- z/OS and z/Linux - IBM internal prototype
- Linux/x86 - IBM internal prototype
Hands On Performance Consulting (HOPC)
Emerging multicore platforms introduce a challenge to the development cycle. The sole purpose of many multicore platforms is to provide better performance. On these platforms, performance debugging and performance analysis have become essential to the development cycle, as correctness pertaining to a single core no longer suffices.
Performance debugging is different from traditional correctness debugging since regular debuggers cannot intercept any of these:
- Delays from synchronization effects between running threads
- Delays from improper load balancing on CPUs
- Delays from memory affinity
- Delays from mutual cache invalidations of global shared data
- Delays from pipeline stalls in each core
- Potential for parallelization and simultaneous execution
HOPC is a complete service provided by HRL performance experts that includes:
- Effective and practical performance debugging for the IBM AIX and Linux systems while working together with the customer
- Isolation and resolution of performance issues and complicated development bugs in multi-core applications on IBM POWER platforms
- Recommendations for development patterns and compilation options