LeProf, a source-level profiling tool LeProf, and its companion post-processing tool LeProft, are used to
profile programs, providing information regarding the programs' behavior. LeProf is characterized by dynamically instrumenting the program to be profiled, thereby being able to capture the effects of user
code and shared libraries.LeProf gathers extensive data about the instructions executed by the program, the function call behavior, the behavior of branches, and the data cache behavior (i.e., cache
misses). Data is gathered at the following granularities:
- for the entire program;
- per function;
- per line of source code (if the program has been compiled using the option "-g");
- per PowerPC instruction.
Moreover, the tools generate the function-call graph, weighted by the frequency of function call invocation. The tracing/profiling mechanism relies on Aria, the tool for dynamic instrumentation of
programs. A program is dynamically instrumented by Aria, generating an execution trace that is used as input to a trace analyzer. The trace analyzer collects the data regarding the execution of the program,
and generates a file with the results. This file can be used as is to extract information regarding the program, or can be used as input to LeProft for generating summary results at the source code line
leval (assuming that the program has been compiled with the -g option; note that the xlc family of compilers permits the use of -g with various levels of optimization, including -O2 and -O3).
For example, finding the "hot spots" in the program foo.c is achieved as follows:
- compile the program using -g flag (required to gather data at the source code line level)
xlc -O2 -g -o foo foo.c
leprof -o foo.stats foo inputs
- find the source code line from which the most instructions are executed
leproft total foo.stats The tool gathers information regarding the
data cache behavior of a program by simulating the directory of a data cache memory. Currently, the following configurations are available (which is selected as a run-time option): 603, 603e, 604: 16k, 4way, 32 byte line 604e: 32k, 4way, 32 byte line
POWER,RIOS: 64k, 4way, 128 byte line P2SC: 128k, 4way, 128 byte line POWER2,RIOS2: 256k, 4way, 256 byte line .....
See the Publications and Presentations for further information regarding LeProf and LeProft. |