OProfile

Chapter 43. OProfile

OProfile is a low overhead, system-wide performance monitoring tool. It uses the performance monitoring hardware on the processor to retrieve information about the kernel and executables on the system such as when memory is referenced, the number of L2 cache requests, and the number of hardware interrupts received. On a Red Hat Enterprise Linux system, the oprofile RPM package must be installed to use this tool.

Many processors include dedicated performance monitoring hardware. This hardware makes it possible to detect when certain events happen (such as the requested data not being in cache). The hardware normally takes the form of one or more counters that are incremented each time an event takes place. When the counter value "rolls over," an interrupt is generated, making it possible to control the amount of detail (and therefore, overhead) produced by performance monitoring.

OProfile uses this hardware (or a timer-based substitute in cases where performance monitoring hardware is not present) to collect samples of performance-related data each time a counter generates an interrupt. These samples are periodically written out to disk; later, the data contained in these samples can then be used to generate reports on system-level and application-level performance.

ImportantImportant
 

The kernel support for OProfile in Red Hat Enterprise Linux 3 is based on the back-ported code from the 2.5 development kernel. When referring to OProfile documentation, 2.5-specific features apply to OProfile in Red Hat Enterprise Linux 3, even though the kernel version is 2.4. Likewise, OProfile features specific to the 2.4 kernel do not apply to Red Hat Enterprise Linux 3.

OProfile is a useful tool, but be aware of some limitations when using it:

  • Use of shared libraries — Samples for code in shared libraries are not attributed to the particular application unless the --separate=library option is used.

  • Performance monitoring samples are inexact — When a performance monitoring register triggers a sample, the interrupt handling is not precise like a divide by zero exception. Due to the out-of-order execution of instructions by the processor, the sample may be recorded on a nearby instruction.

  • oprofpp does not associate samples for inline functions' properlyoprofpp uses a simple address range mechanism to determine which function an address is in. Inline function samples are not attributed to the inline function but rather to the function the inline function was inserted into.

  • OProfile accumulates data from multiple runs — OProfile is a system-wide profiler and expects processes to start up and shut down multiple times. Thus, samples from multiple runs accumulate. Use the command opcontrol --reset to clear out the samples from previous runs.

  • Non-CPU-limited performance problems — OProfile is oriented to finding problems with CPU-limited processes. OProfile does not identify processes that are asleep because they are waiting on locks or for some other event to occur (for example an I/O device to finish an operation).

In Red Hat Enterprise Linux, only the multi-processor (SMP) kernels have OProfile support enabled. To determine which kernel is running, issue the following command:

uname -r

If the kernel version returned ends in .entsmp, the multi-processor kernel is running. If it is not, install it via Red Hat Network or from the distribution CDs, even if the system is not a multi-processor system. The multi-processor kernel can run an a single-processor system.

43.1. Overview of Tools

Table 43-1 provides a brief overview of the tools provided with the oprofile package.

CommandDescription
opcontrol

Configures what data is collected. Refer to Section 43.2 Configuring OProfile for details.

op_help

Displays available events for the system's processor along with a brief description of each.

op_merge

Merges multiple samples from the same executable. Refer to Section 43.5.4 Using op_merge for details.

op_timeGives an overview of all profiled executables. Refer to Section 43.5.1 Using op_time for details.
op_to_sourceCreates annotated source for an executable if the application was compiled with debugging symbols. Refer to Section 43.5.3 Using op_to_source for details.
oprofiled

Runs as a daemon to periodically write sample data to disk.

oprofpp

Retrieves profile data. Refer to Section 43.5.2 Using oprofpp for details.

op_import

Converts sample database files from a foreign binary format to the native format for the system. Only use this option when analyzing a sample database from a different architecture.

Table 43-1. OProfile Commands