Ever wrote a program that did everything it should but was just too slow to be usable? Before you can actually improve the speed of your program you have to figure out which part is the bottleneck. That's what profilers are used for.
OProfile runs exclusively under Linux and consists of a kernel module that has been integrated into the 2.6 tree. A userspace daemon controls the kernel module, the same way as iptables controls the kernel packet filtering modules. In addition there are various reporting tools, so you don't need root privileges for profiling. But in general it is not a good idea to use OProfile in multi-user environments — OProfile allows every user to view the code paths of all running processes, which may lead to the compromise of secret keys.
To generate the data OProfile uses hardware interrupts to monitor the executed programs at fixed time intervals. Each time an interrupt is received the kernel module looks which part of a program is currently being executed and stores this information. This even works for kernel-space code, which means that it is possible to do kernel profiling with OProfile. You can then generate a report, for example how many times the profiler saw your program executing each line of its code.
The are some advantages and disadvantages in using hardware based counters. Hardware interrupts are good because they generate little overhead, which means that the program is almost unaffected by the profiler. More importantly, this does not change the object code of a program, thus avoiding nasty things like heisenbugs (bugs that go away when you look at them). The price you have to pay is that you only get statistical data and not the actual numbers on how many times an instruction has been executed. And what's even worse, unavoidable delays and hardware optimizations may trick the profiler into thinking that a different instruction is being executed. Despite relying on hardware specific interrupts, OProfile is supported on surprisingly many architectures, although not on all Linux has been ported to.
In practice you shouldn't have to deal with these issues. OProfile requires only a few steps to generate useful results. If you should run into trouble when profiling your code the well-written documentation will probably help you out.
There are various types of reports available, ranging from annotating the source code with execution times to call graphs. It is possible to compare two different profiles in order to decide which code runs faster. The only thing thing that is difficult is to profile code in shared libraries. You have to merge the library profile with the application profile first, which is a bit inconvenient.
When it comes to generating reliable statistical information on code performance, OProfile is the right choice. It is easy to use and offers important insight into your programs.