I've always used -fprofile-arcs to instrument and profile my code and often been quite unimpressed by the output verbosity or presentation clarity. My opnion about profilers were down in the dumps, till I ran into kcachegrind. That is really a killer application in terms of profiling code.
KCacheGrind has everything I need. It has call graphs, call counters, callee counters and even lets me view code with profiling data in place. The latter is quite a kick ass thing, really. Now, I could actually see what branch was taken what percentage of time, pre-compute branch conditions and basically understand how the code was being used.
But what are words when a picture could do more justice to the tool. The following picture is a source annotation of a bit of apc code, the numbers might scare a few, but that's what it deals with on an average web server. So this is how I start callgrind :-
bash# valgrind --tool=callgrind --dump-instr=yes --trace-jump=yes -v \ /usr/sbin/apache2 -f /opt/php5/etc/apache2/apache2.conf -X bash# chmod 0666 callgrind.* /* so that www-data can write to it */ bash# kill -USR2 <callgrind-pid> /* to kill apache2 gracefully */
After running my code, which is actually a bit of php non-code (you know, the ruby-on-rails or J2EE flavour of non-code), generated from symfony, I get a callgrind dump, which I can proceed to analyze with KCacheGrind.
![](http://notmysock.org/blog/images/kcachegrind-base.png)
I can then see the call graph, which looks something like this.
![](http://notmysock.org/blog/images/kcachegrind-callgraph.png)
The really cool part is when you add the source dir annotations to the corresponding binary files in the configuration. Then I can get something that actually makes sense for a programmer.
![](http://notmysock.org/blog/images/kcachegrind-annotate.png)
I guess the screenshots speak for themselves. Thanks to this and after moving code pre-computations in place, APC is about 3-10% faster for php 5.1.x .
--You cannot have a science without measurement.
-- R. W. Hamming