< February 2009 >
1 2 3 4 5 6 7
8 91011121314
Wed, 18 Feb 2009:

The really hard part of APC is the internal locking code it has - it's not that hard to do, just hard to figure out if you've done it wrong. And I'm just about to really mess around with the assembly spin locks and pthread mutex locks to make them cross-process locks which live in shared memory (remember that "volatile" keyword in C?). The other couple of lock modes are already cross-process and slow (because of the syscall). If these work right, I won't really have to cripple the fast part of the code to implement the features I have in mind.

But before I start to go MIA into the locking code, I'd like to get my testing in place. So I've written a small and tiny test app called lockhammer - read the makefile and please run it on every platform you want APC to work. (make APC_DIR=~/apc link; make)

The code in lockhammer.c should be easily understood - basically it allocates some shared memory, creates a lock in it, forks, re-attaches the memory in each process. Every process is a loop of lock, write PID into shm, sleep, check the PID. In case someone has a better idea of how to test locks, I'll also like modifications to it, in case any of you think there's some corner case I missed (yes, random sleep & random fork-order is also on my list of TODOs).

Fundamentally, the information about locks is privately held within the lock type code in APC. The information needs to be moved into a shared mode (or at least, transparent) for multiple un-related processes to be able to share the cache without collisions. Eventually, you should be able to use APC in a standard FastCGI deployment without allocating a cache per-process.

And if you're a user, I'd like read something other than a bugreport, occasionaly.

They're gonna lock me up and throw away the key!

posted at: 19:07 | path: /php | permalink | Tags: , ,

Tue, 16 Dec 2008:

Finally, after nearly a year of work, it's into a release. Some new stuff has sneaked into it undocumented, that people might find interesting - apc.preload_path would be one of them. The backend memory allocation has been re-done - the api part by me and the internals by shire. There's a hell of a lot of new code in there, both rewritten and added. Tons of php4 cruft removed, php5 stuff optimized, made more stable, then less stable, made faster, then applied brakes. Made leak-proof, quake-proof and in general, idiot-proof. So, on & so forth.

 apc/ $ cvs diff -u -N -r HEAD -r RELEASE_3_0_19 | diffstat /dev/stdin 
 68 files changed, 3255 insertions(+), 5545 deletions(-)

Sorry about the b0rked 3.1.1 release, so please test this one! :)

Each new user of a new system uncovers a new class of bugs.
                -- Kernighan

posted at: 15:27 | path: /php | permalink | Tags: , ,

Sat, 18 Oct 2008:

In the development of things, there comes a point when it escapes the vision and control of one man/one mind. PHP frameworks are such ... beasts. But the simplicity a machine took away can be made to return. And such an attempt at zooming out of the complex file structure bureaucracy of most php projects was inclued.

When I hacked up that extension, nearly a year back, I wished that it would shame at least some php programmers into writing better code. And slowly, thanks to a few slides from Rasmus, people are actually slowly realizing how messy their include hierarchy really is. And here's an example of what I'm talking about.

That was the Zend Framework 1.5.2, as blogged by phpimpact - download the big one and look at it. The joomla CMS has also got its very own pretty picture elsewhere. Rasmus has a bunch of inclued traces from various frameworks - CakePHP, Symfony, Drupal, and perhaps the cleanest of them all, CodeIgniter.

Now, all that remains is a php-graphviz + svg mode which renders these in-browser as an iframe - or maybe someone can help me with the graph reduction to take a collection of the inclued dumps & create a "package". There's none an end to the bells and whistles I want to tack onto this.

But as long as people are scrambling head over heels to reduce the number of includes & include_onces, I think I've done my part here.

Must I hold a candle to my shames?
                -- William Shakespeare, "The Merchant of Venice"

posted at: 01:27 | path: /php | permalink | Tags: , ,

Mon, 15 Sep 2008:

It's a protest. A protest against all the gags the establishment has put on php functions - big, small and useless alike. No more shall they remain ignored and voiceless. Hear me now, as the day has come for them to shake off their silence and SCREAM!

@Error: Php uses the @ operator to silence errors from functions, so that they fail silently. But while tracing through code which uses it, it becomes nearly impossible to properly figure out what is going wrong. The band-aid that is '@' makes it a complete pain to debug code.

Introducing the SCREAM 0.0.1, which has come out of someone else's frustrations with some pear modules which are liberally peppered with such gag instructions. Essentially, it uses the user opcode functionality to override the silence functions into NOPs (literally).

php -dscream.enabled=1 -r '@foo();'

  Fatal error: Call to undefined function foo() in Command line code on line 1

Dump it into php and hopefully debugging sloppily coded libraries should become much easier. This message was brought to you by the dread of Mondays. It's all over now - End Transmission.

You have not convinced a man because you have silenced him.

posted at: 21:09 | path: /php | permalink | Tags: , ,

Mon, 26 May 2008:

That annoying file descriptor leak that snuck into 3.0.17 has finally been laid to rest. A few double free issues were fixed as I spent quite a long time staring at the same code, till enlightenment hit me like a clue bat. Along with that, there are a bunch of quickfixes for 5.3 quirks. I'm not happy with those, but this is the 3_0 stable branch and by the time 5.3 is popular enough the HEAD should be taking care of those problems. The build is broken in VC++ in this release, but excluding apc_pool.c from the build should work.

Expect more changes... as soon as I get back home.

Delay always breeds danger and to protract a great design is often to ruin it.
              -- Miguel De Cervantes

posted at: 04:27 | path: /php | permalink | Tags: , ,

Tue, 06 May 2008:

There's a certain cultural bankruptcy which shows itself in sequels. It indicates, that you're reduced to imitating yourself. But this isn't that kind of a sequel. No, not the kind where there are T Rexes in the city, trying to make a living drawing cartoons or Arnie switching from ammo boxes to ballots. This is the kind which gives a New Hope.

Yesterday, I had an outpouring of hate against the linux capability model. But the problem turned out to be that setuid resets all the capabilites. In hindsight that makes a lot of sense, but didn't even strike until the kernel people (y! has those too) got involved (and I didn't RTFM).

Enter Prctl: The solution was to use the prctl() call with PR_SET_KEEPCAPS to ensure that the capabilities are not discarded when the effective user-id of a process is changed. But, even then, only the CAP_PERMITTED flags are retained and the CAP_EFFECTIVE are masked to zeros.

So, with the prctl call and another cap_set_proc to reset CAP_EFFECTIVE, it was on a roll. Here's the patch on top of unnice.c.

 #include <sys/resource.h>
+#include <sys/prctl.h>;
@@ -26,12 +27,14 @@

+       prctl(PR_SET_KEEPCAPS, 1, 0, 0, 0);

        /* child */
        if(setuid(nobody_uid) < 0)
+       cap_set_proc(lcap);

        if(setpriority(PRIO_PROCESS, 0, getpriority(PRIO_PROCESS, 0) - 1) < 0)

Thus concludes this adventure and hope that this blog entry serves as warning of things to come. Watch this space for more Tales! Of! INTEREST!.

Only great masters of style can succeed in being obtuse.

posted at: 18:34 | path: /php | permalink | Tags: , ,

Mon, 05 May 2008:

Running infinte loops is a tricky challenge. What happens to a process when a programmer writes an infinite loop, should be familiar to all. But the challenge is to not let that affect the *other* processes. There seemed to be a perfect solution to it - setrlimit().

The function lets you set soft and hard limits on CPU, so that if a process does exceed the soft limit CPU usage, a SIGXCPU is raised. The process can catch the signal and do something sensible. Basically, all that was required was for the process to call setpriority and let the linux process scheduler slow it down to a trickle.

But a process can lower its priority, but not raise it - if it is a non-privileged process. But linux capabilities allows you to grant CAP_SYS_NICE to the process which essentially lets a non-privileged process muck around with priority - down and up.

To begin with /proc/sys/kernel/cap-bound is unbelievably confusing to use. It is a 32 wide bit-mask on which the 23rd bit apparently seems to be the CAP_SYS_NICE value. After much mucking around, I came to the conclusion that "-257" would be 0xFFFFFEFF which only disables CAP_SETPCAP. But even then the setpriority call kept failing. Here's my test code.

cap_t lcap;
const unsigned cap_size = 1;
cap_value_t cap_list[] = {CAP_SYS_NICE};

cap_set_flag(lcap, CAP_EFFECTIVE, cap_size, cap_list, CAP_SET);
cap_set_flag(lcap, CAP_PERMITTED, cap_size, cap_list, CAP_SET);


if(setuid(nobody_uid) < 0) 

if(setpriority(PRIO_PROCESS, 0, getpriority(PRIO_PROCESS, 0) - 1) < 0) 

Here's a link to the test case in a more compileable condition. Build it with gcc -lcap and run with sudo to test it. Right now, my ubuntu (2.6.22) errors out with this message.

bash$ gcc -lcap -o unnice unnice.c
bash$ sudo ./unnice 
0: =ep cap_setpcap-ep
setpriority: Permission denied

The core issue has to do with apache child-process lifetimes. The only recourse for me is to kill the errant process after the bad infinite loop and have the parent process spawn a new process with a normal priority. But which means blowing off nearly all the local process cache, causing memory churn and more than that, the annoyance of a documented feature not working.

This story currently has no ending, but if any kernel hackers are reading this and should happen to know an answer, please email gopalv shift+2 php noshift+> net. And thus we prepare for a sequel (hopefully).

I use technology in order to hate it more properly.
                -- Nam June Paik

posted at: 22:03 | path: /php | permalink | Tags: , ,

Wed, 26 Mar 2008:

In response to CVE-2008-1488, APC 3.0.17 has just been pushed out with the requisite security fixes. But in the process of producing a php4 compatible release, a significant amount of code has been reverted in the merge into an APC_3_0 branch for future bugfixes.

I've spent a couple of hours unmerging my "bye bye php4" cleanups with the help of Kompare. And my sanity is simply due to the fact that I can "cvs diff -u | kompare -" to look at the resulting huge patch. But it is not unpossible that the new code merged from HEAD has regressions, so you could also apply the unofficial patch onto 3.0.16.

I took your advice and did my own thing. Now I've got to undo it.

posted at: 22:03 | path: /php | permalink | Tags: , ,

Sun, 30 Dec 2007:

I got a nice little present for Christmas.

It had +4 lines, a huge mail explaining why and made me feel happy & stupid at the same time (there's some correlation, I think).

The patch fixes a one-off error in the APC shm allocator (read my mail for a shorter paraphrasing) and has triggered the 3.0.16 release. Now, APC should be stable even when running under cache full/heavy load conditions. And I've been barking up the wrong tree of race conditions for months & months. But the important thing is that this is fixed now.

Merry XMas and a happy New Year! [ citation needed ]

Patience is a minor form of despair, disguised as virtue.
                -- Ambrose Bierce

posted at: 06:27 | path: /php | permalink | Tags: , ,

Fri, 19 Oct 2007:

APC 3.0.15 has been released - read the release announcement. Not too many changes since 3.0.14, but there's a reason it took this long to make so few changes.

To begin with, I've just been lazy. Just kidding! This release was actually delayed to make sure this could be the very last PHP4 release of APC. And with the amount of major changes coming in, the next release is definitely going to be 3.1.0 rather than a 3.0.16 release. Now I can start working on making some fairly big changes.

Bye Bye PHP4, it was a nice ride while it lasted.

While most peoples' opinions change, the conviction of their correctness never does.

posted at: 03:11 | path: /php | permalink | Tags: , ,

Mon, 24 Sep 2007:

PHP programmers don't really understand PHP.

They know how to use PHP - but they hardly know how it works, mainly because it Just Works most of the time. But such wilful ignorance (otherwise known as abstraction) often runs them aground on some issues when their code meets the stupidity that is APC. Bear with me while I explain how something very simple about PHP - how includes work.

Every single include that you do in PHP is evaluated at runtime. This is necessary so that you could technically write an include inside an if condition or a while loop and have it behave as you would expect. But executing PHP in Zend is actually a two step process - compile *and* execute, of which APC gets to handle only the first.

Compilation: Compiling a php file gives a single opcode stream, a list of functions & yet another list of classes. The includes in that file are only processed when you actually execute the code compiled. To simplify things a bit, take a look at how the following code would be executed.


include_once "a.php";

The PHP compiler does generate an instruction to include file "a.php", but since the engine never executed it, no error is thrown for the absence of a.php. Having understood how includes work, classes & OOP face a unique problem during compilation.

include_once "parent.php";

class Child extends Parent

Even though the class Child is created at compile time, its parent class is not available in the class table until the include instruction is actually executed & the parent.php compiled up. So, php generates a runtime class declaration which is an actual pair of opcodes.

ZEND_FETCH_CLASS              :1, 'Parent'
ZEND_DECLARE_INHERITED_CLASS  null, '<mangled>', 'child

But what if the class parent was already in the class table when the file was being compiled? Like the following index.php

include_once "parent.php";
include_once "child.php";

$a = new Child();

Since obviously the parent class is already compiled & ready, Zend does something intelligent by removing the two instructions and replacing them by NOPs. That makes for fewer opcodes and therefore faster execution.

Here's the kicker of the problem. Which of these versions should APC cache? Obviously, the dynamically inherited version is valid for both cases - but APC caches whatever it encounters initially. The static version is obviously incompatible in a dynamic scenario.

So whenever APC detects that it has cached a static version, but this case actually requires a dynamic version, it decides to not cache that file *at* all from that point onwards. That's what the APC autofiltering does.

Now, you ask - how could it appear in perfectly normal code?

Assume child1 and child2 inherits from parent class. And here is how the first hit on index.php looks like from an inclusion perspective. Now, it is obvious that the child2 in this case is actually compiled with the faster static inheritance (marked in orange) while child1 suffers the performance hit of not having Parent available till execution time.

Then we have a profile.php which only requires the child2 class. But while executing this file, APC fetches the copy of child2.php which was in cache - which is the statically inherited one.

As you could've guessed, the cached version is not usuable for this case - and APC drops it out of cache. And for all requests henceforth, even for the index.php case, APC actually ignores the cached version and insists on compiling the file with Zend. If you enable apc.report_autofilter, this information will be printed out into the server error log.

Part of the culprit here is the conditional inclusion using include_once. With mere includes, you get an error whenever parent.php is included multiple times - but that can be annoying too. Where include_once/require_once can be debugged with Inclued, userspace hacks like the rinclude_once or !class_exists() checks make it really hard for me to figure out what's going wrong.

So, if you write One File per Class PHP and use such methods of inclusion, be prepared to sacrifice a certain amount of performance by doing so.

Doubt is not a pleasant condition, but certainty is absurd.
              -- Voltaire

posted at: 13:55 | path: /php | permalink | Tags: , ,

Fri, 07 Sep 2007:

After procrastinating for nearly two weeks with the code nearly done, I've managed to find the energy (and some caramel coffee) required to fix it up for the public to use - and here it is. In the process, I also threw out all the ZendEngine2 hacks and started to use zend_user_opcode_set_handler, which should let people use this with the faster CGOTO vm core, though I would advise against using that just yet.

The new & improved inclued can dump out class inheritance dependencies (though not the interfaces, as of now). This gives a slightly bigger picture view of what files depend on what other files and provide a tree of the classes clustered into their own files. For example, this is the graph pulled out from the relatively minimal PEAR::HTML_QuickForm2 library.

The usage is as before, the gengraph.php script now has a -t option which will accept either "classes" or "includes". At the very least, it should help people documenting OOP php code. Next up are interface implementations, the data is already in the dumped files, but not output in any human readable format.

It is a very sad thing that nowadays there is so little useless information.
        -- Oscar Wilde

posted at: 06:07 | path: /php | permalink | Tags: , ,

Sat, 11 Aug 2007:

Yak Shaving: So you start out with that simple problem. But half-way through fixing it, it explodes into this whole exercise in pointless dependencies. It is a rather recent wordification (never heard of that word ? it's a perfectly cromulent word). But considering the fate of the "pre-shaved yaks" guy, who ended up saying "It's a band.", I'd say it is not quite popular enough ... yet.

Now. before I start onto the real topic - let me first say that the next release of APC will be the last release compatible with PHP 4.x. Now, what is wrong with just letting the #ifdefs stay ? That's where this snippet of code comes into play.

  apc_store("a", array(new stdclass()));

It doesn't work. Now, the problem is very simple - the original patch by Marcus only checks for objects in a very shallow way. It will detect & serialize objects which are passed to apc_store - but the check does not extend deeper into the recursive copy functions.

Symmetry: But the zval* copy functions were written to be beautifully symmetric. A copy into cache is nearly the same as copying out of it. And when I say "nearly", I actually mean that until the *_copy_for_execution() optimisations were thrown in, they were actually symmetric - in & out. But objects don't play nicely with that - because they are much more than just data.

In & Out: Objects require assymmetric caching. Storing into cache is a serialize operation, while retrieving from storage is a deserialize. This ensures that they end up with the right kind of pointers, class object initialization and that the resources they hold in their opaque boxes are properly handled. The objects have to implement their appropriate magic persistance methods.

And thus begins the Yak Shaving. I need to rewrite most of the cache copy-in and copy-out functions to handle the basic assymetry. But consider this, most of the code in there has been limited for months because of the fact that I cannot optimize on PHP data structures without breaking the symmetry.

A couple of years ago, I sat through a full hour talk by Rusty Russell about talloc(). Built on top of the trusty old malloc() calls, it simplifies memory management a lot for Samba4. So bear with me as I take a brain dump of my idea - for my very intelligent reader to poke holes in (gopalv shift+2 php net).

APC's allocation strategy is a little brain dead. To allocate 4 bytes of data, it actually requires 24 bytes of space. But much more than the space wastage, I'm more concerned about the number of lock() calls required to cache a single php file - a hello world program takes about 22 lock operations (11 locks, 11 unlocks). Yes, that's actually 22 syscalls just to cache echo "hello world";.

I've previously tried to fix it with partitioned locks. The problem with that was actually cleaning up the locks, because the extension code would have to have special cases for every SAPI - because of some bugs in PHP 5.x. So, the "if you don't succeed, destroy all evidence" principle made me throw out that idea. But the cache-copy, zend-copy separation should help me revive another approach to this.

Pools: So, now that I'm officially b0rking up APC, I could as well slap on a new pool allocator, right on top of sma_allocate - ala, talloc(). The allocation speed would skyrocket, because the in-pool allocs are sequential and do not have any fragmentation issues due to blocks in the middle being free'd. As much as allocates are important, the real advantage of this would be that I could basically speed up cache expunges by a magnitude or more. The 22 syscall cache expunge for hello world would be reduced to a potential pair of syscalls - because it would be a single free of the entire pool space.

Right now the pool is actually built up to be of the following structure.

struct apc_pool_t {
	int capacity;
	int avail;
	void *head;
	apc_pool_t *overflow;
	unsigned char data[0];

I've yet to run this through an x86_64 build, but an even multiple of int/void* should align data area right into a wordsize. And I think nearly every pool should be around 4k (i.e 4096 - sizeof(apc_pool_t)) for opcode cache and 1k for data cache. I might make the latter a runtime tuneable, just to pad the APC manual up into an entire book (just in case someone asks me to write one .. *heh*).

None of this is included in APC 3.0.15, which will exit out of the gates as soon I'm sure I'm happy with its stability. The new code will probably be an APC 3.1 release, marking the end of php4 compat & opening up the door for php6 compat.

A two line bug report which exploded into nearly two thousand lines of C code - that's just classic yak shaving.

10 If it ain't broke, break it;
20 Fix it.
30 Goto 10

posted at: 09:27 | path: /php | permalink | Tags: , ,

Wed, 08 Aug 2007:

Finally, I got bored enough to update my inclued extension (as promised at OSCON). The extension now comes with a nearly completely non-intrusive data dumping mode. The new inclued.dumpdir can be used to dump the inclued data onto a temporary file without ever modifying any of your php scripts. Also included is some php code to transform the dump data into graphviz formatted .dot files.

Pick up your free & complementary copy of the source code on your way out. And stay clued-in about your includes.

This quote intentionally not included.

posted at: 04:27 | path: /php | permalink | Tags: , ,

Mon, 09 Jul 2007:

Recently, one of the php lists I'm on was asked how to implement a stable sort using php's sort functions. But since all of php's sort functions eventually seem to land up in zend_qsort, the default sort is not stable. The query on list had this simple example which illustrates the problem clearly.

bash$ php -r  '$a= array(1,1,1); asort($a); print_r($a);'

    [2] => 1
    [1] => 1
    [0] => 1

The basic problem here is to produce a stable sort which still operates in quicksort O(n*lg(n)) time. Essentially, falling back onto a bubble sort is ... well ... giving up :)

Schwartzian transform: The programming idiom, named after Randal L. Schwartz, is a carry-over of the decorate-sort-undecorate lisp memoization into perl. The real problem here however was putting it into a php syntactic form which was clean and as similar to the original as possible. For example, here's how the python version of the code would look like (despite the fact that the current python sort is stable).

a = [1,1,1]
b = zip(a, range(len(a)))    # decorate
b.sort()                     # sort
a = map(lambda x : x[0], b)  # undecorate

array_walk() magic: Coming from a world of constant iterators, I had read the array_walk documentation and sort of read the "Users may not change array itself ..." as boilerplate. But as it turns out, the callback function is allowed to change the current value *in place* and for that purpose it gets a reference to the value. With that in mind, array_walk becomes a faux in-place map/transform function.

$a = array(1,1,1);

function dec(&$v, $k) { $v = array($v, $k);}
function undec(&$v, $k) { $v = $v[0]; }

array_walk($a, dec);   // decorate
asort($a);             // sort
array_walk($a, undec); // undecorate

And there you have it, a lispism made famous by perl, implemented nearly exactly in php.

Whoever knows he is deep, strives for clarity;
Whoever would like to appear deep to the crowd, strives for obscurity.
                        -- Nietzsche

posted at: 08:06 | path: /php | permalink | Tags: ,

Sat, 26 May 2007:

Brian Shire has put up his slides (835k PDF) of his php|tek talk.

Quite interesting procedures followed to prevent the very obvious cache slam issues by firewalling the apache while restarting it, as well as the priming sub-system they use. Also the cross-server (aka site-vars) seem like a good idea as well - a basic curl POST request moving around json data could potentially serve as a half-reliable cross-server config propogator.

Seeing my code used makes happy ... very happy, indeed.

I'm willing to make the mistakes if someone else is willing to learn from them.

posted at: 22:27 | path: /php | permalink | Tags: , ,