< June 2006 >
SuMoTuWeThFrSa
     1 2 3
4 5 6 7 8 910
11121314151617
18192021222324
252627282930 
Thu, 08 Jun 2006:

I recently discovered an easy way to inspect php files. So xdebug has a cvs module in their cvs called vle. This prints out the bytecode generated for php code. This lets me actually look at the bytecode generated for a particular php data or control structure. This extension is shining a bright light into an otherwise dark world of the ZendEngine2.

Let me pick on my favourite example of mis-optimisation that people use in php land - the HereDoc. People use heredoc very heavily and over the more mundane ways of putting strings in a file, like the double quoted world of the common man. Some php programmers even take special pride in the fact that they use heredocs rather than use quoted strings. Most C programmers use it to represent multi-line strings, not realizing php quoted strings can span lines.

<?php

echo <<<EOF
	Hello World
EOF;

?>

Generates some real ugly, underoptimised and really bad bytecode. Don't believe me, just look at what the vle dump looks like.

line     #  op            ext  operands
-------------------------------------------------
   3     0  INIT_STRING        ~0
         1  ADD_STRING         ~0, ~0, '%09'
         2  ADD_STRING         ~0, ~0, 'Hello'
         3  ADD_STRING         ~0, ~0, '+'
         4  ADD_STRING         ~0, ~0, 'World'
   4     5  ADD_STRING         ~0, ~0, ''
         6  ECHO                   ~0

That's right, every single word is seperately appended to a new string and after all the appends with their corresponding reallocs, the string is echoed and thrown away. A really wasteful operation, right ? Well, it is unless you run it through APC's peephole add_string optimizer.

Or the other misleading item in the arsenal, constant arrays. I see hundreds of people use php arrays in include files to speed up the code, which does indeed work. But a closer look at the array code shows a few chinks which can actually be fixed in APC land.

<?php

$a = array("x" => "z", 
		"a" => "b",
		"b" => "c",
		"c"	=> "d");
?>

Generating the following highly obvious result. Though it must be said that these are hardly different from what most other VMs store in bytecode, they are limited by the fact that they have to actually write the code (minus pointers) to a file. But Zend is completely in memory and could've had a memory organization for these arrays (which would've segv'd apc months before I ran into the default array args issue).

line     #  op                      ext  operands
-----------------------------------------------------------
   2     0  INIT_ARRAY                   ~0, 'z', 'x'
   3     1  ADD_ARRAY_ELEMENT            ~0, 'b', 'a'
   4     2  ADD_ARRAY_ELEMENT            ~0, 'c', 'b'
   5     3  ADD_ARRAY_ELEMENT            ~0, 'd', 'c'
         4  ASSIGN                           !0, ~0
   7     5  RETURN                           1
         6  ZEND_HANDLE_EXCEPTION            

This still isn't optimized by APC and I think I'll do it sometime soon. After all, I just need to virtually execute the array additions and cache the resulting hash as the operand of the assign instead of going through this stupidity everytime it is executed.

Like rhysw said, "Make it work, then make it work better".

--
Organizations can grow faster than their brains can manage them.
                    -- The Brontosaurus Principle

posted at: 16:22 | path: /php | permalink | Tags: , ,