< August 2008 >
      1 2
3 4 5 6 7 8 9
Tue, 05 Aug 2008:

Frustration is my fuel. I spent an all nighter re-doing up one of my old valgrind patches to work with valgrind-3.3.1. This one was a doozy to patch up the first time (stealing rwalsh's code), but not quite very hard to keep up with the releases. The patch needs to be applied to the 3.3.1 source tree and memcheck rebuilt. It also requires the target code to be instrumented.

#include "valgrind/memcheck.h"

static int foobar = 1;

int main()
	int *x = malloc(sizeof(int));
	int wpid = VALGRIND_SET_WATCHPOINT(x, sizeof(int), &foobar);
	*x = 10;
	foobar = 0;
	*x = 10;

What has been added anew is the foobar conditional (you could just pass in NULL, if you always want an error). In this case the error is thrown only in first line modifying x. Setting the conditional to zero turns off the error reporting.

With the new APC-3.1 branch, I'm again hitting new race conditions whenever I change around stuff. I have no real way of debugging it in a controlled environment. But this patch will let me protect the entire shared memory space and turn on error flag as soon as control exits APC. Just being able to log all unlocked writes from Zend should go a long way in being able to track down race conditions.

Yup, frustration is my fuel.

An intellectual is someone whose mind watches itself.
                -- Albert Camus

posted at: 05:44 | path: /hacks | permalink | Tags: , ,

Wed, 20 Sep 2006:

Currently, I am totally struggling with APC's shm memory. Usually with most memory issues, the system cleanly segfaults. But when the memory is part of a 128 MB mmap() area, a couple of bytes here or there show up way too late in the debug operations to detect and fix.

Now, a lot of the allocator code has things which allocate n bytes + sizeof(header) and return the allocated area - sizeof(header). The problem is that my previous watchpoints code cannot differentiate between these two, unless I put an explicit watchpoint on the location. Not to mention, it has no concept of free operation in conjuction with the original block.

Digging around in valgrind code, I found an elegant answer to the problem - VALGRIND_MALLOCLIKE_BLOCK and VALGRIND_MAKE_NOACCESS. Here's a mocked up version of my code, which seems to work out.

#include <valgrind/valgrind.h>
#include <valgrind/memcheck.h>

void *  alloc(size_t n) 
	void* x = malloc(n + 42);
	x = (unsigned char *)x + 42;
	return x;
void dealloc(void * ptr)
	free(ptr - 42);
int main() 
	size_t n = 200;
	char * a = alloc(n);
	a[0] = '1';
	a[-1] = 'x';

Now, the valgrind has red-zones, which are like canaries or sentinels for memory over-writes. I haven't figured out quite how to use them, but this should be enough right now, I think.

==28706== Invalid write of size 1
==28706==    at 0x80484ED: main (x.c:23)
==28706==  Address 0x4025051 is 41 bytes inside a block of size 242 alloc'd

And the line 23 is the a[-1]. Valgrind is just amazing. Ever since I've hit up on this tool, I've found that my debugging life is a lot easier. Now, to reproduce original bug and nail that son of a b*tch :)

They separate the right from the left, the man from the woman, the plant from the animal,
the sun from the moon. They only want to count to two.
                -- Emma Bull, "Bone Dance

posted at: 15:44 | path: /hacks | permalink | Tags: ,

Thu, 01 Jun 2006:

Valgrind is one of the most common tools people use to debug memory. Recently while I was debugging APC, the primary problem I have is of php Zend code writing into shared memory without acquiring the locks required. I had been debugging that with gdb for a while, but gdb is just dead slow for watching writes to 16 Mb of memory and generating backtraces.

The result of all that pain was a quick patch on valgrind 3.1.1. The patch would log all writes to a memory block with backtraces. But valgrind does not have a terminal to type into midway, unlike gdb. So the question was how to indicate a watchpoint. Valgrind magic functions were the answer. The magic functions can pass a parameter to valgrind while in execution. This is a source hack and is a hell of a lot easier to do than actually breaking in gdb and marking a breakpoint everytime you run it. So here's how the code looks like :-

#include "valgrind/memcheck.h"

int main()
	int * k = malloc(sizeof(int));
	int x = VALGRIND_SET_WATCHPOINT(k, sizeof(int));

That is marked out in the normal code with the following assembly fragment.

    movl    $1296236555, -56(%ebp)
    movl    8(%ebp), %eax
    movl    %eax, -52(%ebp)
    movl    $4, -48(%ebp)
    movl    $0, -44(%ebp)
    movl    $0, -40(%ebp)
    leal    -56(%ebp), %eax
    movl    $0, %edx
    roll $29, %eax ; roll $3, %eax
    rorl $27, %eax ; rorl $5, %eax
    roll $13, %eax ; roll $19, %eax
    movl    %edx, %eax
    movl    %eax, -12(%ebp)

This doesn't do anything at all on a normal x86 cpu but inside the valgrind executor, it is picked up and delivered to mc_handle_client_request where I handle the case and add the address and size, to the watch points list.

So whenever a helperc_STOREV* function is called, the address passed in is checked against the watchpoints list, which is stored in the corresponding primary map of access bits. All of these bright ideas were completely stolen from Richard Walsh patch for valgrind 2.x. But of course, if it weren't for the giants on whose shoulders I stand ...

bash$ valgrind a.out

==6493== Watchpoint 0 event: write
==6493==    at 0x804845E: modify (in /home/gopalv/hacks/valgrind-tests/a.out)
==6493==    by 0x80484EA: main (in /home/gopalv/hacks/valgrind-tests/a.out)
==6493== This watchpoint has been triggered 1 time
==6493== This watchpoint was set at:
==6493==    at 0x80484DB: main (in /home/gopalv/hacks/valgrind-tests/a.out)

Now, I can actually run a huge ass set of tests on php5 after marking the APC shared memory as watched and see all the writes, filter out all the APC writes and continue to copy out the other written segments into local memory for Zend's pleasure.

Writing software gives you that high of creating something out of nearly nothing. Since I am neither a poet nor a painter, there's no other easy way to get that high (unless ... *ahem*).

Mathemeticians stand on each other's shoulders while computer scientists stand on each other's toes.
                -- Richard Hamming

posted at: 23:44 | path: /hacks | permalink | Tags: , ,