< November 2005 >
SuMoTuWeThFrSa
   1 2 3 4 5
6 7 8 9101112
13141516171819
20212223242526
27282930   
Sun, 13 Nov 2005:

Contrary to what I thought, Rhys can indeed make mistakes. For the last two weeks, I've been pissed off by an absolutely untraceable bug in the AMD64 engine. One single test case inside the engine fails to pass.

TestMarshalOffsetOf ... failed: 
    CSUnit.TestAssertFailed: Test failed: OffsetOf (6)
    at CSUnit.TestCase.Fail(String) in ./TestCase.cs:14
    at TestMarshal.TestMarshalOffsetOf() ....

Thanks to foresight by the FFOD [1], debugging the engine is far easier than most people would imagine. All it requires is for you to comment out things till stuff works and then hit on the one that you last commented [2] . The opcodes which aren't JIT converted to native code automatically fall back on the C-version which has been working on AMD64 for a couple of years now (that's what keeps my hair on my head and my stomach ulcer free). The real trick is to imagine what is happening and comment out the right one first off. So after commenting a lot of code, I finally hit on the right one (this was on 29th night, sometime near 3:30 AM). It was the following code

case COP_PWRITE_FIELD:
{
    /* Write a word field to an object */
    unsigned temp = (unsigned)CVM_ARG_BYTE;
    UNROLL_START();
    GetTopTwoWordRegisters(&unroll, &reg, &reg2,
                           MD_REG1_NATIVE | MD_REG2_32BIT);
    CheckForNull(&unroll, reg, pc, (unsigned char *)inst, 0);
    md_store_membase_word_native(unroll.out, reg2, reg, (int)temp);
    FreeTopRegister(&unroll, -1);
    FreeTopRegister(&unroll, -1);
    MODIFY_UNROLL_PC(CVM_LEN_BYTE);
}
break;

Now fastforward to saturday night, November 12th. I was bored, this was pending and valgrind wouldn't load 64 bit binaries, nor would it let me use cachegrind on the 32 bit VM (thanks to libgc's stack probes). OK, now that I have started on this whole bloody thing, the next obvious step was to figure out on what exact arguments the whole thingy goes for a toss. It turns out that it explodes to tiny bits when temp is 0. So the obvious workaround looked like this.

unsigned temp = (unsigned)CVM_ARG_BYTE;
if(temp == 0) goto defaultCase;
UNROLL_START();

But the md_store_membase_word_native was supposed to work even when displacement was zero. So I had a look at the code with gdb while in execution.

if(temp == 0) amd64_breakpoint(unroll.out);
md_store_membase_word_native(unroll.out, reg2, reg, (int)temp);

Now we have to head into assembly programming territory. But the good news is that AMD64 isn't that different from x86 at all. So this is what I do in gdb to debug code which was generated directly in binary form.

Program received signal SIGTRAP, Trace/breakpoint trap.
(gdb) x/5i $pc-2
0x2a97a6e277:   rex64 int3                 # breakpoint
0x2a97a6e279:   mov    %rcx,(%rax)         # mov_membase
0x2a97a6e27c:   sub    $0x20,%r14          # stack update 
0x2a97a6e280:   mov    $0x2a97a6e1c4,%r12  # load next insn
0x2a97a6e28a:   rex64Z jmpq   *(%r12)      # jump there 

Everything there looks as it should. Around 15 minutes after staring at the binary code generation, I came to the conclusion that the displacement isn't the issue at all. There was something subtle I was missing in all this. After the first half hour, I had almost given up on the code - then I put a breakpoint before the registers are loaded, and ran the testcase again.

(gdb) x/12i $pc
0x2a97a6e250:   mov    -8(%r14),%ecx  
0x2a97a6e254:   mov    -16(%r14),%rax 
0x2a97a6e258:   or     %rax,%rax          # null check       
0x2a97a6e25b:   rex64 jne    0x2a97a6e279 # if == 0
0x2a97a6e262:   mov    %rax,-16(%r14)     # restore
0x2a97a6e266:   mov    %ecx,-8(%r14)      # arguments
0x2a97a6e26a:   mov    $0x2a97a6e1b4,%r12 
0x2a97a6e274:   jmpq   0x2b00451a8a       # rexecute 
0x2a97a6e279:   mov    %rcx,(%rax) 
0x2a97a6e27c:   sub    $0x20,%r14         # stack += 2 * 8
0x2a97a6e280:   mov    $0x2a97a6e1c4,%r12
0x2a97a6e28a:   rex64Z jmpq   *(%r12)

So after a couple of minutes, the highlighted code peices hit my brain properly. I was storing to ECX and writing from RCX. The word register was being loaded in 32 bit mode, not the full and glorious 64 bit mode. And this is a pointer write, so some pointer somewhere in the generated code was being truncated by the above code. In hindsight it was a very obvious bug - but hindsight is always 20/20. Fix was very simple, change the MD_REG2_32BIT to MD_REG2_NATIVE and it stops truncating the pointer to be stored. Next in line, array read & write inlining.

There is a moral to this story, so to speak. Do not judge anyone's contribution to any project by mere linecount. This single line fix too me over 6 working hours over a period of 17 days to locate, track down and fix. The results are obvious to anyone who runs my souped up engine.

Benchmark Interpreter Unroller
Sieve 1665 10775
Loop 1773 22638
Logic 1637 14895
String 16021 19465
Float 222 458
Method 1569 8013
PNetMark (Weighted Sum) 1731 7986

This new and improved engine beats the hell out of the 32 bit mode JIT on the same machine (and I barely put in 30-odd hours into building this). All the small CPU-specific optimisations add up to a huge difference in performance. I am a R!c3r and AMD64 R0xx0rs.

[1] Flying Fingers of Death : Rhys Weatherley
[2] ease of debugging : If you thought I was joking, think again. Most JITs follow an all or nothing policy, where if you comment out an opcode, code with that opcode will just fail. Inside pnet code, it falls back gracefully to the interpreter core.

posted at: 14:45 | path: /hacks | permalink | Tags:

I bought a Nokia 6021 more than a month ago. What with all the things going on, I've never had time to play around with it properly. You see, I belong to the rare category of people who can't leave a working thing alone. So I went out and bought a Bluetooth dongle yesterday evening. Came back home and plugged it into my big honkin' AMD64 box.

usb 2-1: new full speed USB device using address 2
usb 2-1: device not accepting address 2, error -71
usb 2-1: new full speed USB device using address 3
Bluetooth: Core ver 2.6
NET: Registered protocol family 31
Bluetooth: HCI device and connection manager initialized
Bluetooth: HCI socket layer initialized
Bluetooth: HCI USB driver ver 2.7
usbcore: registered new driver hci_usb

I remember the time when I had to spend 3 days to configure X11 on an i810 with more than 16 colours. Recently, linux (yes, the kernel) has been surprising me with the array of hardware it supports. The last time I plugged in a USB DVD+RW, it just popped up the dvd creator. Was scary to see too many things just work, without knowing what to do if something doesn't. Thankfully the trip to a hooked up mobile was just starting.

Turned on the battery eating bluetooth monster on my mobile, and started scanning. It diligently picked up my roommate's laptop, but not my own box. Suddenly, I remembered that there was something called /etc/init.d/bluetooth, duh ?. Started that up and watched what happened.

Bluetooth: L2CAP ver 2.4
Bluetooth: L2CAP socket layer initialized
Bluetooth: RFCOMM ver 1.3
Bluetooth: RFCOMM socket layer initialized
Bluetooth: RFCOMM TTY layer initialized
Bluetooth: HIDP (Human Interface Emulation) ver 1.0

"Wohoo !" I said, but a bit too prematurely. All I had managed to get working was the hci interface itself. Which is a good first step, but doesn't really get you anywhere special - just like that road in front of you house.

[root@phoenix ~]# hciconfig 
hci0:   Type: USB
        BD Address: 00:02:72:B0:00:** ACL MTU: 192:8  SCO MTU: 64:8
        UP RUNNING PSCAN ISCAN 
        RX bytes:139992 acl:3551 sco:0 events:2021 errors:0
        TX bytes:37530 acl:1766 sco:0 commands:136 errors:0

The device finally showed up on the pairing screens. But Nokia doesn't allow me to not use a passcode for pairing bluetooth. The OK button appears only after you have entered at least one digit. It was not really obvious to me that you could enter alphabets in there - so my hunts for a default pin code went into a grep for 4 digit numbers. Finally after some time I noticed the file "/etc/bluetooth/pin" which had "BlueZ" inside it. Changed that 0000, restarted bluetooth and tried pairing. Voiced yet another exulted shout which would have raised eyebrows in a bar. And yet again, too premature.

[root@phoenix ~]# hcidump 
HCIDump - HCI packet analyzer ver 1.11
device: hci0 snap_len: 1028 filter: 0xffffffffffffffff
> HCI Event: Connect Request (0x04) plen 10
< HCI Command: Accept Connection Request (0x01|0x0009) plen 7
> HCI Event: Command Status (0x0f) plen 4
> HCI Event: PIN Code Request (0x16) plen 6
< HCI Command: PIN Code Request Reply (0x01|0x000d) plen 23
> HCI Event: Command Complete (0x0e) plen 10
> HCI Event: Link Key Notification (0x18) plen 23
> HCI Event: Connect Complete (0x03) plen 11

Now that the phone is linked up using bluetooth, then the hunt started for things to use it with. Installed gnokii and openobex from my local dag (fc3_x86_64) mirror. It should be worth mentioning that I don't have a net connection at home, if I needed anything from outside it would've been a washout. So I went hunting for services with sdptool.

phoenix# sdptool browse
Inquiring ...
Browsing 00:13:70:C3:**:** ...
Service Name: OBEX Object Push
...

phoenix# sdptool browse --tree  00:13:70:C3:**:** | grep "Text 
  Text : "OBEX Object Push"
  Text : "OBEX File Transfer"
  Text : "Dial-up networking"
  Text : "Nokia PC Suite"
  Text : "COM 1"
  Text : "Voice Gateway"
  Text : "Audio Gateway"
  Text : "SyncML Client"
  Text : "SIM ACCESS"

Lookin' good. But I wasn't home and dry yet, though you could think of me as near home, with a towel handy. Gnokii was what I tried next. But the docs of Gnokii are particularly sparse about how to use an unlisted Nokia phone. So when trying with 6021, all I got was.

[root@phoenix etc]# gnokii --identify
Telephone interface init failed: Model specified isn't supported.
Quitting.
Sorry, phone has not yet been converted to new style. 
Phone.Functions == NULL!

Even worse, the port number for Gnokii requires the identifer of the phone and not your local port. This made sense in hindsight as there would be multiple paired devices with your BT adapter - but wasn't really clear to start with. So the /etc/gnokiirc modding started up in full earnest.

# Use this setting also for the Bluetooth connection:
# port = aa:bb:cc:dd:ee:ff

port = 00:13:70:C3:**:**

connection = bluetooth

Then I tried setting the phone type to 6600, with slightly better results. The phone at least showed a Do you want to connect to 'mordor' message before gnokii crashed.

Serial device: opening device 00:13:70:C3:**:**
Message received: 0x01 / 0x0002
00 01                                           |                 
Received message type 01
gnokii: pkt.c:55: buffer_expand: Assertion `buf->size 
        >= buf->offs + len' failed.
Aborted

Since I had worked with Ericsson phones a LOT, I just tried reading what README-ericsson said. It struck me that plain old AT terminal mode might work with any GSM phone - Nokia or otherwise. That works well enough for me.

# Set model to the model number of your phone
model = AT

Restarted gnokii and it worked. Then I tried connecting to the serial AT port with minicom. Just to make sure the AT terminal interface is functional.

>AT
OK
>ATZ
OK
>AT+CGMM
Nokia 6021
OK
[root@phoenix ~]# gnokii --identify
IMEI         : 35623300132****
Manufacturer : Nokia
Model        : Nokia 6021
Revision     : V 03.83

Encouraged by the results, I pulled out the gnokii GUI (xgnokii) and tried it out. I was able to read my phonebook, update entries, send SMS and make calls with xgnokii. Looks very promising for some hack to sync my phone to a PIM/Calendar organizer.

One of these days, I'll sit down and see if I can find out the exact protocol that 6021 uses and hook this up properly with Gnokii. But that day might be very very far indeed. Because I've got a lot more to do on weekends than just hack on some random thing.

posted at: 14:30 | path: /hacks | permalink | Tags: