< November 2005 >
SuMoTuWeThFrSa
   1 2 3 4 5
6 7 8 9101112
13141516171819
20212223242526
27282930   
Mon, 21 Nov 2005:

Dotgnu has a new task at hand and it is worth 4 lakhs to whoever would care to do it. The task is simple and straight forward, you need to write a fast i686 JIT for dotgnu. Half the task is already done in libjit - which has around 65 KLOC of code. All you need to do is finish the x86 engine in there, with the help me and many other dotgnu folks around the place.

For those who feel that they can help and want to, please contact me. Read the original announcement here. Tell your friends, ask them to tell your friends. Circulate it in your LUG, circulate it in your college. Do whatever it takes, we need this JIT (and other projects like rubydium will also gain).

posted at: 14:44 | path: /dotgnu | permalink | Tags:

Almost all design textbooks for Java or C# talk extensively about Interfaces. Did you ever stop to wonder how expensive an interface call actually is ? It is a very expensive lookup through the class heirarchy practically killing all data in your cache in most cases. But someone has already run into this before.

The RVM team in IBM had a very simple and direct way of caching the interface lookups in the class entry itself. This was published as a paper in OOPSLA in 2001, titled "Efficient Implementation of Java Interfaces: Invokeinterface Considered Harmless". Now I would like to correct that and say Invokeinterface: mostly harmless.

The solution is simple and elegant. Build an "Interface Method Table" (IMT) for a particular class, from the interface information in that class. Each interface method in the system is assigned a unique identifier. The identifier is used to hash into a fixed-sized table of method entry points, similar to a vtable. In most cases, the hashed position will be the right position.

RVM handles hash conflicts by using a set of stubs which are inserted in the slots of the IMT. The pnet engine cannot do that directly because we JIT method by method, unlike RVM which does this for an entire class. So we do not know the address of the other methods in the conflict resolution and it is considered a bad thing to go back and modify running code (*duh*). Or we could convert all of those and waste precious heap space. In short, we don't do what RVM does. We just store a NULL pointer in that slot and takes the old codepath if we hit a NULL during lookup.

#define    IL_IMT_SIZE        37

struct _tagILClassPrivate
{
    ...
#ifdef IL_USE_IMTS
    ILUInt32        imtBase;            /* Base for IMT identifiers */
    ILMethod       *imt[IL_IMT_SIZE];    /* Interface method table */
#endif
};

problem #1: generating a unique identifier for each interface method.
Was an easily solved problem with the following code, for each interface.

            size = (ILUInt32)(classPrivate->vtableSize);
            classPrivate->imtBase = process->imtBase;
            process->imtBase += size;

After all that will be unique in each engine running, till we flip over the bits in imtBase. The I came to the real code generating the IMT table.

    /* Process the implementation records that are attached to this class */
    impl = classPrivate->implements;
    while(impl != 0)
    {
        implPrivate = (ILClassPrivate *)(impl->interface->userData);
        if(implPrivate)
        {
            size = (ILUInt32)(implPrivate->vtableSize);
            /* Process the members of this interface */
            for(posn = 0; posn < size; ++posn)
            {
                imtIndex = (implPrivate->imtBase + posn) % IL_IMT_SIZE;
                vtableIndex = (ILImplPrivate_Table(impl))[posn];
                if(vtableIndex != (ILUInt32)(ILUInt16)0xFFFF)
                {
                    if(!(classPrivate->imt[imtIndex]))
                    {
                        /* No conflict at this table position */
                        classPrivate->imt[imtIndex] =
                            classPrivate->vtable[vtableIndex];
                    }
                    else
                    {
                        /* We have encountered a conflict in the table */
                        classPrivate->imt[imtIndex] =
                            (ILMethod *)(ILNativeInt)(-1);
                    }
                }
            }
        }
        impl = impl->next;
    }

    /* Clear positions in the table that indicate conflicts */
    for(posn = 0; posn < IL_IMT_SIZE; ++posn)
    {
        if(classPrivate->imt[posn] == (ILMethod *)(ILNativeInt)(-1))
        {
            classPrivate->imt[posn] = 0;
        }
    }

All pretty straight forward ? I suppose, you must be reading VM code for the first time. Well, you pull out the imtBase and add the index, hash it into IL_IMT_SIZE buckets and mark with -1 if conflicts exist. Finally roll over all the -1 entries into NULL and you have an interface hash table ready and rockin'.

problem #2 : where to use the imt ?
In the COP_CALL_INTERFACE method, you dolt. Here, watch the code scroll by.

    /* Locate the method to be called */
    #ifdef IL_USE_IMTS
        methodToCall = GetObjectClassPrivate(tempptr)
            ->imt[CVM_ARG_DWIDE2_SMALL];
        if(!methodToCall)
        {
            methodToCall = CVM_ARG_DWIDE_PTR_SMALL(ILMethod *);
            methodToCall = _ILLookupInterfaceMethod
                (GetObjectClassPrivate(tempptr),
                 methodToCall->member.owner, methodToCall->index);
            if(!methodToCall)
            {
                MISSING_METHOD_EXCEPTION();
            }
        }
    #else

Is that all ? Oh, wait. This was what Rhys did up when he implement IMT on 2003-07-13. So why am I saying all this now, because the above code fails when I use an IL_IMT_SIZE = 37, but works when I use 41. Between two strong coffees, three appams, two dosas and an entire half litre of Mountain Dew - I managed to debug the hell out of it.

The above code did not traverse up the parent tree to look for interfaces. Therefore there was a hidden hash collision, but only when I run an entire suite of tests (thanks to process->imtBase value, in hindsight). Thanks to the whole lot of threading, I switched back to printf for debugging. It paid off, here's what I got.

 ModuleBuilder->imt {
 /* IDetachItem */ 
     0 : (empty)
     ....
     31 : System.Reflection.Emit.IDetachItem.Detach
     ...
     36 : (empty)
}; /* ModuleBuilder */

cvm_call.c:1099 found <ModuleBuilder:System.Reflection.Emit.IDetachItem.Detach> for <IClrProgramItem:get_ClrHandle> 

Hey, where is IClrProgramItem in the ->implements of ModuleBuilder. It turns out that ModuleBuilder inherits from Module, which implements IClrProgramItem and was therefore wasn't checked by the IMT code for collisions. So now the code for looking up through the parents for the interfaces they implement is being testing. And it looks promising :)

For more details, hit the code.

--
I'm a two bit coder, but I make things work

posted at: 12:20 | path: /dotgnu | permalink | Tags: