You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@harmony.apache.org by "Sergey Kuksenko (JIRA)" <ji...@apache.org> on 2006/11/13 14:26:37 UTC

[jira] Created: (HARMONY-2171) [drlvm][jit] Loop Unrolling for private method java.util.TreeMap.successor().

[drlvm][jit] Loop Unrolling for private method java.util.TreeMap.successor().
-----------------------------------------------------------------------------

                 Key: HARMONY-2171
                 URL: http://issues.apache.org/jira/browse/HARMONY-2171
             Project: Harmony
          Issue Type: Improvement
          Components: DRLVM
         Environment: IA-32
            Reporter: Sergey Kuksenko


Method  java.util.TreeMap::successor is hot method for iterations over TreeMap (and some others operations).
Generated code for the method may be improved.

Java code of the method:

    static TreeMapEntry successor(TreeMapEntry x) {
        if (x.right != null)
            return minimum(x.right);
        TreeMapEntry y = x.parent;
        while (y != null && x == y.right) {
            x = y;
            y = y.parent;
        }
        return y;
    }
Also method "minimum" is usualy inlined. minimum's code:
    static TreeMapEntry minimum(TreeMapEntry x) {
        while (x.left != null)
            x = x.left;
        return x;
    }

The generated asm code the follwoing:

line address    code                    
1       B7EE60: push        ebx
2       B7EE61: push        ebp
3       B7EE62: mov         eax,dword ptr [esp+0Ch]
4       B7EE66: mov         ebp,eax
5       B7EE68: mov         ebx,dword ptr [eax+14h]
6       B7EE6B: test        ebx,ebx
7       B7EE6D: je          00B7EEA8
8       B7EE73: mov         ebp,dword ptr fs:[00000014h]
9       B7EE7A: mov         eax,ebx
10      B7EE7C: mov         ebx,dword ptr [eax+18h]
11      B7EE7F: test        ebx,ebx
12      B7EE81: je          00B7EEF2
13      B7EE87: mov         ebp,dword ptr fs:[00000014h]
14      B7EE8E: cmp         dword ptr [ebp+00000270h],0
15      B7EE98: je          00B7EE7A
16      B7EE9E: call        00A96850
17      B7EEA3: jmp         00B7EE7A
18      B7EEA8: mov         ebx,dword ptr [eax+1Ch]
19      B7EEAB: mov         eax,dword ptr fs:[00000014h]
20      B7EEB2: mov         eax,ebx
21      B7EEB4: test        eax,eax
22      B7EEB6: je          00B7EEED
23      B7EEBC: cmp         ebp,dword ptr [eax+14h]
24      B7EEBF: jne         00B7EEED
25      B7EEC5: mov         ebp,eax
26      B7EEC7: mov         eax,dword ptr [eax+1Ch]
27      B7EECA: mov         ebx,eax
28      B7EECC: mov         eax,dword ptr fs:[00000014h]
29      B7EED3: cmp         dword ptr [eax+00000270h],0
30      B7EEDD: je          00B7EEB2
31      B7EEE3: call        00A96850
32      B7EEE8: jmp         00B7EEB2
33      B7EEED: pop         ebp
34      B7EEEE: pop         ebx
35      B7EEEF: ret         4
36      B7EEF2: pop         ebp
37      B7EEF3: pop         ebx
38      B7EEF4: ret         4
-------------------

Unrolling of small cycles by factor 4 or 8 will decrease BBP impact.
Unrolling this cycle by factor 2 allows making more effective register usage.
The original loop in lines 9-17:
loop:   mov     eax,ebx
        mov     ebx,dword ptr [eax+18h]
        test    ebx,ebx
        je      exit
        mov     ebp,dword ptr fs:[00000014h]
        cmp     dword ptr [ebp+00000270h],0
        je      loop
        call    suspend
        jmp     loop
        ....
exit:                             ; result already in eax.
        pop ebp
        pop ebx
        ret 4
At this cycle variable x is located at eax register. x.left is loaded into ebx
register and for each iteration we should move ebx to eax. If this cycle will
be unroll at least by factor 2 we may change register assignment. For the first
iteration store x in ebx and load x.left to eax and for second iteration store
x in eax and load x.left to ebx. After such transformation instruction mov eax,
ebx at each cycle became useless. 
Let's move loading TLS address out of cycle and unroll this cycle by 4.

        mov     ebp,dword ptr fs:[00000014h]
        ; at this point ebx contains x
loop:
        mov     eax,dword ptr [ebx+18h]     ;     eax = x.left
        test    eax,eax
        je      exit1
        mov     ebx,dword ptr [eax+18h]     ;     ebx = x.left.left
        test    ebx,ebx
        je      exit
        mov     eax,dword ptr [ebx+18h]     ;     eax = x.left.left.left
        test    eax,eax
        je      exit1
        mov     ebx,dword ptr [eax+18h]
        test    ebx,ebx
        je      exit
        cmp     dword ptr [ebp+00000270h],0
        je      loop
        call    suspend
        jmp     loop
        ....
exit1:
        mov eax, ebx     ; here we should move result from ebx to eax.
exit:                               ; here the result already eax.
        pop ebp
        pop ebx
        ret 4

After such unrolling will have 3.5 instructions per iteration instead of 7
instructions per iteration. 

Such unrolling will be very useful for cycles like:
while (x!=null) {
        ...
        x = x.next;
}




-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira