You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@harmony.apache.org by "Tim Ellison (JIRA)" <ji...@apache.org> on 2007/06/06 13:44:26 UTC

[jira] Commented: (HARMONY-4064) [classlib][luni] Performance improvement of java.util.HashMap

    [ https://issues.apache.org/jira/browse/HARMONY-4064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12501905 ] 

Tim Ellison commented on HARMONY-4064:
--------------------------------------

Robert.  You patch undoes the change I made to make the length of the elements array a prime number, so that it is now always a power of two.  The problem is that identity hash codes are not randomly distributed, they typically reflect object address alignments which will be factors of two, so we don't get a good distribution across all the possible buckets.

You can see the effect in this simple 'simulation':

public class HashingTest {

    public static void main(String[] args) {
        int numBuckets = 16;

        int[] distrib = new int[numBuckets];
        for (int i = 0; i < distrib.length; i++) {
            distrib[i] = 0;
        }

        for (int i = 0; i < 1000; i++) {
            int hashCode = new Object().hashCode();
            int bucket = hashCode & (numBuckets - 1);
            distrib[bucket] = distrib[bucket] + 1;
        }

        System.out.println("Bucket\tList_length");
        for (int i = 0; i < distrib.length; i++) {
            System.out.println(i + "\t" + distrib[i]);
        }
    }
}

I suggest that we have to ensure the elements array length is chosen more carefully, or that we modify the hashCodes to avoid these collisions.

> [classlib][luni] Performance improvement of java.util.HashMap
> -------------------------------------------------------------
>
>                 Key: HARMONY-4064
>                 URL: https://issues.apache.org/jira/browse/HARMONY-4064
>             Project: Harmony
>          Issue Type: Improvement
>          Components: Classlib
>            Reporter: Robert Hu
>            Assignee: Tim Ellison
>         Attachments: HARMONY-4064.diff
>
>
> The performance of HashMap can be improved, the hash method is improved.
> By running the following test code, our HashMap can be improved from 110% time cost (compared by RI) to 90% time cost (compared by RI).
> public class TestHashMap {
>     public static void main(String[] args) {
>         Random ran = new Random();
>         HashMap map = new HashMap();
>         int elementNum = 500;
>         int times = 10000;
>         int[] rans = new int[elementNum];
>         for (int i = 0; i < elementNum; i++)
>             rans[i] = ran.nextInt(Integer.MAX_VALUE);
>         long start = System.currentTimeMillis();
>         for (int i = 0; i < elementNum; i++)
>             map.put(rans[i], "b");
>         System.out.println(System.currentTimeMillis() - start);
>         start = System.currentTimeMillis();
>         for (int i = 0; i < elementNum; i++)
>             for(int j = 0; j< times; j++){
>                 map.get(rans[i]);
>             }
>         System.out.println(System.currentTimeMillis() - start);        
>     }
> }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.