You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@harmony.apache.org by "Tim Ellison (JIRA)" <ji...@apache.org> on 2007/06/06 13:44:26 UTC
[jira] Commented: (HARMONY-4064) [classlib][luni] Performance
improvement of java.util.HashMap
[ https://issues.apache.org/jira/browse/HARMONY-4064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12501905 ]
Tim Ellison commented on HARMONY-4064:
--------------------------------------
Robert. You patch undoes the change I made to make the length of the elements array a prime number, so that it is now always a power of two. The problem is that identity hash codes are not randomly distributed, they typically reflect object address alignments which will be factors of two, so we don't get a good distribution across all the possible buckets.
You can see the effect in this simple 'simulation':
public class HashingTest {
public static void main(String[] args) {
int numBuckets = 16;
int[] distrib = new int[numBuckets];
for (int i = 0; i < distrib.length; i++) {
distrib[i] = 0;
}
for (int i = 0; i < 1000; i++) {
int hashCode = new Object().hashCode();
int bucket = hashCode & (numBuckets - 1);
distrib[bucket] = distrib[bucket] + 1;
}
System.out.println("Bucket\tList_length");
for (int i = 0; i < distrib.length; i++) {
System.out.println(i + "\t" + distrib[i]);
}
}
}
I suggest that we have to ensure the elements array length is chosen more carefully, or that we modify the hashCodes to avoid these collisions.
> [classlib][luni] Performance improvement of java.util.HashMap
> -------------------------------------------------------------
>
> Key: HARMONY-4064
> URL: https://issues.apache.org/jira/browse/HARMONY-4064
> Project: Harmony
> Issue Type: Improvement
> Components: Classlib
> Reporter: Robert Hu
> Assignee: Tim Ellison
> Attachments: HARMONY-4064.diff
>
>
> The performance of HashMap can be improved, the hash method is improved.
> By running the following test code, our HashMap can be improved from 110% time cost (compared by RI) to 90% time cost (compared by RI).
> public class TestHashMap {
> public static void main(String[] args) {
> Random ran = new Random();
> HashMap map = new HashMap();
> int elementNum = 500;
> int times = 10000;
> int[] rans = new int[elementNum];
> for (int i = 0; i < elementNum; i++)
> rans[i] = ran.nextInt(Integer.MAX_VALUE);
> long start = System.currentTimeMillis();
> for (int i = 0; i < elementNum; i++)
> map.put(rans[i], "b");
> System.out.println(System.currentTimeMillis() - start);
> start = System.currentTimeMillis();
> for (int i = 0; i < elementNum; i++)
> for(int j = 0; j< times; j++){
> map.get(rans[i]);
> }
> System.out.println(System.currentTimeMillis() - start);
> }
> }
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.