You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-dev@lucene.apache.org by "John Wang (JIRA)" <ji...@apache.org> on 2007/10/23 17:51:50 UTC

[jira] Created: (SOLR-390) HashDocSet initialization of internal array is not efficient

HashDocSet initialization of internal array is not efficient
------------------------------------------------------------

                 Key: SOLR-390
                 URL: https://issues.apache.org/jira/browse/SOLR-390
             Project: Solr
          Issue Type: Bug
          Components: search
            Reporter: John Wang


HashDocSet initializes the internal array but iterating it instead of using Arrays.fill which is much faster. Patch included

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-390) HashDocSet initialization of internal array is not efficient

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/SOLR-390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12537179 ] 

Yonik Seeley commented on SOLR-390:
-----------------------------------

Doing a quick HashDocSet construction test (below) showed the loop to be slightly faster on average than Arrays.fill()... I have no idea why, but I'll close this bug for now and it can be reopened if someone comes up with a better test (or tests on different JVMs, etc).

{code}public class TestPerf {

  private static int VAL=-1;

  private static int go(int[] x) {
    HashDocSet ds = new HashDocSet(x,0,x.length);
    return ds.exists(1) ? 1 : 0;
  }


  public static void main(String[] args) {
    int a=0;
    int sz = Integer.parseInt(args[a++]);
    int iter = Integer.parseInt(args[a++]);
    int[] x = new int[sz];
    int[] x2 = new int[sz];
    for (int ii=0; ii<sz; ii++) {
      x[ii]=ii*1234567891;
      x2[ii]=ii*987654323;
    }

    int ret=0;
    long start = System.currentTimeMillis();
    int num=0;
    for (int i=0; i<iter; i++) {
      if (++num>=sz) num=0;
      x[num] += go(x)+ret+x2[num];
      ret += go(x2) + x2[num]++;
    }
    long end = System.currentTimeMillis();
    System.out.println("result=" + ret);
    System.out.println("time=" +(end-start));
  }
}
{code}

> HashDocSet initialization of internal array is not efficient
> ------------------------------------------------------------
>
>                 Key: SOLR-390
>                 URL: https://issues.apache.org/jira/browse/SOLR-390
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>            Reporter: John Wang
>
> HashDocSet initializes the internal array but iterating it instead of using Arrays.fill which is much faster. Patch included

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-390) HashDocSet initialization of internal array is not efficient

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/SOLR-390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12537062 ] 

Yonik Seeley commented on SOLR-390:
-----------------------------------

As I suspected, it doesn't look like there is yet any JVM acceleration for Arrays.fill() (and I wouldn't hold my breath).
I just tested with Java 1.6 -server, and my current method appears about 88% faster (on a P4 at least).

I used an array size of 1000 (since HashDocSet will normally be between 1 and 3000),
and 10,000,000 iterations.

explicit loop countdown =>  9281 msec
Arrays.fill => 17515 msec

{code}
public class TestPerf {

  private static int VAL=-1;

  private static void fill(int[] x) {
/*
    for (int i=x.length-1; i>=0; i--) {
      x[i] = VAL;
    }
*/
    Arrays.fill(x,VAL);
  }


  public static void main(String[] args) {
    int a=0;
    int sz = Integer.parseInt(args[a++]);
    int iter = Integer.parseInt(args[a++]);
    int[] x = new int[sz];
    int ret=0;
    long start = System.currentTimeMillis();
    for (int i=0; i<iter; i++) {
      fill(x);
      ret = ret + x[0];  // use results
    }
    long end = System.currentTimeMillis();
    System.out.println("result=" + ret);
    System.out.println("time=" +(end-start));
  }
}
{code}

> HashDocSet initialization of internal array is not efficient
> ------------------------------------------------------------
>
>                 Key: SOLR-390
>                 URL: https://issues.apache.org/jira/browse/SOLR-390
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>            Reporter: John Wang
>
> HashDocSet initializes the internal array but iterating it instead of using Arrays.fill which is much faster. Patch included

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-390) HashDocSet initialization of internal array is not efficient

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/SOLR-390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12537171 ] 

Yonik Seeley commented on SOLR-390:
-----------------------------------

I did some further tests with mixed results...
After modifying the test program to do fill() on multiple arrays per iteration (and using an element from each array to try and prevent any dead code elimination), the benefit of the inlined loop vanishes (sneaky hotspot). Sometimes the Arrays.fill() version was faster, and sometimes it wasn't.  So perhaps a real test is needed here.



> HashDocSet initialization of internal array is not efficient
> ------------------------------------------------------------
>
>                 Key: SOLR-390
>                 URL: https://issues.apache.org/jira/browse/SOLR-390
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>            Reporter: John Wang
>
> HashDocSet initializes the internal array but iterating it instead of using Arrays.fill which is much faster. Patch included

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Closed: (SOLR-390) HashDocSet initialization of internal array is not efficient

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/SOLR-390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yonik Seeley closed SOLR-390.
-----------------------------

    Resolution: Cannot Reproduce

> HashDocSet initialization of internal array is not efficient
> ------------------------------------------------------------
>
>                 Key: SOLR-390
>                 URL: https://issues.apache.org/jira/browse/SOLR-390
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>            Reporter: John Wang
>
> HashDocSet initializes the internal array but iterating it instead of using Arrays.fill which is much faster. Patch included

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-390) HashDocSet initialization of internal array is not efficient

Posted by "John Wang (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/SOLR-390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12539863 ] 

John Wang commented on SOLR-390:
--------------------------------

Hi Yonik:
    With my tests, for large arrays, e.g. 2M entries, there is a 14% gain.
But it is 14% out of a small number, so I guess it is not a big deal. Sorry
for the false alarm.

-John



> HashDocSet initialization of internal array is not efficient
> ------------------------------------------------------------
>
>                 Key: SOLR-390
>                 URL: https://issues.apache.org/jira/browse/SOLR-390
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>            Reporter: John Wang
>
> HashDocSet initializes the internal array but iterating it instead of using Arrays.fill which is much faster. Patch included

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-390) HashDocSet initialization of internal array is not efficient

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/SOLR-390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12537055 ] 

Yonik Seeley commented on SOLR-390:
-----------------------------------

That's interesting... does it actually test as faster for you?  Have any JVMs finally done specific optimizations for it?
In the past, my version was always a little faster because counting down to zero can be slightly faster (no explicit compare needed in many instruction sets because the flags are often set by arithmetic operations anyway).

> HashDocSet initialization of internal array is not efficient
> ------------------------------------------------------------
>
>                 Key: SOLR-390
>                 URL: https://issues.apache.org/jira/browse/SOLR-390
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>            Reporter: John Wang
>
> HashDocSet initializes the internal array but iterating it instead of using Arrays.fill which is much faster. Patch included

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-390) HashDocSet initialization of internal array is not efficient

Posted by "John Wang (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/SOLR-390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12537051 ] 

John Wang commented on SOLR-390:
--------------------------------

I am having problems with "Attach file"
Following is the patch:
Index: /Users/john/plum/solr-trunk/src/java/org/apache/solr/search/HashDocSet.java
===================================================================
--- /Users/john/plum/solr-trunk/src/java/org/apache/solr/search/HashDocSet.java (revision 587538)
+++ /Users/john/plum/solr-trunk/src/java/org/apache/solr/search/HashDocSet.java (working copy)
@@ -17,6 +17,8 @@
 
 package org.apache.solr.search;
 
+import java.util.Arrays;
+
 import org.apache.solr.util.BitUtil;
 
 
@@ -63,8 +65,8 @@
     mask=tsize-1;
 
     table = new int[tsize];
-    for (int i=tsize-1; i>=0; i--) table[i]=EMPTY;
-
+    //for (int i=tsize-1; i>=0; i--) table[i]=EMPTY;
+    Arrays.fill(table, EMPTY);
     for (int i=offset; i<len; i++) {
       put(docs[i]);
     }



> HashDocSet initialization of internal array is not efficient
> ------------------------------------------------------------
>
>                 Key: SOLR-390
>                 URL: https://issues.apache.org/jira/browse/SOLR-390
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>            Reporter: John Wang
>
> HashDocSet initializes the internal array but iterating it instead of using Arrays.fill which is much faster. Patch included

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.