You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Martijn van Groningen (Created) (JIRA)" <ji...@apache.org> on 2012/04/12 00:19:16 UTC

[jira] [Created] (LUCENE-3972) Improve AllGroupsCollector implementations

Improve AllGroupsCollector implementations
------------------------------------------

                 Key: LUCENE-3972
                 URL: https://issues.apache.org/jira/browse/LUCENE-3972
             Project: Lucene - Java
          Issue Type: Improvement
          Components: modules/grouping
            Reporter: Martijn van Groningen


I think that the performance of TermAllGroupsCollectorm, DVAllGroupsCollector.BR and DVAllGroupsCollector.SortedBR can be improved by using BytesRefHash to store the groups instead of an ArrayList.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3972) Improve AllGroupsCollector implementations

Posted by "Martijn van Groningen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252437#comment-13252437 ] 

Martijn van Groningen commented on LUCENE-3972:
-----------------------------------------------

Dawid - Do you mean the hash method instead of rehash method in SentinelIntSet? The rehash methods doesn't have any parameters. 
                
> Improve AllGroupsCollector implementations
> ------------------------------------------
>
>                 Key: LUCENE-3972
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3972
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/grouping
>            Reporter: Martijn van Groningen
>         Attachments: LUCENE-3972.patch, LUCENE-3972.patch
>
>
> I think that the performance of TermAllGroupsCollectorm, DVAllGroupsCollector.BR and DVAllGroupsCollector.SortedBR can be improved by using BytesRefHash to store the groups instead of an ArrayList.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3972) Improve AllGroupsCollector implementations

Posted by "Martijn van Groningen (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Martijn van Groningen updated LUCENE-3972:
------------------------------------------

    Attachment: LUCENE-3972.patch

Tested the current patch on a index containing 10M documents that has ~5M unique groups. The already existing implementation needs ~15.3 seconds to get the total group count and the new impl in the patch needs ~4.4 seconds to get the total group count.
                
> Improve AllGroupsCollector implementations
> ------------------------------------------
>
>                 Key: LUCENE-3972
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3972
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/grouping
>            Reporter: Martijn van Groningen
>         Attachments: LUCENE-3972.patch, LUCENE-3972.patch
>
>
> I think that the performance of TermAllGroupsCollectorm, DVAllGroupsCollector.BR and DVAllGroupsCollector.SortedBR can be improved by using BytesRefHash to store the groups instead of an ArrayList.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3972) Improve AllGroupsCollector implementations

Posted by "Martijn van Groningen (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Martijn van Groningen updated LUCENE-3972:
------------------------------------------

    Attachment: LUCENE-3972.patch

Attached initial patch. The TermAllGroupsCollector variant attached in this patch seems to be 2 times faster then the one that is already in subversion.
                
> Improve AllGroupsCollector implementations
> ------------------------------------------
>
>                 Key: LUCENE-3972
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3972
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/grouping
>            Reporter: Martijn van Groningen
>         Attachments: LUCENE-3972.patch
>
>
> I think that the performance of TermAllGroupsCollectorm, DVAllGroupsCollector.BR and DVAllGroupsCollector.SortedBR can be improved by using BytesRefHash to store the groups instead of an ArrayList.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3972) Improve AllGroupsCollector implementations

Posted by "Dawid Weiss (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252463#comment-13252463 ] 

Dawid Weiss commented on LUCENE-3972:
-------------------------------------

Yes, sorry -- hash of course. The hash method that should redistribute keys space into buckets (but currently doesn't).

As for BytesRefHash vs. BytesRef instances -- maybe it's the source of the speedup, who knows. I would try the hash method though, if nothing else just for curiosity. I would also patch it for the future in either case. Not rehashing input keys is a flaw in my opinion (again -- backed by real life experience from HPPC).
                
> Improve AllGroupsCollector implementations
> ------------------------------------------
>
>                 Key: LUCENE-3972
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3972
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/grouping
>            Reporter: Martijn van Groningen
>         Attachments: LUCENE-3972.patch, LUCENE-3972.patch
>
>
> I think that the performance of TermAllGroupsCollectorm, DVAllGroupsCollector.BR and DVAllGroupsCollector.SortedBR can be improved by using BytesRefHash to store the groups instead of an ArrayList.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3972) Improve AllGroupsCollector implementations

Posted by "Yonik Seeley (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252469#comment-13252469 ] 

Yonik Seeley commented on LUCENE-3972:
--------------------------------------

bq. I would also patch it for the future in either case. 

Hmmm, if you don't know anything about the integer keys, then a little extra hashing by default is a good idea.
But we know a lot about docids, and extra hashing should just lead to an average-case slowdown.

                
> Improve AllGroupsCollector implementations
> ------------------------------------------
>
>                 Key: LUCENE-3972
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3972
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/grouping
>            Reporter: Martijn van Groningen
>         Attachments: LUCENE-3972.patch, LUCENE-3972.patch
>
>
> I think that the performance of TermAllGroupsCollectorm, DVAllGroupsCollector.BR and DVAllGroupsCollector.SortedBR can be improved by using BytesRefHash to store the groups instead of an ArrayList.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3972) Improve AllGroupsCollector implementations

Posted by "Martijn van Groningen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252467#comment-13252467 ] 

Martijn van Groningen commented on LUCENE-3972:
-----------------------------------------------

I ran a test with the changes to SentinelIntSet and one without. Performance difference between the two tests are minimal. The test with the changes is somewhat slower (15.7 s) than the one without the changes (15.3 s).
                
> Improve AllGroupsCollector implementations
> ------------------------------------------
>
>                 Key: LUCENE-3972
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3972
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/grouping
>            Reporter: Martijn van Groningen
>         Attachments: LUCENE-3972.patch, LUCENE-3972.patch
>
>
> I think that the performance of TermAllGroupsCollectorm, DVAllGroupsCollector.BR and DVAllGroupsCollector.SortedBR can be improved by using BytesRefHash to store the groups instead of an ArrayList.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3972) Improve AllGroupsCollector implementations

Posted by "Martijn van Groningen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252458#comment-13252458 ] 

Martijn van Groningen commented on LUCENE-3972:
-----------------------------------------------

The major difference between the committed TermAllGroupsCollector and one in the patch is that committed TermAllGroupsCollector creates a BytesRef instance for each unique group and puts it in a ArrayList (5M in my case). The version in the patch reuses a single BytesRef instance. The BytesRef bytes are copied into the BytesRefHash big bytes array via the BytesRefHash#add(...) method. I think the many BytesRef instances makes the committed TermAllGroupsCollector slow.
                
> Improve AllGroupsCollector implementations
> ------------------------------------------
>
>                 Key: LUCENE-3972
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3972
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/grouping
>            Reporter: Martijn van Groningen
>         Attachments: LUCENE-3972.patch, LUCENE-3972.patch
>
>
> I think that the performance of TermAllGroupsCollectorm, DVAllGroupsCollector.BR and DVAllGroupsCollector.SortedBR can be improved by using BytesRefHash to store the groups instead of an ArrayList.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3972) Improve AllGroupsCollector implementations

Posted by "Dawid Weiss (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252486#comment-13252486 ] 

Dawid Weiss commented on LUCENE-3972:
-------------------------------------

Hmmm... it's not collisions then, it was worth a try. I still find the difference puzzling -- I can't justify your version being 3x faster. Curious what it might be.

bq. But we know a lot about docids, and extra hashing should just lead to an average-case slowdown.

Ok.
                
> Improve AllGroupsCollector implementations
> ------------------------------------------
>
>                 Key: LUCENE-3972
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3972
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/grouping
>            Reporter: Martijn van Groningen
>         Attachments: LUCENE-3972.patch, LUCENE-3972.patch
>
>
> I think that the performance of TermAllGroupsCollectorm, DVAllGroupsCollector.BR and DVAllGroupsCollector.SortedBR can be improved by using BytesRefHash to store the groups instead of an ArrayList.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3972) Improve AllGroupsCollector implementations

Posted by "Michael McCandless (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252518#comment-13252518 ] 

Michael McCandless commented on LUCENE-3972:
--------------------------------------------

Actually, we are storing term ords here, not docIDs.

I think the high number of unique groups explains why the new patch is
faster: the time is likely dominated by re-ord'ing for each segment?

If you have fewer unique groups (and as the number of docs collected goes up),
I think the current impl should be faster...?

                
> Improve AllGroupsCollector implementations
> ------------------------------------------
>
>                 Key: LUCENE-3972
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3972
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/grouping
>            Reporter: Martijn van Groningen
>         Attachments: LUCENE-3972.patch, LUCENE-3972.patch
>
>
> I think that the performance of TermAllGroupsCollectorm, DVAllGroupsCollector.BR and DVAllGroupsCollector.SortedBR can be improved by using BytesRefHash to store the groups instead of an ArrayList.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3972) Improve AllGroupsCollector implementations

Posted by "Dawid Weiss (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252427#comment-13252427 ] 

Dawid Weiss commented on LUCENE-3972:
-------------------------------------

This is curious indeed. One thing to check would be this: SentinelIntSet uses no key rehashing (rehash simply returns the key). This resulted in very poor performance for certain regular integer sets (my experience from implementing HPPC). So while rehashing may seem like an additional overhead, it actually boosts performance.

Martijn -- could you patch the trunk's SentinelIntSet#rehash with, for example, this (murmur hash3 tail):
{noformat}
    public static int rehash(int k)
    {
        k ^= k >>> 16;
        k *= 0x85ebca6b;
        k ^= k >>> 13;
        k *= 0xc2b2ae35;
        k ^= k >>> 16;
        return k;
    }
{noformat}
                
> Improve AllGroupsCollector implementations
> ------------------------------------------
>
>                 Key: LUCENE-3972
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3972
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/grouping
>            Reporter: Martijn van Groningen
>         Attachments: LUCENE-3972.patch, LUCENE-3972.patch
>
>
> I think that the performance of TermAllGroupsCollectorm, DVAllGroupsCollector.BR and DVAllGroupsCollector.SortedBR can be improved by using BytesRefHash to store the groups instead of an ArrayList.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Issue Comment Edited] (LUCENE-3972) Improve AllGroupsCollector implementations

Posted by "Dawid Weiss (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252427#comment-13252427 ] 

Dawid Weiss edited comment on LUCENE-3972 at 4/12/12 1:52 PM:
--------------------------------------------------------------

This is curious indeed. One thing to check would be this: SentinelIntSet uses no key rehashing (rehash simply returns the key). This resulted in very poor performance for certain regular integer sets (my experience from implementing HPPC). So while rehashing may seem like an additional overhead, it actually boosts performance.

Martijn -- could you patch the trunk's SentinelIntSet#rehash with, for example, this (murmur hash3 tail):
{noformat}
    public static int rehash(int k)
    {
        k ^= k >>> 16;
        k *= 0x85ebca6b;
        k ^= k >>> 13;
        k *= 0xc2b2ae35;
        k ^= k >>> 16;
        return k;
    }
{noformat}
and retry your test? Btw. I'm not saying it'll be faster :)
                
      was (Author: dweiss):
    This is curious indeed. One thing to check would be this: SentinelIntSet uses no key rehashing (rehash simply returns the key). This resulted in very poor performance for certain regular integer sets (my experience from implementing HPPC). So while rehashing may seem like an additional overhead, it actually boosts performance.

Martijn -- could you patch the trunk's SentinelIntSet#rehash with, for example, this (murmur hash3 tail):
{noformat}
    public static int rehash(int k)
    {
        k ^= k >>> 16;
        k *= 0x85ebca6b;
        k ^= k >>> 13;
        k *= 0xc2b2ae35;
        k ^= k >>> 16;
        return k;
    }
{noformat}
                  
> Improve AllGroupsCollector implementations
> ------------------------------------------
>
>                 Key: LUCENE-3972
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3972
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/grouping
>            Reporter: Martijn van Groningen
>         Attachments: LUCENE-3972.patch, LUCENE-3972.patch
>
>
> I think that the performance of TermAllGroupsCollectorm, DVAllGroupsCollector.BR and DVAllGroupsCollector.SortedBR can be improved by using BytesRefHash to store the groups instead of an ArrayList.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3972) Improve AllGroupsCollector implementations

Posted by "Michael McCandless (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252315#comment-13252315 ] 

Michael McCandless commented on LUCENE-3972:
--------------------------------------------

Curious that it's so much faster ... BytesRefHash operates on the byte[] term while the current approach operates on int ord.

How large was the index?  If it was smallish, maybe the time was dominated by re-ord'ing after each reader...?
                
> Improve AllGroupsCollector implementations
> ------------------------------------------
>
>                 Key: LUCENE-3972
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3972
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/grouping
>            Reporter: Martijn van Groningen
>         Attachments: LUCENE-3972.patch
>
>
> I think that the performance of TermAllGroupsCollectorm, DVAllGroupsCollector.BR and DVAllGroupsCollector.SortedBR can be improved by using BytesRefHash to store the groups instead of an ArrayList.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3972) Improve AllGroupsCollector implementations

Posted by "Martijn van Groningen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252581#comment-13252581 ] 

Martijn van Groningen commented on LUCENE-3972:
-----------------------------------------------

bq. If you have fewer unique groups (and as the number of docs collected goes up), I think the current impl should be faster...?
This is true. I ran a few tests on an index containing 10M docs:
||Run||Num unique groups||Perf. collector in patch||Perf. committed collector|| 
|1|~65000|892 ms|132 ms|
|2|~645000|1183 ms|878 ms|
|3|~953000|1291 ms|1517 ms|
|4|~1819000|1783 ms|3762 ms|
|5|~3332000|2703 ms|4882 ms|
|6|~6668000|4840 ms|18989 ms|

All the times are the average over 10 executions with a match all query.

bq. the time is likely dominated by re-ord'ing for each segment?
During run 4 I noticed that 3470 ms of the total 3762 ms was spend on re-ord'ing groups for segments.

It seems that the implementation in the patch is only usable if a search yields many unique groups as result.  
                
> Improve AllGroupsCollector implementations
> ------------------------------------------
>
>                 Key: LUCENE-3972
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3972
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/grouping
>            Reporter: Martijn van Groningen
>         Attachments: LUCENE-3972.patch, LUCENE-3972.patch
>
>
> I think that the performance of TermAllGroupsCollectorm, DVAllGroupsCollector.BR and DVAllGroupsCollector.SortedBR can be improved by using BytesRefHash to store the groups instead of an ArrayList.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org