You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Jeff Wartes (JIRA)" <ji...@apache.org> on 2016/03/30 23:35:25 UTC

[jira] [Created] (SOLR-8922) DocSetCollector can allocate massive garbage on large indexes

Jeff Wartes created SOLR-8922:
---------------------------------

             Summary: DocSetCollector can allocate massive garbage on large indexes
                 Key: SOLR-8922
                 URL: https://issues.apache.org/jira/browse/SOLR-8922
             Project: Solr
          Issue Type: Improvement
            Reporter: Jeff Wartes


After reaching a point of diminishing returns tuning the GC collector, I decided to take a look at where the garbage was coming from. To my surprise, it turned out that for my index and query set, almost 60% of the garbage was coming from this single line:

https://github.com/apache/lucene-solr/blob/94c04237cce44cac1e40e1b8b6ee6a6addc001a5/solr/core/src/java/org/apache/solr/search/DocSetCollector.java#L49

This is due to the simple fact that I have 86M documents in my shards. Allocating a scratch array big enough to track a result set 1/64th of my index (1.3M) is also almost certainly excessive, considering my 99.9th percentile hit count is less than 56k.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org