You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Yonik Seeley (JIRA)" <ji...@apache.org> on 2015/08/12 23:10:46 UTC

[jira] [Updated] (SOLR-7918) speed up term->DocSet production

     [ https://issues.apache.org/jira/browse/SOLR-7918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yonik Seeley updated SOLR-7918:
-------------------------------
    Attachment: SOLR-7918.patch

Patch attached..  This also introduces a DocSetProducer interface (ported from Heliosearch) to form a basis for future optimizations.

The actual set building was moved out to DocSetUtil from SolrIndexSearcher to avoid bloating that class more.

Performance improvements were quite good. On the low end was large SortedInt sets (only a 20% improvement), but large sets saw a 70% improvement and very small sets saw over 120% improvement.  Complete request+response was measured from the client, so the speedups were actually even greater.


> speed up term->DocSet production
> --------------------------------
>
>                 Key: SOLR-7918
>                 URL: https://issues.apache.org/jira/browse/SOLR-7918
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>         Attachments: SOLR-7918.patch
>
>
> We can use index statistics to figure out before hand what type of doc set (sorted int or bitset) we should create.  This should use less memory than the current approach as well as increase performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org