You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Yonik Seeley (JIRA)" <ji...@apache.org> on 2018/06/02 19:10:00 UTC
[jira] [Commented] (SOLR-12366) Avoid SlowAtomicReader.getLiveDocs -- it's slow

    [ https://issues.apache.org/jira/browse/SOLR-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499147#comment-16499147 ] 

Yonik Seeley commented on SOLR-12366:
-------------------------------------

Nice catch, this stuff has been broken forever!
 Looking back, I think not enough was exposed to be able to work per-segment, so Lucene's MultiReader.isDeleted(int doc) did a binary search each time. Once we gained the ability to operate per-segment, some code wasn't converted.
{quote}IMO some callers of SolrIndexSearcher.getSlowAtomicReader should change to use MultiFields to avoid the temptation to have a LeafReader that has many slow methods.
{quote}
MultiFields has slow methods as well, and if you look at the histories, many places used MultiFields.getDeletedDocs even before (and were replaced with the equivalent?)
 For example, commit 6ffc159b40 changed getFirstMatch to use MultiFields.getDeletedDocs (which may not have been a bug since it probably was equivalent at the time?)

Anyway, I think perhaps we should throw an exception for any place in SlowCompositeReaderWrapper that exposes code that does a binary search. We don't need a full Reader implementation here I think.

A variable name change for "SolrIndexSearcher.leafReader" would really be welcome too... it's a bad name.  We've been bit by the naming before as well: SOLR-9592

> Avoid SlowAtomicReader.getLiveDocs -- it's slow
> -----------------------------------------------
>
>                 Key: SOLR-12366
>                 URL: https://issues.apache.org/jira/browse/SOLR-12366
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: search
>            Reporter: David Smiley
>            Assignee: David Smiley
>            Priority: Major
>             Fix For: 7.4
>
>         Attachments: SOLR-12366.patch, SOLR-12366.patch, SOLR-12366.patch, SOLR-12366.patch
>
>
> SlowAtomicReader is of course slow, and it's getLiveDocs (based on MultiBits) is slow as it uses a binary search for each lookup.  There are various places in Solr that use SolrIndexSearcher.getSlowAtomicReader and then get the liveDocs.  Most of these places ought to work with SolrIndexSearcher's getLiveDocs method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org