You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@lucene.apache.org by "David Smiley (Jira)" <ji...@apache.org> on 2021/01/14 20:05:00 UTC

[jira] [Commented] (SOLR-14185) add DocSet.getDocIdSetIterator(LeafReaderContext)

    [ https://issues.apache.org/jira/browse/SOLR-14185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17265184#comment-17265184 ] 

David Smiley commented on SOLR-14185:
-------------------------------------

Now we have only two DocSet implementations (on master).

Proposed method on DocSet: {{DocIdSetIterator iterator(LeafReaderContext ctx)}}.

The implementation of that on BitDocSet is obvious; SortedIntDocSet is harder.  See SortedIntDocSet.getTopFilter.getDocIdSet which looks up the range of indexes based on the ID range of the LeafReaderContext.  We could basically copy that logic to this new iterator method in order to get a DocIdSetIterator of the corresponding slice.  Straight-forward I think?  But notice the getTopFilter tracks "lastEndIdx" which we can't carry forward, as there is no intermediary object to hold it.   However, I imagine an optimization when the SortedIntDocSet is built, in which the segment boundaries could be tracked _at that time_ such that when iterator() is called, it merely looks up the boundaries from an additional small array of leaf ordinal -> index into sorted docIDs.

> add DocSet.getDocIdSetIterator(LeafReaderContext)
> -------------------------------------------------
>
>                 Key: SOLR-14185
>                 URL: https://issues.apache.org/jira/browse/SOLR-14185
>             Project: Solr
>          Issue Type: Sub-task
>            Reporter: David Smiley
>            Priority: Major
>
> Many callers of {{DocSet.getTopFilter()}} really just want to call {{getDocIdSet}} on the Filter and then call {{iterator()}}.  The Bits 2nd arg is also always null or always live-docs so this arg doesn't matter since a Solr DocSet never contains deleted docs.  The goal here is to reduce needless dependencies on the old Filter.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org