You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Feng Guo (Jira)" <ji...@apache.org> on 2021/11/14 18:43:00 UTC

[jira] [Updated] (LUCENE-10233) Store docIds as bit set when leafCardinality = 1 to speed up addAll

     [ https://issues.apache.org/jira/browse/LUCENE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Feng Guo updated LUCENE-10233:
------------------------------
    Summary: Store docIds as bit set when leafCardinality = 1 to speed up addAll  (was: Store docIds as bit set to speed up addAll)

> Store docIds as bit set when leafCardinality = 1 to speed up addAll
> -------------------------------------------------------------------
>
>                 Key: LUCENE-10233
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10233
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/codecs
>            Reporter: Feng Guo
>            Priority: Major
>
> In low cardinality points cases, id blocks will usually store doc ids that have the same point value, and intersect will get into addAll logic. If we store ids as bitset when the leafCadinality = 1, and give the IntersectVisitor bulk visiting ability (something like visit(DocIdSetIterator iterator), we can speed up addAll because we can just execute the 'or' logic between the result and the block ids.
> Concerns:
> 1. Bitset could occupy more disk space.
> 2. MergeReader will become slower because it needs to iterate docIds one by one. (But maybe the performance of merge could be less sensitive than queries ?)
> I'd like to do some test for query, merge and space if you think this optimization is worth a try :)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org