You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Adrien Grand (Jira)" <ji...@apache.org> on 2019/10/10 13:00:00 UTC

[jira] [Created] (LUCENE-9003) Should FilterDirectoryReader compute numDocs lazily?

Adrien Grand created LUCENE-9003:
------------------------------------

             Summary: Should FilterDirectoryReader compute numDocs lazily?
                 Key: LUCENE-9003
                 URL: https://issues.apache.org/jira/browse/LUCENE-9003
             Project: Lucene - Core
          Issue Type: Improvement
            Reporter: Adrien Grand


FilterDirectoryReader extends BaseCompositeReader, which computes both maxDoc and numDocs eagerly in its constructor by summing up these values across all sub leaves.

This is problematic for readers that hide additional documents. Computing numDocs on such leaf readers usually requires iterating over all live documents to count them. This makes creating a FilterDirectoryReader on top run in linear time, which has caused several performance bugs to us over time. This is especially frustrating given that numDocs is a rarely used index statistic.

I think computing numDocs lazily would be less surprising?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org