You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Alan Woodward (JIRA)" <ji...@apache.org> on 2016/06/29 18:38:05 UTC

[jira] [Updated] (LUCENE-7365) Don't use BooleanScorer for small segments

     [ https://issues.apache.org/jira/browse/LUCENE-7365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alan Woodward updated LUCENE-7365:
----------------------------------
    Attachment: LUCENE-7365.patch

Patch.  This prevents use of BooleanScorer if the segment is smaller than 1024 docs.  I'm not sure if that's the best cutoff though, and I'd like to do some benchmarking to check performance.

> Don't use BooleanScorer for small segments
> ------------------------------------------
>
>                 Key: LUCENE-7365
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7365
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Alan Woodward
>            Assignee: Alan Woodward
>         Attachments: LUCENE-7365.patch
>
>
> If a BooleanQuery meets certain criteria (only contains disjunctions, is likely to match large numbers of docs) then we use a BooleanScorer to score groups of 1024 docs at a time.  This allocates arrays of 1024 Bucket objects up-front.  On very small segments (for example, a MemoryIndex) this is very wasteful of memory, particularly if the query is large or deeply-nested.  We should avoid using a bulk scorer on these segments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org