You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2019/07/10 17:19:00 UTC

[jira] [Commented] (LUCENE-8875) Should TopScoreDocCollector Always Populate Sentinel Values?

    [ https://issues.apache.org/jira/browse/LUCENE-8875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16882278#comment-16882278 ] 

ASF subversion and git services commented on LUCENE-8875:
---------------------------------------------------------

Commit ee79a20174528a99b1a805af5ce2212276db1630 in lucene-solr's branch refs/heads/master from Atri Sharma
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=ee79a20 ]

LUCENE-8875: Introduce Optimized Collector For Large Number Of Hits (#754)

This commit introduces a new collector which is optimized for
cases when the number of hits is large and/or the actual hits
collected are sparse in comparison to the number of hits
requested.

> Should TopScoreDocCollector Always Populate Sentinel Values?
> ------------------------------------------------------------
>
>                 Key: LUCENE-8875
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8875
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Atri Sharma
>            Priority: Major
>          Time Spent: 9h
>  Remaining Estimate: 0h
>
> TopScoreDocCollector always initializes HitQueue as the PQ implementation, and instruct HitQueue to populate with sentinels. While this is a great safety mechanism, for very large datasets where the query's selectivity is high, the sentinel population can be redundant and can become a large enough bottleneck in itself. Does it make sense to introduce a new parameter in TopScoreDocCollector which uses a heuristic (say number of hits > 10k) and does not populate sentinels?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org