You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Mayya Sharipova (JIRA)" <ji...@apache.org> on 2019/07/19 21:29:00 UTC
[jira] [Comment Edited] (LUCENE-8727) IndexSearcher#search(Query,int) should operate on a shared priority queue when configured with an executor

    [ https://issues.apache.org/jira/browse/LUCENE-8727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16889181#comment-16889181 ] 

Mayya Sharipova edited comment on LUCENE-8727 at 7/19/19 9:28 PM:
------------------------------------------------------------------

Some comments about design option # 1.

I think we should just share  min competitive score(it could be AtomicLong or something) between collectors, and not the top hits.  The reason for not sharing top hits  is that Collectors expect leaves in the [sequential order.|https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/search/TopScoreDocCollector.java#L240-L242] And if it happens that we start processing leaves with higher doc Ids first in the executor, we may populate the global priority queue with docs with higher ids and set the global min competitive score to the next float. Next, when we process leaves with smaller doc Ids, as our global priority queue is full and as we use this updated global min competitive score, we will have to skip all these docs with smaller doc Ids even if they have the same scores as docs with higher doc Ids and should be selected instead. 

If all collectors have their own priority queues, they will make sure first to fill them to N and only after that set min competitive score. 


was (Author: mayyas):
Some comments about design option # 1.

I think we should just share  min competitive score(it could be AtomicLong or something) between collectors, and not the top hits.  The reason for not sharing top hits  is that Collectors expect leaves in [the sequential order|[https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/search/TopScoreDocCollector.java#L240-L242]]. And if it happens that we start processing leaves with higher doc Ids first in the executor, we may populate the global priority queue with docs with higher ids and set the global min competitive score to the next float. Next, when we process leaves with smaller doc Ids, as our global priority queue is full and as we use this updated global min competitive score, we will have to skip all these docs with smaller doc Ids even if they have the same scores as docs with higher doc Ids and should be selected instead. 

If all collectors have their own priority queues, they will make sure first to fill them to N and only after that set min competitive score. 

> IndexSearcher#search(Query,int) should operate on a shared priority queue when configured with an executor
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-8727
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8727
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Priority: Minor
>
> If IndexSearcher is configured with an executor, then the top docs for each slice are computed separately before being merged once the top docs for all slices are computed. With block-max WAND this is a bit of a waste of resources: it would be better if an increase of the min competitive score could help skip non-competitive hits on every slice and not just the current one.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org