You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Adrien Grand (Jira)" <ji...@apache.org> on 2022/03/30 14:20:00 UTC

[jira] [Commented] (LUCENE-10486) Avoid unnecessary overhead in TopScoreDoc and TopField collector manager

    [ https://issues.apache.org/jira/browse/LUCENE-10486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17514733#comment-17514733 ] 

Adrien Grand commented on LUCENE-10486:
---------------------------------------

It would be nice if we could avoid this overhead when possible, but I don't like that it requires the user to pass this information to the collector manager, it feels wrong.

I don't remember if it had been benchmarked how much overhead the shared manager has, does it really make things slower than the serial one? can we reduce its overhead?

> Avoid unnecessary overhead in TopScoreDoc and TopField collector manager
> ------------------------------------------------------------------------
>
>                 Key: LUCENE-10486
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10486
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Luca Cavanna
>            Priority: Minor
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> TopScoreDocCollector and TopFieldCollector expose a createSharedManager method that returns a collector manager for concurrent search, which relies on a shared global counter for hits counting as well as a shared max score accumulator.
> As part of LUCENE-10002 we are going to deprecate the ability to search providing a collector, in favour of using a corresponding collector manager. The above mentioned shared collector managers are great for concurrent searches, yet they add overhead when search is single threaded, which can be the case despite a collector manager is used. That is the case when an executor is not set to the index searcher, or when there's only one slice to be searched.
> We could adapt the hits threshold checker as well as the max score accumulator depending on whether a search is effectively executed by multiple threads or not.
> An additional idea along the same lines could be to introduce a new hits threshold checker for the case when totalHitsThreshold is set to Integer.MAX_VALUE, which does no counting at all. This could be safely used both in the single threaded as well as in the concurrent scenario.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org