You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "David Smiley (Jira)" <ji...@apache.org> on 2020/08/12 20:26:00 UTC

[jira] [Commented] (LUCENE-8319) A Time-limiting collector that works with CollectorManagers

    [ https://issues.apache.org/jira/browse/LUCENE-8319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176585#comment-17176585 ] 

David Smiley commented on LUCENE-8319:
--------------------------------------

Note that there is also ExitableDirectoryReader which seems to compete with TimeLimitingCollector.  IMO EDR is better because it extends earlier to query rewrite phase, and TLC has maybe no advantages?  I'd rather see TLC removed.  Any way, I bring this up because I'm not sure how EDR plays with concurrent search.  Maybe just fine, maybe there is a parallel concern there with the proposal above.

> A Time-limiting collector that works with CollectorManagers
> -----------------------------------------------------------
>
>                 Key: LUCENE-8319
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8319
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/search
>            Reporter: Tony Xu
>            Priority: Minor
>
> Currently Lucene has *TimeLimitingCollector* to support time-bound collection and it will throw 
> *TimeExceededException* if timeout happens. This only works nicely with the single-thread low-level API from the IndexSearcher. The method signature is --
> *void search(List<LeafReaderContext> leaves, Weight weight, Collector collector)*
> The intended use is to always enclose the searcher.search(query, collector) call with a try ... catch and handle the timeout exception. Unfortunately when working with a *CollectorManager* in the multi-thread search context, the *TimeExceededException* thrown during collecting one leaf slice will be re-thrown by *IndexSearcher* without calling *CollectorManager*'s reduce(), even if other slices are successfully collected. The signature 
> of the search api with *CollectorManager* is --
> *<C extends Collector, T> T search(Query query, CollectorManager<C, T> collectorManager)*
>  
> The good news is that IndexSearcher handles *CollectionTerminatedException* gracefully by ignoring it. We can either wrap TimeLimitingCollector and throw  *CollectionTerminatedException* when timeout happens or simply replace *TimeExceededException* with *CollectionTerminatedException*. In either way, we also need to maintain a flag that indicates if timeout occurred so that the user know it's a partial collection.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org