You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Adrien Grand (JIRA)" <ji...@apache.org> on 2018/07/26 17:54:00 UTC

[jira] [Commented] (LUCENE-8060) Enable top-docs collection optimizations by default

    [ https://issues.apache.org/jira/browse/LUCENE-8060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16558683#comment-16558683 ] 

Adrien Grand commented on LUCENE-8060:
--------------------------------------

{quote}why have a setDefaultNumTotalHitsToTrack(int) just for this concept, and not a setter for all the other collector concepts that we currently have defaults for in the simple search/searchAfter methods (like Sort sort , boolean doDocScores , boolean doMaxScore , etc...)
{quote}
Actually some concepts like the similarity and the query cache policy are set as members of IndexSearcher, so this isn't really new? I think the assumption is that you most likely need the same values for most your requests and do not need it to be configurable on a per-request basis, unlike the sort or the number of hits to collect?
{quote}do we want to go down the route of an IndexSearcherConfig ?
{quote}
A user suggested adding this class last year: LUCENE-7902. I don't have a strong opinion on this one besides keeping a simple IndexSearcher ctor that only take a reader and has sensible defaults.
{quote}this seems like it introduces divergent "intermediate APIs" for users to learn about that might frustrate them down the road...
{quote}
This is a good point. I also dislike a bit adding new setters/configuration options if we can come up with a default value that is reasonable and should work for most users, at least as long as their use-case remains simple. I'm seeing pros and cons either way and I would probably be fine either way too.

Based on your comments I am getting the feeling that you are leaning towards to exposing this configuration option, having a sensible default and pointing users to creating collectors manually if they have more specific needs, do I get it right?

> Enable top-docs collection optimizations by default
> ---------------------------------------------------
>
>                 Key: LUCENE-8060
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8060
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Priority: Minor
>             Fix For: master (8.0)
>
>
> We are getting optimizations when hit counts are not required (sorted indexes, MAXSCORE, short-circuiting of phrase queries) but our users won't benefit from them unless we disable exact hit counts by default or we require them to tell us whether hit counts are required.
> I think making hit counts approximate by default is going to be a bit trappy, so I'm rather leaning towards requiring users to tell us explicitly whether they need total hit counts. I can think of two ways to do that: either by passing a boolean to the IndexSearcher constructor or by adding a boolean to all methods that produce TopDocs instances. I like the latter better but I'm open to discussion or other ideas?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org