You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Kyle Maxwell <ky...@reddit.com.INVALID> on 2020/02/13 17:29:08 UTC

Custom Solr Collector

Hi,
Looking to see if there's any appetite for either:

1. Allowing custom collectors as Solr Plugins, or
2. Taking a patch on TimeLimitedCollector to allow it to be doc-limited as
well.

Motivation:
https://medium.com/@kyle.c.maxwell/some-lucene-tuning-t-45d82a9dfd83

TimeLimitedCollector Patch:
https://github.com/fizx/lucene-solr-1/pull/1/files

Which approach might people prefer?  I'm happy to do the legwork, but
wanted to check in first.

Thanks,
Kyle

Re: Custom Solr Collector

Posted by Kyle Maxwell <ky...@reddit.com.INVALID>.
You understand the min-visited part.

I don’t think the EarlyTerminatingSortingCollector is exactly what we want because we don’t want to sort the query results, just scan the index in roughly sorted order and score normally. 

Thankfully, a bunch of the features I’ve written custom collectors for over the years have made it into Solr by now, so maybe the full arbitrary collector configuration is overkill.

> On Feb 13, 2020, at 11:42 AM, Tomás Fernández Löbbe <to...@gmail.com> wrote:
> 
> 
> Hi Kyle,
> For #2, I understand you need this because you want "min-visited-docs", right? Because, for max you could use EarlyTerminatingSortingCollector? (or Lucene's "HitsThresholdChecker", but I don't know if Solr has support for this yet). The "min-visited" would override the "timeAllowed", so even if the collection should expire based on time, you'd let it continue until something hits, is that the idea?
> 
>> On Thu, Feb 13, 2020 at 9:29 AM Kyle Maxwell <ky...@reddit.com.invalid> wrote:
>> Hi,
>> Looking to see if there's any appetite for either:
>> 
>> 1. Allowing custom collectors as Solr Plugins, or
>> 2. Taking a patch on TimeLimitedCollector to allow it to be doc-limited as well.
>> 
>> Motivation:
>> https://medium.com/@kyle.c.maxwell/some-lucene-tuning-t-45d82a9dfd83
>> 
>> TimeLimitedCollector Patch:
>> https://github.com/fizx/lucene-solr-1/pull/1/files
>> 
>> Which approach might people prefer?  I'm happy to do the legwork, but wanted to check in first.
>> 
>> Thanks,
>> Kyle

Re: Custom Solr Collector

Posted by Tomás Fernández Löbbe <to...@gmail.com>.
Hi Kyle,
For #2, I understand you need this because you want "min-visited-docs",
right? Because, for max you could use EarlyTerminatingSortingCollector? (or
Lucene's "HitsThresholdChecker", but I don't know if Solr has support for
this yet). The "min-visited" would override the "timeAllowed", so even if
the collection should expire based on time, you'd let it continue until
something hits, is that the idea?

On Thu, Feb 13, 2020 at 9:29 AM Kyle Maxwell
<ky...@reddit.com.invalid> wrote:

> Hi,
> Looking to see if there's any appetite for either:
>
> 1. Allowing custom collectors as Solr Plugins, or
> 2. Taking a patch on TimeLimitedCollector to allow it to be doc-limited as
> well.
>
> Motivation:
> https://medium.com/@kyle.c.maxwell/some-lucene-tuning-t-45d82a9dfd83
>
> TimeLimitedCollector Patch:
> https://github.com/fizx/lucene-solr-1/pull/1/files
>
> Which approach might people prefer?  I'm happy to do the legwork, but
> wanted to check in first.
>
> Thanks,
> Kyle
>