You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Jae Joo <ja...@gmail.com> on 2020/12/03 16:55:12 UTC

Facet to part of search results

Is there any way to apply facet to the partial search result?
For ex, we have 10m return by "dog" and like to apply facet to first 10K.
Possible?

Jae

Re: Facet to part of search results

Posted by Andy Webb <an...@gmail.com>.
I wonder if you could increase the precision of your result set to reduce
its size? If you have 10M results for a query but only the first 10K
deserve to be represented by the faceting, what is it about those 10K that
makes them better than the other 9.99M? For example if some items are
boosted by some attribute(s) to get higher scores, can you filter out items
that don't have those attributes? Also, maybe setting mm to require more
terms to match could cut out unwanted results (that's not useful for the
"dog" query of course).

Andy

On Fri, 4 Dec 2020 at 06:43, Radu Gheorghe <ra...@sematext.com>
wrote:

>
> > On 3 Dec 2020, at 20:18, Shawn Heisey <ap...@elyograg.org> wrote:
> >
> > On 12/3/2020 9:55 AM, Jae Joo wrote:
> >> Is there any way to apply facet to the partial search result?
> >> For ex, we have 10m return by "dog" and like to apply facet to first
> 10K.
> >> Possible?
> >
> > The point of facets is to provide accurate numbers.
> >
> > What would it mean to only apply to the first 10K?  If there are 10
> million documents in the query results that contain "dog" then the facet
> should say 10 million, not 10K.  I do not understand what you're trying to
> do.
> >
>
> Maybe sampling? I’m not aware of a built-in way to do that. But you could
> index a random float between, say 0 and 100 and then filter out a sample by
> filtering for number<X to keep only X% of the data set.
>
> Or maybe you’d think that faceting on 10K would be enough (e.g. if you
> don’t need the numbers, just some unique values). But I really don’t see a
> good solution to that - you’d have to terminateEarly and do faceting
> somehow…
>
> Best regards,
> Radu
>
>

Re: Facet to part of search results

Posted by Radu Gheorghe <ra...@sematext.com>.
> On 3 Dec 2020, at 20:18, Shawn Heisey <ap...@elyograg.org> wrote:
> 
> On 12/3/2020 9:55 AM, Jae Joo wrote:
>> Is there any way to apply facet to the partial search result?
>> For ex, we have 10m return by "dog" and like to apply facet to first 10K.
>> Possible?
> 
> The point of facets is to provide accurate numbers.
> 
> What would it mean to only apply to the first 10K?  If there are 10 million documents in the query results that contain "dog" then the facet should say 10 million, not 10K.  I do not understand what you're trying to do.
> 

Maybe sampling? I’m not aware of a built-in way to do that. But you could index a random float between, say 0 and 100 and then filter out a sample by filtering for number<X to keep only X% of the data set.

Or maybe you’d think that faceting on 10K would be enough (e.g. if you don’t need the numbers, just some unique values). But I really don’t see a good solution to that - you’d have to terminateEarly and do faceting somehow…

Best regards,
Radu


Re: Facet to part of search results

Posted by Shawn Heisey <ap...@elyograg.org>.
On 12/3/2020 9:55 AM, Jae Joo wrote:
> Is there any way to apply facet to the partial search result?
> For ex, we have 10m return by "dog" and like to apply facet to first 10K.
> Possible?

The point of facets is to provide accurate numbers.

What would it mean to only apply to the first 10K?  If there are 10 
million documents in the query results that contain "dog" then the facet 
should say 10 million, not 10K.  I do not understand what you're trying 
to do.

Shawn