You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by David Lukowski <da...@gmail.com> on 2020/05/11 19:35:47 UTC

Limiting random results set with facets.

I'm looking for a way if possible to run a query with random results, where
I limit the number of results I want back, yet still have the facets
accurately reflect the results I'm searching.

When I run a search I use a filter query to randomize the results based on
a modulo of a random seed. This returns a results set with the associated
facets for each documentType.

"response":{"numFound":377895,"start":0,"docs":[]
  },
  "facet_counts":{
    "facet_queries":{},
    "facet_fields":{
      "documentType":[
        "78",374015,
        "3",3021,
        "2",736,
        "1",41,
        "34",41,
        "35",32,
        "72",8,
        "7",1]},

How do I limit the number of results returned to N and have the facets
accurately reflect the number of messages?  I cannot simply say rows=N
because the facets will always reflect the total numFound and not the
limited results set I'm looking for.

Re: Limiting random results set with facets.

Posted by ART GALLERY <al...@goretoy.com>.
check out the videos on this website TROO.TUBE don't be such a
sheep/zombie/loser/NPC. Much love!
https://troo.tube/videos/watch/aaa64864-52ee-4201-922f-41300032f219

On Tue, May 12, 2020 at 9:38 PM David Lukowski <da...@gmail.com> wrote:
>
> Thanks Srijan,  2 queries is exactly the route I started going today.
>
> Query 1:
> http://mysolr-node:8080/solr/M2_content/select
> ?q=({!terms f='permissionFilterId'}10,49 AND docBody:(lucky))
> &start=0
> &rows=100
> &fq=channelId:(2 1 3 78 34 35 7 72)
> &fq=date:([* TO 2020-05-12T03:59:59.999Z])
> &hl=false
> &fl=id
> &wt=json
> &sort=random_123456 desc
>
>
> Query 2:
> http://mysolr-node:8080/solr/M2_content/select
> ?q= id:(12345 2345 3456 4567...<id's returned from Query 1>)
> &start=0
> &rows=30
> &facet=true
> &facet.field=channelId
> &f.channelId.facet.limit=10
> &f.channelId.facet.mincount=1
> &hl=false
> &fl=id, text, users
> &wt=json
> &sort=date desc
>
> Working well so far, but still not ideal.
>
> Thanks for the assist,
>
> David
>
> On Tue, May 12, 2020 at 7:31 PM Srijan <sh...@gmail.com> wrote:
>
> > I see what you mean now. You could use two queries - first would return 100
> > randomly sorted docs (no faceting) and the second with fq that includes the
> > ids of the returned 100 docs + faceting.
> >
> > On Tue, May 12, 2020 at 1:29 PM David Lukowski <da...@gmail.com>
> > wrote:
> >
> > > Thanks for the offer of help, this doesn't really seem like what I'm
> > > looking for though, but I could be misunderstanding.  I'll try to state
> > it
> > > more clearly and include the query.
> > >
> > >
> > > -- This will give me back all the documents that have "lucky" in them in
> > > RANDOM sorted order.
> > >
> > > http://mysolr-node:8080/solr/M2_content/select
> > > ?q=({!terms f='permissionFilterId'}10,49 AND docBody:(lucky))
> > > &start=0
> > > &rows=0
> > > &fq=channelId:(2 1 3 78 34 35 7 72)
> > > &fq=date:([* TO 2020-05-12T03:59:59.999Z])
> > > &facet=true
> > > &facet.field=channelId
> > > &f.channelId.facet.limit=10
> > > &f.channelId.facet.mincount=1
> > > &hl=false
> > > &fl=id
> > > &wt=json
> > > &sort=random_123456 desc
> > >
> > >   The issue is that I only want 100 random results.  Sure, I could limit
> > > the results returned to the first 100 by specifying &rows=100, but the
> > > facets would match the query totals and not the rows returned totals.
> > >
> > > RESULTS I HAVE:
> > > "response":{"numFound":377895,"start":0,"docs":[]
> > >   },
> > >   "facet_counts":{
> > >     "facet_queries":{},
> > >     "facet_fields":{
> > >       "documentType":[
> > >         "78",374015,
> > >         "3",3021,
> > >         "2",736,
> > >         "1",41,
> > >         "34",41,
> > >         "35",32,
> > >         "72",8,
> > >         "7",1]},
> > >
> > >
> > > RESULTS I WANT:
> > > "response":{"numFound":100,"start":0,"docs":[]
> > >   },
> > >   "facet_counts":{
> > >     "facet_queries":{},
> > >     "facet_fields":{
> > >       "documentType":[
> > >         "78",68,
> > >         "3",22,
> > >         "2",10]},
> > >
> > > How would I formulate the above query to give me a specific number of
> > > random results with the correct facet counts?
> > >
> > > Thanks for looking,
> > > David
> > >
> > > On Mon, May 11, 2020 at 2:09 PM Srijan <sh...@gmail.com> wrote:
> > >
> > > > If you can tag your filter query, you can exclude it when faceting.
> > Your
> > > > results will honor the filter query and you will get the N results
> > back,
> > > > and since faceting will exclude the filter, it will still give you
> > facet
> > > > count for the base query.
> > > >
> > > >
> > > >
> > >
> > https://lucene.apache.org/solr/guide/8_5/faceting.html#tagging-and-excluding-filters
> > > >
> > > >
> > > > On Mon, May 11, 2020 at 3:36 PM David Lukowski <
> > david.lukowski@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > I'm looking for a way if possible to run a query with random results,
> > > > where
> > > > > I limit the number of results I want back, yet still have the facets
> > > > > accurately reflect the results I'm searching.
> > > > >
> > > > > When I run a search I use a filter query to randomize the results
> > based
> > > > on
> > > > > a modulo of a random seed. This returns a results set with the
> > > associated
> > > > > facets for each documentType.
> > > > >
> > > > > "response":{"numFound":377895,"start":0,"docs":[]
> > > > >   },
> > > > >   "facet_counts":{
> > > > >     "facet_queries":{},
> > > > >     "facet_fields":{
> > > > >       "documentType":[
> > > > >         "78",374015,
> > > > >         "3",3021,
> > > > >         "2",736,
> > > > >         "1",41,
> > > > >         "34",41,
> > > > >         "35",32,
> > > > >         "72",8,
> > > > >         "7",1]},
> > > > >
> > > > > How do I limit the number of results returned to N and have the
> > facets
> > > > > accurately reflect the number of messages?  I cannot simply say
> > rows=N
> > > > > because the facets will always reflect the total numFound and not the
> > > > > limited results set I'm looking for.
> > > > >
> > > >
> > >
> >

Re: Limiting random results set with facets.

Posted by David Lukowski <da...@gmail.com>.
Thanks Srijan,  2 queries is exactly the route I started going today.

Query 1:
http://mysolr-node:8080/solr/M2_content/select
?q=({!terms f='permissionFilterId'}10,49 AND docBody:(lucky))
&start=0
&rows=100
&fq=channelId:(2 1 3 78 34 35 7 72)
&fq=date:([* TO 2020-05-12T03:59:59.999Z])
&hl=false
&fl=id
&wt=json
&sort=random_123456 desc


Query 2:
http://mysolr-node:8080/solr/M2_content/select
?q= id:(12345 2345 3456 4567...<id's returned from Query 1>)
&start=0
&rows=30
&facet=true
&facet.field=channelId
&f.channelId.facet.limit=10
&f.channelId.facet.mincount=1
&hl=false
&fl=id, text, users
&wt=json
&sort=date desc

Working well so far, but still not ideal.

Thanks for the assist,

David

On Tue, May 12, 2020 at 7:31 PM Srijan <sh...@gmail.com> wrote:

> I see what you mean now. You could use two queries - first would return 100
> randomly sorted docs (no faceting) and the second with fq that includes the
> ids of the returned 100 docs + faceting.
>
> On Tue, May 12, 2020 at 1:29 PM David Lukowski <da...@gmail.com>
> wrote:
>
> > Thanks for the offer of help, this doesn't really seem like what I'm
> > looking for though, but I could be misunderstanding.  I'll try to state
> it
> > more clearly and include the query.
> >
> >
> > -- This will give me back all the documents that have "lucky" in them in
> > RANDOM sorted order.
> >
> > http://mysolr-node:8080/solr/M2_content/select
> > ?q=({!terms f='permissionFilterId'}10,49 AND docBody:(lucky))
> > &start=0
> > &rows=0
> > &fq=channelId:(2 1 3 78 34 35 7 72)
> > &fq=date:([* TO 2020-05-12T03:59:59.999Z])
> > &facet=true
> > &facet.field=channelId
> > &f.channelId.facet.limit=10
> > &f.channelId.facet.mincount=1
> > &hl=false
> > &fl=id
> > &wt=json
> > &sort=random_123456 desc
> >
> >   The issue is that I only want 100 random results.  Sure, I could limit
> > the results returned to the first 100 by specifying &rows=100, but the
> > facets would match the query totals and not the rows returned totals.
> >
> > RESULTS I HAVE:
> > "response":{"numFound":377895,"start":0,"docs":[]
> >   },
> >   "facet_counts":{
> >     "facet_queries":{},
> >     "facet_fields":{
> >       "documentType":[
> >         "78",374015,
> >         "3",3021,
> >         "2",736,
> >         "1",41,
> >         "34",41,
> >         "35",32,
> >         "72",8,
> >         "7",1]},
> >
> >
> > RESULTS I WANT:
> > "response":{"numFound":100,"start":0,"docs":[]
> >   },
> >   "facet_counts":{
> >     "facet_queries":{},
> >     "facet_fields":{
> >       "documentType":[
> >         "78",68,
> >         "3",22,
> >         "2",10]},
> >
> > How would I formulate the above query to give me a specific number of
> > random results with the correct facet counts?
> >
> > Thanks for looking,
> > David
> >
> > On Mon, May 11, 2020 at 2:09 PM Srijan <sh...@gmail.com> wrote:
> >
> > > If you can tag your filter query, you can exclude it when faceting.
> Your
> > > results will honor the filter query and you will get the N results
> back,
> > > and since faceting will exclude the filter, it will still give you
> facet
> > > count for the base query.
> > >
> > >
> > >
> >
> https://lucene.apache.org/solr/guide/8_5/faceting.html#tagging-and-excluding-filters
> > >
> > >
> > > On Mon, May 11, 2020 at 3:36 PM David Lukowski <
> david.lukowski@gmail.com
> > >
> > > wrote:
> > >
> > > > I'm looking for a way if possible to run a query with random results,
> > > where
> > > > I limit the number of results I want back, yet still have the facets
> > > > accurately reflect the results I'm searching.
> > > >
> > > > When I run a search I use a filter query to randomize the results
> based
> > > on
> > > > a modulo of a random seed. This returns a results set with the
> > associated
> > > > facets for each documentType.
> > > >
> > > > "response":{"numFound":377895,"start":0,"docs":[]
> > > >   },
> > > >   "facet_counts":{
> > > >     "facet_queries":{},
> > > >     "facet_fields":{
> > > >       "documentType":[
> > > >         "78",374015,
> > > >         "3",3021,
> > > >         "2",736,
> > > >         "1",41,
> > > >         "34",41,
> > > >         "35",32,
> > > >         "72",8,
> > > >         "7",1]},
> > > >
> > > > How do I limit the number of results returned to N and have the
> facets
> > > > accurately reflect the number of messages?  I cannot simply say
> rows=N
> > > > because the facets will always reflect the total numFound and not the
> > > > limited results set I'm looking for.
> > > >
> > >
> >
>

Re: Limiting random results set with facets.

Posted by Srijan <sh...@gmail.com>.
I see what you mean now. You could use two queries - first would return 100
randomly sorted docs (no faceting) and the second with fq that includes the
ids of the returned 100 docs + faceting.

On Tue, May 12, 2020 at 1:29 PM David Lukowski <da...@gmail.com>
wrote:

> Thanks for the offer of help, this doesn't really seem like what I'm
> looking for though, but I could be misunderstanding.  I'll try to state it
> more clearly and include the query.
>
>
> -- This will give me back all the documents that have "lucky" in them in
> RANDOM sorted order.
>
> http://mysolr-node:8080/solr/M2_content/select
> ?q=({!terms f='permissionFilterId'}10,49 AND docBody:(lucky))
> &start=0
> &rows=0
> &fq=channelId:(2 1 3 78 34 35 7 72)
> &fq=date:([* TO 2020-05-12T03:59:59.999Z])
> &facet=true
> &facet.field=channelId
> &f.channelId.facet.limit=10
> &f.channelId.facet.mincount=1
> &hl=false
> &fl=id
> &wt=json
> &sort=random_123456 desc
>
>   The issue is that I only want 100 random results.  Sure, I could limit
> the results returned to the first 100 by specifying &rows=100, but the
> facets would match the query totals and not the rows returned totals.
>
> RESULTS I HAVE:
> "response":{"numFound":377895,"start":0,"docs":[]
>   },
>   "facet_counts":{
>     "facet_queries":{},
>     "facet_fields":{
>       "documentType":[
>         "78",374015,
>         "3",3021,
>         "2",736,
>         "1",41,
>         "34",41,
>         "35",32,
>         "72",8,
>         "7",1]},
>
>
> RESULTS I WANT:
> "response":{"numFound":100,"start":0,"docs":[]
>   },
>   "facet_counts":{
>     "facet_queries":{},
>     "facet_fields":{
>       "documentType":[
>         "78",68,
>         "3",22,
>         "2",10]},
>
> How would I formulate the above query to give me a specific number of
> random results with the correct facet counts?
>
> Thanks for looking,
> David
>
> On Mon, May 11, 2020 at 2:09 PM Srijan <sh...@gmail.com> wrote:
>
> > If you can tag your filter query, you can exclude it when faceting. Your
> > results will honor the filter query and you will get the N results back,
> > and since faceting will exclude the filter, it will still give you facet
> > count for the base query.
> >
> >
> >
> https://lucene.apache.org/solr/guide/8_5/faceting.html#tagging-and-excluding-filters
> >
> >
> > On Mon, May 11, 2020 at 3:36 PM David Lukowski <david.lukowski@gmail.com
> >
> > wrote:
> >
> > > I'm looking for a way if possible to run a query with random results,
> > where
> > > I limit the number of results I want back, yet still have the facets
> > > accurately reflect the results I'm searching.
> > >
> > > When I run a search I use a filter query to randomize the results based
> > on
> > > a modulo of a random seed. This returns a results set with the
> associated
> > > facets for each documentType.
> > >
> > > "response":{"numFound":377895,"start":0,"docs":[]
> > >   },
> > >   "facet_counts":{
> > >     "facet_queries":{},
> > >     "facet_fields":{
> > >       "documentType":[
> > >         "78",374015,
> > >         "3",3021,
> > >         "2",736,
> > >         "1",41,
> > >         "34",41,
> > >         "35",32,
> > >         "72",8,
> > >         "7",1]},
> > >
> > > How do I limit the number of results returned to N and have the facets
> > > accurately reflect the number of messages?  I cannot simply say rows=N
> > > because the facets will always reflect the total numFound and not the
> > > limited results set I'm looking for.
> > >
> >
>

Re: Limiting random results set with facets.

Posted by David Lukowski <da...@gmail.com>.
Thanks for the offer of help, this doesn't really seem like what I'm
looking for though, but I could be misunderstanding.  I'll try to state it
more clearly and include the query.


-- This will give me back all the documents that have "lucky" in them in
RANDOM sorted order.

http://mysolr-node:8080/solr/M2_content/select
?q=({!terms f='permissionFilterId'}10,49 AND docBody:(lucky))
&start=0
&rows=0
&fq=channelId:(2 1 3 78 34 35 7 72)
&fq=date:([* TO 2020-05-12T03:59:59.999Z])
&facet=true
&facet.field=channelId
&f.channelId.facet.limit=10
&f.channelId.facet.mincount=1
&hl=false
&fl=id
&wt=json
&sort=random_123456 desc

  The issue is that I only want 100 random results.  Sure, I could limit
the results returned to the first 100 by specifying &rows=100, but the
facets would match the query totals and not the rows returned totals.

RESULTS I HAVE:
"response":{"numFound":377895,"start":0,"docs":[]
  },
  "facet_counts":{
    "facet_queries":{},
    "facet_fields":{
      "documentType":[
        "78",374015,
        "3",3021,
        "2",736,
        "1",41,
        "34",41,
        "35",32,
        "72",8,
        "7",1]},


RESULTS I WANT:
"response":{"numFound":100,"start":0,"docs":[]
  },
  "facet_counts":{
    "facet_queries":{},
    "facet_fields":{
      "documentType":[
        "78",68,
        "3",22,
        "2",10]},

How would I formulate the above query to give me a specific number of
random results with the correct facet counts?

Thanks for looking,
David

On Mon, May 11, 2020 at 2:09 PM Srijan <sh...@gmail.com> wrote:

> If you can tag your filter query, you can exclude it when faceting. Your
> results will honor the filter query and you will get the N results back,
> and since faceting will exclude the filter, it will still give you facet
> count for the base query.
>
>
> https://lucene.apache.org/solr/guide/8_5/faceting.html#tagging-and-excluding-filters
>
>
> On Mon, May 11, 2020 at 3:36 PM David Lukowski <da...@gmail.com>
> wrote:
>
> > I'm looking for a way if possible to run a query with random results,
> where
> > I limit the number of results I want back, yet still have the facets
> > accurately reflect the results I'm searching.
> >
> > When I run a search I use a filter query to randomize the results based
> on
> > a modulo of a random seed. This returns a results set with the associated
> > facets for each documentType.
> >
> > "response":{"numFound":377895,"start":0,"docs":[]
> >   },
> >   "facet_counts":{
> >     "facet_queries":{},
> >     "facet_fields":{
> >       "documentType":[
> >         "78",374015,
> >         "3",3021,
> >         "2",736,
> >         "1",41,
> >         "34",41,
> >         "35",32,
> >         "72",8,
> >         "7",1]},
> >
> > How do I limit the number of results returned to N and have the facets
> > accurately reflect the number of messages?  I cannot simply say rows=N
> > because the facets will always reflect the total numFound and not the
> > limited results set I'm looking for.
> >
>

Re: Limiting random results set with facets.

Posted by Srijan <sh...@gmail.com>.
If you can tag your filter query, you can exclude it when faceting. Your
results will honor the filter query and you will get the N results back,
and since faceting will exclude the filter, it will still give you facet
count for the base query.

https://lucene.apache.org/solr/guide/8_5/faceting.html#tagging-and-excluding-filters


On Mon, May 11, 2020 at 3:36 PM David Lukowski <da...@gmail.com>
wrote:

> I'm looking for a way if possible to run a query with random results, where
> I limit the number of results I want back, yet still have the facets
> accurately reflect the results I'm searching.
>
> When I run a search I use a filter query to randomize the results based on
> a modulo of a random seed. This returns a results set with the associated
> facets for each documentType.
>
> "response":{"numFound":377895,"start":0,"docs":[]
>   },
>   "facet_counts":{
>     "facet_queries":{},
>     "facet_fields":{
>       "documentType":[
>         "78",374015,
>         "3",3021,
>         "2",736,
>         "1",41,
>         "34",41,
>         "35",32,
>         "72",8,
>         "7",1]},
>
> How do I limit the number of results returned to N and have the facets
> accurately reflect the number of messages?  I cannot simply say rows=N
> because the facets will always reflect the total numFound and not the
> limited results set I'm looking for.
>