You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by Fiz N <fi...@gmail.com> on 2022/04/11 07:06:21 UTC

Select Random set of data in SOLR

Hi SOLR experts,

In my current project, we have a requirement to select random set of data
of N number of rows across result set (without sorted). I have already
checked below options but these were not fruitful:



   1. *Providing start parameter during the query*:

Since we have millions of documents indexed in SOLR, hence this method is
not useful with very high value of start parameter and its taken lot of
memory and sometimes getting OOM as well.



   1. *Using Cursor Mark parameter during query:*

This method works well compare to start parameter but the catch is, it will
first sort the result set based on sort logic we pass and then traverse
through them. In my requirement we do not need sorting. We just need
randomize doc selection across result set.

So, can you please let me know if there is any capability present in SOLR
to handle the requirement or any kind of plugin/3rd party trusted tool to
perform the same.



Thanks

Fiz

Re: Select Random set of data in SOLR

Posted by Fiz N <fi...@gmail.com>.
Thanks much Mikhail. I will go thru the below URLS.

Regards
Fiz Fareedh.

On Mon, Apr 11, 2022 at 1:22 PM Mikhail Khludnev <mk...@apache.org> wrote:

> Hi, Fiz.
> Here's the old clue about index time facility
>
> https://solr.pl/en/2013/04/02/random-documents-from-result-set-giveaway-results/
> .
> Also mind about
>
> https://solr.apache.org/guide/7_6/other-parsers.html#function-range-query-parser
> it allows to cut certain rane from the numeric values.
> Also https://solr.apache.org/guide/7_6/function-queries.html#ord-function
> allows to turn string field to number values. Also check scale() function.
> Unfortunately, there;s no reminder % function which is useful for pseudo
> random ordering. However, one can use squedist, scale + frange to toss
> values somewhat randomly.
>
>
> On Mon, Apr 11, 2022 at 10:06 AM Fiz N <fi...@gmail.com> wrote:
>
> > Hi SOLR experts,
> >
> > In my current project, we have a requirement to select random set of data
> > of N number of rows across result set (without sorted). I have already
> > checked below options but these were not fruitful:
> >
> >
> >
> >    1. *Providing start parameter during the query*:
> >
> > Since we have millions of documents indexed in SOLR, hence this method is
> > not useful with very high value of start parameter and its taken lot of
> > memory and sometimes getting OOM as well.
> >
> >
> >
> >    1. *Using Cursor Mark parameter during query:*
> >
> > This method works well compare to start parameter but the catch is, it
> will
> > first sort the result set based on sort logic we pass and then traverse
> > through them. In my requirement we do not need sorting. We just need
> > randomize doc selection across result set.
> >
> > So, can you please let me know if there is any capability present in SOLR
> > to handle the requirement or any kind of plugin/3rd party trusted tool to
> > perform the same.
> >
> >
> >
> > Thanks
> >
> > Fiz
> >
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>

Re: Select Random set of data in SOLR

Posted by Mikhail Khludnev <mk...@apache.org>.
Hi, Fiz.
Here's the old clue about index time facility
https://solr.pl/en/2013/04/02/random-documents-from-result-set-giveaway-results/
.
Also mind about
https://solr.apache.org/guide/7_6/other-parsers.html#function-range-query-parser
it allows to cut certain rane from the numeric values.
Also https://solr.apache.org/guide/7_6/function-queries.html#ord-function
allows to turn string field to number values. Also check scale() function.
Unfortunately, there;s no reminder % function which is useful for pseudo
random ordering. However, one can use squedist, scale + frange to toss
values somewhat randomly.


On Mon, Apr 11, 2022 at 10:06 AM Fiz N <fi...@gmail.com> wrote:

> Hi SOLR experts,
>
> In my current project, we have a requirement to select random set of data
> of N number of rows across result set (without sorted). I have already
> checked below options but these were not fruitful:
>
>
>
>    1. *Providing start parameter during the query*:
>
> Since we have millions of documents indexed in SOLR, hence this method is
> not useful with very high value of start parameter and its taken lot of
> memory and sometimes getting OOM as well.
>
>
>
>    1. *Using Cursor Mark parameter during query:*
>
> This method works well compare to start parameter but the catch is, it will
> first sort the result set based on sort logic we pass and then traverse
> through them. In my requirement we do not need sorting. We just need
> randomize doc selection across result set.
>
> So, can you please let me know if there is any capability present in SOLR
> to handle the requirement or any kind of plugin/3rd party trusted tool to
> perform the same.
>
>
>
> Thanks
>
> Fiz
>


-- 
Sincerely yours
Mikhail Khludnev

Re: Select Random set of data in SOLR

Posted by Thomas Corthals <th...@klascement.net>.
Hi Fiz,

Does RandomSortField suit your needs?

https://solr.apache.org/docs/8_11_1/solr-core/org/apache/solr/schema/RandomSortField.html

Thomas

Op ma 11 apr. 2022 om 09:06 schreef Fiz N <fi...@gmail.com>:

> Hi SOLR experts,
>
> In my current project, we have a requirement to select random set of data
> of N number of rows across result set (without sorted). I have already
> checked below options but these were not fruitful:
>
>
>
>    1. *Providing start parameter during the query*:
>
> Since we have millions of documents indexed in SOLR, hence this method is
> not useful with very high value of start parameter and its taken lot of
> memory and sometimes getting OOM as well.
>
>
>
>    1. *Using Cursor Mark parameter during query:*
>
> This method works well compare to start parameter but the catch is, it will
> first sort the result set based on sort logic we pass and then traverse
> through them. In my requirement we do not need sorting. We just need
> randomize doc selection across result set.
>
> So, can you please let me know if there is any capability present in SOLR
> to handle the requirement or any kind of plugin/3rd party trusted tool to
> perform the same.
>
>
>
> Thanks
>
> Fiz
>