You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Sean Laval <se...@hotmail.com> on 2008/07/07 17:58:40 UTC

implementing a random result request handler - solr 1.2

I have seen various posts about implementing random sorting relating to the 1.3 code base but I am trying to do this in 1.2. Does anyone have any suggestions? The approach I have considered is to implement my own request handler that picks random documents from a larger result list. I therefore need to be able to create a DocList and add documents to it but can't seem to do this. Does anyone have any advice they could offer please?

Regards,

Sean

Re: implementing a random result request handler - solr 1.2

Posted by Sean Laval <se...@hotmail.com>.
Perhaps I can  be clearer about my requirement... I need to populate a 
portlet with a random list of documents of a specified size from the index, 
thats all.

Thanks, and any help would be a greatly appreciated.

--------------------------------------------------
From: "Walter Underwood" <wu...@netflix.com>
Sent: Monday, July 07, 2008 5:06 PM
To: <so...@lucene.apache.org>
Subject: Re: implementing a random result request handler - solr 1.2

> Why do you want random hits? If we know more about the bigger
> problem, we can probably make better suggestions.
>
> Fundamentally, Lucene is designed to quickly return the best
> hits for a query. Returning random hits from the entire
> matched set is likely to be very slow. It just isn't what
> Lucene is designed to do.
>
> wunder
>
> On 7/7/08 8:58 AM, "Sean Laval" <se...@hotmail.com> wrote:
>
>> I have seen various posts about implementing random sorting relating to 
>> the
>> 1.3 code base but I am trying to do this in 1.2. Does anyone have any
>> suggestions? The approach I have considered is to implement my own 
>> request
>> handler that picks random documents from a larger result list. I 
>> therefore
>> need to be able to create a DocList and add documents to it but can't 
>> seem to
>> do this. Does anyone have any advice they could offer please?
>>
>> Regards,
>>
>> Sean
>
> 

Re: implementing a random result request handler - solr 1.2

Posted by Ryan McKinley <ry...@gmail.com>.
The random sort field in solr 1.3 relies on the field name and dynamic  
fields for ordering.  Check the example solrconfig.xml in 1.3

    <dynamicField name="random*" type="random" />

to get random results, try various field names:
  &sort=rand_123 asc
  &sort=rand_xyz asc
  &sort=rand_{generate your random number on the client} asc

This is good because you will get the same results for the same query  
string, and will get a new set of random results for a new URL.

ryan


On Jul 7, 2008, at 1:40 PM, Sean Laval wrote:
> The RandomSortField in 1.3.... each time you then issue a query, you  
> get the same random sort order right? That is to say the randomness  
> is implemented at index time rather than search time?
>
> Thanks,
>
> --------------------------------------------------
> From: "Yonik Seeley" <yo...@apache.org>
> Sent: Monday, July 07, 2008 6:22 PM
> To: <so...@lucene.apache.org>
> Subject: Re: implementing a random result request handler - solr 1.2
>
>> If it's just a random ordering you are looking for, it's implemented
>> in the latest Solr 1.3
>> Solr 1.3 should be out soon, so if you are just starting development,
>> I'd start with the latest Solr version.
>>
>> If you really need to stick with 1.2 (even after 1.3 is out?)  then
>> RandomSortField should be easy to backport to 1.2
>>
>> -Yonik
>>
>> On Mon, Jul 7, 2008 at 1:15 PM, Sean Laval <se...@hotmail.com>  
>> wrote:
>>> Well its simply a business requirement from my perspective. I am  
>>> not sure I
>>> can say more than that. I could maybe implement a request handler  
>>> that did
>>> an initial search to work out how many hits there are resulting  
>>> from the
>>> query and then did as many more queries as were required fetching  
>>> just 1
>>> document starting at a given random number .. would that work?  
>>> Sounds a bit
>>> cludgy to me even as I say it.
>>>
>>> Sean
>>>
>>>
>>>
>>> --------------------------------------------------
>>> From: "Walter Underwood" <wu...@netflix.com>
>>> Sent: Monday, July 07, 2008 5:06 PM
>>> To: <so...@lucene.apache.org>
>>> Subject: Re: implementing a random result request handler - solr 1.2
>>>
>>>> Why do you want random hits? If we know more about the bigger
>>>> problem, we can probably make better suggestions.
>>>>
>>>> Fundamentally, Lucene is designed to quickly return the best
>>>> hits for a query. Returning random hits from the entire
>>>> matched set is likely to be very slow. It just isn't what
>>>> Lucene is designed to do.
>>>>
>>>> wunder
>>>>
>>>> On 7/7/08 8:58 AM, "Sean Laval" <se...@hotmail.com> wrote:
>>>>
>>>>> I have seen various posts about implementing random sorting  
>>>>> relating to
>>>>> the
>>>>> 1.3 code base but I am trying to do this in 1.2. Does anyone  
>>>>> have any
>>>>> suggestions? The approach I have considered is to implement my own
>>>>> request
>>>>> handler that picks random documents from a larger result list. I
>>>>> therefore
>>>>> need to be able to create a DocList and add documents to it but  
>>>>> can't
>>>>> seem to
>>>>> do this. Does anyone have any advice they could offer please?
>>>>>
>>>>> Regards,
>>>>>
>>>>> Sean
>>>>
>>>>
>>>


Re: implementing a random result request handler - solr 1.2

Posted by Yonik Seeley <yo...@apache.org>.
On Mon, Jul 7, 2008 at 1:40 PM, Sean Laval <se...@hotmail.com> wrote:
> The RandomSortField in 1.3.... each time you then issue a query, you get the
> same random sort order right? That is to say the randomness is implemented
> at index time rather than search time?

See the comment in the example schema:

    <!-- The "RandomSortField" is not used to store or search any
         data.  You can declare fields of this type it in your schema
         to generate psuedo-random orderings of your docs for sorting
         purposes.  The ordering is generated based on the field name
         and the version of the index, As long as the index version
         remains unchanged, and the same field name is reused,
         the ordering of the docs will be consistent.
         If you want differend psuedo-random orderings of documents,
         for the same version of the index, use a dynamicField and
         change the name
     -->
    <fieldType name="random" class="solr.RandomSortField" indexed="true" />

-Yonik

Re: implementing a random result request handler - solr 1.2

Posted by Sean Laval <se...@hotmail.com>.
The RandomSortField in 1.3.... each time you then issue a query, you get the 
same random sort order right? That is to say the randomness is implemented 
at index time rather than search time?

Thanks,

--------------------------------------------------
From: "Yonik Seeley" <yo...@apache.org>
Sent: Monday, July 07, 2008 6:22 PM
To: <so...@lucene.apache.org>
Subject: Re: implementing a random result request handler - solr 1.2

> If it's just a random ordering you are looking for, it's implemented
> in the latest Solr 1.3
> Solr 1.3 should be out soon, so if you are just starting development,
> I'd start with the latest Solr version.
>
> If you really need to stick with 1.2 (even after 1.3 is out?)  then
> RandomSortField should be easy to backport to 1.2
>
> -Yonik
>
> On Mon, Jul 7, 2008 at 1:15 PM, Sean Laval <se...@hotmail.com> wrote:
>> Well its simply a business requirement from my perspective. I am not sure 
>> I
>> can say more than that. I could maybe implement a request handler that 
>> did
>> an initial search to work out how many hits there are resulting from the
>> query and then did as many more queries as were required fetching just 1
>> document starting at a given random number .. would that work? Sounds a 
>> bit
>> cludgy to me even as I say it.
>>
>> Sean
>>
>>
>>
>> --------------------------------------------------
>> From: "Walter Underwood" <wu...@netflix.com>
>> Sent: Monday, July 07, 2008 5:06 PM
>> To: <so...@lucene.apache.org>
>> Subject: Re: implementing a random result request handler - solr 1.2
>>
>>> Why do you want random hits? If we know more about the bigger
>>> problem, we can probably make better suggestions.
>>>
>>> Fundamentally, Lucene is designed to quickly return the best
>>> hits for a query. Returning random hits from the entire
>>> matched set is likely to be very slow. It just isn't what
>>> Lucene is designed to do.
>>>
>>> wunder
>>>
>>> On 7/7/08 8:58 AM, "Sean Laval" <se...@hotmail.com> wrote:
>>>
>>>> I have seen various posts about implementing random sorting relating to
>>>> the
>>>> 1.3 code base but I am trying to do this in 1.2. Does anyone have any
>>>> suggestions? The approach I have considered is to implement my own
>>>> request
>>>> handler that picks random documents from a larger result list. I
>>>> therefore
>>>> need to be able to create a DocList and add documents to it but can't
>>>> seem to
>>>> do this. Does anyone have any advice they could offer please?
>>>>
>>>> Regards,
>>>>
>>>> Sean
>>>
>>>
>>
> 

Re: implementing a random result request handler - solr 1.2

Posted by Yonik Seeley <yo...@apache.org>.
If it's just a random ordering you are looking for, it's implemented
in the latest Solr 1.3
Solr 1.3 should be out soon, so if you are just starting development,
I'd start with the latest Solr version.

If you really need to stick with 1.2 (even after 1.3 is out?)  then
RandomSortField should be easy to backport to 1.2

-Yonik

On Mon, Jul 7, 2008 at 1:15 PM, Sean Laval <se...@hotmail.com> wrote:
> Well its simply a business requirement from my perspective. I am not sure I
> can say more than that. I could maybe implement a request handler that did
> an initial search to work out how many hits there are resulting from the
> query and then did as many more queries as were required fetching just 1
> document starting at a given random number .. would that work? Sounds a bit
> cludgy to me even as I say it.
>
> Sean
>
>
>
> --------------------------------------------------
> From: "Walter Underwood" <wu...@netflix.com>
> Sent: Monday, July 07, 2008 5:06 PM
> To: <so...@lucene.apache.org>
> Subject: Re: implementing a random result request handler - solr 1.2
>
>> Why do you want random hits? If we know more about the bigger
>> problem, we can probably make better suggestions.
>>
>> Fundamentally, Lucene is designed to quickly return the best
>> hits for a query. Returning random hits from the entire
>> matched set is likely to be very slow. It just isn't what
>> Lucene is designed to do.
>>
>> wunder
>>
>> On 7/7/08 8:58 AM, "Sean Laval" <se...@hotmail.com> wrote:
>>
>>> I have seen various posts about implementing random sorting relating to
>>> the
>>> 1.3 code base but I am trying to do this in 1.2. Does anyone have any
>>> suggestions? The approach I have considered is to implement my own
>>> request
>>> handler that picks random documents from a larger result list. I
>>> therefore
>>> need to be able to create a DocList and add documents to it but can't
>>> seem to
>>> do this. Does anyone have any advice they could offer please?
>>>
>>> Regards,
>>>
>>> Sean
>>
>>
>

Re: implementing a random result request handler - solr 1.2

Posted by Walter Underwood <wu...@netflix.com>.
Is it a business requirement that this is fast? If so, you are
going to spend a lot of money on hardware. Might want to request
that the business people think again about their requirements.

Here is one way to do this, using the simplest Solr/Lucene features.
An implementation internal to Solr would probably look similar, but
might not be much faster, especially if the processing is dominated
by disk accesses.

1. Do the query, requesting 100 hits, using the default sort, and
only return the ID field.

2. Randomly sample that set of 100, adding the chosen IDs to a set,
then request the next 100.

3. Stop when you have a big enough sample (this might be before the
end of the list).

4. Make another request with 100 of the IDs and the desired fields.
The query will look like: "id:1 OR id:99 OR id:186 OR id:42".

wunder

On 7/7/08 10:15 AM, "Sean Laval" <se...@hotmail.com> wrote:

> Well its simply a business requirement from my perspective. I am not sure I
> can say more than that. I could maybe implement a request handler that did
> an initial search to work out how many hits there are resulting from the
> query and then did as many more queries as were required fetching just 1
> document starting at a given random number .. would that work? Sounds a bit
> cludgy to me even as I say it.
> 
> Sean
> 
> 
> 
> --------------------------------------------------
> From: "Walter Underwood" <wu...@netflix.com>
> Sent: Monday, July 07, 2008 5:06 PM
> To: <so...@lucene.apache.org>
> Subject: Re: implementing a random result request handler - solr 1.2
> 
>> Why do you want random hits? If we know more about the bigger
>> problem, we can probably make better suggestions.
>> 
>> Fundamentally, Lucene is designed to quickly return the best
>> hits for a query. Returning random hits from the entire
>> matched set is likely to be very slow. It just isn't what
>> Lucene is designed to do.
>> 
>> wunder
>> 
>> On 7/7/08 8:58 AM, "Sean Laval" <se...@hotmail.com> wrote:
>> 
>>> I have seen various posts about implementing random sorting relating to
>>> the
>>> 1.3 code base but I am trying to do this in 1.2. Does anyone have any
>>> suggestions? The approach I have considered is to implement my own
>>> request
>>> handler that picks random documents from a larger result list. I
>>> therefore
>>> need to be able to create a DocList and add documents to it but can't
>>> seem to
>>> do this. Does anyone have any advice they could offer please?
>>> 
>>> Regards,
>>> 
>>> Sean
>> 
>> 


Re: implementing a random result request handler - solr 1.2

Posted by Sean Laval <se...@hotmail.com>.
Well its simply a business requirement from my perspective. I am not sure I 
can say more than that. I could maybe implement a request handler that did 
an initial search to work out how many hits there are resulting from the 
query and then did as many more queries as were required fetching just 1 
document starting at a given random number .. would that work? Sounds a bit 
cludgy to me even as I say it.

Sean



--------------------------------------------------
From: "Walter Underwood" <wu...@netflix.com>
Sent: Monday, July 07, 2008 5:06 PM
To: <so...@lucene.apache.org>
Subject: Re: implementing a random result request handler - solr 1.2

> Why do you want random hits? If we know more about the bigger
> problem, we can probably make better suggestions.
>
> Fundamentally, Lucene is designed to quickly return the best
> hits for a query. Returning random hits from the entire
> matched set is likely to be very slow. It just isn't what
> Lucene is designed to do.
>
> wunder
>
> On 7/7/08 8:58 AM, "Sean Laval" <se...@hotmail.com> wrote:
>
>> I have seen various posts about implementing random sorting relating to 
>> the
>> 1.3 code base but I am trying to do this in 1.2. Does anyone have any
>> suggestions? The approach I have considered is to implement my own 
>> request
>> handler that picks random documents from a larger result list. I 
>> therefore
>> need to be able to create a DocList and add documents to it but can't 
>> seem to
>> do this. Does anyone have any advice they could offer please?
>>
>> Regards,
>>
>> Sean
>
> 

Re: implementing a random result request handler - solr 1.2

Posted by Walter Underwood <wu...@netflix.com>.
Why do you want random hits? If we know more about the bigger
problem, we can probably make better suggestions.

Fundamentally, Lucene is designed to quickly return the best
hits for a query. Returning random hits from the entire
matched set is likely to be very slow. It just isn't what
Lucene is designed to do.

wunder

On 7/7/08 8:58 AM, "Sean Laval" <se...@hotmail.com> wrote:

> I have seen various posts about implementing random sorting relating to the
> 1.3 code base but I am trying to do this in 1.2. Does anyone have any
> suggestions? The approach I have considered is to implement my own request
> handler that picks random documents from a larger result list. I therefore
> need to be able to create a DocList and add documents to it but can't seem to
> do this. Does anyone have any advice they could offer please?
> 
> Regards,
> 
> Sean