You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Reece <li...@gmail.com> on 2009/01/29 19:27:35 UTC

Question about rating documents

Currently I'm using SOLR 1.2 to index a few million documents.  It's
been requested that a way for users to rate the documents be done so
that something rated higher would show up higher in search results and
vice verse.

I've been thinking about it, but can't come up with a good way to do
this and still have the "best match" ranking of the results according
to search terms entered by the users.

I was hoping someone had done something similar or would have some
insight on it.

Thanks in advance!

-Reece

Re: Question about rating documents

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Reece,

Solr does have the ability to read custom field values from an external file.  This is suitable for cases where these values change a lot.  You might want to consider that instead of updating the index.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: Reece <li...@gmail.com>
> To: solr-user@lucene.apache.org
> Sent: Thursday, January 29, 2009 3:31:22 PM
> Subject: Re: Question about rating documents
> 
> Okay, so what if I added a "rating" field users could update from like
> 1-5, and then did something like this:
> 
> /solr/select?indent=on&debugQuery=on&rows=99&q=body:+something AND
> type:I _val_:product(score, rating); _val_ desc, id desc
> 
> Would that sort the resultset by the product of the score and the rating?
> 
> -Reece
> 
> On Thu, Jan 29, 2009 at 2:47 PM, Reece wrote:
> > Re-indexing so much would be a pretty big pain.   I do have a unique
> > ID for each document though that I use for updating them every day as
> > they change.
> >
> > -Reece
> >
> >
> >
> > On Thu, Jan 29, 2009 at 2:40 PM, Erick Erickson 
> wrote:
> >> This may not be practical, as it would involve re-indexing
> >> all your documents periodically, but here goes anyway...
> >>
> >> You could think about *index-time* boosts. Somewhere
> >> you keep a record of the recommendations, then re-index
> >> your corpus adding some suitable boost to each field in
> >> your document based upon those recommendations.
> >>
> >> From an old post on the Lucene list by Hoss:
> >>
> >> <<<...index time field boosts are a way to express things
> >> like "this documents title is worth twice as much as the title
> >> of most documents...">>>
> >>
> >> Which seems like what you're after.
> >>
> >> But it may not be practical to re-index your corpus,
> >> and the other interesting issue would be how you keep
> >> track of documents since the Lucene doc ID is probably
> >> useless, you'd have to have your own unique, persistent
> >> field.
> >>
> >> Best
> >> Erick
> >>
> >> On Thu, Jan 29, 2009 at 2:27 PM, Reece wrote:
> >>
> >>> Hmm, I already boost certain fields, but from what I know about it you
> >>> would need to know the boost value ahead of time which is not possible
> >>> as it would be a different boost for each document depending on how it
> >>> was rated..
> >>>
> >>> I did think of one thing though.  If I had a field that had a value of
> >>> 1-5 for each document, and took that and used it to then add a boost
> >>> to the fields I was actually searching on (or the final score) that
> >>> would probably work, is that possible?
> >>>
> >>> -Reece
> >>>
> >>>
> >>>
> >>> On Thu, Jan 29, 2009 at 1:51 PM, Matthew Runo wrote:
> >>> > You could use a boost function to gently boost up items which were marked
> >>> as
> >>> > more popular.
> >>> >
> >>> > You would send the function query in the "bf" parameter with your query,
> >>> and
> >>> > you can find out more about syntax here:
> >>> > http://wiki.apache.org/solr/FunctionQuery
> >>> >
> >>> > Thanks for your time!
> >>> >
> >>> > Matthew Runo
> >>> > Software Engineer, Zappos.com
> >>> > mruno@zappos.com - 702-943-7833
> >>> >
> >>> > On Jan 29, 2009, at 10:27 AM, Reece wrote:
> >>> >
> >>> >> Currently I'm using SOLR 1.2 to index a few million documents.  It's
> >>> >> been requested that a way for users to rate the documents be done so
> >>> >> that something rated higher would show up higher in search results and
> >>> >> vice verse.
> >>> >>
> >>> >> I've been thinking about it, but can't come up with a good way to do
> >>> >> this and still have the "best match" ranking of the results according
> >>> >> to search terms entered by the users.
> >>> >>
> >>> >> I was hoping someone had done something similar or would have some
> >>> >> insight on it.
> >>> >>
> >>> >> Thanks in advance!
> >>> >>
> >>> >> -Reece
> >>> >>
> >>> >
> >>> >
> >>>
> >>
> >


Re: Question about rating documents

Posted by Reece <li...@gmail.com>.
Okay, so what if I added a "rating" field users could update from like
1-5, and then did something like this:

/solr/select?indent=on&debugQuery=on&rows=99&q=body:+something AND
type:I _val_:product(score, rating); _val_ desc, id desc

Would that sort the resultset by the product of the score and the rating?

-Reece

On Thu, Jan 29, 2009 at 2:47 PM, Reece <li...@gmail.com> wrote:
> Re-indexing so much would be a pretty big pain.   I do have a unique
> ID for each document though that I use for updating them every day as
> they change.
>
> -Reece
>
>
>
> On Thu, Jan 29, 2009 at 2:40 PM, Erick Erickson <er...@gmail.com> wrote:
>> This may not be practical, as it would involve re-indexing
>> all your documents periodically, but here goes anyway...
>>
>> You could think about *index-time* boosts. Somewhere
>> you keep a record of the recommendations, then re-index
>> your corpus adding some suitable boost to each field in
>> your document based upon those recommendations.
>>
>> From an old post on the Lucene list by Hoss:
>>
>> <<<...index time field boosts are a way to express things
>> like "this documents title is worth twice as much as the title
>> of most documents...">>>
>>
>> Which seems like what you're after.
>>
>> But it may not be practical to re-index your corpus,
>> and the other interesting issue would be how you keep
>> track of documents since the Lucene doc ID is probably
>> useless, you'd have to have your own unique, persistent
>> field.
>>
>> Best
>> Erick
>>
>> On Thu, Jan 29, 2009 at 2:27 PM, Reece <li...@gmail.com> wrote:
>>
>>> Hmm, I already boost certain fields, but from what I know about it you
>>> would need to know the boost value ahead of time which is not possible
>>> as it would be a different boost for each document depending on how it
>>> was rated..
>>>
>>> I did think of one thing though.  If I had a field that had a value of
>>> 1-5 for each document, and took that and used it to then add a boost
>>> to the fields I was actually searching on (or the final score) that
>>> would probably work, is that possible?
>>>
>>> -Reece
>>>
>>>
>>>
>>> On Thu, Jan 29, 2009 at 1:51 PM, Matthew Runo <mr...@zappos.com> wrote:
>>> > You could use a boost function to gently boost up items which were marked
>>> as
>>> > more popular.
>>> >
>>> > You would send the function query in the "bf" parameter with your query,
>>> and
>>> > you can find out more about syntax here:
>>> > http://wiki.apache.org/solr/FunctionQuery
>>> >
>>> > Thanks for your time!
>>> >
>>> > Matthew Runo
>>> > Software Engineer, Zappos.com
>>> > mruno@zappos.com - 702-943-7833
>>> >
>>> > On Jan 29, 2009, at 10:27 AM, Reece wrote:
>>> >
>>> >> Currently I'm using SOLR 1.2 to index a few million documents.  It's
>>> >> been requested that a way for users to rate the documents be done so
>>> >> that something rated higher would show up higher in search results and
>>> >> vice verse.
>>> >>
>>> >> I've been thinking about it, but can't come up with a good way to do
>>> >> this and still have the "best match" ranking of the results according
>>> >> to search terms entered by the users.
>>> >>
>>> >> I was hoping someone had done something similar or would have some
>>> >> insight on it.
>>> >>
>>> >> Thanks in advance!
>>> >>
>>> >> -Reece
>>> >>
>>> >
>>> >
>>>
>>
>

Re: Question about rating documents

Posted by Reece <li...@gmail.com>.
Re-indexing so much would be a pretty big pain.   I do have a unique
ID for each document though that I use for updating them every day as
they change.

-Reece



On Thu, Jan 29, 2009 at 2:40 PM, Erick Erickson <er...@gmail.com> wrote:
> This may not be practical, as it would involve re-indexing
> all your documents periodically, but here goes anyway...
>
> You could think about *index-time* boosts. Somewhere
> you keep a record of the recommendations, then re-index
> your corpus adding some suitable boost to each field in
> your document based upon those recommendations.
>
> From an old post on the Lucene list by Hoss:
>
> <<<...index time field boosts are a way to express things
> like "this documents title is worth twice as much as the title
> of most documents...">>>
>
> Which seems like what you're after.
>
> But it may not be practical to re-index your corpus,
> and the other interesting issue would be how you keep
> track of documents since the Lucene doc ID is probably
> useless, you'd have to have your own unique, persistent
> field.
>
> Best
> Erick
>
> On Thu, Jan 29, 2009 at 2:27 PM, Reece <li...@gmail.com> wrote:
>
>> Hmm, I already boost certain fields, but from what I know about it you
>> would need to know the boost value ahead of time which is not possible
>> as it would be a different boost for each document depending on how it
>> was rated..
>>
>> I did think of one thing though.  If I had a field that had a value of
>> 1-5 for each document, and took that and used it to then add a boost
>> to the fields I was actually searching on (or the final score) that
>> would probably work, is that possible?
>>
>> -Reece
>>
>>
>>
>> On Thu, Jan 29, 2009 at 1:51 PM, Matthew Runo <mr...@zappos.com> wrote:
>> > You could use a boost function to gently boost up items which were marked
>> as
>> > more popular.
>> >
>> > You would send the function query in the "bf" parameter with your query,
>> and
>> > you can find out more about syntax here:
>> > http://wiki.apache.org/solr/FunctionQuery
>> >
>> > Thanks for your time!
>> >
>> > Matthew Runo
>> > Software Engineer, Zappos.com
>> > mruno@zappos.com - 702-943-7833
>> >
>> > On Jan 29, 2009, at 10:27 AM, Reece wrote:
>> >
>> >> Currently I'm using SOLR 1.2 to index a few million documents.  It's
>> >> been requested that a way for users to rate the documents be done so
>> >> that something rated higher would show up higher in search results and
>> >> vice verse.
>> >>
>> >> I've been thinking about it, but can't come up with a good way to do
>> >> this and still have the "best match" ranking of the results according
>> >> to search terms entered by the users.
>> >>
>> >> I was hoping someone had done something similar or would have some
>> >> insight on it.
>> >>
>> >> Thanks in advance!
>> >>
>> >> -Reece
>> >>
>> >
>> >
>>
>

Re: Question about rating documents

Posted by Erick Erickson <er...@gmail.com>.
This may not be practical, as it would involve re-indexing
all your documents periodically, but here goes anyway...

You could think about *index-time* boosts. Somewhere
you keep a record of the recommendations, then re-index
your corpus adding some suitable boost to each field in
your document based upon those recommendations.

>From an old post on the Lucene list by Hoss:

<<<...index time field boosts are a way to express things
like "this documents title is worth twice as much as the title
of most documents...">>>

Which seems like what you're after.

But it may not be practical to re-index your corpus,
and the other interesting issue would be how you keep
track of documents since the Lucene doc ID is probably
useless, you'd have to have your own unique, persistent
field.

Best
Erick

On Thu, Jan 29, 2009 at 2:27 PM, Reece <li...@gmail.com> wrote:

> Hmm, I already boost certain fields, but from what I know about it you
> would need to know the boost value ahead of time which is not possible
> as it would be a different boost for each document depending on how it
> was rated..
>
> I did think of one thing though.  If I had a field that had a value of
> 1-5 for each document, and took that and used it to then add a boost
> to the fields I was actually searching on (or the final score) that
> would probably work, is that possible?
>
> -Reece
>
>
>
> On Thu, Jan 29, 2009 at 1:51 PM, Matthew Runo <mr...@zappos.com> wrote:
> > You could use a boost function to gently boost up items which were marked
> as
> > more popular.
> >
> > You would send the function query in the "bf" parameter with your query,
> and
> > you can find out more about syntax here:
> > http://wiki.apache.org/solr/FunctionQuery
> >
> > Thanks for your time!
> >
> > Matthew Runo
> > Software Engineer, Zappos.com
> > mruno@zappos.com - 702-943-7833
> >
> > On Jan 29, 2009, at 10:27 AM, Reece wrote:
> >
> >> Currently I'm using SOLR 1.2 to index a few million documents.  It's
> >> been requested that a way for users to rate the documents be done so
> >> that something rated higher would show up higher in search results and
> >> vice verse.
> >>
> >> I've been thinking about it, but can't come up with a good way to do
> >> this and still have the "best match" ranking of the results according
> >> to search terms entered by the users.
> >>
> >> I was hoping someone had done something similar or would have some
> >> insight on it.
> >>
> >> Thanks in advance!
> >>
> >> -Reece
> >>
> >
> >
>

Re: Question about rating documents

Posted by Reece <li...@gmail.com>.
Hmm, I already boost certain fields, but from what I know about it you
would need to know the boost value ahead of time which is not possible
as it would be a different boost for each document depending on how it
was rated..

I did think of one thing though.  If I had a field that had a value of
1-5 for each document, and took that and used it to then add a boost
to the fields I was actually searching on (or the final score) that
would probably work, is that possible?

-Reece



On Thu, Jan 29, 2009 at 1:51 PM, Matthew Runo <mr...@zappos.com> wrote:
> You could use a boost function to gently boost up items which were marked as
> more popular.
>
> You would send the function query in the "bf" parameter with your query, and
> you can find out more about syntax here:
> http://wiki.apache.org/solr/FunctionQuery
>
> Thanks for your time!
>
> Matthew Runo
> Software Engineer, Zappos.com
> mruno@zappos.com - 702-943-7833
>
> On Jan 29, 2009, at 10:27 AM, Reece wrote:
>
>> Currently I'm using SOLR 1.2 to index a few million documents.  It's
>> been requested that a way for users to rate the documents be done so
>> that something rated higher would show up higher in search results and
>> vice verse.
>>
>> I've been thinking about it, but can't come up with a good way to do
>> this and still have the "best match" ranking of the results according
>> to search terms entered by the users.
>>
>> I was hoping someone had done something similar or would have some
>> insight on it.
>>
>> Thanks in advance!
>>
>> -Reece
>>
>
>

Re: Question about rating documents

Posted by Matthew Runo <mr...@zappos.com>.
You could use a boost function to gently boost up items which were  
marked as more popular.

You would send the function query in the "bf" parameter with your  
query, and you can find out more about syntax here: http://wiki.apache.org/solr/FunctionQuery

Thanks for your time!

Matthew Runo
Software Engineer, Zappos.com
mruno@zappos.com - 702-943-7833

On Jan 29, 2009, at 10:27 AM, Reece wrote:

> Currently I'm using SOLR 1.2 to index a few million documents.  It's
> been requested that a way for users to rate the documents be done so
> that something rated higher would show up higher in search results and
> vice verse.
>
> I've been thinking about it, but can't come up with a good way to do
> this and still have the "best match" ranking of the results according
> to search terms entered by the users.
>
> I was hoping someone had done something similar or would have some
> insight on it.
>
> Thanks in advance!
>
> -Reece
>