You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by xiong <xi...@gmail.com> on 2007/03/15 11:21:18 UTC

How to customize scoring using user feedback?

Hi there,

Just like google: the more user clicks of search results,
 the higher rank they are.

How to implement this in lucene?

I've read the javadoc of org.apache.lucene.search package,
 but still dont know how.

Some sample code will be great.

Thanks in advance,

Xiong


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: How to customize scoring using user feedback?

Posted by "Peter W." <pe...@marketingbrokers.com>.
Xiong,

You have made an excellent point!

It's a choice determined by how you use Sort,
if you need most suitable results pass in:

SortField.FIELD_SCORE
first...

Otherwise, generate all your scores and convert them
to sortable Strings at index time on your  "votes" field.

Then, use this for searches:

Sort srt=new Sort(new SortField[]
{new SortField("votes",SortField.STRING,true),SortField.FIELD_SCORE});

Hits hits=searcher.search(query,srt);

Keep in mind that only presentation level sorting, not Lucene scoring
is changed so results still depend on and are unique for each query.

Utilizing "user feedback to improve search results"  with clickstream
data could be a sub-project in itself. It moves into future areas of
personalization and would be a cool add-on to Lucene.

Hope that helps,

Peter W.




Because scoring
The way it appears to

On Mar 15, 2007, at 9:19 PM, xiong wrote:

> Peter W. <peter <at> marketingbrokers.com> writes:
>
>>
>> Hello,
>>
>> This is not currently in Lucene.
>>
>> Sounds like you are looking for a voting
>> system to generate float scores that would be
>> inserted as a sortable field at index time.
>>
>> Regards,
>>
>> Peter W.
>
> Hi Peter,
>
> But the voting is query depedant, so just add a sortable vote field  
> may not be
> enough?
> For example, query 'Q1' and 'Q2' can reach result 'R1', and 'Q2'  
> can reach
> result 'R2', more votes for 'R1' from 'Q1' will make 'R1' on top of  
> 'R2', even
> if 'R2' is more suitable for 'Q2'.
>
> Regards,
> Xiong
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: How to customize scoring using user feedback?

Posted by xiong <xi...@gmail.com>.
Peter W. <peter <at> marketingbrokers.com> writes:

> 
> Hello,
> 
> This is not currently in Lucene.
> 
> Sounds like you are looking for a voting
> system to generate float scores that would be
> inserted as a sortable field at index time.
> 
> Regards,
> 
> Peter W.

Hi Peter,

But the voting is query depedant, so just add a sortable vote field may not be
enough?
For example, query 'Q1' and 'Q2' can reach result 'R1', and 'Q2' can reach
result 'R2', more votes for 'R1' from 'Q1' will make 'R1' on top of 'R2', even
if 'R2' is more suitable for 'Q2'.

Regards,
Xiong


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: How to customize scoring using user feedback?

Posted by "Peter W." <pe...@marketingbrokers.com>.
Hello,

This is not currently in Lucene.

Sounds like you are looking for a voting
system to generate float scores that would be
inserted as a sortable field at index time.

Gathering user feedback on search results is
hard because you need to introduce a layer
which logs the click then redirects to the
destination.

Try this: Encode your search results with a
unique id which also points to a another app.
Log votes w/app then deliver actual results.

Set a user cookie with each vote and id,
including application logic to help prevent
abuse and/or duplicates.

Next, use your logs to see what was clicked
and generate a float score tally. Prepare a
sortable String, suitable for Lucene indexing
using Numbertools or Numberutils in Solr.

Finally, index all documents with a field
named "votes" and your converted scores.

When it comes to searching, build your
regular query, adding a Sort Object and
passing in an array of SortFields with
"votes" as type SortField.STRING first.

Precedence of sort order kicks in and
your docs with more clicks rank higher.

If everything goes well you will have
results ordered by user generated
scoring.

Regards,

Peter W.


On Mar 15, 2007, at 7:05 PM, karl wettin wrote:

>
> 16 mar 2007 kl. 02.13 skrev xiong:
>
>> karl wettin <karl.wettin <at> gmail.com> writes:
>>> 15 mar 2007 kl. 11.21 skrev xiong:
>>>
>>>> Just like google: the more user clicks of search results,
>>>>  the higher rank they are.
>>>
>>> Are you really sure Google does this? It would surprise me if  
>>> they did.
>>>
>>
>> I'm not sure, actually.
>> But using user feedback to improve the search results is good,  
>> isn't it?
>
> It would be very easy to tamper with the ranking by hammering  
> specific documents. Perhaps what you are looking for is  
> "collaborative filtering"?


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: How to customize scoring using user feedback?

Posted by karl wettin <ka...@gmail.com>.
16 mar 2007 kl. 02.13 skrev xiong:

> karl wettin <karl.wettin <at> gmail.com> writes:
>> 15 mar 2007 kl. 11.21 skrev xiong:
>>
>>> Just like google: the more user clicks of search results,
>>>  the higher rank they are.
>>
>> Are you really sure Google does this? It would surprise me if they  
>> did.
>>
>
> I'm not sure, actually.
> But using user feedback to improve the search results is good,  
> isn't it?

It would be very easy to tamper with the ranking by hammering  
specific documents. Perhaps what you are looking for is  
"collaborative filtering"?

>
> I searched on the net, and found your work statement in
> 'http://ginandtonique.org/~kalle/javadocs/didyoumean/org/apache/ 
> lucene/search/didyoumean/package-summary.html',
> very interesting, how is it going now?

I belive it works quite well. You find the code here: <https:// 
issues.apache.org/jira/browse/LUCENE-626>.

-- 
karl


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: How to customize scoring using user feedback?

Posted by xiong <xi...@gmail.com>.
karl wettin <karl.wettin <at> gmail.com> writes:

> 
> 
> 15 mar 2007 kl. 11.21 skrev xiong:
> 
> > Just like google: the more user clicks of search results,
> >  the higher rank they are.
> 
> Are you really sure Google does this? It would surprise me if they did.
> 

I'm not sure, actually.
But using user feedback to improve the search results is good, isn't it?

> >
> > How to implement this in lucene?
> >
> > I've read the javadoc of org.apache.lucene.search package,
> >  but still dont know how.
> >
> > Some sample code will be great.
> 
> As far as I know, there is no such thing implemented. Mechanisms that  
> change based on user input are commonly known as "adaptive". You  
> might want to search for something like that. I would personally  
> implement it as a second level scoring in a HitCollector. It might be  
> tricky to get it optimized though.
> 
> I hope this helps.
> 


I searched on the net, and found your work statement in
'http://ginandtonique.org/~kalle/javadocs/didyoumean/org/apache/lucene/search/didyoumean/package-summary.html',
very interesting, how is it going now?


-Xiong


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: How to customize scoring using user feedback?

Posted by karl wettin <ka...@gmail.com>.
15 mar 2007 kl. 11.21 skrev xiong:

> Just like google: the more user clicks of search results,
>  the higher rank they are.

Are you really sure Google does this? It would surprise me if they did.

>
> How to implement this in lucene?
>
> I've read the javadoc of org.apache.lucene.search package,
>  but still dont know how.
>
> Some sample code will be great.

As far as I know, there is no such thing implemented. Mechanisms that  
change based on user input are commonly known as "adaptive". You  
might want to search for something like that. I would personally  
implement it as a second level scoring in a HitCollector. It might be  
tricky to get it optimized though.

I hope this helps.

-- 
karl

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: How to customize scoring using user feedback?

Posted by xiong <xi...@gmail.com>.
daniel rosher <daniel.rosher <at> hotonline.com> writes:

> 
> We regularly open a new IndexReader, and before this reader replaces the
> production one, we determine f(D) for all documents so that for the user
> there is almost no performance issue,i.e. f(D) is cached. I suspect you
> can implement something similar.
> 
> Cheers,
> Dan
> 

But if f(D) is dependent on the query, how can it be precomputed and cached?
Can I get a sorted hits enumerator from IndexReader, and just compute the scores
of the top N hits and resort them?

-Xiong


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: How to customize scoring using user feedback?

Posted by daniel rosher <da...@hotonline.com>.
Hi Xiong,

You're ranking idea sounds interesting ... are you looking into
something akin to the TrafficRank algorithm ? This is moving into the
realm of "Personalized search" or "Personalised search", something I'm
not aware of appearing on the Lucene mailing lists so far, but something
I'm quite interested in exploring.

http://www.earthskater.net/services/marketing/trafficrank-pagerank-webrank.asp

SortComparitorSource returns ScoreDocComparator which utilizes the
method below:

public int compare(ScoreDoc i, ScoreDoc j) {
if (lessthancriteria) return -1;
if (morethancriteria) return 1;
return 0;
}

Section 6.1 of 'Lucene in Action' will help you more here.

>>From ScoreDoc you can access ScoreDoc.doc, lucene document id and
ScoreDoc.score, the lucene score for document.

We regularly open a new IndexReader, and before this reader replaces the
production one, we determine f(D) for all documents so that for the user
there is almost no performance issue,i.e. f(D) is cached. I suspect you
can implement something similar.

Cheers,
Dan

On Fri, 2007-03-16 at 01:50 +0000, xiong wrote:
> daniel rosher <daniel.rosher <at> hotonline.com> writes:
> 
> > 
> > Hi,
> > 
> > This can be achieved by implementing your own implementation of the
> > SortComparitorSource interface.
> > 
> > Section 6.1 of Lucene in Action will help you here.
> > 
> > We currently use this method to alter the ranking of documents depending
> > on the age of the document by multiplying the current score by a cached
> > function f(D) ~ age_of_document_D, so that final score = doc.score*f(D);
> > 
> > Regards,
> > Dan
> > 
> Hi Dan,
> 
> By implementing SortComparitorSource, did you recount all the hits scores?
> If the returned documents number is big, will it be a performance issue?
> 
> Regards,
> Xiong
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 
> <<This email has been scanned for virus and spam content>>
Daniel Rosher
Developer


d: 0207 3489 912
t: 0870 2020 121
f: 0870 2020 131
m: 
http://recruiter.hotonline.com/






- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
This message is sent in confidence for the addressee only. It may contain privileged 
information. The contents are not to be disclosed to anyone other than the addressee. 
Unauthorised recipients are requested to preserve this confidentiality and to advise 
us of any errors in transmission. Thank you.


This message has been scanned for viruses by BlackSpider MailControl - www.blackspider.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: How to customize scoring using user feedback?

Posted by xiong <xi...@gmail.com>.
daniel rosher <daniel.rosher <at> hotonline.com> writes:

> 
> Hi,
> 
> This can be achieved by implementing your own implementation of the
> SortComparitorSource interface.
> 
> Section 6.1 of Lucene in Action will help you here.
> 
> We currently use this method to alter the ranking of documents depending
> on the age of the document by multiplying the current score by a cached
> function f(D) ~ age_of_document_D, so that final score = doc.score*f(D);
> 
> Regards,
> Dan
> 
Hi Dan,

By implementing SortComparitorSource, did you recount all the hits scores?
If the returned documents number is big, will it be a performance issue?

Regards,
Xiong




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: How to customize scoring using user feedback?

Posted by daniel rosher <da...@hotonline.com>.
Hi,

This can be achieved by implementing your own implementation of the
SortComparitorSource interface.

Section 6.1 of Lucene in Action will help you here.

We currently use this method to alter the ranking of documents depending
on the age of the document by multiplying the current score by a cached
function f(D) ~ age_of_document_D, so that final score = doc.score*f(D);

Regards,
Dan


On Thu, 2007-03-15 at 10:21 +0000, xiong wrote:
> Hi there,
> 
> Just like google: the more user clicks of search results,
>  the higher rank they are.
> 
> How to implement this in lucene?
> 
> I've read the javadoc of org.apache.lucene.search package,
>  but still dont know how.
> 
> Some sample code will be great.
> 
> Thanks in advance,
> 
> Xiong
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 
> <<This email has been scanned for virus and spam content>>
Daniel Rosher
Developer


d: 0207 3489 912
t: 0870 2020 121
f: 0870 2020 131
m: 
http://recruiter.hotonline.com/






- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
This message is sent in confidence for the addressee only. It may contain privileged 
information. The contents are not to be disclosed to anyone other than the addressee. 
Unauthorised recipients are requested to preserve this confidentiality and to advise 
us of any errors in transmission. Thank you.


This message has been scanned for viruses by BlackSpider MailControl - www.blackspider.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org