You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Siddhant Goel <si...@gmail.com> on 2009/12/17 13:52:48 UTC

Adaptive search?

Hi,

Does Solr provide adaptive searching? Can it adapt to user clicks within the
search results it provides? Or that has to be done externally?

I couldn't find anything on googling for it.

Thanks,

-- 
- Siddhant

Re: Adaptive search?

Posted by Alexey Serba <as...@gmail.com>.

You can add click counts to your index as additional field and boost
results based on that value.

http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_change_the_score_of_a_document_based_on_the_.2Avalue.2A_of_a_field_.28say.2C_.22popularity.22.29

You can keep some kind of buffer for clicks and update click count
field for documents in the index periodically.

If you don't want to update whole documents in the index then you
probably should look at ExternalFileField or Lucene ParallelReader as
a custom Solr IndexReader, but this is complex low level Lucene stuff
and requires some hacking.

Alex

On Thu, Dec 17, 2009 at 6:46 PM, Siddhant Goel <si...@gmail.com> wrote:
> Let say we have a search engine (a simple front end - web app kind of a
> thing - responsible for querying Solr and then displaying the results in a
> human readable form) based on Solr. If a user searches for something, gets
> quite a few search results, and then clicks on one such result - is there
> any mechanism by which we can notify Solr to boost the score/relevance of
> that particular result in future searches? If not, then any pointers on how
> to go about doing that would be very helpful.
>
> Thanks,
>
> On Thu, Dec 17, 2009 at 7:50 PM, Paul Libbrecht <pa...@activemath.org> wrote:
>
>> What can it mean to "adapt to user clicks" ? Quite many things in my head.
>> Do you have maybe a citation that inspires you here?
>>
>> paul
>>
>>
>> Le 17-déc.-09 à 13:52, Siddhant Goel a écrit :
>>
>>
>>  Does Solr provide adaptive searching? Can it adapt to user clicks within
>>> the
>>> search results it provides? Or that has to be done externally?
>>>
>>
>>
>
>
> --
> - Siddhant
>

Re: Adaptive search?

Posted by Ravi Gidwani <ra...@gmail.com>.

Shalin:
           Can you point me to pages/resources that talk about this approach
in details ? OR can you provide more details on the schema and the
function(?) used for ranking the documents.

Thanks,
~Ravi.

On Mon, Jan 11, 2010 at 1:00 AM, Shalin Shekhar Mangar <
shalinmangar@gmail.com> wrote:

> On Fri, Jan 8, 2010 at 3:41 AM, Otis Gospodnetic <
> otis_gospodnetic@yahoo.com
> > wrote:
>
> >
> > ----- Original Message ----
> >
> > > From: Shalin Shekhar Mangar <sh...@gmail.com>
> > > To: solr-user@lucene.apache.org
> > > Sent: Wed, December 23, 2009 2:45:21 AM
> > > Subject: Re: Adaptive search?
> > >
> > > On Wed, Dec 23, 2009 at 4:09 AM, Lance Norskog wrote:
> > >
> > > > Nice!
> > > >
> > > > Siddhant: Another problem to watch out for is the feedback problem:
> > > > someone clicks on a link and it automatically becomes more
> > > > interesting, so someone else clicks, and it gets even more
> > > > interesting... So you need some kind of suppression. For example, as
> > > > individual clicks get older, you can push them down. Or you can put a
> > > > cap on the number of clicks used to rank the query.
> > > >
> > > >
> > > We use clicks/views instead of just clicks to avoid this problem.
> >
> > Doesn't a click imply a view?  You click to view.  I must be missing
> > something...
> >
> >
> I was talking about boosting documents using past popularity. So a user
> searches for X and gets 10 results. This view is recorded for each of the
> 10
> documents and added to the index later. If a user clicks on result #2, the
> click is recorded for doc #2 and added to index. We boost using
> clicks/view.
>
> --
> Regards,
> Shalin Shekhar Mangar.
>

Re: Adaptive search?

Posted by Chris Hostetter <ho...@fucit.org>.

: I was talking about boosting documents using past popularity. So a user
: searches for X and gets 10 results. This view is recorded for each of the 10
: documents and added to the index later. If a user clicks on result #2, the
: click is recorded for doc #2 and added to index. We boost using clicks/view.

FWIW: I've observed three problems with this type of metric...

1) "render" vs "view" ... what you are calling a "view" is really a 
"rendering" -- you are sending the data back to include the item in the 
list of 10 items on the page, and the brwoser is rendering it, but that 
doesn't mean the users is actaully "viewing" it -- particularly in a 
webpage type situation where only the first 3-5 results might actually 
appear "above the fold" and the user has to scroll to see the rest.  Even 
in a smaller UI element (like a left or right nav info box, there's no 
garuntee that the user acctually "views" any of the items, which can bias 
things.

2) It doesn't take into account people who click on a result, decide it's 
terrible, hit the back arrow and click on a differnet result -- both of 
those wind up scoring "equally".  Some really complex session+click 
analysis can overcome this, but not a lot of people have the resources to 
do that all the time.

3) ignoring #1 and #2 above (because i havne't found many better options) 
you face the popularity problem -- or what my coworkers and i use to call 
the "TRL Problem" back in the 90s:  MTV's Total Request Live was a Top X 
countdown show of videos, featuring hte most popular videos of the week 
based on requests -- but it was also the number one show on the network, 
occupying something like 4/24 broadcast hours of every day, when there was 
only a total of 6/24 hours that actaully showed music videoes.  So for 
them ost part the only videos peopel ever saw were on TRL, so those were 
the only videos that ever got requested.

In a nutshell: once something becomes "popular" and is what everybody 
sees, it stays popular, because it's what everybody sees and they don't 
know that there is better stuff out there.

Even if everyone looks at the full list of results and actaully reads all 
of the first 10 summaries, in the absense of ay other bias their 
inclination is going to be to assume #1 is the best.  So they might click 
on that even if another result on the list appears better bassed on their 
opinion.

A variation that i did some experiments with, but never really refined 
because i didn't have the time/energy to really go to town on it, is to 
weight the "clicks" based on position:  a click on item #1 whould't be 
worth anything -- it's hte number one result, the expectation is that it 
better get clicked or something is wrong.  A click on #2 is worth 
soemthing to that item, and a click on #3 is worth more to that item, and 
so on ... so that if the #9 item gets a click, that's huge.  To do it 
right, I think what you really want to do is penalize items that get views 
but no clicks -- because if someone loads up resuolts 1-10, and doesn't 
click on any of them, that should be a vote in favor of moving all of them 
"down" and moving item #11 up (even though it got no views or clicks)

But like i said: i never experimented with this idea enough to come up 
with a good formula, or verify that the idea was sound.

-Hoss

Re: Adaptive search?

Posted by Shalin Shekhar Mangar <sh...@gmail.com>.

On Fri, Jan 8, 2010 at 3:41 AM, Otis Gospodnetic <otis_gospodnetic@yahoo.com
> wrote:

>
> ----- Original Message ----
>
> > From: Shalin Shekhar Mangar <sh...@gmail.com>
> > To: solr-user@lucene.apache.org
> > Sent: Wed, December 23, 2009 2:45:21 AM
> > Subject: Re: Adaptive search?
> >
> > On Wed, Dec 23, 2009 at 4:09 AM, Lance Norskog wrote:
> >
> > > Nice!
> > >
> > > Siddhant: Another problem to watch out for is the feedback problem:
> > > someone clicks on a link and it automatically becomes more
> > > interesting, so someone else clicks, and it gets even more
> > > interesting... So you need some kind of suppression. For example, as
> > > individual clicks get older, you can push them down. Or you can put a
> > > cap on the number of clicks used to rank the query.
> > >
> > >
> > We use clicks/views instead of just clicks to avoid this problem.
>
> Doesn't a click imply a view?  You click to view.  I must be missing
> something...
>
>
I was talking about boosting documents using past popularity. So a user
searches for X and gets 10 results. This view is recorded for each of the 10
documents and added to the index later. If a user clicks on result #2, the
click is recorded for doc #2 and added to index. We boost using clicks/view.

-- 
Regards,
Shalin Shekhar Mangar.

Re: Adaptive search?

Posted by Otis Gospodnetic <ot...@yahoo.com>.

Shalin,

 
----- Original Message ----

> From: Shalin Shekhar Mangar <sh...@gmail.com>
> To: solr-user@lucene.apache.org
> Sent: Wed, December 23, 2009 2:45:21 AM
> Subject: Re: Adaptive search?
> 
> On Wed, Dec 23, 2009 at 4:09 AM, Lance Norskog wrote:
> 
> > Nice!
> >
> > Siddhant: Another problem to watch out for is the feedback problem:
> > someone clicks on a link and it automatically becomes more
> > interesting, so someone else clicks, and it gets even more
> > interesting... So you need some kind of suppression. For example, as
> > individual clicks get older, you can push them down. Or you can put a
> > cap on the number of clicks used to rank the query.
> >
> >
> We use clicks/views instead of just clicks to avoid this problem.

Doesn't a click imply a view?  You click to view.  I must be missing something...

Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch

Re: Adaptive search?

Posted by Shalin Shekhar Mangar <sh...@gmail.com>.

On Wed, Dec 23, 2009 at 4:09 AM, Lance Norskog <go...@gmail.com> wrote:

> Nice!
>
> Siddhant: Another problem to watch out for is the feedback problem:
> someone clicks on a link and it automatically becomes more
> interesting, so someone else clicks, and it gets even more
> interesting... So you need some kind of suppression. For example, as
> individual clicks get older, you can push them down. Or you can put a
> cap on the number of clicks used to rank the query.
>
>
We use clicks/views instead of just clicks to avoid this problem.

-- 
Regards,
Shalin Shekhar Mangar.

Re: Adaptive search?

Posted by Lance Norskog <go...@gmail.com>.

Nice!

Siddhant: Another problem to watch out for is the feedback problem:
someone clicks on a link and it automatically becomes more
interesting, so someone else clicks, and it gets even more
interesting... So you need some kind of suppression. For example, as
individual clicks get older, you can push them down. Or you can put a
cap on the number of clicks used to rank the query.

On Tue, Dec 22, 2009 at 2:36 AM, Siddhant Goel <si...@gmail.com> wrote:
> On Tue, Dec 22, 2009 at 12:01 PM, Ryan Kennedy <rc...@gmail.com> wrote:
>
>> This approach will be limited to applying a "global" rank to all the
>> documents, which may have some unintended consequences. The most
>> popular document in your index will be the most popular, even for
>> queries for which it was never clicked on.
>
>
> Right. Makes so much sense. Thanks for sharing.
>
> --
> - Siddhant
>

-- 
Lance Norskog
goksron@gmail.com

Re: Adaptive search?

Posted by Siddhant Goel <si...@gmail.com>.

On Tue, Dec 22, 2009 at 12:01 PM, Ryan Kennedy <rc...@gmail.com> wrote:

> This approach will be limited to applying a "global" rank to all the
> documents, which may have some unintended consequences. The most
> popular document in your index will be the most popular, even for
> queries for which it was never clicked on.

Right. Makes so much sense. Thanks for sharing.

-- 
- Siddhant

Re: Adaptive search?

Posted by Ryan Kennedy <rc...@gmail.com>.

On Mon, Dec 21, 2009 at 3:36 PM, Lance Norskog <go...@gmail.com> wrote:
> Solr does have the ExternalFileField available. You could track
> existing clicks from the container search log and generate a file to
> be used with ExternalFileField.
>
> http://lucene.apache.org/solr/api/org/apache/solr/schema/ExternalFileField.html
>
> In the solr source, trunk/src/test/test-files/solr/conf/schema11.xml
> and schema-trie.xml show how to use it.

This approach will be limited to applying a "global" rank to all the
documents, which may have some unintended consequences. The most
popular document in your index will be the most popular, even for
queries for which it was never clicked on. We've currently been
working on this problem in our own implementation and implemented it
using a FunctionQuery (http://wiki.apache.org/solr/FunctionQuery). We
create a ValueSourceParser and hook it into our Solr config:

    <valueSourceParser name="qpop" class="QueryPopularity">
        <str name="popfile">/path/to/popularity_file.xml</str>
    </valueSourceParser>

Then we use the new function in our request handler(s):

    <requestHandler name="..." class="...">
        ...
        <str name="bf">
            qpop(id)
        </str>
    </requestHandler>

The QueryPopularity class takes the current (normalized) query and
indexes into popularity_file.xml to find out what document IDs (it
uses the "id" field because that's what we specified in the arguments
to "qpop", you could use any field you want) are popular for the
current query. Documents which are popular, get a score greater than
zero proportional to their popularity. We do offline processing every
night to build the mappings of query -> popular ID and push that file
to our machines. QueryPopularity has a background thread, which
periodically refreshes the in-memory copy of the XML file's contents.

The main difference is that this is a two-level hash (query -> id ->
score), whereas the ExternalFileField appears to be a one-level hash
(id -> score).

Ryan

Re: Adaptive search?

Posted by Lance Norskog <go...@gmail.com>.

Solr does have the ExternalFileField available. You could track
existing clicks from the container search log and generate a file to
be used with ExternalFileField.

http://lucene.apache.org/solr/api/org/apache/solr/schema/ExternalFileField.html

In the solr source, trunk/src/test/test-files/solr/conf/schema11.xml
and schema-trie.xml show how to use it.

On Mon, Dec 21, 2009 at 12:39 PM, Ian Holsman <li...@holsman.net> wrote:
> On 12/18/09 2:46 AM, Siddhant Goel wrote:
>>
>> Let say we have a search engine (a simple front end - web app kind of a
>> thing - responsible for querying Solr and then displaying the results in a
>> human readable form) based on Solr. If a user searches for something, gets
>> quite a few search results, and then clicks on one such result - is there
>> any mechanism by which we can notify Solr to boost the score/relevance of
>> that particular result in future searches? If not, then any pointers on
>> how
>> to go about doing that would be very helpful.
>>
>
> Hi Siddhant.
> Solr can't do this out of the box.
> you would need to use a external field and a custom scoring function to do
> something like this.
>
> regards
> Ian
>>
>> Thanks,
>>
>> On Thu, Dec 17, 2009 at 7:50 PM, Paul Libbrecht<pa...@activemath.org>
>>  wrote:
>>
>>
>>>
>>> What can it mean to "adapt to user clicks" ? Quite many things in my
>>> head.
>>> Do you have maybe a citation that inspires you here?
>>>
>>> paul
>>>
>>>
>>> Le 17-déc.-09 à 13:52, Siddhant Goel a écrit :
>>>
>>>
>>>  Does Solr provide adaptive searching? Can it adapt to user clicks within
>>>
>>>>
>>>> the
>>>> search results it provides? Or that has to be done externally?
>>>>
>>>>
>>>
>>>
>>
>>
>
>



-- 
Lance Norskog
goksron@gmail.com

Re: Adaptive search?

Posted by Ian Holsman <li...@holsman.net>.

On 12/18/09 2:46 AM, Siddhant Goel wrote:
> Let say we have a search engine (a simple front end - web app kind of a
> thing - responsible for querying Solr and then displaying the results in a
> human readable form) based on Solr. If a user searches for something, gets
> quite a few search results, and then clicks on one such result - is there
> any mechanism by which we can notify Solr to boost the score/relevance of
> that particular result in future searches? If not, then any pointers on how
> to go about doing that would be very helpful.
>    

Hi Siddhant.
Solr can't do this out of the box.
you would need to use a external field and a custom scoring function to 
do something like this.

regards
Ian
> Thanks,
>
> On Thu, Dec 17, 2009 at 7:50 PM, Paul Libbrecht<pa...@activemath.org>  wrote:
>
>    
>> What can it mean to "adapt to user clicks" ? Quite many things in my head.
>> Do you have maybe a citation that inspires you here?
>>
>> paul
>>
>>
>> Le 17-déc.-09 à 13:52, Siddhant Goel a écrit :
>>
>>
>>   Does Solr provide adaptive searching? Can it adapt to user clicks within
>>      
>>> the
>>> search results it provides? Or that has to be done externally?
>>>
>>>        
>>
>>      
>
>

Re: Adaptive search?

Posted by Siddhant Goel <si...@gmail.com>.

Let say we have a search engine (a simple front end - web app kind of a
thing - responsible for querying Solr and then displaying the results in a
human readable form) based on Solr. If a user searches for something, gets
quite a few search results, and then clicks on one such result - is there
any mechanism by which we can notify Solr to boost the score/relevance of
that particular result in future searches? If not, then any pointers on how
to go about doing that would be very helpful.

Thanks,

On Thu, Dec 17, 2009 at 7:50 PM, Paul Libbrecht <pa...@activemath.org> wrote:

> What can it mean to "adapt to user clicks" ? Quite many things in my head.
> Do you have maybe a citation that inspires you here?
>
> paul
>
>
> Le 17-déc.-09 à 13:52, Siddhant Goel a écrit :
>
>
>  Does Solr provide adaptive searching? Can it adapt to user clicks within
>> the
>> search results it provides? Or that has to be done externally?
>>
>
>

-- 
- Siddhant

Re: Adaptive search?

Posted by Paul Libbrecht <pa...@activemath.org>.

What can it mean to "adapt to user clicks" ? Quite many things in my  
head.
Do you have maybe a citation that inspires you here?

paul


Le 17-déc.-09 à 13:52, Siddhant Goel a écrit :

> Does Solr provide adaptive searching? Can it adapt to user clicks  
> within the
> search results it provides? Or that has to be done externally?