You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Jorge Luis Betancourt González <jl...@uci.cu> on 2015/01/27 19:28:14 UTC

"Contextual" sponsored results with Solr

Hi all,

Recently I got an interesting use case that I'm not sure how to implement, the idea is that the client wants a fixed number of documents, let's call it N, to appear in the top of the results. Let me explain a little we're working with web documents so the idea is too promote the documents that match the query of the user from a given domain (wikipedia, for example) to the top of the list. So if I apply a a boost using the boost parameter:

http://localhost:8983/solr/select?q=search&fl=url&boost=map(query($type1query),0,0,1,50)&type1query=host:wikipedia

I get *all* the documents from the desired host at the top, but there is no way of limiting the number of documents from the host that are boosted to the top of the result list (which could lead to several pages of content from the same host, which is not desired, the idea is to only show N) . I was thinking in something like field collapsing/grouping but only for the documents that match my $type1query parameter (host:wikipedia) but I don't see any way of doing grouping/collapsing on only one group and leave the other results untouched. 

I although thought on using 2 groups using group.query=host:wikipedia and group.query=-host:wikipedia, but in this case there is no way of controlling how much documents each independently group will have.

In this particular case QueryElevationComponent it's not helping because I don't want to map all the posible queries I just want to put the some of the results from a certain host in the top of the list, but without boosting all the documents from the same host.

Any thoughts or recommendations on this? 

Thank you,

Regards,


---------------------------------------------------
XII Aniversario de la creación de la Universidad de las Ciencias Informáticas. 12 años de historia junto a Fidel. 12 de diciembre de 2014.


Re: [MASSMAIL]Re: "Contextual" sponsored results with Solr

Posted by Michael Sokolov <ms...@safaribooksonline.com>.
If you have a finite known set of hosts, you could do something truly awful:

create a field for each distinct host and set all of them to have 
value={id of the document} except for the host to which the document 
belongs: assign that hostname field some constant value, like "true".

Then query using group.field=host, group.limit=N, and apply a high boost 
to an optional term: host-wikipedia:true^100

Each group will contain a single entry except the top one.

But I bet you will get better performance from two queries.

-Mike

On 1/28/2015 10:51 AM, Jorge Luis Betancourt González wrote:
> We are trying to avoid firing 2 queries per request. I've started to play with a PostFilter to see how it goes, perhaps something in the line of the ReRankQueryQueryParser could be used to avoid using two queries and instead rerank the results?
>
> ----- Original Message -----
> From: "Ahmet Arslan" <io...@yahoo.com.INVALID>
> To: solr-user@lucene.apache.org
> Sent: Tuesday, January 27, 2015 11:06:29 PM
> Subject: [MASSMAIL]Re: "Contextual" sponsored results with Solr
>
> Hi Jorge,
>
> We have done similar thing with N=3. We issue separate two queries/requests, display 'special N' above the results.
> We excluded 'special N' with -id:(1 2 3 ... N) type query. all done on client side.
>
> Ahmet
>
>
>
> On Tuesday, January 27, 2015 8:28 PM, Jorge Luis Betancourt González <jl...@uci.cu> wrote:
> Hi all,
>
> Recently I got an interesting use case that I'm not sure how to implement, the idea is that the client wants a fixed number of documents, let's call it N, to appear in the top of the results. Let me explain a little we're working with web documents so the idea is too promote the documents that match the query of the user from a given domain (wikipedia, for example) to the top of the list. So if I apply a a boost using the boost parameter:
>
> http://localhost:8983/solr/select?q=search&fl=url&boost=map(query($type1query),0,0,1,50)&type1query=host:wikipedia
>
> I get *all* the documents from the desired host at the top, but there is no way of limiting the number of documents from the host that are boosted to the top of the result list (which could lead to several pages of content from the same host, which is not desired, the idea is to only show N) . I was thinking in something like field collapsing/grouping but only for the documents that match my $type1query parameter (host:wikipedia) but I don't see any way of doing grouping/collapsing on only one group and leave the other results untouched.
>
> I although thought on using 2 groups using group.query=host:wikipedia and group.query=-host:wikipedia, but in this case there is no way of controlling how much documents each independently group will have.
>
> In this particular case QueryElevationComponent it's not helping because I don't want to map all the posible queries I just want to put the some of the results from a certain host in the top of the list, but without boosting all the documents from the same host.
>
> Any thoughts or recommendations on this?
>
> Thank you,
>
> Regards,
>
>
> ---------------------------------------------------
> XII Aniversario de la creación de la Universidad de las Ciencias Informáticas. 12 años de historia junto a Fidel. 12 de diciembre de 2014.
>
>
> ---------------------------------------------------
> XII Aniversario de la creación de la Universidad de las Ciencias Informáticas. 12 años de historia junto a Fidel. 12 de diciembre de 2014.
>


Re: [MASSMAIL]Re: "Contextual" sponsored results with Solr

Posted by Jorge Luis Betancourt González <jl...@uci.cu>.
We are trying to avoid firing 2 queries per request. I've started to play with a PostFilter to see how it goes, perhaps something in the line of the ReRankQueryQueryParser could be used to avoid using two queries and instead rerank the results? 

----- Original Message -----
From: "Ahmet Arslan" <io...@yahoo.com.INVALID>
To: solr-user@lucene.apache.org
Sent: Tuesday, January 27, 2015 11:06:29 PM
Subject: [MASSMAIL]Re: "Contextual" sponsored results with Solr

Hi Jorge,

We have done similar thing with N=3. We issue separate two queries/requests, display 'special N' above the results.
We excluded 'special N' with -id:(1 2 3 ... N) type query. all done on client side.

Ahmet



On Tuesday, January 27, 2015 8:28 PM, Jorge Luis Betancourt González <jl...@uci.cu> wrote:
Hi all,

Recently I got an interesting use case that I'm not sure how to implement, the idea is that the client wants a fixed number of documents, let's call it N, to appear in the top of the results. Let me explain a little we're working with web documents so the idea is too promote the documents that match the query of the user from a given domain (wikipedia, for example) to the top of the list. So if I apply a a boost using the boost parameter:

http://localhost:8983/solr/select?q=search&fl=url&boost=map(query($type1query),0,0,1,50)&type1query=host:wikipedia

I get *all* the documents from the desired host at the top, but there is no way of limiting the number of documents from the host that are boosted to the top of the result list (which could lead to several pages of content from the same host, which is not desired, the idea is to only show N) . I was thinking in something like field collapsing/grouping but only for the documents that match my $type1query parameter (host:wikipedia) but I don't see any way of doing grouping/collapsing on only one group and leave the other results untouched. 

I although thought on using 2 groups using group.query=host:wikipedia and group.query=-host:wikipedia, but in this case there is no way of controlling how much documents each independently group will have.

In this particular case QueryElevationComponent it's not helping because I don't want to map all the posible queries I just want to put the some of the results from a certain host in the top of the list, but without boosting all the documents from the same host.

Any thoughts or recommendations on this? 

Thank you,

Regards,


---------------------------------------------------
XII Aniversario de la creación de la Universidad de las Ciencias Informáticas. 12 años de historia junto a Fidel. 12 de diciembre de 2014.


---------------------------------------------------
XII Aniversario de la creación de la Universidad de las Ciencias Informáticas. 12 años de historia junto a Fidel. 12 de diciembre de 2014.


Re: "Contextual" sponsored results with Solr

Posted by Ahmet Arslan <io...@yahoo.com.INVALID>.
Hi Jorge,

We have done similar thing with N=3. We issue separate two queries/requests, display 'special N' above the results.
We excluded 'special N' with -id:(1 2 3 ... N) type query. all done on client side.

Ahmet



On Tuesday, January 27, 2015 8:28 PM, Jorge Luis Betancourt González <jl...@uci.cu> wrote:
Hi all,

Recently I got an interesting use case that I'm not sure how to implement, the idea is that the client wants a fixed number of documents, let's call it N, to appear in the top of the results. Let me explain a little we're working with web documents so the idea is too promote the documents that match the query of the user from a given domain (wikipedia, for example) to the top of the list. So if I apply a a boost using the boost parameter:

http://localhost:8983/solr/select?q=search&fl=url&boost=map(query($type1query),0,0,1,50)&type1query=host:wikipedia

I get *all* the documents from the desired host at the top, but there is no way of limiting the number of documents from the host that are boosted to the top of the result list (which could lead to several pages of content from the same host, which is not desired, the idea is to only show N) . I was thinking in something like field collapsing/grouping but only for the documents that match my $type1query parameter (host:wikipedia) but I don't see any way of doing grouping/collapsing on only one group and leave the other results untouched. 

I although thought on using 2 groups using group.query=host:wikipedia and group.query=-host:wikipedia, but in this case there is no way of controlling how much documents each independently group will have.

In this particular case QueryElevationComponent it's not helping because I don't want to map all the posible queries I just want to put the some of the results from a certain host in the top of the list, but without boosting all the documents from the same host.

Any thoughts or recommendations on this? 

Thank you,

Regards,


---------------------------------------------------
XII Aniversario de la creación de la Universidad de las Ciencias Informáticas. 12 años de historia junto a Fidel. 12 de diciembre de 2014.

Re: "Contextual" sponsored results with Solr

Posted by Mikhail Khludnev <mk...@griddynamics.com>.
I think it can, but it's sort of tricky thing to implement.

On Tue, Jan 27, 2015 at 10:29 PM, Alexandre Rafalovitch <ar...@gmail.com>
wrote:

> Can this be done as a custom post-filter with the recent Solr improvements?
>
> Regards,
>    Alex.
> ----
> Sign up for my Solr resources newsletter at http://www.solr-start.com/
>
>
> On 27 January 2015 at 14:22, Mikhail Khludnev
> <mk...@griddynamics.com> wrote:
> > Hello,
> > if I get you right it's frequently requested feature, but it requires
> > really deep hack like
> > https://issues.apache.org/jira/browse/LUCENE-6066
> >
> >
> >
> > On Tue, Jan 27, 2015 at 9:28 PM, Jorge Luis Betancourt González <
> > jlbetancourt@uci.cu> wrote:
> >
> >> Hi all,
> >>
> >> Recently I got an interesting use case that I'm not sure how to
> implement,
> >> the idea is that the client wants a fixed number of documents, let's
> call
> >> it N, to appear in the top of the results. Let me explain a little we're
> >> working with web documents so the idea is too promote the documents that
> >> match the query of the user from a given domain (wikipedia, for
> example) to
> >> the top of the list. So if I apply a a boost using the boost parameter:
> >>
> >>
> >>
> http://localhost:8983/solr/select?q=search&fl=url&boost=map(query($type1query),0,0,1,50)&type1query=host:wikipedia
> >>
> >> I get *all* the documents from the desired host at the top, but there is
> >> no way of limiting the number of documents from the host that are
> boosted
> >> to the top of the result list (which could lead to several pages of
> content
> >> from the same host, which is not desired, the idea is to only show N) .
> I
> >> was thinking in something like field collapsing/grouping but only for
> the
> >> documents that match my $type1query parameter (host:wikipedia) but I
> don't
> >> see any way of doing grouping/collapsing on only one group and leave the
> >> other results untouched.
> >>
> >> I although thought on using 2 groups using group.query=host:wikipedia
> and
> >> group.query=-host:wikipedia, but in this case there is no way of
> >> controlling how much documents each independently group will have.
> >>
> >> In this particular case QueryElevationComponent it's not helping
> because I
> >> don't want to map all the posible queries I just want to put the some of
> >> the results from a certain host in the top of the list, but without
> >> boosting all the documents from the same host.
> >>
> >> Any thoughts or recommendations on this?
> >>
> >> Thank you,
> >>
> >> Regards,
> >>
> >>
> >> ---------------------------------------------------
> >> XII Aniversario de la creación de la Universidad de las Ciencias
> >> Informáticas. 12 años de historia junto a Fidel. 12 de diciembre de
> 2014.
> >>
> >>
> >
> >
> > --
> > Sincerely yours
> > Mikhail Khludnev
> > Principal Engineer,
> > Grid Dynamics
> >
> > <http://www.griddynamics.com>
> > <mk...@griddynamics.com>
>



-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

<http://www.griddynamics.com>
<mk...@griddynamics.com>

Re: "Contextual" sponsored results with Solr

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
Can this be done as a custom post-filter with the recent Solr improvements?

Regards,
   Alex.
----
Sign up for my Solr resources newsletter at http://www.solr-start.com/


On 27 January 2015 at 14:22, Mikhail Khludnev
<mk...@griddynamics.com> wrote:
> Hello,
> if I get you right it's frequently requested feature, but it requires
> really deep hack like
> https://issues.apache.org/jira/browse/LUCENE-6066
>
>
>
> On Tue, Jan 27, 2015 at 9:28 PM, Jorge Luis Betancourt González <
> jlbetancourt@uci.cu> wrote:
>
>> Hi all,
>>
>> Recently I got an interesting use case that I'm not sure how to implement,
>> the idea is that the client wants a fixed number of documents, let's call
>> it N, to appear in the top of the results. Let me explain a little we're
>> working with web documents so the idea is too promote the documents that
>> match the query of the user from a given domain (wikipedia, for example) to
>> the top of the list. So if I apply a a boost using the boost parameter:
>>
>>
>> http://localhost:8983/solr/select?q=search&fl=url&boost=map(query($type1query),0,0,1,50)&type1query=host:wikipedia
>>
>> I get *all* the documents from the desired host at the top, but there is
>> no way of limiting the number of documents from the host that are boosted
>> to the top of the result list (which could lead to several pages of content
>> from the same host, which is not desired, the idea is to only show N) . I
>> was thinking in something like field collapsing/grouping but only for the
>> documents that match my $type1query parameter (host:wikipedia) but I don't
>> see any way of doing grouping/collapsing on only one group and leave the
>> other results untouched.
>>
>> I although thought on using 2 groups using group.query=host:wikipedia and
>> group.query=-host:wikipedia, but in this case there is no way of
>> controlling how much documents each independently group will have.
>>
>> In this particular case QueryElevationComponent it's not helping because I
>> don't want to map all the posible queries I just want to put the some of
>> the results from a certain host in the top of the list, but without
>> boosting all the documents from the same host.
>>
>> Any thoughts or recommendations on this?
>>
>> Thank you,
>>
>> Regards,
>>
>>
>> ---------------------------------------------------
>> XII Aniversario de la creación de la Universidad de las Ciencias
>> Informáticas. 12 años de historia junto a Fidel. 12 de diciembre de 2014.
>>
>>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
> Principal Engineer,
> Grid Dynamics
>
> <http://www.griddynamics.com>
> <mk...@griddynamics.com>

Re: "Contextual" sponsored results with Solr

Posted by Mikhail Khludnev <mk...@griddynamics.com>.
Hello,
if I get you right it's frequently requested feature, but it requires
really deep hack like
https://issues.apache.org/jira/browse/LUCENE-6066



On Tue, Jan 27, 2015 at 9:28 PM, Jorge Luis Betancourt González <
jlbetancourt@uci.cu> wrote:

> Hi all,
>
> Recently I got an interesting use case that I'm not sure how to implement,
> the idea is that the client wants a fixed number of documents, let's call
> it N, to appear in the top of the results. Let me explain a little we're
> working with web documents so the idea is too promote the documents that
> match the query of the user from a given domain (wikipedia, for example) to
> the top of the list. So if I apply a a boost using the boost parameter:
>
>
> http://localhost:8983/solr/select?q=search&fl=url&boost=map(query($type1query),0,0,1,50)&type1query=host:wikipedia
>
> I get *all* the documents from the desired host at the top, but there is
> no way of limiting the number of documents from the host that are boosted
> to the top of the result list (which could lead to several pages of content
> from the same host, which is not desired, the idea is to only show N) . I
> was thinking in something like field collapsing/grouping but only for the
> documents that match my $type1query parameter (host:wikipedia) but I don't
> see any way of doing grouping/collapsing on only one group and leave the
> other results untouched.
>
> I although thought on using 2 groups using group.query=host:wikipedia and
> group.query=-host:wikipedia, but in this case there is no way of
> controlling how much documents each independently group will have.
>
> In this particular case QueryElevationComponent it's not helping because I
> don't want to map all the posible queries I just want to put the some of
> the results from a certain host in the top of the list, but without
> boosting all the documents from the same host.
>
> Any thoughts or recommendations on this?
>
> Thank you,
>
> Regards,
>
>
> ---------------------------------------------------
> XII Aniversario de la creación de la Universidad de las Ciencias
> Informáticas. 12 años de historia junto a Fidel. 12 de diciembre de 2014.
>
>


-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

<http://www.griddynamics.com>
<mk...@griddynamics.com>