You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by li...@yahoo.com.INVALID on 2015/10/11 14:38:04 UTC

Re: How to show some (paid) documents ahead of others (non-paid) - fantasy scenario

Hi, 
What if we write all paid results in a new, dedicated, core... let's call it: "PaidResultsCore" and lets call the non-paid results core: "NonPaidResultsCore"
When a user asks for "red pepper" we first perform the query upon "PaidResultsCore" and get the first ranking 3 results and then we perform the query upon "NonPaidResultsCore" and get the first ranking 9 results. Then we mix them all together and deliver a 12 results page to the user. 

Could that be achieved and how???
Thank you,Christian
 Christian Fotache Tel: 0728.297.207 Fax: 0351.411.570
      From: Upayavira <uv...@odoko.co.uk>
 To: solr-user@lucene.apache.org 
 Sent: Saturday, October 10, 2015 6:13 PM
 Subject: Re: How to show some documents ahead of others - requirements
   
I've seen a similar requirement to this recently.

Basically, a sorting requirement that is close to impossible to
implement as a scoring/boosting formula, because the *position* of the
result features in the score, and that's not something I believe can be
done right now.

The way we solved the issue in the similar case I referred to above was
by using a RerankQuery. That query class has a getTopDocsCollector()
function, which you can override, providing your own Collector.

If you then refer to your query(actually your query parser) with the
rerank query param in Solr: rq={!myRerankQuery} then it will trigger
your new collector, which will be given its topDocs() method is called,
will call topDocs on its parent query, get a list of documents, then
order them in some way such as you require, and return them in a
non-score order.

Not sure I've made that very clear, but hope it helps a little.

Upayavira



On Sat, Oct 10, 2015, at 03:13 PM, liviuchristian@yahoo.com.INVALID
wrote:
> Hi Upayavira & Walter & everyone else
> 
> About the requirements:1. I need to return no more than 3 paid results on
> a page of 12 results2. Paid results should be sorted like this: let's say
> a user is searching for: "chocolate almonds cake"Now, lets say that 2000
> results match the query and there are about 10 of these that are "paid
> results".I need to list the first 3 (1-2-3) of the paid results (in their
> ranking decreasing order) on the first page (maybe by improving the
> ranking of the 20 paid results over the non-paid ones and listing the
> first 3 of them.) and then listing 9 non-paid results on the page in
> their ranking decreasing order.
> Then, on the second page, I want to list first the next 3 paid results
> (4-5-6) and so on.
> 
> Kind regards,Christian
>  Christian Fotache Tel: 0728.297.207 
> 
>      From: Upayavira <uv...@odoko.co.uk>
>  To: solr-user@lucene.apache.org 
>  Sent: Thursday, October 8, 2015 7:03 PM
>  Subject: Re: How to show some documents ahead of others
>    
> Hence the suggestion to group by the paid field - would give you two
> lists of the number you ask for.
> 
> What I'm trying to say is that the QueryElevationComponent might do it,
> but it is also relatively clunky, so a pure search solution might do it.
> 
> However, the thing we lack right now is a full take on the requirements,
> e.g. how should paid results be sorted, how many paid results do you
> show, etc, etc. Without these details we're all guessing.
> 
> Upayavira
> 
> 
> On Thu, Oct 8, 2015, at 04:45 PM, Walter Underwood wrote:
> > Sorting all paid above all unpaid will give bad results when there are
> > many matches. It will show 1000 paid items, include all the barely
> > relevant ones, before it shows the first highly relevant unpaid recipe.
> > What if that was the only correct result?
> > 
> > Two approaches that work:
> > 
> > 1. Boost paid items using the “boost” parameter in edismax. Adjust it to
> > be a tiebreaker between documents with similar score.
> > 
> > 2. Show two lists, one with the five most relevant paid, the next with
> > the five most relevant unpaid.
> > 
> > wunder
> > Walter Underwood
> > wunder@wunderwood.org
> > http://observer.wunderwood.org/  (my blog)
> > 
> > 
> > > On Oct 8, 2015, at 7:39 AM, Alessandro Benedetti <be...@gmail.com> wrote:
> > > 
> > > Is it possible to understand better this : "as it doesn't
> > > allow any meaningful customization " ?
> > > 
> > > Cheers
> > > 
> > > On 8 October 2015 at 15:27, Andrea Roggerone <andrearoggerone.osrc@gmail.com
> > >> wrote:
> > > 
> > >> Hi guys,
> > >> I don't think that sorting is a good solution in this case as it doesn't
> > >> allow any meaningful customization.I believe that the advised
> > >> QueryElevationComponent is one of the viable alternative. Another one would
> > >> be to boost at query time a particular field, like for instance paid. That
> > >> would allow you to assign different boosts to different values using a
> > >> function.
> > >> 
> > >> On Thu, Oct 8, 2015 at 1:48 PM, Upayavira <uv...@odoko.co.uk> wrote:
> > >> 
> > >>> Or just have a field in your index -
> > >>> 
> > >>> paid: true/false
> > >>> 
> > >>> Then sort=paid desc, score desc
> > >>> 
> > >>> (you may need to sort paid asc, not sure which way a boolean would sort)
> > >>> 
> > >>> Question is whether you want to show ALL paid posts, or just a set of
> > >>> them. For the latter you could use result grouping on the paid field.
> > >>> 
> > >>> Upayavira
> > >>> 
> > >>> On Thu, Oct 8, 2015, at 01:34 PM, NutchDev wrote:
> > >>>> Hi Christian,
> > >>>> 
> > >>>> You can take a look at Solr's  QueryElevationComponent
> > >>>> <https://wiki.apache.org/solr/QueryElevationComponent>  .
> > >>>> 
> > >>>> It will allow you to configure the top results for a given query
> > >>>> regardless
> > >>>> of the normal lucene scoring. Also you can specify exclude document
> > >> list
> > >>>> to
> > >>>> exclude certain results for perticular query.
> > >>>> 
> > >>>> 
> > >>>> 
> > >>>> 
> > >>>> 
> > >>>> --
> > >>>> View this message in context:
> > >>>> 
> > >>> 
> > >> http://lucene.472066.n3.nabble.com/How-to-show-some-documents-ahead-of-others-tp4233481p4233490.html
> > >>>> Sent from the Solr - User mailing list archive at Nabble.com.
> > >>> 
> > >> 
> > > 
> > > 
> > > 
> > > -- 
> > > --------------------------
> > > 
> > > Benedetti Alessandro
> > > Visiting card - http://about.me/alessandro_benedetti
> > > Blog - http://alexbenedetti.blogspot.co.uk
> > > 
> > > "Tyger, tyger burning bright
> > > In the forests of the night,
> > > What immortal hand or eye
> > > Could frame thy fearful symmetry?"
> > > 
> > > William Blake - Songs of Experience -1794 England
> > 
> 
>  

  

Re: How to show some (paid) documents ahead of others (non-paid) - fantasy scenario

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
What about Streaming Expressions? Could they be used here? Disclaimer:
I have not used them myself yet.

https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions

Regards,
   Alex.
----
Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/


On 11 October 2015 at 13:56, Upayavira <uv...@odoko.co.uk> wrote:
> I think Walter suggested the simplest: make two requests. When you've
> got both results back, you can stick them together to make results.
>
> At present, there is no method to do multiple actions within a single
> request.
>
> Upayavira
>
> On Sun, Oct 11, 2015, at 01:38 PM, liviuchristian@yahoo.com.INVALID
> wrote:
>> Hi,
>> What if we write all paid results in a new, dedicated, core... let's call
>> it: "PaidResultsCore" and lets call the non-paid results core:
>> "NonPaidResultsCore"
>> When a user asks for "red pepper" we first perform the query upon
>> "PaidResultsCore" and get the first ranking 3 results and then we perform
>> the query upon "NonPaidResultsCore" and get the first ranking 9 results.
>> Then we mix them all together and deliver a 12 results page to the user.
>>
>> Could that be achieved and how???
>> Thank you,Christian
>>  Christian Fotache Tel: 0728.297.207 Fax: 0351.411.570
>>       From: Upayavira <uv...@odoko.co.uk>
>>  To: solr-user@lucene.apache.org
>>  Sent: Saturday, October 10, 2015 6:13 PM
>>  Subject: Re: How to show some documents ahead of others - requirements
>>
>> I've seen a similar requirement to this recently.
>>
>> Basically, a sorting requirement that is close to impossible to
>> implement as a scoring/boosting formula, because the *position* of the
>> result features in the score, and that's not something I believe can be
>> done right now.
>>
>> The way we solved the issue in the similar case I referred to above was
>> by using a RerankQuery. That query class has a getTopDocsCollector()
>> function, which you can override, providing your own Collector.
>>
>> If you then refer to your query(actually your query parser) with the
>> rerank query param in Solr: rq={!myRerankQuery} then it will trigger
>> your new collector, which will be given its topDocs() method is called,
>> will call topDocs on its parent query, get a list of documents, then
>> order them in some way such as you require, and return them in a
>> non-score order.
>>
>> Not sure I've made that very clear, but hope it helps a little.
>>
>> Upayavira
>>
>>
>>
>> On Sat, Oct 10, 2015, at 03:13 PM, liviuchristian@yahoo.com.INVALID
>> wrote:
>> > Hi Upayavira & Walter & everyone else
>> >
>> > About the requirements:1. I need to return no more than 3 paid results on
>> > a page of 12 results2. Paid results should be sorted like this: let's say
>> > a user is searching for: "chocolate almonds cake"Now, lets say that 2000
>> > results match the query and there are about 10 of these that are "paid
>> > results".I need to list the first 3 (1-2-3) of the paid results (in their
>> > ranking decreasing order) on the first page (maybe by improving the
>> > ranking of the 20 paid results over the non-paid ones and listing the
>> > first 3 of them.) and then listing 9 non-paid results on the page in
>> > their ranking decreasing order.
>> > Then, on the second page, I want to list first the next 3 paid results
>> > (4-5-6) and so on.
>> >
>> > Kind regards,Christian
>> >  Christian Fotache Tel: 0728.297.207
>> >
>> >      From: Upayavira <uv...@odoko.co.uk>
>> >  To: solr-user@lucene.apache.org
>> >  Sent: Thursday, October 8, 2015 7:03 PM
>> >  Subject: Re: How to show some documents ahead of others
>> >
>> > Hence the suggestion to group by the paid field - would give you two
>> > lists of the number you ask for.
>> >
>> > What I'm trying to say is that the QueryElevationComponent might do it,
>> > but it is also relatively clunky, so a pure search solution might do it.
>> >
>> > However, the thing we lack right now is a full take on the requirements,
>> > e.g. how should paid results be sorted, how many paid results do you
>> > show, etc, etc. Without these details we're all guessing.
>> >
>> > Upayavira
>> >
>> >
>> > On Thu, Oct 8, 2015, at 04:45 PM, Walter Underwood wrote:
>> > > Sorting all paid above all unpaid will give bad results when there are
>> > > many matches. It will show 1000 paid items, include all the barely
>> > > relevant ones, before it shows the first highly relevant unpaid recipe.
>> > > What if that was the only correct result?
>> > >
>> > > Two approaches that work:
>> > >
>> > > 1. Boost paid items using the “boost” parameter in edismax. Adjust it to
>> > > be a tiebreaker between documents with similar score.
>> > >
>> > > 2. Show two lists, one with the five most relevant paid, the next with
>> > > the five most relevant unpaid.
>> > >
>> > > wunder
>> > > Walter Underwood
>> > > wunder@wunderwood.org
>> > > http://observer.wunderwood.org/  (my blog)
>> > >
>> > >
>> > > > On Oct 8, 2015, at 7:39 AM, Alessandro Benedetti <be...@gmail.com> wrote:
>> > > >
>> > > > Is it possible to understand better this : "as it doesn't
>> > > > allow any meaningful customization " ?
>> > > >
>> > > > Cheers
>> > > >
>> > > > On 8 October 2015 at 15:27, Andrea Roggerone <andrearoggerone.osrc@gmail.com
>> > > >> wrote:
>> > > >
>> > > >> Hi guys,
>> > > >> I don't think that sorting is a good solution in this case as it doesn't
>> > > >> allow any meaningful customization.I believe that the advised
>> > > >> QueryElevationComponent is one of the viable alternative. Another one would
>> > > >> be to boost at query time a particular field, like for instance paid. That
>> > > >> would allow you to assign different boosts to different values using a
>> > > >> function.
>> > > >>
>> > > >> On Thu, Oct 8, 2015 at 1:48 PM, Upayavira <uv...@odoko.co.uk> wrote:
>> > > >>
>> > > >>> Or just have a field in your index -
>> > > >>>
>> > > >>> paid: true/false
>> > > >>>
>> > > >>> Then sort=paid desc, score desc
>> > > >>>
>> > > >>> (you may need to sort paid asc, not sure which way a boolean would sort)
>> > > >>>
>> > > >>> Question is whether you want to show ALL paid posts, or just a set of
>> > > >>> them. For the latter you could use result grouping on the paid field.
>> > > >>>
>> > > >>> Upayavira
>> > > >>>
>> > > >>> On Thu, Oct 8, 2015, at 01:34 PM, NutchDev wrote:
>> > > >>>> Hi Christian,
>> > > >>>>
>> > > >>>> You can take a look at Solr's  QueryElevationComponent
>> > > >>>> <https://wiki.apache.org/solr/QueryElevationComponent>  .
>> > > >>>>
>> > > >>>> It will allow you to configure the top results for a given query
>> > > >>>> regardless
>> > > >>>> of the normal lucene scoring. Also you can specify exclude document
>> > > >> list
>> > > >>>> to
>> > > >>>> exclude certain results for perticular query.
>> > > >>>>
>> > > >>>>
>> > > >>>>
>> > > >>>>
>> > > >>>>
>> > > >>>> --
>> > > >>>> View this message in context:
>> > > >>>>
>> > > >>>
>> > > >> http://lucene.472066.n3.nabble.com/How-to-show-some-documents-ahead-of-others-tp4233481p4233490.html
>> > > >>>> Sent from the Solr - User mailing list archive at Nabble.com.
>> > > >>>
>> > > >>
>> > > >
>> > > >
>> > > >
>> > > > --
>> > > > --------------------------
>> > > >
>> > > > Benedetti Alessandro
>> > > > Visiting card - http://about.me/alessandro_benedetti
>> > > > Blog - http://alexbenedetti.blogspot.co.uk
>> > > >
>> > > > "Tyger, tyger burning bright
>> > > > In the forests of the night,
>> > > > What immortal hand or eye
>> > > > Could frame thy fearful symmetry?"
>> > > >
>> > > > William Blake - Songs of Experience -1794 England
>> > >
>> >
>> >
>>
>>

Re: How to show some (paid) documents ahead of others (non-paid) - fantasy scenario

Posted by Upayavira <uv...@odoko.co.uk>.
I think Walter suggested the simplest: make two requests. When you've
got both results back, you can stick them together to make results.

At present, there is no method to do multiple actions within a single
request.

Upayavira

On Sun, Oct 11, 2015, at 01:38 PM, liviuchristian@yahoo.com.INVALID
wrote:
> Hi, 
> What if we write all paid results in a new, dedicated, core... let's call
> it: "PaidResultsCore" and lets call the non-paid results core:
> "NonPaidResultsCore"
> When a user asks for "red pepper" we first perform the query upon
> "PaidResultsCore" and get the first ranking 3 results and then we perform
> the query upon "NonPaidResultsCore" and get the first ranking 9 results.
> Then we mix them all together and deliver a 12 results page to the user. 
> 
> Could that be achieved and how???
> Thank you,Christian
>  Christian Fotache Tel: 0728.297.207 Fax: 0351.411.570
>       From: Upayavira <uv...@odoko.co.uk>
>  To: solr-user@lucene.apache.org 
>  Sent: Saturday, October 10, 2015 6:13 PM
>  Subject: Re: How to show some documents ahead of others - requirements
>    
> I've seen a similar requirement to this recently.
> 
> Basically, a sorting requirement that is close to impossible to
> implement as a scoring/boosting formula, because the *position* of the
> result features in the score, and that's not something I believe can be
> done right now.
> 
> The way we solved the issue in the similar case I referred to above was
> by using a RerankQuery. That query class has a getTopDocsCollector()
> function, which you can override, providing your own Collector.
> 
> If you then refer to your query(actually your query parser) with the
> rerank query param in Solr: rq={!myRerankQuery} then it will trigger
> your new collector, which will be given its topDocs() method is called,
> will call topDocs on its parent query, get a list of documents, then
> order them in some way such as you require, and return them in a
> non-score order.
> 
> Not sure I've made that very clear, but hope it helps a little.
> 
> Upayavira
> 
> 
> 
> On Sat, Oct 10, 2015, at 03:13 PM, liviuchristian@yahoo.com.INVALID
> wrote:
> > Hi Upayavira & Walter & everyone else
> > 
> > About the requirements:1. I need to return no more than 3 paid results on
> > a page of 12 results2. Paid results should be sorted like this: let's say
> > a user is searching for: "chocolate almonds cake"Now, lets say that 2000
> > results match the query and there are about 10 of these that are "paid
> > results".I need to list the first 3 (1-2-3) of the paid results (in their
> > ranking decreasing order) on the first page (maybe by improving the
> > ranking of the 20 paid results over the non-paid ones and listing the
> > first 3 of them.) and then listing 9 non-paid results on the page in
> > their ranking decreasing order.
> > Then, on the second page, I want to list first the next 3 paid results
> > (4-5-6) and so on.
> > 
> > Kind regards,Christian
> >  Christian Fotache Tel: 0728.297.207 
> > 
> >      From: Upayavira <uv...@odoko.co.uk>
> >  To: solr-user@lucene.apache.org 
> >  Sent: Thursday, October 8, 2015 7:03 PM
> >  Subject: Re: How to show some documents ahead of others
> >    
> > Hence the suggestion to group by the paid field - would give you two
> > lists of the number you ask for.
> > 
> > What I'm trying to say is that the QueryElevationComponent might do it,
> > but it is also relatively clunky, so a pure search solution might do it.
> > 
> > However, the thing we lack right now is a full take on the requirements,
> > e.g. how should paid results be sorted, how many paid results do you
> > show, etc, etc. Without these details we're all guessing.
> > 
> > Upayavira
> > 
> > 
> > On Thu, Oct 8, 2015, at 04:45 PM, Walter Underwood wrote:
> > > Sorting all paid above all unpaid will give bad results when there are
> > > many matches. It will show 1000 paid items, include all the barely
> > > relevant ones, before it shows the first highly relevant unpaid recipe.
> > > What if that was the only correct result?
> > > 
> > > Two approaches that work:
> > > 
> > > 1. Boost paid items using the “boost” parameter in edismax. Adjust it to
> > > be a tiebreaker between documents with similar score.
> > > 
> > > 2. Show two lists, one with the five most relevant paid, the next with
> > > the five most relevant unpaid.
> > > 
> > > wunder
> > > Walter Underwood
> > > wunder@wunderwood.org
> > > http://observer.wunderwood.org/  (my blog)
> > > 
> > > 
> > > > On Oct 8, 2015, at 7:39 AM, Alessandro Benedetti <be...@gmail.com> wrote:
> > > > 
> > > > Is it possible to understand better this : "as it doesn't
> > > > allow any meaningful customization " ?
> > > > 
> > > > Cheers
> > > > 
> > > > On 8 October 2015 at 15:27, Andrea Roggerone <andrearoggerone.osrc@gmail.com
> > > >> wrote:
> > > > 
> > > >> Hi guys,
> > > >> I don't think that sorting is a good solution in this case as it doesn't
> > > >> allow any meaningful customization.I believe that the advised
> > > >> QueryElevationComponent is one of the viable alternative. Another one would
> > > >> be to boost at query time a particular field, like for instance paid. That
> > > >> would allow you to assign different boosts to different values using a
> > > >> function.
> > > >> 
> > > >> On Thu, Oct 8, 2015 at 1:48 PM, Upayavira <uv...@odoko.co.uk> wrote:
> > > >> 
> > > >>> Or just have a field in your index -
> > > >>> 
> > > >>> paid: true/false
> > > >>> 
> > > >>> Then sort=paid desc, score desc
> > > >>> 
> > > >>> (you may need to sort paid asc, not sure which way a boolean would sort)
> > > >>> 
> > > >>> Question is whether you want to show ALL paid posts, or just a set of
> > > >>> them. For the latter you could use result grouping on the paid field.
> > > >>> 
> > > >>> Upayavira
> > > >>> 
> > > >>> On Thu, Oct 8, 2015, at 01:34 PM, NutchDev wrote:
> > > >>>> Hi Christian,
> > > >>>> 
> > > >>>> You can take a look at Solr's  QueryElevationComponent
> > > >>>> <https://wiki.apache.org/solr/QueryElevationComponent>  .
> > > >>>> 
> > > >>>> It will allow you to configure the top results for a given query
> > > >>>> regardless
> > > >>>> of the normal lucene scoring. Also you can specify exclude document
> > > >> list
> > > >>>> to
> > > >>>> exclude certain results for perticular query.
> > > >>>> 
> > > >>>> 
> > > >>>> 
> > > >>>> 
> > > >>>> 
> > > >>>> --
> > > >>>> View this message in context:
> > > >>>> 
> > > >>> 
> > > >> http://lucene.472066.n3.nabble.com/How-to-show-some-documents-ahead-of-others-tp4233481p4233490.html
> > > >>>> Sent from the Solr - User mailing list archive at Nabble.com.
> > > >>> 
> > > >> 
> > > > 
> > > > 
> > > > 
> > > > -- 
> > > > --------------------------
> > > > 
> > > > Benedetti Alessandro
> > > > Visiting card - http://about.me/alessandro_benedetti
> > > > Blog - http://alexbenedetti.blogspot.co.uk
> > > > 
> > > > "Tyger, tyger burning bright
> > > > In the forests of the night,
> > > > What immortal hand or eye
> > > > Could frame thy fearful symmetry?"
> > > > 
> > > > William Blake - Songs of Experience -1794 England
> > > 
> > 
> >  
> 
>   

Re: How to show some (paid) documents ahead of others (non-paid) - fantasy scenario

Posted by Mikhail Khludnev <mk...@griddynamics.com>.
On Sun, Oct 11, 2015 at 3:38 PM, <li...@yahoo.com.invalid> wrote:

> Hi,
> What if we write all paid results in a new, dedicated, core... let's call
> it: "PaidResultsCore" and lets call the non-paid results core:
> "NonPaidResultsCore"
> When a user asks for "red pepper" we first perform the query upon
> "PaidResultsCore" and get the first ranking 3 results and then we perform
> the query upon "NonPaidResultsCore" and get the first ranking 9 results.
> Then we mix them all together and deliver a 12 results page to the user.
>

you can experiment with sending &shards=<paid core url>,<plain core url> or
similarly &collections=.. see
https://cwiki.apache.org/confluence/display/solr/Advanced+Distributed+Request+Options
Also, .however, there are no precise control over relevance and merging,
fwiw it might be a handy extension for SolrCloud.


-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

<http://www.griddynamics.com>
<mk...@griddynamics.com>