You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by solr user <so...@gmail.com> on 2012/04/26 08:10:05 UTC

Using Customized sorting in Solr

Hi,

We are planning to move the search of one of our listing based portal to
solr/lucene search server from sphinx search server. But we are facing a
challenge is porting customized sorting being used in our portal. We only
have last 60 days of data live.The algorithm is as follows:-

   1.  Put all listings into 54 buckets – (Date bucket for 60 days)  i.e.
   buckets of 7day, 1 day, 1 day……
   2.  For each date bucket we make 2 buckets –(Paid / free bucket)
   3.  For each paid / free bucket cycle the advertisers on uniqueness basis

                  i.e. inside a bucket the ordering should be 1st listing
of each advertiser, 2nd listing of each advertiser and so on
                  in other words within a *sub-bucket* second listing of an
advertiser will be displayed only after first listing of all advertiser has
been displayed.

For taking care of point 1 and 2 we have created a field named bucket_index
at the time of indexing the data and get the results sorted by this index,
but we are not able to find a way to create a sort field at index time or
think of a sort function for the point no 3.  Please suggest if there is a
way to do so in solr.

Tia,

BC Rathore

Re: Using Customized sorting in Solr

Posted by Erick Erickson <er...@gmail.com>.
Consider writing a custom sort method or a custom function
that you use for sorting. Be _very_ careful that anything you
do here is very efficient, it'll be called a _lot_.

Best
Erick

On Mon, Apr 30, 2012 at 2:10 AM, solr user <so...@gmail.com> wrote:
> Hi,
>
> Any suggestions,
>
> Am I trying to do too much with solr? Is there any other search engine,
> which should be used here?
>
> I am looking into solr codebase and planning to modify QueryComponent. Will
> this be the right approach?
>
> Regards,
>
> Shivam
>
> On Fri, Apr 27, 2012 at 10:48 AM, solr user <so...@gmail.com> wrote:
>
>> Jan,
>>
>> Thanks for the response,
>>
>> I though of using it, but it will be suboptimal to do this in the scenario
>> I have. I guess I have to explain the scenario better, let me try it again:-
>>
>> 1. I have importance based buckets in the system, this is implemented
>> using a variable named bucket_count having integer values 0,1,2,3, and I
>> have to show results in order of bucket_count i.e. results from 0th bucket
>> at top, then results from 1st bucket and so on. That is done by doing a asc
>> sort on this variable.
>> 2. Now *within these buckets* I need to ensure that 1st listing of every
>> advertiser comes at top, then 2nd listing from every advertiser and so on.
>>
>> Now if I go with the grouping on advertiserId and and use the
>> group.offset, then probably I also need to do additive filtering on
>> bucket_count. To explain it better pseudo algorithm will be like
>>
>> 1. query solr with group.offset 0 and bucket count 0
>> 2. if results more than zero in step1 then increase group offset and
>> follow step 1 again
>> 3. else increase bucket count with group offset zero and start from step 1.
>>
>> With this logic in the worst case I need to query solr (number of
>> importance buckets)*(max number of listings by an advertiser). Which could
>> be very high number of solr queries for a single user query. Please suggest
>> if I can do this with more optimal way. I am also open to do modifications
>> in solr/lucene code if needed.
>>
>> Regards,
>> BC Rathore
>>
>>
>>
>> On Fri, Apr 27, 2012 at 4:09 AM, Jan Høydahl <ja...@cominvent.com>wrote:
>>
>>> Hi,
>>>
>>> How about trying grouping with paging?
>>> First you do
>>> group=true&group.field=advertiserId&group.limit=1&group.offset=0&group.main=true&sort=something&group.sort=how-much-paid
>>> desc
>>>
>>> That gives you one listing per advertiser, sorted the way you like.
>>> Then to grab the next batch of ads, you go group.offset=1 etc etc.
>>>
>>> --
>>> Jan Høydahl, search solution architect
>>> Cominvent AS - www.cominvent.com
>>> Solr Training - www.solrtraining.com
>>>
>>> On 26. apr. 2012, at 08:10, solr user wrote:
>>>
>>> > Hi,
>>> >
>>> > We are planning to move the search of one of our listing based portal to
>>> > solr/lucene search server from sphinx search server. But we are facing a
>>> > challenge is porting customized sorting being used in our portal. We
>>> only
>>> > have last 60 days of data live.The algorithm is as follows:-
>>> >
>>> >   1.  Put all listings into 54 buckets – (Date bucket for 60 days)  i.e.
>>> >   buckets of 7day, 1 day, 1 day……
>>> >   2.  For each date bucket we make 2 buckets –(Paid / free bucket)
>>> >   3.  For each paid / free bucket cycle the advertisers on uniqueness
>>> basis
>>> >
>>> >                  i.e. inside a bucket the ordering should be 1st listing
>>> > of each advertiser, 2nd listing of each advertiser and so on
>>> >                  in other words within a *sub-bucket* second listing of
>>> an
>>> > advertiser will be displayed only after first listing of all advertiser
>>> has
>>> > been displayed.
>>> >
>>> > For taking care of point 1 and 2 we have created a field named
>>> bucket_index
>>> > at the time of indexing the data and get the results sorted by this
>>> index,
>>> > but we are not able to find a way to create a sort field at index time
>>> or
>>> > think of a sort function for the point no 3.  Please suggest if there
>>> is a
>>> > way to do so in solr.
>>> >
>>> > Tia,
>>> >
>>> > BC Rathore
>>>
>>>
>>

Re: Using Customized sorting in Solr

Posted by solr user <so...@gmail.com>.
Hi,

Any suggestions,

Am I trying to do too much with solr? Is there any other search engine,
which should be used here?

I am looking into solr codebase and planning to modify QueryComponent. Will
this be the right approach?

Regards,

Shivam

On Fri, Apr 27, 2012 at 10:48 AM, solr user <so...@gmail.com> wrote:

> Jan,
>
> Thanks for the response,
>
> I though of using it, but it will be suboptimal to do this in the scenario
> I have. I guess I have to explain the scenario better, let me try it again:-
>
> 1. I have importance based buckets in the system, this is implemented
> using a variable named bucket_count having integer values 0,1,2,3, and I
> have to show results in order of bucket_count i.e. results from 0th bucket
> at top, then results from 1st bucket and so on. That is done by doing a asc
> sort on this variable.
> 2. Now *within these buckets* I need to ensure that 1st listing of every
> advertiser comes at top, then 2nd listing from every advertiser and so on.
>
> Now if I go with the grouping on advertiserId and and use the
> group.offset, then probably I also need to do additive filtering on
> bucket_count. To explain it better pseudo algorithm will be like
>
> 1. query solr with group.offset 0 and bucket count 0
> 2. if results more than zero in step1 then increase group offset and
> follow step 1 again
> 3. else increase bucket count with group offset zero and start from step 1.
>
> With this logic in the worst case I need to query solr (number of
> importance buckets)*(max number of listings by an advertiser). Which could
> be very high number of solr queries for a single user query. Please suggest
> if I can do this with more optimal way. I am also open to do modifications
> in solr/lucene code if needed.
>
> Regards,
> BC Rathore
>
>
>
> On Fri, Apr 27, 2012 at 4:09 AM, Jan Høydahl <ja...@cominvent.com>wrote:
>
>> Hi,
>>
>> How about trying grouping with paging?
>> First you do
>> group=true&group.field=advertiserId&group.limit=1&group.offset=0&group.main=true&sort=something&group.sort=how-much-paid
>> desc
>>
>> That gives you one listing per advertiser, sorted the way you like.
>> Then to grab the next batch of ads, you go group.offset=1 etc etc.
>>
>> --
>> Jan Høydahl, search solution architect
>> Cominvent AS - www.cominvent.com
>> Solr Training - www.solrtraining.com
>>
>> On 26. apr. 2012, at 08:10, solr user wrote:
>>
>> > Hi,
>> >
>> > We are planning to move the search of one of our listing based portal to
>> > solr/lucene search server from sphinx search server. But we are facing a
>> > challenge is porting customized sorting being used in our portal. We
>> only
>> > have last 60 days of data live.The algorithm is as follows:-
>> >
>> >   1.  Put all listings into 54 buckets – (Date bucket for 60 days)  i.e.
>> >   buckets of 7day, 1 day, 1 day……
>> >   2.  For each date bucket we make 2 buckets –(Paid / free bucket)
>> >   3.  For each paid / free bucket cycle the advertisers on uniqueness
>> basis
>> >
>> >                  i.e. inside a bucket the ordering should be 1st listing
>> > of each advertiser, 2nd listing of each advertiser and so on
>> >                  in other words within a *sub-bucket* second listing of
>> an
>> > advertiser will be displayed only after first listing of all advertiser
>> has
>> > been displayed.
>> >
>> > For taking care of point 1 and 2 we have created a field named
>> bucket_index
>> > at the time of indexing the data and get the results sorted by this
>> index,
>> > but we are not able to find a way to create a sort field at index time
>> or
>> > think of a sort function for the point no 3.  Please suggest if there
>> is a
>> > way to do so in solr.
>> >
>> > Tia,
>> >
>> > BC Rathore
>>
>>
>

Re: Using Customized sorting in Solr

Posted by solr user <so...@gmail.com>.
Jan,

Thanks for the response,

I though of using it, but it will be suboptimal to do this in the scenario
I have. I guess I have to explain the scenario better, let me try it again:-

1. I have importance based buckets in the system, this is implemented using
a variable named bucket_count having integer values 0,1,2,3, and I have to
show results in order of bucket_count i.e. results from 0th bucket at top,
then results from 1st bucket and so on. That is done by doing a asc sort on
this variable.
2. Now *within these buckets* I need to ensure that 1st listing of every
advertiser comes at top, then 2nd listing from every advertiser and so on.

Now if I go with the grouping on advertiserId and and use the group.offset,
then probably I also need to do additive filtering on bucket_count. To
explain it better pseudo algorithm will be like

1. query solr with group.offset 0 and bucket count 0
2. if results more than zero in step1 then increase group offset and follow
step 1 again
3. else increase bucket count with group offset zero and start from step 1.

With this logic in the worst case I need to query solr (number of
importance buckets)*(max number of listings by an advertiser). Which could
be very high number of solr queries for a single user query. Please suggest
if I can do this with more optimal way. I am also open to do modifications
in solr/lucene code if needed.

Regards,
BC Rathore


On Fri, Apr 27, 2012 at 4:09 AM, Jan Høydahl <ja...@cominvent.com> wrote:

> Hi,
>
> How about trying grouping with paging?
> First you do
> group=true&group.field=advertiserId&group.limit=1&group.offset=0&group.main=true&sort=something&group.sort=how-much-paid
> desc
>
> That gives you one listing per advertiser, sorted the way you like.
> Then to grab the next batch of ads, you go group.offset=1 etc etc.
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
> Solr Training - www.solrtraining.com
>
> On 26. apr. 2012, at 08:10, solr user wrote:
>
> > Hi,
> >
> > We are planning to move the search of one of our listing based portal to
> > solr/lucene search server from sphinx search server. But we are facing a
> > challenge is porting customized sorting being used in our portal. We only
> > have last 60 days of data live.The algorithm is as follows:-
> >
> >   1.  Put all listings into 54 buckets – (Date bucket for 60 days)  i.e.
> >   buckets of 7day, 1 day, 1 day……
> >   2.  For each date bucket we make 2 buckets –(Paid / free bucket)
> >   3.  For each paid / free bucket cycle the advertisers on uniqueness
> basis
> >
> >                  i.e. inside a bucket the ordering should be 1st listing
> > of each advertiser, 2nd listing of each advertiser and so on
> >                  in other words within a *sub-bucket* second listing of
> an
> > advertiser will be displayed only after first listing of all advertiser
> has
> > been displayed.
> >
> > For taking care of point 1 and 2 we have created a field named
> bucket_index
> > at the time of indexing the data and get the results sorted by this
> index,
> > but we are not able to find a way to create a sort field at index time or
> > think of a sort function for the point no 3.  Please suggest if there is
> a
> > way to do so in solr.
> >
> > Tia,
> >
> > BC Rathore
>
>

Re: Using Customized sorting in Solr

Posted by Jan Høydahl <ja...@cominvent.com>.
Hi,

How about trying grouping with paging?
First you do 
group=true&group.field=advertiserId&group.limit=1&group.offset=0&group.main=true&sort=something&group.sort=how-much-paid desc

That gives you one listing per advertiser, sorted the way you like.
Then to grab the next batch of ads, you go group.offset=1 etc etc.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com

On 26. apr. 2012, at 08:10, solr user wrote:

> Hi,
> 
> We are planning to move the search of one of our listing based portal to
> solr/lucene search server from sphinx search server. But we are facing a
> challenge is porting customized sorting being used in our portal. We only
> have last 60 days of data live.The algorithm is as follows:-
> 
>   1.  Put all listings into 54 buckets – (Date bucket for 60 days)  i.e.
>   buckets of 7day, 1 day, 1 day……
>   2.  For each date bucket we make 2 buckets –(Paid / free bucket)
>   3.  For each paid / free bucket cycle the advertisers on uniqueness basis
> 
>                  i.e. inside a bucket the ordering should be 1st listing
> of each advertiser, 2nd listing of each advertiser and so on
>                  in other words within a *sub-bucket* second listing of an
> advertiser will be displayed only after first listing of all advertiser has
> been displayed.
> 
> For taking care of point 1 and 2 we have created a field named bucket_index
> at the time of indexing the data and get the results sorted by this index,
> but we are not able to find a way to create a sort field at index time or
> think of a sort function for the point no 3.  Please suggest if there is a
> way to do so in solr.
> 
> Tia,
> 
> BC Rathore