You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Sandeep Shetty <sa...@touchlocal.com> on 2007/09/26 14:14:53 UTC

custom sorting

> Hi Guys,
> 
> this question as been asked before but i was unable to find an answer
> thats good for me, so hope you guys can help again
> i am working on a website where we need to sort the results by distance
> from the location entered by the user. I have indexed the lat and long
> info for each record in solr and also i can get the lat and long of the
> location input by the user.
> Previously we were using lucene to do this. by using the
> SortComparatorSource we could sort the documents returned by distance
> nicely. we are now switching over to lucene because of the features it
> provides, however i am not able to see a way to do this in Solr. 
> 
> If someone can point me in the right direction i would be very grateful!
> 
> Thanks in advance,
> Sandeep

This email is confidential and may also be privileged. If you are not the intended recipient please notify us immediately by telephoning +44 (0)20 7452 5300 or email postmaster@touchlocal.com. You should not copy it or use it for any purpose nor disclose its contents to any other person. Touch Local cannot accept liability for statements made which are clearly the sender's own and are not made on behalf of the firm.

Touch Local Limited
Registered Number: 2885607
VAT Number: GB896112114
Cardinal Tower, 12 Farringdon Road, London EC1M 3NN
+44 (0)20 7452 5300


Re: custom sorting

Posted by Mike Klaas <mi...@gmail.com>.
On 26-Sep-07, at 5:14 AM, Sandeep Shetty wrote:

>> Hi Guys,
>>
>> this question as been asked before but i was unable to find an answer
>> thats good for me, so hope you guys can help again
>> i am working on a website where we need to sort the results by  
>> distance
>> from the location entered by the user. I have indexed the lat and  
>> long
>> info for each record in solr and also i can get the lat and long  
>> of the
>> location input by the user.
>> Previously we were using lucene to do this. by using the
>> SortComparatorSource we could sort the documents returned by distance
>> nicely. we are now switching over to lucene because of the  
>> features it
>> provides, however i am not able to see a way to do this in Solr.
>>
>> If someone can point me in the right direction i would be very  
>> grateful!
>>
>> Thanks in advance,
>> Sandeep
>
> This email is confidential and may also be privileged. If you are  
> not the intended recipient please notify us immediately by  
> telephoning +44 (0)20 7452 5300 or email postmaster@touchlocal.com.  
> You should not copy it or use it for any purpose nor disclose its  
> contents to any other person. Touch Local cannot accept liability  
> for statements made which are clearly the sender's own and are not  
> made on behalf of the firm.

Sorry, I'm afraid the above email is already irrevokably publicly  
archived.

-Mike

Re: custom sorting

Posted by Chris Hostetter <ho...@fucit.org>.
: > Using something like this, how would the custom SortComparatorSource
: > get a parameter from the request to use in sorting calculations?

in general: you wouldn't you would have to specify all options as init 
params for the FieldType -- which makes it pretty horrible for distance 
calculations, and isn't something i considered when i posted that.

the only way i can think of that you can really solve the problem with a 
plugin at the moment (without some serious internal changes that yonik 
describes below) would be to use a dynamicField when you want geodistance 
sort, and encode the center lat/lon point in the field name, ala:

   sort=geodist_-124.75_93.45

: or extend solr's sorting mechanisms to allow specifying a function to sort by.
: 
: sort="dist(10.4,20.2,geoloc) asc"

thta would in fact, kick ass.  even if there is a better solution for the 
distance stuff the idea of being able to specify a raw function as a sort 
would be pretty sick. (NOTE: that's "sick" as in "so good it's amazing" 
... since the last person i used that idiom with didn't understand and 
thought i ment "bad")



-Hoss


Re: custom sorting

Posted by Narayanan Palasseri <pa...@gmail.com>.
Hi all,
Regarding this issue, we tried using a custom request handler which inturn
uses the CustomCompartor. But this has a memory leak and we are almost got
stuck up at that point. As somebody mentioned, we are thinking of moving
towards function query to achieve the same. Please let me know whether
anybody has faced similar issue or is it that we are doing something wrong.
The additional code that we have return from the default handler is as given
below.

*

if* ("*myappRequestHandler*".equalsIgnoreCase(requestHandler))

{

sort = getSortCriteria(*new* SimpleSortComparatorSourceImpl());

}

Thanks and Regards
Narayanan


On 9/28/07, Yonik Seeley <yo...@apache.org> wrote:
>
> On 9/27/07, Erik Hatcher <er...@ehatchersolutions.com> wrote:
> > Using something like this, how would the custom SortComparatorSource
> > get a parameter from the request to use in sorting calculations?
>
> perhaps hook in via function query:
> dist(10.4,20.2,geoloc)
>
> And either manipulate the score with that and sort by score,
>
> q=+(foo bar)^0 dist(10.4,20.2,geoloc)
> sort=score asc
>
> or extend solr's sorting mechanisms to allow specifying a function to sort
> by.
>
> sort="dist(10.4,20.2,geoloc) asc"
>
> -Yonik
>

Re: custom sorting

Posted by Chris Hostetter <ho...@fucit.org>.
: If you went with the FunctionQuery approach for sorting by distance, would
: there be any way to use the output of the FunctionQuery to limit the
: documents to those within a certain radius?  Or is it just for boosting
: documents, not for filtering?

FunctionQueries don't restrict the set of documents at all, so you would 
need to combine it with a seperate query that limits the documents ... the 
simplest way would be as you say: by combining it with two range queries 
that would define a lat/lon "bounding box"

: Also, even if you're just using it for boosting, is there a way to avoid
: running the expensive function on all docs in the index?  Could you somehow

that's the bueaty of "skipTo" in query scoring ... BooleanQueries keep 
track of the "next" document each clause can match (in order by docid), 
and tell all of the other queries to "skipTo" that doc and not bother 
trying to score any doc ids below that.



-Hoss


Re: custom sorting

Posted by Doug Daniels <dd...@rooftophq.com>.
If you went with the FunctionQuery approach for sorting by distance, would
there be any way to use the output of the FunctionQuery to limit the
documents to those within a certain radius?  Or is it just for boosting
documents, not for filtering?

Also, even if you're just using it for boosting, is there a way to avoid
running the expensive function on all docs in the index?  Could you somehow
nest bounding-box RangeQuery for latitude and longitude inside as
ValueSources?

Thanks,
Doug


hossman wrote:
> 
> 
> : leaks, etc.).  (Speaking of which, could anyone with more Lucene/Solr
> : experience than I comment on the performance characteristics of the
> : locallucene implementation mentioned on the list recently?  I've taken
> : a first look and it seems reasonable to me.)
> 
> i cna't speak for anyone else, but i haven't had a chacne to drill into it 
> yet.
> 
> : Using a function query, as Yonik suggests above, is another approach.
> : But to get a true sort, you have to boost the original query to zero?
> 
> or a very close approximation there of (0.000001 perhaps)
> 
> keep in mind: a "true" distance sort while easy to explain may not be as 
> useful as a sort by score where the distance is factored into the score 
> ... there have been some threads about this on the java-user list in the 
> past and it's been discussed that a really relevant result 2 miles away is 
> probably better then a mildly relevent result 1.5 miles away ... that's 
> where a function query with well choosen boosts might serve you better.
> 
> : How does this impact the results returned by the original query?  Will
> : the requirements (and boosts) of the original (now nested) query
> : remain intact, only sorted by the function?  Also, is there any way to
> 
> it should ... but i won't swear to that.
> 
> : do this with the dismax handler?
> 
> a strict sort on the value of a a function?  put the function in the bf
> param, don't bother with bq or pf params and change your qf params to all 
> have really small boosts.
> 
> 
> 
> -Hoss
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/custom-sorting-tf4521989.html#a13436617
Sent from the Solr - User mailing list archive at Nabble.com.


Re: custom sorting

Posted by Chris Hostetter <ho...@fucit.org>.
: leaks, etc.).  (Speaking of which, could anyone with more Lucene/Solr
: experience than I comment on the performance characteristics of the
: locallucene implementation mentioned on the list recently?  I've taken
: a first look and it seems reasonable to me.)

i cna't speak for anyone else, but i haven't had a chacne to drill into it 
yet.

: Using a function query, as Yonik suggests above, is another approach.
: But to get a true sort, you have to boost the original query to zero?

or a very close approximation there of (0.000001 perhaps)

keep in mind: a "true" distance sort while easy to explain may not be as 
useful as a sort by score where the distance is factored into the score 
... there have been some threads about this on the java-user list in the 
past and it's been discussed that a really relevant result 2 miles away is 
probably better then a mildly relevent result 1.5 miles away ... that's 
where a function query with well choosen boosts might serve you better.

: How does this impact the results returned by the original query?  Will
: the requirements (and boosts) of the original (now nested) query
: remain intact, only sorted by the function?  Also, is there any way to

it should ... but i won't swear to that.

: do this with the dismax handler?

a strict sort on the value of a a function?  put the function in the bf
param, don't bother with bq or pf params and change your qf params to all 
have really small boosts.



-Hoss


Re: custom sorting

Posted by Jon Pierce <jo...@gmail.com>.
Is the machinery in place to do this now (hook up a function query to
be used in sorting)?

I'm trying to figure out what's the best way to do a distance sort:
custom comparator or function query.

Using a custom comparator seems straightforward and reusable across
both the standard and dismax handlers.  But it also seems most likely
to impact performance (or at least require the most work/knowledge to
get right by minimizing calculations, caching, watching out for memory
leaks, etc.).  (Speaking of which, could anyone with more Lucene/Solr
experience than I comment on the performance characteristics of the
locallucene implementation mentioned on the list recently?  I've taken
a first look and it seems reasonable to me.)

Using a function query, as Yonik suggests above, is another approach.
But to get a true sort, you have to boost the original query to zero?
How does this impact the results returned by the original query?  Will
the requirements (and boosts) of the original (now nested) query
remain intact, only sorted by the function?  Also, is there any way to
do this with the dismax handler?

Thanks,
- Jon

On 9/27/07, Yonik Seeley <yo...@apache.org> wrote:
> On 9/27/07, Erik Hatcher <er...@ehatchersolutions.com> wrote:
> > Using something like this, how would the custom SortComparatorSource
> > get a parameter from the request to use in sorting calculations?
>
> perhaps hook in via function query:
>   dist(10.4,20.2,geoloc)
>
> And either manipulate the score with that and sort by score,
>
> q=+(foo bar)^0 dist(10.4,20.2,geoloc)
> sort=score asc
>
> or extend solr's sorting mechanisms to allow specifying a function to sort by.
>
> sort="dist(10.4,20.2,geoloc) asc"
>
> -Yonik
>

Re: custom sorting

Posted by Yonik Seeley <yo...@apache.org>.
On 9/27/07, Erik Hatcher <er...@ehatchersolutions.com> wrote:
> Using something like this, how would the custom SortComparatorSource
> get a parameter from the request to use in sorting calculations?

perhaps hook in via function query:
  dist(10.4,20.2,geoloc)

And either manipulate the score with that and sort by score,

q=+(foo bar)^0 dist(10.4,20.2,geoloc)
sort=score asc

or extend solr's sorting mechanisms to allow specifying a function to sort by.

sort="dist(10.4,20.2,geoloc) asc"

-Yonik

Re: custom sorting

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Sep 27, 2007, at 2:50 PM, Chris Hostetter wrote:
> to answer the broader question of using customized
> LUcene SortComparatorSource objects in solr -- it is in fact possible.
>
> In Solr, all decisisons about how to sort are driven by  
> FieldTypes.  You
> can subclass any of the FieldTypes that come with Solr and override  
> just
> the getSortField method to use whatever sort logic you want and  
> then use
> your new FieldType as you would any other plugin...
>
> http://wiki.apache.org/solr/SolrPlugins
>
> In the case where you have a custom SortComparatorSource that is not
> "field" specific (or uses data from morethen one field) you would  
> need to
> make your field type smart enough to let you cofigure (via the  
> <fieldType>
> declaration in the schema) which fields (if any) to get it's data  
> from,
> and then create a marker field of that type, which you don't use to  
> index
> or store any data, but you use to indicate when to trigger your custom
> sort logic, ie...
>
>
>     <fieldType name="distance" class="solr.YourField"
>                latFieldName="latitude" lonFieldName="longitute"
>                stored="false" indexed="false />
>     ...
>    <field name="latitude" type="sint" indexed="true" stored="true" />
>    <field name="latitude" type="sint" indexed="true" stored="true" />
>    <field name="distance" type="distance" />
>
> ...and then use "sort=distance+asc" in your query

Using something like this, how would the custom SortComparatorSource  
get a parameter from the request to use in sorting calculations?

I haven't looked under the covers of the local-solr stuff that flew  
by earlier, but looks quite well done.  I think I can speak for many  
that would love to have geo field types / sorting capability built  
into Solr.

	Erik


Re: custom sorting

Posted by Chris Hostetter <ho...@fucit.org>.
: > Previously we were using lucene to do this. by using the
: > SortComparatorSource we could sort the documents returned by distance
: > nicely. we are now switching over to lucene because of the features it
: > provides, however i am not able to see a way to do this in Solr. 

Someone started another thread where they specificly discuss the 
"Geographical distance searching" aspect of your question.

to answer the broader question of using customized 
LUcene SortComparatorSource objects in solr -- it is in fact possible.

In Solr, all decisisons about how to sort are driven by FieldTypes.  You 
can subclass any of the FieldTypes that come with Solr and override just 
the getSortField method to use whatever sort logic you want and then use 
your new FieldType as you would any other plugin...

http://wiki.apache.org/solr/SolrPlugins

In the case where you have a custom SortComparatorSource that is not 
"field" specific (or uses data from morethen one field) you would need to 
make your field type smart enough to let you cofigure (via the <fieldType> 
declaration in the schema) which fields (if any) to get it's data from, 
and then create a marker field of that type, which you don't use to index 
or store any data, but you use to indicate when to trigger your custom 
sort logic, ie...


    <fieldType name="distance" class="solr.YourField" 
               latFieldName="latitude" lonFieldName="longitute" 
               stored="false" indexed="false />
    ...
   <field name="latitude" type="sint" indexed="true" stored="true" /> 
   <field name="latitude" type="sint" indexed="true" stored="true" /> 
   <field name="distance" type="distance" />

...and then use "sort=distance+asc" in your query



-Hoss