You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Nicholas Ding <ni...@gmail.com> on 2013/05/13 19:12:23 UTC

How to improve performance of geodist()

Hi guys,

I'm using geodist() in a recip boost function. I noticed a performance
impact to the response time. I did a profiling session, the geodist()
calculation took 30% of CPU time.

I'm wondering is there any alternative to Haversine function that can
reduce CPU calculation? I don't need very accurate float numbers when I use
geodist() in the boost function.

Thanks
Nicholas

Re: How to improve performance of geodist()

Posted by "David Smiley (@MITRE.org)" <DS...@mitre.org>.
Hi Nicholas,

Given that boosting is generally inherently fuzzy / inexact thing, you can
likely get away with using simpler calculations.  dist() can do the
Euclidean distance (i.e. the Pythagorean theorem).  If your data is in just
one region of the world, you can project your data into a 2-D plane (a
so-called "projection") and use the Euclidean distance.  If your data is
everywhere, you may need to use multiple projections, putting them in
separate fields for each projection and then choose the best projected set
of coordinates based on your starting point.

~ David


Nicholas Ding wrote
> Yes, I did. But instead of sorting by geodist(), I use function query to
> boost by distance. That's why I noticed the heavy calculation happened in
> the processing.
> 
> Example:
> bf=recip(geodist(), 50, 5)
> 
> Basically, I think the boost function will iterate all the results, and
> calculate the distance.
> 
> 
> 
> On Mon, May 13, 2013 at 1:27 PM, Yonik Seeley &lt;

> yonik@

> &gt; wrote:
> 
>> On Mon, May 13, 2013 at 1:12 PM, Nicholas Ding &lt;

> nicholasdsj@

> &gt;
>> wrote:
>> > I'm using geodist() in a recip boost function. I noticed a performance
>> > impact to the response time. I did a profiling session, the geodist()
>> > calculation took 30% of CPU time.
>>
>> Are you also using an "fq" with geofilt to narrow down the number of
>> documents that must be scored?
>>
>> -Yonik
>> http://lucidworks.com
>>





-----
 Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
--
View this message in context: http://lucene.472066.n3.nabble.com/How-to-improve-performance-of-geodist-tp4063004p4063136.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to improve performance of geodist()

Posted by Nicholas Ding <ni...@gmail.com>.
Yes, I did. But instead of sorting by geodist(), I use function query to
boost by distance. That's why I noticed the heavy calculation happened in
the processing.

Example:
bf=recip(geodist(), 50, 5)

Basically, I think the boost function will iterate all the results, and
calculate the distance.



On Mon, May 13, 2013 at 1:27 PM, Yonik Seeley <yo...@lucidworks.com> wrote:

> On Mon, May 13, 2013 at 1:12 PM, Nicholas Ding <ni...@gmail.com>
> wrote:
> > I'm using geodist() in a recip boost function. I noticed a performance
> > impact to the response time. I did a profiling session, the geodist()
> > calculation took 30% of CPU time.
>
> Are you also using an "fq" with geofilt to narrow down the number of
> documents that must be scored?
>
> -Yonik
> http://lucidworks.com
>

Re: How to improve performance of geodist()

Posted by Yonik Seeley <yo...@lucidworks.com>.
On Mon, May 13, 2013 at 1:12 PM, Nicholas Ding <ni...@gmail.com> wrote:
> I'm using geodist() in a recip boost function. I noticed a performance
> impact to the response time. I did a profiling session, the geodist()
> calculation took 30% of CPU time.

Are you also using an "fq" with geofilt to narrow down the number of
documents that must be scored?

-Yonik
http://lucidworks.com