You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by Alex Neth <al...@liivid.com> on 2008/04/27 21:31:28 UTC

Terrible performance for range searches

I am trying to switch from Ferret to Solr.  My searches were  
performing quite well on Ferret (<100ms), but I have some reasons for  
switching.

I am not experiencing terrible performance when doing range searches.   
I have seen posts that granularity should be reduced, but this is not  
an option in my case.

I have around 120K records with 8 digit range integer fields.  When I  
search for a small range, e.g. [0 100] the search is slow, but  
usable.  When I use a larger range, e.g. [0 99999999] the search can  
take over 60 seconds.  Searches without ranges are fast.

Is there a way to get good performance in this case?  This is a  
critical function for my application and I will not be able to use  
Solr unless I can find a way to make this work.

Thanks for any advice.


Re: Terrible performance for range searches

Posted by Chris Hostetter <ho...@fucit.org>.
: > That seems quite slow for 120K docs.  Is this on a warmed-up Solr?  (I.e.,
: > is the index in the OS's disk cache?)
: 
: I'm not sure if it's in the cache or not.  How could I make sure that it will
: be in the cache?

You're index is in the OS's disk cache if you've done some queries agaisnt 
it before you do your timing tests (and if your box has plenty of free 
memory for hte OS to use as a disk cache)

: I did reduce the size of most of the caches because the memory situation was
: unmanageable.

the solr filterCache is one of hte best ways to get good search speed when 
you tend to reuse common filter queries (like your range queries) ... it's 
a space vs speed issue ... if you give it more ram, you can get better 
performance in the common case.

PS...
http://people.apache.org/~hossman/#solr-dev

Please Use "solr-user@lucene" Not "solr-dev@lucene"

Your question is better suited for the solr-user@lucene mailing list ...
not the solr-dev@lucene list.  solr-dev is for discussing development of
the internals of the Solr application ... it is *not* the appropriate
place to ask questions about how to use Solr (or write Solr plugins) 
when developing your own applications.



-Hoss


Re: Terrible performance for range searches

Posted by Ryan McKinley <ry...@gmail.com>.
>
> The distribution is pretty small as they are lat/lon coordinates  
> that have been mapped to positive integers.  Most of them are near  
> each other and the queries are in a relatively small range, e.g.  
> (geo_lon:[5752427 5781706] AND geo_lat:[13750592 13763600]).
>

I know this is not the question you asked... but for geographic stuff,  
you may want to consider looking at local lucene:
http://sourceforge.net/projects/locallucene/

This indexes lat/lon with a bunch of tiers.  I have had good luck with  
it so far.

ryan

Re: Terrible performance for range searches

Posted by Alex Neth <al...@liivid.com>.

On Apr 28, 2008, at 11:56 AM, Mike Klaas wrote:

> On 27-Apr-08, at 12:31 PM, Alex Neth wrote:
>
>> I am trying to switch from Ferret to Solr.  My searches were  
>> performing quite well on Ferret (<100ms), but I have some reasons  
>> for switching.
>>
>> I am not experiencing terrible performance when doing range  
>> searches.  I have seen posts that granularity should be reduced,  
>> but this is not an option in my case.
>>
>> I have around 120K records with 8 digit range integer fields.  When  
>> I search for a small range, e.g. [0 100] the search is slow, but  
>> usable.  When I use a larger range, e.g. [0 99999999] the search  
>> can take over 60 seconds.  Searches without ranges are fast.
>
> That seems quite slow for 120K docs.  Is this on a warmed-up Solr?   
> (I.e., is the index in the OS's disk cache?)

I'm not sure if it's in the cache or not.  How could I make sure that  
it will be in the cache?

I did reduce the size of most of the caches because the memory  
situation was unmanageable.

>
>
>> Is there a way to get good performance in this case?  This is a  
>> critical function for my application and I will not be able to use  
>> Solr unless I can find a way to make this work.
>
> There is always a way <g>.  What is the distribution of values and  
> queries?  How are you storing the field?

The field is indexed as a range integer and is an integer in the MySQL  
database.  I'm not sure what you mean by "how are you storing the  
field," but that's the best I can answer.

The distribution is pretty small as they are lat/lon coordinates that  
have been mapped to positive integers.  Most of them are near each  
other and the queries are in a relatively small range, e.g. (geo_lon: 
[5752427 5781706] AND geo_lat:[13750592 13763600]).

>
>
> cheers,
> -Mike
>
>


Re: Terrible performance for range searches

Posted by Mike Klaas <mi...@gmail.com>.
On 27-Apr-08, at 12:31 PM, Alex Neth wrote:

> I am trying to switch from Ferret to Solr.  My searches were  
> performing quite well on Ferret (<100ms), but I have some reasons  
> for switching.
>
> I am not experiencing terrible performance when doing range  
> searches.  I have seen posts that granularity should be reduced, but  
> this is not an option in my case.
>
> I have around 120K records with 8 digit range integer fields.  When  
> I search for a small range, e.g. [0 100] the search is slow, but  
> usable.  When I use a larger range, e.g. [0 99999999] the search can  
> take over 60 seconds.  Searches without ranges are fast.

That seems quite slow for 120K docs.  Is this on a warmed-up Solr?   
(I.e., is the index in the OS's disk cache?)

> Is there a way to get good performance in this case?  This is a  
> critical function for my application and I will not be able to use  
> Solr unless I can find a way to make this work.

There is always a way <g>.  What is the distribution of values and  
queries?  How are you storing the field?

cheers,
-Mike