You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Tri Nguyen <tr...@yahoo.com> on 2011/01/01 03:06:42 UTC

solr benchmarks

Hi,
 
I remember going through some page that had graphs of response times based on index size for solr.
 
Anyone know of such pages?
 
Internally, we have some requirements for response times and I'm trying to figure out when to shard the index.
 
Thanks,
 
Tri

Re: solr benchmarks

Posted by Toke Eskildsen <te...@statsbiblioteket.dk>.
On Sat, 2011-01-01 at 03:06 +0100, Tri Nguyen wrote:
> I remember going through some page that had graphs of response times based on index size for solr.
>  
> Anyone know of such pages?

Sorry, no. Some small scale tests with our corpus showed that response
times suffered less than proportionally to index size, with regard to
the raw searches: Doubling the index size did not halve the response
time. On the other hand, faceting time was proportional to the index
size. As always, your mileage will vary.

> Internally, we have some requirements for response times and I'm trying to figure out when to shard the index.

If you discover that your searches are primarily IO-bound, which is
often the case, and if you're still using spinning disks, I highly
recommend that you upgrade to SDD's. They are very cheap compared to
RAM, you don't need to change your code or workflow and they work
beautifully with Lucene/SOLR: They gave us 2-4 times speedup, compared
to 2 * 15.000 RPM harddisks in RAID 1. Compared to holding the index
fully in RAM (with a 14GB index) they gave us 80% on a dual core machine
- more CPU cores might benefit more from the RAM solution.


Re: solr benchmarks

Posted by François Schiettecatte <fs...@gmail.com>.
I would shard the index so that each shard is no larger than the memory of the machine it sits on, that way your entire index will be in memory all the time. When I was at Feedster (I wrote the search engine), the rule of thumb I had was to have 14GB of index on a 16GB machine.

François

On Dec 31, 2010, at 9:06 PM, Tri Nguyen wrote:

> Hi,
>  
> I remember going through some page that had graphs of response times based on index size for solr.
>  
> Anyone know of such pages?
>  
> Internally, we have some requirements for response times and I'm trying to figure out when to shard the index.
>  
> Thanks,
>  
> Tri


Re: solr benchmarks

Posted by dc tech <dc...@gmail.com>.
Tri:
What is the volume of content (# of documents) and index size you are
expecting? What about the document complexity in terms of # of fields, what
are you storing in the index, complexity of the queries etc?

We have used SOLR with 10m documents with 1-3 second response times on the
front end  - this is with minimal tuning, 4-5 facet fields and large blobs
of content in the index and jRuby on Rails and complex queries and under low
load conditions (hence caches are probably not warmed much).

We have external search application almost fully powered by SOLR (except for
web crawl) and the response is of the typically less than 1 second with
about 100k documents. Solr time is probably 100-200 ms of this.

My sense is that SOLR is as fast as it gets and scales very, very well. On
the user group, I have seen reference to people using SOLR for 100m
documents or more. It would be useful to get your use case(s).





On Mon, Jan 3, 2011 at 10:44 AM, Jak Akdemir <ja...@gmail.com> wrote:

> Hi,
> You can find benchmark results but these are not directly based on "index
> size vs. response time"
> http://wiki.apache.org/solr/SolrPerformanceData
>
> On Sat, Jan 1, 2011 at 4:06 AM, Tri Nguyen <tr...@yahoo.com> wrote:
>
> > Hi,
> >
> > I remember going through some page that had graphs of response times
> based
> > on index size for solr.
> >
> > Anyone know of such pages?
> >
> > Internally, we have some requirements for response times and I'm trying
> to
> > figure out when to shard the index.
> >
> > Thanks,
> >
> > Tri
>

Re: solr benchmarks

Posted by Jak Akdemir <ja...@gmail.com>.
Hi,
You can find benchmark results but these are not directly based on "index
size vs. response time"
http://wiki.apache.org/solr/SolrPerformanceData

On Sat, Jan 1, 2011 at 4:06 AM, Tri Nguyen <tr...@yahoo.com> wrote:

> Hi,
>
> I remember going through some page that had graphs of response times based
> on index size for solr.
>
> Anyone know of such pages?
>
> Internally, we have some requirements for response times and I'm trying to
> figure out when to shard the index.
>
> Thanks,
>
> Tri