You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Rajiv2 <ra...@gmail.com> on 2008/10/08 17:56:46 UTC

Need help with Solr Performance

Hi, I need some recommendations w/ some issues I'm having w/ solr search
performance. 

Here is my index/hardware config:
- CentOS on 8 quad core xeon processors @ 3.16 Ghz
- 32 GB RAM
- Tomcat and JAVA 1.6
- Solr 1.3
~15 million documents .
- Index size on disk is about 22 GB
- 8 quad core xeon processors @ 3.16 Ghz
- there 25 fields in the index
- I'm using dismax w/ 9 query fields and 3 phrase fields being searched,  -
the other fields are used for faceting
- I usually have at least 1 filter query

Currently, I'm getting query times between 6-12 seconds, I'm looking to
speed this up to sub .25 sec (250ms) without sharding. Can anyone recommend
any techniques? I've already looked at the Solr wiki and done most of the
optimizations. Here are some steps I'm considering using:

- Stop using dismax and use the standard 1 field query and cram everything
into 1 field. That will speed up searching but might reduce relevancy.
- Keep the index as lean as possible

thanks,
Rajiv

-- 
View this message in context: http://www.nabble.com/Need-help-with-Solr-Performance-tp19881808p19881808.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Need help with Solr Performance

Posted by Mark Miller <ma...@gmail.com>.
I don't think you can search a 15 million doc index with any kind of 
query complexity beyond a low freq query term in under .25 seconds 
unless its a cached hit (in which case it still might not *quite* make 
it under .25 every time either I'd think). Would love to be proven wrong 
though <g> You have quite a beastly server there.
> Currently, I'm getting query times between 6-12 seconds, I'm looking to
> speed this up to sub .25 sec (250ms) without sharding. 
>
>   


Re: Need help with Solr Performance

Posted by Ryan McKinley <ry...@gmail.com>.
On Oct 8, 2008, at 6:11 PM, Rajiv2 wrote:

>
> w/ faceting qtime is around +200ms.

if your target time is 250, this will need some work... but lets  
ignore that for now...


>
>
> qtime for a standard query on the default search field is less than  
> 100ms.
> Usually around 60ms.
> qtime for id:xxxx is around 16ms.
>

if you break your dismax query into multiple standard queries, what  
are the times?

that is, for dismax:  a^2 b^3 c^4

what are your times for each component:
  a:text
  b:text
  c:text

It may be that one of the fields in your dismax query is much slower  
to query....  and then you could focus on how that is indexed...



Re: Need help with Solr Performance

Posted by Rajiv2 <ra...@gmail.com>.
w/ faceting qtime is around +200ms.

qtime for a standard query on the default search field is less than 100ms.
Usually around 60ms.
qtime for id:xxxx is around 16ms.




ryantxu wrote:
> 
>> -
>> <lst name="process">
>> <double name="time">6727.0</double>
>> -
>> <lst name="org.apache.solr.handler.component.QueryComponent">
>> <double name="time">6457.0</double>
>> </lst>
>> -
>> <lst name="org.apache.solr.handler.component.FacetComponent">
>> <double name="time">0.0</double>
>> </lst>
>> -
> 
> So I take it, this is with faceting turned off...
> 
> what are your timing results for a simple (not dismax query), perhaps:  
> "id:XXXX"
> 
> ryan
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Need-help-with-Solr-Performance-tp19881808p19888604.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Need help with Solr Performance

Posted by Ryan McKinley <ry...@gmail.com>.
> -
> <lst name="process">
> <double name="time">6727.0</double>
> -
> <lst name="org.apache.solr.handler.component.QueryComponent">
> <double name="time">6457.0</double>
> </lst>
> -
> <lst name="org.apache.solr.handler.component.FacetComponent">
> <double name="time">0.0</double>
> </lst>
> -

So I take it, this is with faceting turned off...

what are your timing results for a simple (not dismax query), perhaps:  
"id:XXXX"

ryan


Re: Need help with Solr Performance

Posted by Rajiv2 <ra...@gmail.com>.
yes I'm using 1.3

here are the contents of debug.. I'm only pasting the first explain since
the whole debug section is very long.

<lst name="debug">
<str name="rawquerystring">cleaning services</str>
<str name="querystring">cleaning services</str>
-
<str name="parsedquery">
+((DisjunctionMaxQuery((about_us:cleaning^0.7 | zip:cleaning^0.7 |
(insurances:cleaners insurances:cleaning insurances:clean) |
(services:cleaners services:cleaning services:clean) |
(complete_listing:cleaners complete_listing:cleaning complete_listing:clean)
| ((payments:cleaners payments:cleaning payments:clean)^0.7) |
(segment_name:cleaners segment_name:cleaning segment_name:clean) |
((biz_attributes:cleaners biz_attributes:cleaning biz_attributes:clean)^0.9)
| listing_name:cleaning^1.5)~0.01)
DisjunctionMaxQuery((about_us:services^0.7 | zip:services^0.7 |
insurances:services | services:services | complete_listing:services |
payments:services^0.7 | segment_name:services | biz_attributes:services^0.9
| listing_name:services^1.5)~0.01))~2)
DisjunctionMaxQuery((about_us:"cleaning services"~100^0.7 |
services:"(cleaners cleaning clean) services"~100 | insurances:"(cleaners
cleaning clean) services"~100 | complete_listing:"(cleaners cleaning clean)
services"~100^1.3 | payments:"(cleaners cleaning clean) services"~100 |
segment_name:"(cleaners cleaning clean) services"~100 |
biz_attributes:"(cleaners cleaning clean) services"~100^0.9 |
listing_name:"cleaning services"~100^1.5)~0.01)
</str>
-
<str name="parsedquery_toString">
+(((about_us:cleaning^0.7 | zip:cleaning^0.7 | (insurances:cleaners
insurances:cleaning insurances:clean) | (services:cleaners services:cleaning
services:clean) | (complete_listing:cleaners complete_listing:cleaning
complete_listing:clean) | ((payments:cleaners payments:cleaning
payments:clean)^0.7) | (segment_name:cleaners segment_name:cleaning
segment_name:clean) | ((biz_attributes:cleaners biz_attributes:cleaning
biz_attributes:clean)^0.9) | listing_name:cleaning^1.5)~0.01
(about_us:services^0.7 | zip:services^0.7 | insurances:services |
services:services | complete_listing:services | payments:services^0.7 |
segment_name:services | biz_attributes:services^0.9 |
listing_name:services^1.5)~0.01)~2) (about_us:"cleaning services"~100^0.7 |
services:"(cleaners cleaning clean) services"~100 | insurances:"(cleaners
cleaning clean) services"~100 | complete_listing:"(cleaners cleaning clean)
services"~100^1.3 | payments:"(cleaners cleaning clean) services"~100 |
segment_name:"(cleaners cleaning clean) services"~100 |
biz_attributes:"(cleaners cleaning clean) services"~100^0.9 |
listing_name:"cleaning services"~100^1.5)~0.01
</str>
-
<lst name="explain">
-
<str name="2219">

14.29027 = (MATCH) sum of:
  0.4004236 = (MATCH) sum of:
    0.3046103 = (MATCH) max plus 0.01 times others of:
      0.11388241 = (MATCH) weight(about_us:cleaning^0.7 in 2219), product
of:
        0.06378449 = queryWeight(about_us:cleaning^0.7), product of:
          0.7 = boost
          7.1416993 = idf(docFreq=32263, numDocs=14997686)
          0.012758967 = queryNorm
        1.7854248 = (MATCH) fieldWeight(about_us:cleaning in 2219), product
of:
          2.0 = tf(termFreq(about_us:cleaning)=4)
          7.1416993 = idf(docFreq=32263, numDocs=14997686)
          0.125 = fieldNorm(field=about_us, doc=2219)
      0.25111166 = (MATCH) sum of:
        0.25111166 = (MATCH) weight(services:cleaning in 2219), product of:
          0.07772631 = queryWeight(services:cleaning), product of:
            6.091897 = idf(docFreq=92180, numDocs=14997686)
            0.012758967 = queryNorm
          3.2307162 = (MATCH) fieldWeight(services:cleaning in 2219),
product of:
            1.4142135 = tf(termFreq(services:cleaning)=2)
            6.091897 = idf(docFreq=92180, numDocs=14997686)
            0.375 = fieldNorm(field=services, doc=2219)
      0.29627118 = (MATCH) sum of:
        0.053804386 = (MATCH) weight(complete_listing:cleaners in 2219),
product of:
          0.091120705 = queryWeight(complete_listing:cleaners), product of:
            7.1416993 = idf(docFreq=32263, numDocs=14997686)
            0.012758967 = queryNorm
          0.5904738 = (MATCH) fieldWeight(complete_listing:cleaners in
2219), product of:
            5.2915025 = tf(termFreq(complete_listing:cleaners)=28)
            7.1416993 = idf(docFreq=32263, numDocs=14997686)
            0.015625 = fieldNorm(field=complete_listing, doc=2219)
        0.13884431 = (MATCH) weight(complete_listing:cleaning in 2219),
product of:
          0.07437885 = queryWeight(complete_listing:cleaning), product of:
            5.8295355 = idf(docFreq=119834, numDocs=14997686)
            0.012758967 = queryNorm
          1.8667177 = (MATCH) fieldWeight(complete_listing:cleaning in
2219), product of:
            20.493902 = tf(termFreq(complete_listing:cleaning)=420)
            5.8295355 = idf(docFreq=119834, numDocs=14997686)
            0.015625 = fieldNorm(field=complete_listing, doc=2219)
        0.103622474 = (MATCH) weight(complete_listing:clean in 2219),
product of:
          0.089417025 = queryWeight(complete_listing:clean), product of:
            7.0081716 = idf(docFreq=36872, numDocs=14997686)
            0.012758967 = queryNorm
          1.1588674 = (MATCH) fieldWeight(complete_listing:clean in 2219),
product of:
            10.583005 = tf(termFreq(complete_listing:clean)=112)
            7.0081716 = idf(docFreq=36872, numDocs=14997686)
            0.015625 = fieldNorm(field=complete_listing, doc=2219)
      0.08230729 = (MATCH) sum of:
        0.08230729 = (MATCH) weight(segment_name:cleaning in 2219), product
of:
          0.040990844 = queryWeight(segment_name:cleaning), product of:
            3.212709 = idf(docFreq=1640804, numDocs=14997686)
            0.012758967 = queryNorm
          2.0079432 = (MATCH) fieldWeight(segment_name:cleaning in 2219),
product of:
            1.0 = tf(termFreq(segment_name:cleaning)=1)
            3.212709 = idf(docFreq=1640804, numDocs=14997686)
            0.625 = fieldNorm(field=segment_name, doc=2219)
      0.1985048 = (MATCH) sum of:
        0.1985048 = (MATCH) weight(biz_attributes:cleaning in 2219), product
of:
          0.08377869 = queryWeight(biz_attributes:cleaning), product of:
            7.2958446 = idf(docFreq=27654, numDocs=14997686)
            0.0114830695 = queryNorm
          2.369395 = (MATCH) fieldWeight(biz_attributes:cleaning in 2219),
product of:
            1.7320508 = tf(termFreq(biz_attributes:cleaning)=3)
            7.2958446 = idf(docFreq=27654, numDocs=14997686)
            0.1875 = fieldNorm(field=biz_attributes, doc=2219)
      0.18810701 = (MATCH) weight(listing_name:cleaning^1.5 in 2219),
product of:
        0.084853716 = queryWeight(listing_name:cleaning^1.5), product of:
          1.5 = boost
          4.4336777 = idf(docFreq=483945, numDocs=14997686)
          0.012758967 = queryNorm
        2.2168388 = (MATCH) fieldWeight(listing_name:cleaning in 2219),
product of:
          1.0 = tf(termFreq(listing_name:cleaning)=1)
          4.4336777 = idf(docFreq=483945, numDocs=14997686)
          0.5 = fieldNorm(field=listing_name, doc=2219)
    0.09581329 = (MATCH) max plus 0.01 times others of:
      0.054453246 = (MATCH) weight(complete_listing:services in 2219),
product of:
        0.06481942 = queryWeight(complete_listing:services), product of:
          5.080303 = idf(docFreq=253495, numDocs=14997686)
          0.012758967 = queryNorm
        0.84007615 = (MATCH) fieldWeight(complete_listing:services in 2219),
product of:
          10.583005 = tf(termFreq(complete_listing:services)=112)
          5.080303 = idf(docFreq=253495, numDocs=14997686)
          0.015625 = fieldNorm(field=complete_listing, doc=2219)
      0.095268756 = (MATCH) weight(segment_name:services in 2219), product
of:
        0.04410045 = queryWeight(segment_name:services), product of:
          3.4564278 = idf(docFreq=1285911, numDocs=14997686)
          0.012758967 = queryNorm
        2.1602674 = (MATCH) fieldWeight(segment_name:services in 2219),
product of:
          1.0 = tf(termFreq(segment_name:services)=1)
          3.4564278 = idf(docFreq=1285911, numDocs=14997686)
          0.625 = fieldNorm(field=segment_name, doc=2219)
  13.889846 = (MATCH) max plus 0.01 times others of:
    1.2711473 = (MATCH) weight(complete_listing:"(cleaners cleaning clean)
services"~100^1.3 in 2219), product of:
      0.41565678 = queryWeight(complete_listing:"(cleaners cleaning clean)
services"~100^1.3), product of:
        1.3 = boost
        25.05971 = idf(complete_listing:"(cleaners cleaning clean)
services"~100^1.3)
        0.012758967 = queryNorm
      3.0581656 = (MATCH) fieldWeight(complete_listing:"(cleaners cleaning
clean) services"~100^1.3 in 2219), product of:
        7.81025 = tf(phraseFreq=61.0)
        25.05971 = idf(complete_listing:"(cleaners cleaning clean)
services"~100^1.3)
        0.015625 = fieldNorm(field=complete_listing, doc=2219)
    13.877134 = (MATCH) weight(segment_name:"(cleaners cleaning clean)
services"~100 in 2219), product of:
      0.53225243 = queryWeight(segment_name:"(cleaners cleaning clean)
services"~100), product of:
        41.71595 = idf(segment_name:"(cleaners cleaning clean)
services"~100)
        0.012758967 = queryNorm
      26.072468 = (MATCH) fieldWeight(segment_name:"(cleaners cleaning
clean) services"~100 in 2219), product of:
        1.0 = tf(phraseFreq=1.0)
        41.71595 = idf(segment_name:"(cleaners cleaning clean)
services"~100)
        0.625 = fieldNorm(field=segment_name, doc=2219)
</str>
... + 10 others

</lst>
<str name="QParser">DismaxQParser</str>
<null name="altquerystring"/>
<null name="boostfuncs"/>
-
<lst name="timing">
<double name="time">6730.0</double>
-
<lst name="prepare">
<double name="time">3.0</double>
-
<lst name="org.apache.solr.handler.component.QueryComponent">
<double name="time">2.0</double>
</lst>
-
<lst name="org.apache.solr.handler.component.FacetComponent">
<double name="time">0.0</double>
</lst>
-
<lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
<double name="time">0.0</double>
</lst>
-
<lst name="org.apache.solr.handler.component.HighlightComponent">
<double name="time">0.0</double>
</lst>
-
<lst name="org.apache.solr.handler.component.DebugComponent">
<double name="time">0.0</double>
</lst>
</lst>
-
<lst name="process">
<double name="time">6727.0</double>
-
<lst name="org.apache.solr.handler.component.QueryComponent">
<double name="time">6457.0</double>
</lst>
-
<lst name="org.apache.solr.handler.component.FacetComponent">
<double name="time">0.0</double>
</lst>
-
<lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
<double name="time">0.0</double>
</lst>
-
<lst name="org.apache.solr.handler.component.HighlightComponent">
<double name="time">0.0</double>
</lst>
-
<lst name="org.apache.solr.handler.component.DebugComponent">
<double name="time">270.0</double>
</lst>
</lst>
</lst>
</lst>
-- 
View this message in context: http://www.nabble.com/Need-help-with-Solr-Performance-tp19881808p19888155.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Need help with Solr Performance

Posted by Ryan McKinley <ry...@gmail.com>.
On Oct 8, 2008, at 4:03 PM, Rajiv2 wrote:

>
>> and query times without faceting are... ?
>
>> solr's built in faceting is "simple" and has its limits.  15M is
>> higher than i've seen good faceting performance out of, particularly
>> multivalued fields.
>>
>> 	Erik
>
> Hi, My facet fields are multi valued and w/o faceting the query time  
> is
> about 200ms faster.
>

are you using 1.3?

turn on debugQuery=true and check the "timing" block, that will show  
you how much time is spent in each component...

can you send along the contents of <lst name="debug">

ryan



>
> -- 
> View this message in context: http://www.nabble.com/Need-help-with-Solr-Performance-tp19881808p19886504.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Need help with Solr Performance

Posted by Rajiv2 <ra...@gmail.com>.
>and query times without faceting are... ?

>solr's built in faceting is "simple" and has its limits.  15M is  
>higher than i've seen good faceting performance out of, particularly  
>multivalued fields.
>
>	Erik

Hi, My facet fields are multi valued and w/o faceting the query time is
about 200ms faster.


-- 
View this message in context: http://www.nabble.com/Need-help-with-Solr-Performance-tp19881808p19886504.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Need help with Solr Performance

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Oct 8, 2008, at 3:30 PM, Rajiv2 wrote:

>
>
>> what is your actual query?
>> Are you doing faceting / highlighting / or anything else?
>
> I am doing faceting on 5 fields, no highlighting or anything else,  
> debugging
> is also off. A basic query that I'm doing using dismax is 'cleaning
> services' over 15 million local business records.

and query times without faceting are... ?

solr's built in faceting is "simple" and has its limits.  15M is  
higher than i've seen good faceting performance out of, particularly  
multivalued fields.

	Erik


Re: Need help with Solr Performance

Posted by Rajiv2 <ra...@gmail.com>.

>what is your actual query?
>Are you doing faceting / highlighting / or anything else?

I am doing faceting on 5 fields, no highlighting or anything else, debugging
is also off. A basic query that I'm doing using dismax is 'cleaning
services' over 15 million local business records.


-- 
View this message in context: http://www.nabble.com/Need-help-with-Solr-Performance-tp19881808p19885870.html
Sent from the Solr - User mailing list archive at Nabble.com.


Advice needed on master-slave configuration

Posted by William Pierce <ev...@hotmail.com>.
Folks:

I have two instances of solr running one on the master (U) and the other on 
the slave (Q).  Q is used for queries only, while U is where updates/deletes 
are done.   I am running on Windows so unfortunately I cannot use the 
distribution scripts.

Every N hours when changes are committed and the index on U is updated,  I 
want to copy the files from the master to the slave.    Do I need to halt 
the solr server on Q while the index is being updated?  If not,  how do I 
copy the files into the data folder while the server is running?     Any 
pointers would be greatly appreciated!

Thanks!

- Bill 


Re: Need help with Solr Performance

Posted by Ryan McKinley <ry...@gmail.com>.
what is your actual query?
Are you doing faceting / highlighting / or anything else?


On Oct 8, 2008, at 2:17 PM, Rajiv2 wrote:

>
> Hi, thanks for responding so quickly,
>
>> 6-12 seconds seems really long and 15 million docs is nothing on a
>> machine like this.  Are you sure the issue is in Solr?  How are you
>> measuring the 6-12 seconds?
>
> I'm looking at the <QTime> value in the Solr response.
>
>> Assuming it is Solr...
>
>> How often are you indexing?  How often do you commit and get new
>> searchers?  What's your JVM heap size?  Are you warming?  Is your
>> index optimized?  Did you turn off the compound file system?
>
> This is basically a test that I'm doing and it's not in production  
> yet, so I
> did a one time index and I haven't committed any new documents.
> - JVM heap size is 12 GB
> - I am autowarming
> - Index is optimized
> - useCompoundFile is false
>
>> You said you've "done most of the optimizations", can you be  
>> specific?
>
> - I have a minimum # of stored fields, 5 out of 25.
> - My index is optimized
> - HashDocSet is set to around 75000
> - I've setup autowarming queries
> - Haven't warmed sort fields because I'm not doing any sorting
> - Not using any solid state drives
> - Using filters instead of queries for filtering.
>
>>
>>
>> thanks,
>> Rajiv
>>
>> -- 
>> View this message in context:
>> http://www.nabble.com/Need-help-with-Solr-Performance-tp19881808p19881808.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>
> --------------------------
> Grant Ingersoll
>
> Lucene Helpful Hints:
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/LuceneFAQ
>
>
>
>
>
>
>
>
>
>
>
> -- 
> View this message in context: http://www.nabble.com/Need-help-with-Solr-Performance-tp19881808p19884425.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Need help with Solr Performance

Posted by Walter Underwood <wu...@netflix.com>.
One other question: are you using real query logs or a set of
unique queries? With real query logs, the caches will warm up
after a while (tens of minutes) and performance will improve.

With a set of unique queries, you are mostly measuring Solr
cache misses. For us, that is about 4X slower, and we have
a small index.

wunder

On 10/8/08 11:17 AM, "Rajiv2" <ra...@gmail.com> wrote:

> 
> Hi, thanks for responding so quickly,
> 
>> 6-12 seconds seems really long and 15 million docs is nothing on a
>> machine like this.  Are you sure the issue is in Solr?  How are you
>> measuring the 6-12 seconds?
> 
> I'm looking at the <QTime> value in the Solr response.
> 
>> Assuming it is Solr...
> 
>> How often are you indexing?  How often do you commit and get new
>> searchers?  What's your JVM heap size?  Are you warming?  Is your
>> index optimized?  Did you turn off the compound file system?
> 
> This is basically a test that I'm doing and it's not in production yet, so I
> did a one time index and I haven't committed any new documents.
> - JVM heap size is 12 GB
> - I am autowarming
> - Index is optimized
> - useCompoundFile is false
> 
>> You said you've "done most of the optimizations", can you be specific?
> 
> - I have a minimum # of stored fields, 5 out of 25.
> - My index is optimized
> - HashDocSet is set to around 75000
> - I've setup autowarming queries
> - Haven't warmed sort fields because I'm not doing any sorting
> - Not using any solid state drives
> - Using filters instead of queries for filtering.
> 
>> 
>> 
>> thanks,
>> Rajiv
>> 
>> -- 
>> View this message in context:
>> 
http://www.nabble.com/Need-help-with-Solr-Performance-tp19881808p19881808.htm>>
l
>> Sent from the Solr - User mailing list archive at Nabble.com.
>> 
> 
> --------------------------
> Grant Ingersoll
> 
> Lucene Helpful Hints:
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/LuceneFAQ
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 


Re: Need help with Solr Performance

Posted by Rajiv2 <ra...@gmail.com>.
Hi, thanks for responding so quickly,

>6-12 seconds seems really long and 15 million docs is nothing on a  
>machine like this.  Are you sure the issue is in Solr?  How are you  
>measuring the 6-12 seconds?

I'm looking at the <QTime> value in the Solr response.

>Assuming it is Solr...

>How often are you indexing?  How often do you commit and get new  
>searchers?  What's your JVM heap size?  Are you warming?  Is your  
>index optimized?  Did you turn off the compound file system?

This is basically a test that I'm doing and it's not in production yet, so I
did a one time index and I haven't committed any new documents.
- JVM heap size is 12 GB
- I am autowarming
- Index is optimized
- useCompoundFile is false

>You said you've "done most of the optimizations", can you be specific?

- I have a minimum # of stored fields, 5 out of 25.
- My index is optimized
- HashDocSet is set to around 75000
- I've setup autowarming queries
- Haven't warmed sort fields because I'm not doing any sorting
- Not using any solid state drives
- Using filters instead of queries for filtering.

>
>
> thanks,
> Rajiv
>
> -- 
> View this message in context:
> http://www.nabble.com/Need-help-with-Solr-Performance-tp19881808p19881808.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

--------------------------
Grant Ingersoll

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ











-- 
View this message in context: http://www.nabble.com/Need-help-with-Solr-Performance-tp19881808p19884425.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Need help with Solr Performance

Posted by Grant Ingersoll <gs...@apache.org>.
On Oct 8, 2008, at 11:56 AM, Rajiv2 wrote:

>
> Hi, I need some recommendations w/ some issues I'm having w/ solr  
> search
> performance.
>
> Here is my index/hardware config:
> - CentOS on 8 quad core xeon processors @ 3.16 Ghz
> - 32 GB RAM
> - Tomcat and JAVA 1.6
> - Solr 1.3
> ~15 million documents .
> - Index size on disk is about 22 GB
> - 8 quad core xeon processors @ 3.16 Ghz
> - there 25 fields in the index
> - I'm using dismax w/ 9 query fields and 3 phrase fields being  
> searched,  -
> the other fields are used for faceting
> - I usually have at least 1 filter query
>
> Currently, I'm getting query times between 6-12 seconds, I'm looking  
> to
> speed this up to sub .25 sec (250ms) without sharding. Can anyone  
> recommend
> any techniques? I've already looked at the Solr wiki and done most  
> of the
> optimizations. Here are some steps I'm considering using:
>
> - Stop using dismax and use the standard 1 field query and cram  
> everything
> into 1 field. That will speed up searching but might reduce relevancy.
> - Keep the index as lean as possible

6-12 seconds seems really long and 15 million docs is nothing on a  
machine like this.  Are you sure the issue is in Solr?  How are you  
measuring the 6-12 seconds?

Assuming it is Solr...

How often are you indexing?  How often do you commit and get new  
searchers?  What's your JVM heap size?  Are you warming?  Is your  
index optimized?  Did you turn off the compound file system?

You said you've "done most of the optimizations", can you be specific?



>
>
> thanks,
> Rajiv
>
> -- 
> View this message in context: http://www.nabble.com/Need-help-with-Solr-Performance-tp19881808p19881808.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

--------------------------
Grant Ingersoll

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ









Re: Need help with Solr Performance

Posted by Chris Hostetter <ho...@fucit.org>.
: considering you're doing faceting on quite a few fields, the filterCache 
: is somewhat important.

Sorry ... i overlooked the bit where QueryComponent was taking 6.x seconds 
... in general knowing what the cache hit rates are looking like is 
crucial to understanding the performance, but as Ryan mentioned figuring 
out what parts of yourquery are slow is clearly the first step.


-Hoss


Re: Need help with Solr Performance

Posted by Chris Hostetter <ho...@fucit.org>.
Maybe i missed it, but skimming this thread i haven't seen any indication 
of how your configured the various caches in solrconfig.xml ... or any 
indication of what kinds of cache hit/miss/expullsion stats you see from 
stats.jsp after running any tests.

considering you're doing faceting on quite a few fields, the filterCache 
is somewhat important.



-Hoss