You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by stockii <st...@googlemail.com> on 2011/03/08 16:52:37 UTC

getting much double-Values from solr -- timeout

Hello.

i have 34.000.000 documents in my index and each doc have a field with a
double-value. i want the sum of these fields. i testet it with the
statscomponent but this is not usable. !! so i get all my values directly
from solr, from the index and with php-sum() i get my sum.

that works fine but, when a user search over really much documents (~
30.000), my skript need longer than 30 seconds and php skipped this.


how can i tune solr, to geht much faster this double-values from the index
!?

-----
------------------------------- System ----------------------------------------

One Server, 12 GB RAM, 2 Solr Instances, 7 Cores, 
1 Core with 31 Million Documents other Cores < 100.000

- Solr1 for Search-Requests - commit every Minute  - 4GB Xmx
- Solr2 for Update-Request  - delta every 2 Minutes - 4GB Xmx
--
View this message in context: http://lucene.472066.n3.nabble.com/getting-much-double-Values-from-solr-timeout-tp2650981p2650981.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: getting much double-Values from solr -- timeout

Posted by Jan Høydahl <ja...@cominvent.com>.
You have a large index with tough performance requirements on one server.
I would analyze your system to see if it's got any bottlenecks.
Watch out for auto-warming taking too long so it does not finish before next commit()
Watch out for too frequent commits
Monitor mem usage (JConsole or similar) to find if the correct RAM is allocated to each JVM.
How large is your index in terms of Gb? It may very well be that you need even more RAM in the server to cache more of the index files in OS memory.

Try to stop the Update JVM and let only Search-JVM be active. This will free RAM for OS. Then see if performance increases.
Next, try an optimize() and then see if that makes a difference.

I'm not familiar with the implementation details of StatsComponent. But if your Stats query is still slow after freeing RAM and optimize() I would file a JIRA issue, and attach to that issue some detailed response XMLs with debugQuery=true&echoParams=all , to document exactly how you use it and how it performs. It may be possible to optimize the code.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 9. mars 2011, at 11.39, stockii wrote:

> i am using NRT, and the caches are not always warmed, i think this is almost
> a problem !?
> 
> -----
> ------------------------------- System ----------------------------------------
> 
> One Server, 12 GB RAM, 2 Solr Instances, 7 Cores, 
> 1 Core with 31 Million Documents other Cores < 100.000
> 
> - Solr1 for Search-Requests - commit every Minute  - 5GB Xmx
> - Solr2 for Update-Request  - delta every Minute - 4GB Xmx
> --
> View this message in context: http://lucene.472066.n3.nabble.com/getting-much-double-Values-from-solr-timeout-tp2650981p2654725.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: getting much double-Values from solr -- timeout

Posted by stockii <st...@googlemail.com>.
i am using NRT, and the caches are not always warmed, i think this is almost
a problem !?

-----
------------------------------- System ----------------------------------------

One Server, 12 GB RAM, 2 Solr Instances, 7 Cores, 
1 Core with 31 Million Documents other Cores < 100.000

- Solr1 for Search-Requests - commit every Minute  - 5GB Xmx
- Solr2 for Update-Request  - delta every Minute - 4GB Xmx
--
View this message in context: http://lucene.472066.n3.nabble.com/getting-much-double-Values-from-solr-timeout-tp2650981p2654725.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: getting much double-Values from solr -- timeout

Posted by stockii <st...@googlemail.com>.
Are you using shards or have everything in same index? 
- shards == distributed Search over several cores ? => yes, but not always.
but in generally not.

What problem did you experience with the StatsCompnent?
- if i use stats on my 34Million Index, no matter how many docs founded, the
sum takes VEERY long time.

How did you use it? 
- like in the wiki, i think statscomp is not so dynamic usable !? 


I think the right approach will be to optimize StatsComponent to do quick
sum() 
- how can i optimize this ? change the code vom statscomponent and create a
new solr ? 

-----
------------------------------- System ----------------------------------------

One Server, 12 GB RAM, 2 Solr Instances, 7 Cores, 
1 Core with 31 Million Documents other Cores < 100.000

- Solr1 for Search-Requests - commit every Minute  - 5GB Xmx
- Solr2 for Update-Request  - delta every Minute - 4GB Xmx
--
View this message in context: http://lucene.472066.n3.nabble.com/getting-much-double-Values-from-solr-timeout-tp2650981p2654721.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: getting much double-Values from solr -- timeout

Posted by Jan Høydahl <ja...@cominvent.com>.
Are you using shards or have everything in same index?

What problem did you experience with the StatsCompnent? How did you use it? I think the right approach will be to optimize StatsComponent to do quick sum()

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 8. mars 2011, at 16.52, stockii wrote:

> Hello.
> 
> i have 34.000.000 documents in my index and each doc have a field with a
> double-value. i want the sum of these fields. i testet it with the
> statscomponent but this is not usable. !! so i get all my values directly
> from solr, from the index and with php-sum() i get my sum.
> 
> that works fine but, when a user search over really much documents (~
> 30.000), my skript need longer than 30 seconds and php skipped this.
> 
> 
> how can i tune solr, to geht much faster this double-values from the index
> !?
> 
> -----
> ------------------------------- System ----------------------------------------
> 
> One Server, 12 GB RAM, 2 Solr Instances, 7 Cores, 
> 1 Core with 31 Million Documents other Cores < 100.000
> 
> - Solr1 for Search-Requests - commit every Minute  - 4GB Xmx
> - Solr2 for Update-Request  - delta every 2 Minutes - 4GB Xmx
> --
> View this message in context: http://lucene.472066.n3.nabble.com/getting-much-double-Values-from-solr-timeout-tp2650981p2650981.html
> Sent from the Solr - User mailing list archive at Nabble.com.