You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by sausarkar <sa...@ebay.com> on 2012/12/06 02:35:00 UTC

SolrCloud - Query performance degrades with multiple servers

We are using SolrCloud and trying to configure it for testing purposes, we
are seeing that the average query time is increasing if we have more than
one node in the SolrCloud cluster. We have a single shard 12 gigs
index.Example:1 node, average query time *~28 msec* , load 140
queries/second3 nodes, average query time *~110 msec*, load 420
queries/second distributed equally on three servers so essentially 140 qps
on each node.Is there any inter node communication going on for queries, is
there any setting on the Solrcloud for query tuning for a  cloud config with
multiple nodes.Please help.



--
View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud - Query performance degrades with multiple servers(Shards)

Posted by Erick Erickson <er...@gmail.com>.
15M docs may still comfortably fit in a single shard!
I've seen up to 300M docs fit on a shard. Then
again I've seen 10M docs make things unacceptably
slow.

You simply cannot extrapolate from 10K to
5M reliably. Put all 5M docs on the stand-alone
servers and test _that_. Whenever I see numbers
like 30K qps (assuming this is queries, not number
of docs indexed) I wonder if you're using the
same query over and over and hitting the query
result cache rather than doing any actual
searches.

But to answer your question (again). Sharding adds
overhead. There's no way to make that overhead
magically disappear. What you measure is what
you can expect, and you must measure.

Best,
Erick

On Tue, Jul 19, 2016 at 8:32 AM, Susheel Kumar <su...@gmail.com> wrote:
> You may want to utilise Document routing (_route_) option to have your
> query serve faster but above you are trying to compare apple with oranges
> meaning your performance tests numbers have to be based on either your
> actual numbers like 3-5 million docs per shard or sufficient enough to see
> advantage of using sharding.  10K is nothing for your performance tests and
> will not give you anything.
>
> Otherwise as Eric mentioned don't shard  and add replica's if there is no
> need to distribute/divide data into shards.
>
>
> See
> https://cwiki.apache.org/confluence/display/solr/Shards+and+Indexing+Data+in+SolrCloud
>
> https://cwiki.apache.org/confluence/display/solr/Advanced+Distributed+Request+Options
>
>
> Thanks,
> Susheel
>
> On Tue, Jul 19, 2016 at 1:41 AM, kasimjinwala <ji...@gmail.com>
> wrote:
>
>> This is just for performance testing we have taken 10K records per shard.
>> In
>> live scenario it would be 30L-50L per shard. I want to search document from
>> all shards, it will slow down and take too long time.
>>
>> I know in case of solr Cloud, it will query all shard node and then return
>> result. Is there any way to search document in all shard with best
>> performance(qps)
>>
>>
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4287763.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>

Re: SolrCloud - Query performance degrades with multiple servers(Shards)

Posted by Susheel Kumar <su...@gmail.com>.
You may want to utilise Document routing (_route_) option to have your
query serve faster but above you are trying to compare apple with oranges
meaning your performance tests numbers have to be based on either your
actual numbers like 3-5 million docs per shard or sufficient enough to see
advantage of using sharding.  10K is nothing for your performance tests and
will not give you anything.

Otherwise as Eric mentioned don't shard  and add replica's if there is no
need to distribute/divide data into shards.


See
https://cwiki.apache.org/confluence/display/solr/Shards+and+Indexing+Data+in+SolrCloud

https://cwiki.apache.org/confluence/display/solr/Advanced+Distributed+Request+Options


Thanks,
Susheel

On Tue, Jul 19, 2016 at 1:41 AM, kasimjinwala <ji...@gmail.com>
wrote:

> This is just for performance testing we have taken 10K records per shard.
> In
> live scenario it would be 30L-50L per shard. I want to search document from
> all shards, it will slow down and take too long time.
>
> I know in case of solr Cloud, it will query all shard node and then return
> result. Is there any way to search document in all shard with best
> performance(qps)
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4287763.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: SolrCloud - Query performance degrades with multiple servers(Shards)

Posted by kasimjinwala <ji...@gmail.com>.
This is just for performance testing we have taken 10K records per shard. In
live scenario it would be 30L-50L per shard. I want to search document from
all shards, it will slow down and take too long time. 

I know in case of solr Cloud, it will query all shard node and then return
result. Is there any way to search document in all shard with best
performance(qps)



--
View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4287763.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud - Query performance degrades with multiple servers(Shards)

Posted by Erick Erickson <er...@gmail.com>.
+1 to Susheel's question. Sharding inevitably adds
overhead. Roughly each shard is queried
for its top N docs (10 if, say, rows=10). The
doc ID and sort criteria (score by default) are returned
to the node that originally got the request. That node
then sorts the lists into the real top 10 to return to
the user. Then the node handling the request re-queries
the shards for the contents of those docs.

Sharding is a way to handle very large data sets, the
general recommendation is to shard _only_ when you
have too many documents to get good query perf
from a single shard.

If you need to increase QPS, add _replicas_ not shards.
Only go to sharding when you have too many documents
fit on your hardware.

Best,
Erick

On Mon, Jul 18, 2016 at 6:31 AM, Susheel Kumar <su...@gmail.com> wrote:
> Hello,
>
> Question:  Do you really need sharding/can live without sharding since you
> mentioned only 10K records in one shard. What's your index/document size?
>
> Thanks,
> Susheel
>
> On Mon, Jul 18, 2016 at 2:08 AM, kasimjinwala <ji...@gmail.com>
> wrote:
>
>> currently I am using solrCloud 5.0 and I am facing query performance issue
>> while using 3 implicit shards, each shard contain around 10K records.
>> when I am specifying shards parameter(*shards=shard1*) in query it gives
>> 30K-35K qps. but while removing shards parameter from query it give
>> *1000-1500qps*. performance decreases drastically.
>>
>> please provide comment or suggestion to solve above issue
>>
>>
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4287600.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>

Re: SolrCloud - Query performance degrades with multiple servers(Shards)

Posted by Susheel Kumar <su...@gmail.com>.
Hello,

Question:  Do you really need sharding/can live without sharding since you
mentioned only 10K records in one shard. What's your index/document size?

Thanks,
Susheel

On Mon, Jul 18, 2016 at 2:08 AM, kasimjinwala <ji...@gmail.com>
wrote:

> currently I am using solrCloud 5.0 and I am facing query performance issue
> while using 3 implicit shards, each shard contain around 10K records.
> when I am specifying shards parameter(*shards=shard1*) in query it gives
> 30K-35K qps. but while removing shards parameter from query it give
> *1000-1500qps*. performance decreases drastically.
>
> please provide comment or suggestion to solve above issue
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4287600.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: SolrCloud - Query performance degrades with multiple servers(Shards)

Posted by kasimjinwala <ji...@gmail.com>.
currently I am using solrCloud 5.0 and I am facing query performance issue
while using 3 implicit shards, each shard contain around 10K records. 
when I am specifying shards parameter(*shards=shard1*) in query it gives
30K-35K qps. but while removing shards parameter from query it give
*1000-1500qps*. performance decreases drastically.

please provide comment or suggestion to solve above issue



--
View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4287600.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud - Query performance degrades with multiple servers

Posted by Shawn Heisey <so...@elyograg.org>.
On 1/9/2013 7:01 PM, sausarkar wrote:
> Hi Yonik,
>
> Could you merger this feature with 4.0 branch, We tried to use 4.1 it did
> solve the CPU spike but we did get other issues. As we are very tight on
> schedule so it would very beneficial if you could merge this feature with
> 4.0 branch.

4.1 *is* the next release after 4.0.  At this point, with 4.1 close to 
release, there will not be a 4.0.1.

Thanks,
Shawn


Re: SolrCloud - Query performance degrades with multiple servers

Posted by sausarkar <sa...@ebay.com>.
Hi Yonik,

Could you merger this feature with 4.0 branch, We tried to use 4.1 it did
solve the CPU spike but we did get other issues. As we are very tight on
schedule so it would very beneficial if you could merge this feature with
4.0 branch.

Let me know.

Thanks



--
View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4032088.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud - Query performance degrades with multiple servers

Posted by Yonik Seeley <yo...@lucidworks.com>.
On Wed, Dec 12, 2012 at 5:03 PM, sausarkar <sa...@ebay.com> wrote:
> We still could replicate the issue in 4.1 branch i.e. queries going to one
> server (numShards=1) is being distributed among all the servers which is
> creating CPU spikes in all the servers in the cloud. Do you think this
> behavior is as expected or will be fixed in the 4.1 release?

In 4.0 we defaulted to distrib=true for queries, so it's natural to
see more than one log line per request, and to see log lines in 2
different servers for a single distributed request, even if there is
only one shard (that's different from seeing a query going to *all*
replicas).  See the bottom of my previous message for an explanation.

I just now merged back a bunch of changes from trunk to 4x, one of
which was the short circuiting when a request goes to an active
replica and that replica can satisfy the request without going
distributed.

-Yonik
http://lucidworks.com

Re: SolrCloud - Query performance degrades with multiple servers

Posted by sausarkar <sa...@ebay.com>.
We still could replicate the issue in 4.1 branch i.e. queries going to one
server (numShards=1) is being distributed among all the servers which is
creating CPU spikes in all the servers in the cloud. Do you think this
behavior is as expected or will be fixed in the 4.1 release?



--
View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4026521.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud - Query performance degrades with multiple servers

Posted by Yonik Seeley <yo...@lucidworks.com>.
OK, I tried to reproduce it on trunk, and I can't (i.e. everything is
looking fine).

rm -rf example/solr/zoo_data
cp -rp example example2
cp -rp example example3

cd example
java -Dbootstrap_confdir=./solr/collection1/conf
-Dcollection.configName=myconf -DzkRun -DnumShards=1 -jar start.jar

cd example2
java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar

cd example3
java -Djetty.port=8900 -DzkHost=localhost:9983 -jar start.jar

#index some docs
cd example/exampledocs
./post.sh *xml

Looking at http://localhost:8983/solr/#/~cloud?view=tree
I can see the clusterstate.json is now

{"collection1":{
    "properties":{"router":"compositeId"},
    "shard1":{
      "range":"80000000-7fffffff",
      "replicas":{
        "192.168.1.109:8983_solr_collection1":{
          "shard":"shard1",
          "roles":null,
          "state":"active",
          "core":"collection1",
          "collection":"collection1",
          "node_name":"192.168.1.109:8983_solr",
          "base_url":"http://192.168.1.109:8983/solr",
          "leader":"true"},
        "192.168.1.109:7574_solr_collection1":{
          "shard":"shard1",
          "roles":null,
          "state":"active",
          "core":"collection1",
          "collection":"collection1",
          "node_name":"192.168.1.109:7574_solr",
          "base_url":"http://192.168.1.109:7574/solr"},
        "192.168.1.109:8900_solr_collection1":{
          "shard":"shard1",
          "roles":null,
          "state":"active",
          "core":"collection1",
          "collection":"collection1",
          "node_name":"192.168.1.109:8900_solr",
          "base_url":"http://192.168.1.109:8900/solr"}}}}}

curl "http://localhost:8983/solr/query?q=*:*"

Single log line in example:
Dec 11, 2012 4:14:48 PM org.apache.solr.core.SolrCore execute
INFO: [collection1] webapp=/solr path=/query params={q=*:*} hits=0
status=0 QTime=4

curl "http://localhost:7574/solr/query?q=*:*"

Single log line in example2:
Dec 11, 2012 4:15:59 PM org.apache.solr.core.SolrCore execute
INFO: [collection1] webapp=/solr path=/query params={q=*:*} hits=0
status=0 QTime=3


curl "http://localhost:8983/solr/query?q=*:*&shortCircuit=false"

3 log lines....
in example2:
Dec 11, 2012 4:18:48 PM org.apache.solr.core.SolrCore execute
INFO: [collection1] webapp=/solr path=/query
params={q=*:*&shortCircuit=false} hits=32 status=0 QTime=42

in example:
Dec 11, 2012 4:18:48 PM org.apache.solr.core.SolrCore execute
INFO: [collection1] webapp=/solr path=/select
params={distrib=false&wt=javabin&version=2&rows=10&NOW=1355260728098&shard.url=192.168.1.109:8983/solr/collection1/|192.168.1.109:7574/solr/collection1/|192.168.1.109:8900/solr/collection1/&fl=id,score&df=text&start=0&q=*:*&isShard=true&fsv=true&shortCircuit=false}
hits=32 status=0 QTime=10
Dec 11, 2012 4:18:48 PM org.apache.solr.core.SolrCore execute
INFO: [collection1] webapp=/solr path=/select
params={df=text&shard.url=192.168.1.109:8983/solr/collection1/|192.168.1.109:7574/solr/collection1/|192.168.1.109:8900/solr/collection1/&NOW=1355260728098&q=*:*&ids=SP2514N,GB18030TEST,apple,F8V7067-APL-KIT,adata,6H500F0,MA147LL/A,ati,IW-02,asus&distrib=false&isShard=true&wt=javabin&shortCircuit=false&version=2}
status=0 QTime=6



Notice that before I introduced short-circuiting  as part of document
routing ( https://issues.apache.org/jira/browse/SOLR-2592 )
You would see 3 log lines for each distributed request - 2 for the two
sub-queries as standard phases of distributed search, and one for the
top level request that encompasses the two sub-requests.

-Yonik
http://lucidworks.com

Re: SolrCloud - Query performance degrades with multiple servers

Posted by Mark Miller <ma...@gmail.com>.
I'm still looking into this - didn't have a lot of luck seeing it with
a test and am going to look at it manually.

I'm hoping 4.1 by xmas!

We will see though...need to get others on board.

- Mark

On Tue, Dec 11, 2012 at 2:40 PM, sausarkar <sa...@ebay.com> wrote:
> Do you know when will 4.1 be released or will there be a 4.0.1 release with
> bug fixes from 4.0?
>
> Thanks
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4026139.html
> Sent from the Solr - User mailing list archive at Nabble.com.



-- 
- Mark

Re: SolrCloud - Query performance degrades with multiple servers

Posted by sausarkar <sa...@ebay.com>.
Do you know when will 4.1 be released or will there be a 4.0.1 release with
bug fixes from 4.0?

Thanks



--
View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4026139.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud - Query performance degrades with multiple servers

Posted by Mark Miller <ma...@gmail.com>.
I missed this bug report! https://issues.apache.org/jira/browse/SOLR-3912

Will fix this very shortly. It's a problem with numShards=1.

- Mark

On Sun, Dec 9, 2012 at 4:21 PM, sausarkar <sa...@ebay.com> wrote:
> Thank you very much will wait for the results from your tests.
>
> From: "Mark Miller-3 [via Lucene]" <ml...@n3.nabble.com>>
> Date: Saturday, December 8, 2012 11:08 PM
> To: "Sarkar, Sauvik" <sa...@ebay.com>>
> Subject: Re: SolrCloud - Query performance degrades with multiple servers
>
> If that's true, we will fix it for 4.1. I can look closer tomorrow.
>
> Mark
>
> Sent from my iPhone
>
> On Dec 9, 2012, at 2:04 AM, sausarkar <[hidden email]</user/SendEmail.jtp?type=node&node=4025457&i=0>> wrote:
>
>> Spoke too early it seems that SolrCloud is still distributing queries to all
>> the servers even if numShards=1 We are seeing POST request to all servers in
>> the cluster, please let me know what is the solution. Here is an example:
>> (the variable isShard should be false in our case as single shard, please
>> help)
>>
>> POST /solr/core0/select HTTP/1.1
>> Content-Charset: UTF-8
>> Content-Type: application/x-www-form-urlencoded; charset=UTF-8
>> User-Agent: Solr[org.apache.solr.client.solrj.impl.HttpSolrServer] 1.0
>> Content-Length: 991
>> Host: server1
>> Connection: Keep-Alive
>>
>> lowercaseOperators=true&mm=70%&fl=EntityId&df=EntityId&q.op=AND&q.alt=*:*&qs=10&stopwords=true&defType=edismax&rows=3000&q=*:*&start=0&fsv=true&distrib=false&*isShard=true&*shard.url=*server1*:9090/solr/core0/|*server2*:9090/solr/core0/|*server3*:9090/solr/core0/&NOW=1354918880447&wt=javabin&version=2
>>
>>
>> Re: SolrCloud - Query performance degrades with multiple servers
>> Dec 06, 2012; 6:29pm — by   Mark Miller-3
>>
>> On Dec 6, 2012, at 5:08 PM, sausarkar <[hidden email]> wrote:
>>
>>> We solved the issue by explicitly adding numShards=1 argument to the solr
>>> start up script. Is this a bug?
>>
>> Sounds like it…perhaps related to SOLR-3971…not sure though.
>>
>> - Mark
>>
>>
>>
>>
>> --
>> View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4025455.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>
>
> ________________________________
> If you reply to this email, your message will be added to the discussion below:
> http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4025457.html
> To unsubscribe from SolrCloud - Query performance degrades with multiple servers, click here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4024660&code=c2F1c2Fya2FyQGViYXkuY29tfDQwMjQ2NjB8LTE0MTU2ODg5MDk=>.
> NAML<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4025573.html
> Sent from the Solr - User mailing list archive at Nabble.com.



-- 
- Mark

Re: SolrCloud - Query performance degrades with multiple servers

Posted by sausarkar <sa...@ebay.com>.
Thank you very much will wait for the results from your tests.

From: "Mark Miller-3 [via Lucene]" <ml...@n3.nabble.com>>
Date: Saturday, December 8, 2012 11:08 PM
To: "Sarkar, Sauvik" <sa...@ebay.com>>
Subject: Re: SolrCloud - Query performance degrades with multiple servers

If that's true, we will fix it for 4.1. I can look closer tomorrow.

Mark

Sent from my iPhone

On Dec 9, 2012, at 2:04 AM, sausarkar <[hidden email]</user/SendEmail.jtp?type=node&node=4025457&i=0>> wrote:

> Spoke too early it seems that SolrCloud is still distributing queries to all
> the servers even if numShards=1 We are seeing POST request to all servers in
> the cluster, please let me know what is the solution. Here is an example:
> (the variable isShard should be false in our case as single shard, please
> help)
>
> POST /solr/core0/select HTTP/1.1
> Content-Charset: UTF-8
> Content-Type: application/x-www-form-urlencoded; charset=UTF-8
> User-Agent: Solr[org.apache.solr.client.solrj.impl.HttpSolrServer] 1.0
> Content-Length: 991
> Host: server1
> Connection: Keep-Alive
>
> lowercaseOperators=true&mm=70%&fl=EntityId&df=EntityId&q.op=AND&q.alt=*:*&qs=10&stopwords=true&defType=edismax&rows=3000&q=*:*&start=0&fsv=true&distrib=false&*isShard=true&*shard.url=*server1*:9090/solr/core0/|*server2*:9090/solr/core0/|*server3*:9090/solr/core0/&NOW=1354918880447&wt=javabin&version=2
>
>
> Re: SolrCloud - Query performance degrades with multiple servers
> Dec 06, 2012; 6:29pm — by   Mark Miller-3
>
> On Dec 6, 2012, at 5:08 PM, sausarkar <[hidden email]> wrote:
>
>> We solved the issue by explicitly adding numShards=1 argument to the solr
>> start up script. Is this a bug?
>
> Sounds like it…perhaps related to SOLR-3971…not sure though.
>
> - Mark
>
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4025455.html
> Sent from the Solr - User mailing list archive at Nabble.com.


________________________________
If you reply to this email, your message will be added to the discussion below:
http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4025457.html
To unsubscribe from SolrCloud - Query performance degrades with multiple servers, click here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4024660&code=c2F1c2Fya2FyQGViYXkuY29tfDQwMjQ2NjB8LTE0MTU2ODg5MDk=>.
NAML<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>




--
View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4025573.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud - Query performance degrades with multiple servers

Posted by Mark Miller <ma...@gmail.com>.
If that's true, we will fix it for 4.1. I can look closer tomorrow. 

Mark

Sent from my iPhone

On Dec 9, 2012, at 2:04 AM, sausarkar <sa...@ebay.com> wrote:

> Spoke too early it seems that SolrCloud is still distributing queries to all
> the servers even if numShards=1 We are seeing POST request to all servers in
> the cluster, please let me know what is the solution. Here is an example:
> (the variable isShard should be false in our case as single shard, please
> help)
> 
> POST /solr/core0/select HTTP/1.1
> Content-Charset: UTF-8
> Content-Type: application/x-www-form-urlencoded; charset=UTF-8
> User-Agent: Solr[org.apache.solr.client.solrj.impl.HttpSolrServer] 1.0
> Content-Length: 991
> Host: server1
> Connection: Keep-Alive
> 
> lowercaseOperators=true&mm=70%&fl=EntityId&df=EntityId&q.op=AND&q.alt=*:*&qs=10&stopwords=true&defType=edismax&rows=3000&q=*:*&start=0&fsv=true&distrib=false&*isShard=true&*shard.url=*server1*:9090/solr/core0/|*server2*:9090/solr/core0/|*server3*:9090/solr/core0/&NOW=1354918880447&wt=javabin&version=2
> 
> 
> Re: SolrCloud - Query performance degrades with multiple servers
> Dec 06, 2012; 6:29pm — by   Mark Miller-3
> 
> On Dec 6, 2012, at 5:08 PM, sausarkar <[hidden email]> wrote: 
> 
>> We solved the issue by explicitly adding numShards=1 argument to the solr 
>> start up script. Is this a bug?
> 
> Sounds like it…perhaps related to SOLR-3971…not sure though. 
> 
> - Mark
> 
> 
> 
> 
> --
> View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4025455.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud - Query performance degrades with multiple servers

Posted by sausarkar <sa...@ebay.com>.
Spoke too early it seems that SolrCloud is still distributing queries to all
the servers even if numShards=1 We are seeing POST request to all servers in
the cluster, please let me know what is the solution. Here is an example:
(the variable isShard should be false in our case as single shard, please
help)

POST /solr/core0/select HTTP/1.1
Content-Charset: UTF-8
Content-Type: application/x-www-form-urlencoded; charset=UTF-8
User-Agent: Solr[org.apache.solr.client.solrj.impl.HttpSolrServer] 1.0
Content-Length: 991
Host: server1
Connection: Keep-Alive

lowercaseOperators=true&mm=70%&fl=EntityId&df=EntityId&q.op=AND&q.alt=*:*&qs=10&stopwords=true&defType=edismax&rows=3000&q=*:*&start=0&fsv=true&distrib=false&*isShard=true&*shard.url=*server1*:9090/solr/core0/|*server2*:9090/solr/core0/|*server3*:9090/solr/core0/&NOW=1354918880447&wt=javabin&version=2


Re: SolrCloud - Query performance degrades with multiple servers
Dec 06, 2012; 6:29pm — by   Mark Miller-3

On Dec 6, 2012, at 5:08 PM, sausarkar <[hidden email]> wrote: 

> We solved the issue by explicitly adding numShards=1 argument to the solr 
> start up script. Is this a bug? 

Sounds like it…perhaps related to SOLR-3971…not sure though. 

- Mark




--
View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4025455.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud - Query performance degrades with multiple servers

Posted by Mark Miller <ma...@gmail.com>.
On Dec 6, 2012, at 5:08 PM, sausarkar <sa...@ebay.com> wrote:

> We solved the issue by explicitly adding numShards=1 argument to the solr
> start up script. Is this a bug?

Sounds like it…perhaps related to SOLR-3971…not sure though.

- Mark

Re: SolrCloud - Query performance degrades with multiple servers

Posted by Yonik Seeley <yo...@lucidworks.com>.
On Thu, Dec 6, 2012 at 8:08 PM, sausarkar <sa...@ebay.com> wrote:
> Ok we think we found out the issue here. When solrcloud is started without
> specifying numShards argument solrcloud starts with a single shard but still
> thinks that there are multiple shards, so it forwards every single query to
> all the nodes in the cloud.

Yeah, if you don't specify numShards when creating the collection, you
go into custom sharding mode.
When bringing up new cores, you need to say what shard it represents
(and by default solr will assume it's a new shard).

-Yonik
http://lucidworks.com

Re: SolrCloud - Query performance degrades with multiple servers

Posted by sausarkar <sa...@ebay.com>.
Ok we think we found out the issue here. When solrcloud is started without
specifying numShards argument solrcloud starts with a single shard but still
thinks that there are multiple shards, so it forwards every single query to
all the nodes in the cloud. We did a tcpdump on the node where queries are
not targeted and found out that it is receiving POST requests from the node
where queries are started.
*&start=0&fsv=true&distrib=false&isShard=true&shard.url=serve1.com
*

We solved the issue by explicitly adding numShards=1 argument to the solr
start up script. Is this a bug?

Re: SolrCloud - Query performance degrades with multiple servers
Dec 06, 2012; 3:13pm — by   sausarkar
I also did a test running a load directed to one single server in the cloud
and checked the CPU usage of other servers. It seems that even if there are
no load directed to those servers there is a CPU spike each minute. Did you
also di this test on the SolrCloud, any observations or suggestions? 


In Reply To 
Re: SolrCloud - Query performance degrades with multiple servers 
Dec 05, 2012; 7:59pm — by   Mark Miller-3 
This is just the std scatter gather distrib search stuff solr has been using
since around 1.4. 

There is some overhead to that, but generally not much. I've measured it at
around 30-50ms for a 100 machines, each with 10 million docs a few years
ago. 

So…that doesn't help you much…but FYI… 

- Mark 

On Dec 5, 2012, at 5:35 PM, sausarkar <[hidden email]> wrote: 

> We are using SolrCloud and trying to configure it for testing purposes, we 
> are seeing that the average query time is increasing if we have more than 
> one node in the SolrCloud cluster. We have a single shard 12 gigs 
> index.Example:1 node, average query time *~28 msec* , load 140 
> queries/second3 nodes, average query time *~110 msec*, load 420 
> queries/second distributed equally on three servers so essentially 140 qps 
> on each node.Is there any inter node communication going on for queries,
> is 
> there any setting on the Solrcloud for query tuning for a  cloud config
> with 
> multiple nodes.Please help. 
> 
> 
> 
> -- 
> View this message in context:
> http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660.html
> Sent from the Solr - User mailing list archive at Nabble.com.



--
View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4024986.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud - Query performance degrades with multiple servers

Posted by sausarkar <sa...@ebay.com>.
I also did a test running a load directed to one single server in the cloud
and checked the CPU usage of other servers. It seems that even if there are
no load directed to those servers there is a CPU spike each minute. Did you
also di this test on the SolrCloud, any observations or suggestions?


In Reply To
Re: SolrCloud - Query performance degrades with multiple servers
Dec 05, 2012; 7:59pm — by   Mark Miller-3
This is just the std scatter gather distrib search stuff solr has been using
since around 1.4. 

There is some overhead to that, but generally not much. I've measured it at
around 30-50ms for a 100 machines, each with 10 million docs a few years
ago. 

So…that doesn't help you much…but FYI… 

- Mark 

On Dec 5, 2012, at 5:35 PM, sausarkar <[hidden email]> wrote: 

> We are using SolrCloud and trying to configure it for testing purposes, we 
> are seeing that the average query time is increasing if we have more than 
> one node in the SolrCloud cluster. We have a single shard 12 gigs 
> index.Example:1 node, average query time *~28 msec* , load 140 
> queries/second3 nodes, average query time *~110 msec*, load 420 
> queries/second distributed equally on three servers so essentially 140 qps 
> on each node.Is there any inter node communication going on for queries,
> is 
> there any setting on the Solrcloud for query tuning for a  cloud config
> with 
> multiple nodes.Please help. 
> 
> 
> 
> -- 
> View this message in context:
> http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660.html
> Sent from the Solr - User mailing list archive at Nabble.com.



--
View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4024961.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: SolrCloud - Query performance degrades with multiple servers

Posted by Michael Ryan <mr...@moreover.com>.
As you add nodes, the average response time of the slowest node will likely increase. For example, consider an extreme case where you have something like 1 million nodes - you're practically guaranteed that one of them is going to be doing something like a stop-the-world garbage collection. So even if 999,999 return in 10ms, there's going to be that one slowpoke that takes 1000ms, and it doesn't matter how fast the other 999,999 are.

-Michael

-----Original Message-----
From: Mark Miller [mailto:markrmiller@gmail.com] 
Sent: Wednesday, December 05, 2012 11:00 PM
To: solr-user@lucene.apache.org
Subject: Re: SolrCloud - Query performance degrades with multiple servers

This is just the std scatter gather distrib search stuff solr has been using since around 1.4.

There is some overhead to that, but generally not much. I've measured it at around 30-50ms for a 100 machines, each with 10 million docs a few years ago.

So...that doesn't help you much...but FYI...

- Mark

On Dec 5, 2012, at 5:35 PM, sausarkar <sa...@ebay.com> wrote:

> We are using SolrCloud and trying to configure it for testing purposes, we
> are seeing that the average query time is increasing if we have more than
> one node in the SolrCloud cluster. We have a single shard 12 gigs
> index.Example:1 node, average query time *~28 msec* , load 140
> queries/second3 nodes, average query time *~110 msec*, load 420
> queries/second distributed equally on three servers so essentially 140 qps
> on each node.Is there any inter node communication going on for queries, is
> there any setting on the Solrcloud for query tuning for a  cloud config with
> multiple nodes.Please help.
> 
> 
> 
> --
> View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: SolrCloud - Query performance degrades with multiple servers

Posted by sausarkar <sa...@ebay.com>.
We measured for just 3 nodes the overhead is around 100ms. We also noticed is
that CPU spikes to 100% and some queries get blocked, this happens only when
cloud has multiple nodes but does not happen on single node. All the nodes
has the exact same configuration and JVM setting and hardware configuration.

Any clues why this is happening?



--
View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4024941.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud - Query performance degrades with multiple servers

Posted by Mark Miller <ma...@gmail.com>.
This is just the std scatter gather distrib search stuff solr has been using since around 1.4.

There is some overhead to that, but generally not much. I've measured it at around 30-50ms for a 100 machines, each with 10 million docs a few years ago.

So…that doesn't help you much…but FYI…

- Mark

On Dec 5, 2012, at 5:35 PM, sausarkar <sa...@ebay.com> wrote:

> We are using SolrCloud and trying to configure it for testing purposes, we
> are seeing that the average query time is increasing if we have more than
> one node in the SolrCloud cluster. We have a single shard 12 gigs
> index.Example:1 node, average query time *~28 msec* , load 140
> queries/second3 nodes, average query time *~110 msec*, load 420
> queries/second distributed equally on three servers so essentially 140 qps
> on each node.Is there any inter node communication going on for queries, is
> there any setting on the Solrcloud for query tuning for a  cloud config with
> multiple nodes.Please help.
> 
> 
> 
> --
> View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: SolrCloud - Query performance degrades with multiple servers

Posted by Mark Miller <ma...@gmail.com>.
Ryan, my new best friend! Please, file JIRA issue(s) for these items!

I'm sure you will get some feedback.

- Mark

On Dec 6, 2012, at 5:09 PM, Ryan Zezeski <rz...@gmail.com> wrote:

> There are some gains to be made in Solr's distributed search code.  A few
> weeks about I spent time profiling dist search using dtrace/btrace and
> found some areas for improvement.  I planned on writing up some blog posts
> and providing patches but I'll list them off now in case others have input.
> 
> 1) Disable the http client stale check.  It is known to cause latency
> issues.  Doing this gave be a 4x increase in perf.
> 
> 2) Disable nagle, many tiny packets are not being sent (to my knowledge),
> so don't wait.
> 
> 3) Use a single TermEnum for all external id->lucene id lookups.  This
> seemed to reduce total bytes read according to dtrace.
> 
> 4) Building off #3, cache a certain number of external id->lucene id.
> Avoding the TermEnum altogether.
> 
> 5) If fl=id is present then dont' run the 2nd phase of the dist search.
> 
> I'm still very new to Solr so there could be issues with any of the patches
> I propose above that I'm not aware of.  Would love to hear input.
> 
> -Z
> 
> On Wed, Dec 5, 2012 at 8:35 PM, sausarkar <sa...@ebay.com> wrote:
> 
>> We are using SolrCloud and trying to configure it for testing purposes, we
>> are seeing that the average query time is increasing if we have more than
>> one node in the SolrCloud cluster. We have a single shard 12 gigs
>> index.Example:1 node, average query time *~28 msec* , load 140
>> queries/second3 nodes, average query time *~110 msec*, load 420
>> queries/second distributed equally on three servers so essentially 140 qps
>> on each node.Is there any inter node communication going on for queries, is
>> there any setting on the Solrcloud for query tuning for a  cloud config
>> with
>> multiple nodes.Please help.
>> 
>> 
>> 
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660.html
>> Sent from the Solr - User mailing list archive at Nabble.com.


Re: SolrCloud - Query performance degrades with multiple servers

Posted by Ryan Zezeski <rz...@gmail.com>.
There are some gains to be made in Solr's distributed search code.  A few
weeks about I spent time profiling dist search using dtrace/btrace and
found some areas for improvement.  I planned on writing up some blog posts
and providing patches but I'll list them off now in case others have input.

1) Disable the http client stale check.  It is known to cause latency
issues.  Doing this gave be a 4x increase in perf.

2) Disable nagle, many tiny packets are not being sent (to my knowledge),
so don't wait.

3) Use a single TermEnum for all external id->lucene id lookups.  This
seemed to reduce total bytes read according to dtrace.

4) Building off #3, cache a certain number of external id->lucene id.
 Avoding the TermEnum altogether.

5) If fl=id is present then dont' run the 2nd phase of the dist search.

I'm still very new to Solr so there could be issues with any of the patches
I propose above that I'm not aware of.  Would love to hear input.

-Z

On Wed, Dec 5, 2012 at 8:35 PM, sausarkar <sa...@ebay.com> wrote:

> We are using SolrCloud and trying to configure it for testing purposes, we
> are seeing that the average query time is increasing if we have more than
> one node in the SolrCloud cluster. We have a single shard 12 gigs
> index.Example:1 node, average query time *~28 msec* , load 140
> queries/second3 nodes, average query time *~110 msec*, load 420
> queries/second distributed equally on three servers so essentially 140 qps
> on each node.Is there any inter node communication going on for queries, is
> there any setting on the Solrcloud for query tuning for a  cloud config
> with
> multiple nodes.Please help.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660.html
> Sent from the Solr - User mailing list archive at Nabble.com.