You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Frederik Kraus <fr...@gmail.com> on 2011/05/13 12:08:28 UTC

Huge performance drop in distributed search w/ shards on the same server/container

Hi, 

I'm having some serious problems scaling the following setup:

48 CPU / Tomcat / ...

localhost/shard1
...
localhost/shard10

When using all 10 shards in the query the req/s drop down to about 300 without fully utilizing cpu (60% idle) or ram (disk i/o is zero - everything fits into the ram)

When only quering one shard I get about 5k-6k req/s 

Are there any known limits and/or work-arounds?

Thanks,

Fred.

Re: Huge performance drop in distributed search w/ shards on the same server/container

Posted by Johannes Goll <jo...@gmail.com>.

I increased the maximum POST size and headerBufferSize to 10MB ; lowThreads
to 50, maxThreads to 100000 and lowResourceMaxIdleTime=15000. We tried
tomcat 6 using the following Connnector settings :

<Connector port="8989"
       protocol="HTTP/1.1"
       redirectPort="8443"
       URIEncoding="UTF-8"
       maxThreads="10000"
       maxKeepAliveRequests="200"
       maxHttpHeaderSize="10485760"
       />

I am getting the same exception as for jetty

SEVERE: org.apache.solr.common.SolrException:
org.apache.solr.client.solrj.SolrServerException: java.net.SocketException:
Connection reset

This seem to point towards a Solr specific issue (solrj.SolrServerException
during individual shard searches).  I monitored the CPU utilization
executing sequential distributed searches and noticed that in the beginning
all CPUs are getting used for a short period of time (multiple lines for
shard searches are shown in the log with isShard=true arguments), then all
CPU except one become idle and the request is being processed by this one
CPU for the longest period of time.

I also noticed in the logs that while most of the individual shard searches
(isShard=true) have low QTimes (5-10), a minority has extreme QTimes
(104402-105126). All shards are fairly similar in size and content (1.2 M
documents) and the StatsComponent is being used
[stats=true&stats.field=weight&stats.facet=library_id]. Here library_id
equals the shard/core name.

Is there an internal timeout for gathering shard results or other fixed
resource limitation ?

Johannes

2011/6/13 Yonik Seeley <yo...@lucidimagination.com>

> On Sun, Jun 12, 2011 at 9:10 PM, Johannes Goll <jo...@gmail.com>
> wrote:
> > However, sporadically, Jetty 6.1.2X (shipped with  Solr 3.1.)
> > sporadically throws Socket connect exceptions when executing distributed
> > searches.
>
> Are you using the exact jetty.xml that shipped with the solr example
> server,
> or did you make any modifications?
>
> -Yonik
> http://www.lucidimagination.com
>

-- 
Johannes Goll
211 Curry Ford Lane
Gaithersburg, Maryland 20878

Re: Huge performance drop in distributed search w/ shards on the same server/container

Posted by Yonik Seeley <yo...@lucidimagination.com>.

On Sun, Jun 12, 2011 at 9:10 PM, Johannes Goll <jo...@gmail.com> wrote:
> However, sporadically, Jetty 6.1.2X (shipped with  Solr 3.1.)
> sporadically throws Socket connect exceptions when executing distributed
> searches.

Are you using the exact jetty.xml that shipped with the solr example server,
or did you make any modifications?

-Yonik
http://www.lucidimagination.com

Re: Huge performance drop in distributed search w/ shards on the same server/container

Posted by Johannes Goll <jo...@gmail.com>.

Hi Fred,

we are having similar issues of scaling Solr 3.1 distributed searches on a
single box with 18 cores. We use the StatsComponent which seems to be mainly
CPU bound. Using distributed searches resulted in a 9 fold decrease in
response time. However, sporadically, Jetty 6.1.2X (shipped with  Solr 3.1.)
sporadically throws Socket connect exceptions when executing distributed
searches. Our next step is to switch from from jetty to tomcat.  Did you
find a solution for improving the CPU utilization and requests per second
for your system?

Johannes


2011/5/26 pravesh <su...@yahoo.com>

> Do you really require multi-shards? Single core/shard will do for even
> millions of documents and the search will be faster than searching on
> multi-shards.
>
> Consider multi-shard when you cannot scale-up on a single
> shard/machine(e.g,
> CPU,RAM etc. becomes major block).
>
> Also read through the SOLR distributed search wiki to check on all tuning
> up
> required at application server(Tomcat) end, like maxHTTP request settings.
> For a single request in a multi-shard setup internal HTTP requests are made
> through all queried shards, so, make sure you set this parameter higher.
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Huge-performance-drop-in-distributed-search-w-shards-on-the-same-server-container-tp2938421p2988464.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Johannes Goll
211 Curry Ford Lane
Gaithersburg, Maryland 20878

Re: Huge performance drop in distributed search w/ shards on the same server/container

Posted by pravesh <su...@yahoo.com>.

Do you really require multi-shards? Single core/shard will do for even
millions of documents and the search will be faster than searching on
multi-shards.

Consider multi-shard when you cannot scale-up on a single shard/machine(e.g,
CPU,RAM etc. becomes major block). 

Also read through the SOLR distributed search wiki to check on all tuning up
required at application server(Tomcat) end, like maxHTTP request settings.
For a single request in a multi-shard setup internal HTTP requests are made
through all queried shards, so, make sure you set this parameter higher.

--
View this message in context: http://lucene.472066.n3.nabble.com/Huge-performance-drop-in-distributed-search-w-shards-on-the-same-server-container-tp2938421p2988464.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Huge performance drop in distributed search w/ shards on the same server/container

Posted by Frederik Kraus <fr...@gmail.com>.

Any ideas?
On Freitag, 13. Mai 2011 at 13:19, Frederik Kraus wrote: 
> One Tomcat with multicore. I have a list of about 2mio "real" queries that I'm firing at the cluster with jmeter. Reason for splitting up the index in rather small parts is that the maximum response time of 1 sec cannot be exceeded for any of those queries.
> 
> 
> 
> On Freitag, 13. Mai 2011 at 12:57, Grant Ingersoll wrote:
> > Is that 10 different Tomcat instances or are you using multicore? How are you testing?
> > 
> > On May 13, 2011, at 6:08 AM, Frederik Kraus wrote:
> > 
> > > Hi, 
> > > 
> > > I'm having some serious problems scaling the following setup:
> > > 
> > > 48 CPU / Tomcat / ...
> > > 
> > > localhost/shard1
> > > ...
> > > localhost/shard10
> > > 
> > > When using all 10 shards in the query the req/s drop down to about 300 without fully utilizing cpu (60% idle) or ram (disk i/o is zero - everything fits into the ram)
> > > 
> > > When only quering one shard I get about 5k-6k req/s 
> > > 
> > > Are there any known limits and/or work-arounds?
> > > 
> > > Thanks,
> > > 
> > > Fred.
> > 
> > --------------------------------------------
> > Grant Ingersoll
> > Join the LUCENE REVOLUTION
> > Lucene & Solr User Conference
> > May 25-26, San Francisco
> > www.lucenerevolution.org
> > 
>

Re: Huge performance drop in distributed search w/ shards on the same server/container

Posted by Frederik Kraus <fr...@gmail.com>.

One Tomcat with multicore. I have a list of about 2mio "real" queries that I'm firing at the cluster with jmeter. Reason for splitting up the index in rather small parts is that the maximum response time of 1 sec cannot be exceeded for any of those queries.



On Freitag, 13. Mai 2011 at 12:57, Grant Ingersoll wrote: 
> Is that 10 different Tomcat instances or are you using multicore? How are you testing?
> 
> On May 13, 2011, at 6:08 AM, Frederik Kraus wrote:
> 
> > Hi, 
> > 
> > I'm having some serious problems scaling the following setup:
> > 
> > 48 CPU / Tomcat / ...
> > 
> > localhost/shard1
> > ...
> > localhost/shard10
> > 
> > When using all 10 shards in the query the req/s drop down to about 300 without fully utilizing cpu (60% idle) or ram (disk i/o is zero - everything fits into the ram)
> > 
> > When only quering one shard I get about 5k-6k req/s 
> > 
> > Are there any known limits and/or work-arounds?
> > 
> > Thanks,
> > 
> > Fred.
> 
> --------------------------------------------
> Grant Ingersoll
> Join the LUCENE REVOLUTION
> Lucene & Solr User Conference
> May 25-26, San Francisco
> www.lucenerevolution.org
>

Re: Huge performance drop in distributed search w/ shards on the same server/container

Posted by Grant Ingersoll <gs...@apache.org>.

Is that 10 different Tomcat instances or are you using multicore?  How are you testing?

On May 13, 2011, at 6:08 AM, Frederik Kraus wrote:

> Hi, 
> 
> I'm having some serious problems scaling the following setup:
> 
> 48 CPU / Tomcat / ...
> 
> localhost/shard1
> ...
> localhost/shard10
> 
> When using all 10 shards in the query the req/s drop down to about 300 without fully utilizing cpu (60% idle) or ram (disk i/o is zero - everything fits into the ram)
> 
> When only quering one shard I get about 5k-6k req/s 
> 
> Are there any known limits and/or work-arounds?
> 
> Thanks,
> 
> Fred.

--------------------------------------------
Grant Ingersoll
Join the LUCENE REVOLUTION
Lucene & Solr User Conference
May 25-26, San Francisco
www.lucenerevolution.org