You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Peter Karich <pe...@yahoo.de> on 2010/08/04 09:50:50 UTC

Re: Migrating from Lucene 2.9.1 to Solr 1.4.0 - Performance issues under heavy load

Ophir,

this sounds a bit strange:

> CommonsHttpSolrServer.java, line 416 takes about 95% of the application's total search time

Is this only for heavy load?

Some other things:

 * with lucene you accessed the indices with MultiSearcher in a LAN, right?
 * did you look into the logs of the servers, is there something
wrong/delayed?
 * did you enable gzip compression for your servers or even the binary
writer/parser for your solr clients?

CommonsHttpSolrServer server = ...
server.setRequestWriter(new BinaryRequestWriter());
server.setParser(new BinaryResponseParser());

Regards,
Peter.

> [posted this yesterday in lucene-user mailing list, and got an advice to
> post this here instead. excuse me for spamming]
>
> Hi,
>
> I'm currently involved in a project of migrating from Lucene 2.9.1 to Solr
> 1.4.0.
> During stress testing, I encountered this performance problem:
> While actual search times in our shards (which are now running Solr) have
> not changed, the total time it takes for a query has increased dramatically.
> During this performance test, we of course do not modify the indexes.
> Our application is sending Solr select queries concurrently to the 8 shards,
> using CommonsHttpSolrServer.
> I added some timing debug messages, and found that
> CommonsHttpSolrServer.java, line 416 takes about 95% of the application's
> total search time:
> int statusCode = _httpClient.executeMethod(method);
>
> Just to clarify: looking at access logs of the Solr shards, TTLB for a query
> might be around 5 ms. (on all shards), but httpClient.executeMethod() for
> this query can be much higher - say, 50 ms.
> On average, if under light load queries take 12 ms. on average, under heavy
> load the take around 22 ms.
>
> Another route we tried to pursue is add the "shards=shard1,shard2,…"
> parameter to the query instead of doing this ourselves, but this doesn't
> seem to work due to an NPE caused by QueryComponent.returnFields(), line
> 553:
> if (returnScores && sdoc.score != null) {
>
> where sdoc is null. I saw there is a null check on trunk, but since we're
> currently using Solr 1.4.0's ready-made WAR file, I didn't see an easy way
> around this.
> Note: we're using a custom query component which extends QueryComponent, but
> debugging this, I saw nothing wrong with the results at this point in the
> code.
>
> Our previous code used HTTP in a different manner:
> For each request, we created a new
> sun.net.www.protocol.http.HttpURLConnection, and called its getInputStream()
> method.
> Under the same load as the new application, the old application does not
> encounter the delays mentioned above.
>
> Our current code is initializing CommonsHttpSolrServer for each shard this
> way:
>     MultiThreadedHttpConnectionManager httpConnectionManager = new
> MultiThreadedHttpConnectionManager();
>     httpConnectionManager.getParams().setTcpNoDelay(true);
>     httpConnectionManager.getParams().setMaxTotalConnections(1024);
>     httpConnectionManager.getParams().setStaleCheckingEnabled(false);
>     HttpClient httpClient = new HttpClient();
>     HttpClientParams params = new HttpClientParams();
>     params.setCookiePolicy(CookiePolicy.IGNORE_COOKIES);
>     params.setAuthenticationPreemptive(false);
>     params.setContentCharset(StringConstants.UTF8);
>     httpClient.setParams(params);
>     httpClient.setHttpConnectionManager(httpConnectionManager);
>
> and passing the new HttpClient to the Solr Server:
> solrServer = new CommonsHttpSolrServer(coreUrl, httpClient);
>
> We tried two different ways - one with a single
> MultiThreadedHttpConnectionManager and HttpClient for all the SolrServer's,
> and the other with a new MultiThreadedHttpConnectionManager and HttpClient
> for each SolrServer.
> Both tries yielded similar performance results.
> Also tried to give setMaxTotalConnections() a much higher connections number
> (1,000,000) - didn't have an effect.
>
> One last thing - to answer Lance's question about this being an "apples to
> apples" comparison (in lucene-user thread) - yes, our main goal in this
> project is to do things as close to the previous version as possible.
> This way we can monitor that behavior (both quality and performance) remains
> similar, release this version, and then move forward to improve things.
> Of course, there are some changes, but I believe we are indeed measuring the
> complete flow on both apps, and that both apps are returning the same fields
> via HTTP.
>
> Would love to hear what you think about this. TIA,
> Ophir
>
>   


-- 
http://karussell.wordpress.com/

Re: Is there a better for solor server side loadbalance?

Posted by Peter Karich <pe...@yahoo.de>.

>> The default solr solution is client side loadbalance.
>> Is there a solution provide the server side loadbalance?
>>
>>
>>     
> No. Most of us stick a HTTP load balancer in front of multiple Solr servers.
>   

E.g. mod_jk is a very easy solution (maybe too simple/stupid?) for a
load balancer,
but it offers also a failover functionality:

It is as simple as:

worker.loadbalancer.balance_workers=worker1,worker2,worker3,...

and the failover:

worker.worker1.redirect=worker2

Re: Is there a better for solor server side loadbalance?

Posted by Shalin Shekhar Mangar <sh...@gmail.com>.

2010/8/4 Chengyang <at...@163.com>

> The default solr solution is client side loadbalance.
> Is there a solution provide the server side loadbalance?
>
>
No. Most of us stick a HTTP load balancer in front of multiple Solr servers.

-- 
Regards,
Shalin Shekhar Mangar.

Re: Is there a better for solor server side loadbalance?

Posted by Andrei Savu <sa...@gmail.com>.

Check this article [1] that explains how to setup haproxy to do load
balacing. The steps are the same even if you are not using Drupal.  By
using this approach you can easily add more replicas without changing
the application configuration files.

You should also check SolrCloud [2] which does automatic load
balancing and fail-over for queries. This branch is still under
development.

[1] http://davehall.com.au/blog/dave/2010/03/13/solr-replication-load-balancing-haproxy-and-drupal
[2] http://wiki.apache.org/solr/SolrCloud

2010/8/4 Chengyang <at...@163.com>:
- Hide quoted text -
> The default solr solution is client side loadbalance.
> Is there a solution provide the server side loadbalance?
>
>

-- 
Indekspot -- http://www.indekspot.com -- Managed Hosting for Apache Solr

Is there a better for solor server side loadbalance?

Posted by Chengyang <at...@163.com>.

The default solr solution is client side loadbalance.
Is there a solution provide the server side loadbalance?

Re: Migrating from Lucene 2.9.1 to Solr 1.4.0 - Performance issues under heavy load

Posted by Ophir Adiv <fi...@gmail.com>.

On Wed, Aug 4, 2010 at 10:50 AM, Peter Karich <pe...@yahoo.de> wrote:

> Ophir,
>
> this sounds a bit strange:
>
> > CommonsHttpSolrServer.java, line 416 takes about 95% of the application's
> total search time
>
> Is this only for heavy load?
>
>
I think this makes sense, since the hard work is done by Solr - once the
application gets the search results from the shards, it does a bit of
manipulations on them (combine, filter, ...), but these are easy tasks.

Some other things:


>  * with lucene you accessed the indices with MultiSearcher in a LAN, right?
>

No, each shard was run under a different tomcat instance, and each shard was
accessed by HTTP calls (the same way we're trying to work now with Solr)


>  * did you look into the logs of the servers, is there something
> wrong/delayed?
>

Everything seems peachy... logs are clean of errors/warnings and the likes


>  * did you enable gzip compression for your servers or even the binary
> writer/parser for your solr clients?
>
>
We're running our application (and Solr) under Tomcat. We do not enable
compression (the configuration remained similar to our old application's
configuration)
Tried using XMLResponseParser instead of BinaryResponseParser - hardly
affected run times.

Thanks for the ideas,
Ophir

CommonsHttpSolrServer server = ...
> server.setRequestWriter(new BinaryRequestWriter());
> server.setParser(new BinaryResponseParser());
>
> Regards,
> Peter.
>
> > [posted this yesterday in lucene-user mailing list, and got an advice to
> > post this here instead. excuse me for spamming]
> >
> > Hi,
> >
> > I'm currently involved in a project of migrating from Lucene 2.9.1 to
> Solr
> > 1.4.0.
> > During stress testing, I encountered this performance problem:
> > While actual search times in our shards (which are now running Solr) have
> > not changed, the total time it takes for a query has increased
> dramatically.
> > During this performance test, we of course do not modify the indexes.
> > Our application is sending Solr select queries concurrently to the 8
> shards,
> > using CommonsHttpSolrServer.
> > I added some timing debug messages, and found that
> > CommonsHttpSolrServer.java, line 416 takes about 95% of the application's
> > total search time:
> > int statusCode = _httpClient.executeMethod(method);
> >
> > Just to clarify: looking at access logs of the Solr shards, TTLB for a
> query
> > might be around 5 ms. (on all shards), but httpClient.executeMethod() for
> > this query can be much higher - say, 50 ms.
> > On average, if under light load queries take 12 ms. on average, under
> heavy
> > load the take around 22 ms.
> >
> > Another route we tried to pursue is add the "shards=shard1,shard2,…"
> > parameter to the query instead of doing this ourselves, but this doesn't
> > seem to work due to an NPE caused by QueryComponent.returnFields(), line
> > 553:
> > if (returnScores && sdoc.score != null) {
> >
> > where sdoc is null. I saw there is a null check on trunk, but since we're
> > currently using Solr 1.4.0's ready-made WAR file, I didn't see an easy
> way
> > around this.
> > Note: we're using a custom query component which extends QueryComponent,
> but
> > debugging this, I saw nothing wrong with the results at this point in the
> > code.
> >
> > Our previous code used HTTP in a different manner:
> > For each request, we created a new
> > sun.net.www.protocol.http.HttpURLConnection, and called its
> getInputStream()
> > method.
> > Under the same load as the new application, the old application does not
> > encounter the delays mentioned above.
> >
> > Our current code is initializing CommonsHttpSolrServer for each shard
> this
> > way:
> >     MultiThreadedHttpConnectionManager httpConnectionManager = new
> > MultiThreadedHttpConnectionManager();
> >     httpConnectionManager.getParams().setTcpNoDelay(true);
> >     httpConnectionManager.getParams().setMaxTotalConnections(1024);
> >     httpConnectionManager.getParams().setStaleCheckingEnabled(false);
> >     HttpClient httpClient = new HttpClient();
> >     HttpClientParams params = new HttpClientParams();
> >     params.setCookiePolicy(CookiePolicy.IGNORE_COOKIES);
> >     params.setAuthenticationPreemptive(false);
> >     params.setContentCharset(StringConstants.UTF8);
> >     httpClient.setParams(params);
> >     httpClient.setHttpConnectionManager(httpConnectionManager);
> >
> > and passing the new HttpClient to the Solr Server:
> > solrServer = new CommonsHttpSolrServer(coreUrl, httpClient);
> >
> > We tried two different ways - one with a single
> > MultiThreadedHttpConnectionManager and HttpClient for all the
> SolrServer's,
> > and the other with a new MultiThreadedHttpConnectionManager and
> HttpClient
> > for each SolrServer.
> > Both tries yielded similar performance results.
> > Also tried to give setMaxTotalConnections() a much higher connections
> number
> > (1,000,000) - didn't have an effect.
> >
> > One last thing - to answer Lance's question about this being an "apples
> to
> > apples" comparison (in lucene-user thread) - yes, our main goal in this
> > project is to do things as close to the previous version as possible.
> > This way we can monitor that behavior (both quality and performance)
> remains
> > similar, release this version, and then move forward to improve things.
> > Of course, there are some changes, but I believe we are indeed measuring
> the
> > complete flow on both apps, and that both apps are returning the same
> fields
> > via HTTP.
> >
> > Would love to hear what you think about this. TIA,
> > Ophir
> >
> >
>
>
> --
> http://karussell.wordpress.com/
>
>