You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Shawn Heisey <so...@elyograg.org> on 2012/02/03 21:12:02 UTC

Setting solrj server connection timeout

Is the following a reasonable approach to setting a connection timeout 
with SolrJ?

         queryCore.getHttpClient().getHttpConnectionManager().getParams()
                 .setConnectionTimeout(15000);

Right now I have all my solr server objects sharing a single HttpClient 
that gets created using the multithreaded connection manager, where I 
set the timeout for all of them.  Now I will be letting each server 
object create its own HttpClient object, and using the above statement 
to set the timeout on each one individually.  It'll use up a bunch more 
memory, as there are 56 server objects, but maybe it'll work better.  
The total of 56 objects comes about from 7 shards, a build core and a 
live core per shard, two complete index chains, and for each of those, 
one server object for access to CoreAdmin and another for the index.

The impetus for this, as it's possible I'm stating an XY problem: 
Currently I have an occasional problem where SolrJ connections throw an 
exception.  When it happens, nothing is logged in Solr.  My code is 
smart enough to notice the problem, send an email alert, and simply try 
again at the top of the next minute.  The simple explanation is that 
this is a Linux networking problem, but I never had any problem like 
this when I was using Perl with LWP to keep my index up to date.  I sent 
a message to the list some time ago on this exception, but I never got a 
response that helped me figure it out.

Caused by: org.apache.solr.client.solrj.SolrServerException: 
java.net.SocketException: Connection reset

at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:480)

at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:246)

at 
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:89)

at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:276)

at com.newscom.idxbuild.solr.Core.getCount(Core.java:325)

... 3 more

Caused by: java.net.SocketException: Connection reset

at java.net.SocketInputStream.read(SocketInputStream.java:168)

at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)

at java.io.BufferedInputStream.read(BufferedInputStream.java:237)

at org.apache.commons.httpclient.HttpParser.readRawLine(HttpParser.java:78)

at org.apache.commons.httpclient.HttpParser.readLine(HttpParser.java:106)

at 
org.apache.commons.httpclient.HttpConnection.readLine(HttpConnection.java:1116)

at 
org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.readLine(MultiThreadedHttpConnectionManager.java:1413)

at 
org.apache.commons.httpclient.HttpMethodBase.readStatusLine(HttpMethodBase.java:1973)

at 
org.apache.commons.httpclient.HttpMethodBase.readResponse(HttpMethodBase.java:1735)

at 
org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1098)

at 
org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:398)

at 
org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)

at 
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)

at 
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)

at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:424)

... 7 more


Thanks,
Shawn


Re: Setting solrj server connection timeout

Posted by Shawn Heisey <so...@elyograg.org>.
On 2/16/2012 6:28 PM, Mark Miller wrote:
> Im not sure that timeout will help you here - I believe it's the timeout on
> 'creating' the connection.
>
> Try setting the socket timeout (setSoTimeout) - that should let you try
> sooner.
>
> It looks like perhaps the server is timing out and closing the connection.
>
> I guess all you can do is timeout reasonably (if it takes too long to we
> for the exception) and retry.

When the timeout exception happens, it is happening within the same 
second as the beginning of the update cycle, which involves a lot of 
other things happening (such as talking to a database) before it even 
gets around to talking to Solr.  I do not have millisecond timestamps, 
but from what little I can tell, it's a handful of milliseconds from 
when SolrJ starts the request until the exception is logged.  It happens 
relatively rarely - no more than once every few days, usually less often 
than that.  I cannot reproduce it at will.  Nobody is doing any work on 
either Solr or the network when it happens.  Nothing is logged in the 
Solr server log or syslog at the OS level, the only mention of anything 
bad going on is in the log of my SolrJ application.

I never had this problem when my build system was written in Perl, using 
LWP to make HTTP requests with URLs that I constructed myself.  The perl 
system ran on CentOS 5 with Xen virtualization, now I'm running CentOS 6 
on the bare metal.  I'm using a bonded interface (for failover, not load 
balancing) comprised of two NICs plugged into separate switches.  When 
it was virtualized, the Xen host was also using an identically 
configured bonded interface, bridged to the guests, which used eth0.

The last time the error happened, which was on Feb 15th at 2:04 PM MST, 
the query that failed was 'did:(289800299 OR 289800157)', a very simple 
query against a tlong field.  The application tests for the existence of 
the did values that it is trying to delete before it issues the delete 
request.

I'm willing to look deeper into possible networking issues, but I am 
skeptical about that being the problem, and because there are no log 
messages to investigate, I have no idea how to proceed.  The application 
runs on one of four Solr servers, sometimes the error even happens when 
connecting to Solr on the same server it's running on, which takes the 
gigabit switches out of the equation.  If it's an actual networking 
problem, it's either in the hardware (Dell PowerEdge 2950 III, built-in 
NICs) or the CentOS 6 kernel.

At this point, I am thinking it's one of the following problems, in 
order of decreasing probability: 1) I am using SolrJ incorrectly. 2) 
There is a SolrJ problem that only appears under specific circumstances 
that happen to exist in my setup. 3) My hardware or OS software has an 
extremely intermittent problem.

What other info can I provide?

Thanks,
Shawn


Re: Setting solrj server connection timeout

Posted by Mark Miller <ma...@gmail.com>.
Im not sure that timeout will help you here - I believe it's the timeout on
'creating' the connection.

Try setting the socket timeout (setSoTimeout) - that should let you try
sooner.

It looks like perhaps the server is timing out and closing the connection.

I guess all you can do is timeout reasonably (if it takes too long to we
for the exception) and retry.

On Fri, Feb 3, 2012 at 3:12 PM, Shawn Heisey <so...@elyograg.org> wrote:

> Is the following a reasonable approach to setting a connection timeout
> with SolrJ?
>
>        queryCore.getHttpClient().**getHttpConnectionManager().**
> getParams()
>                .setConnectionTimeout(15000);
>
> Right now I have all my solr server objects sharing a single HttpClient
> that gets created using the multithreaded connection manager, where I set
> the timeout for all of them.  Now I will be letting each server object
> create its own HttpClient object, and using the above statement to set the
> timeout on each one individually.  It'll use up a bunch more memory, as
> there are 56 server objects, but maybe it'll work better.  The total of 56
> objects comes about from 7 shards, a build core and a live core per shard,
> two complete index chains, and for each of those, one server object for
> access to CoreAdmin and another for the index.
>
> The impetus for this, as it's possible I'm stating an XY problem:
> Currently I have an occasional problem where SolrJ connections throw an
> exception.  When it happens, nothing is logged in Solr.  My code is smart
> enough to notice the problem, send an email alert, and simply try again at
> the top of the next minute.  The simple explanation is that this is a Linux
> networking problem, but I never had any problem like this when I was using
> Perl with LWP to keep my index up to date.  I sent a message to the list
> some time ago on this exception, but I never got a response that helped me
> figure it out.
>
> Caused by: org.apache.solr.client.solrj.**SolrServerException:
> java.net.SocketException: Connection reset
>
> at org.apache.solr.client.solrj.**impl.CommonsHttpSolrServer.**
> request(CommonsHttpSolrServer.**java:480)
>
> at org.apache.solr.client.solrj.**impl.CommonsHttpSolrServer.**
> request(CommonsHttpSolrServer.**java:246)
>
> at org.apache.solr.client.solrj.**request.QueryRequest.process(**
> QueryRequest.java:89)
>
> at org.apache.solr.client.solrj.**SolrServer.query(SolrServer.**java:276)
>
> at com.newscom.idxbuild.solr.**Core.getCount(Core.java:325)
>
> ... 3 more
>
> Caused by: java.net.SocketException: Connection reset
>
> at java.net.SocketInputStream.**read(SocketInputStream.java:**168)
>
> at java.io.BufferedInputStream.**fill(BufferedInputStream.java:**218)
>
> at java.io.BufferedInputStream.**read(BufferedInputStream.java:**237)
>
> at org.apache.commons.httpclient.**HttpParser.readRawLine(**
> HttpParser.java:78)
>
> at org.apache.commons.httpclient.**HttpParser.readLine(**
> HttpParser.java:106)
>
> at org.apache.commons.httpclient.**HttpConnection.readLine(**
> HttpConnection.java:1116)
>
> at org.apache.commons.httpclient.**MultiThreadedHttpConnectionMan**
> ager$HttpConnectionAdapter.**readLine(**MultiThreadedHttpConnectionMan**
> ager.java:1413)
>
> at org.apache.commons.httpclient.**HttpMethodBase.readStatusLine(**
> HttpMethodBase.java:1973)
>
> at org.apache.commons.httpclient.**HttpMethodBase.readResponse(**
> HttpMethodBase.java:1735)
>
> at org.apache.commons.httpclient.**HttpMethodBase.execute(**
> HttpMethodBase.java:1098)
>
> at org.apache.commons.httpclient.**HttpMethodDirector.**executeWithRetry(*
> *HttpMethodDirector.java:398)
>
> at org.apache.commons.httpclient.**HttpMethodDirector.**executeMethod(**
> HttpMethodDirector.java:171)
>
> at org.apache.commons.httpclient.**HttpClient.executeMethod(**
> HttpClient.java:397)
>
> at org.apache.commons.httpclient.**HttpClient.executeMethod(**
> HttpClient.java:323)
>
> at org.apache.solr.client.solrj.**impl.CommonsHttpSolrServer.**
> request(CommonsHttpSolrServer.**java:424)
>
> ... 7 more
>
>
> Thanks,
> Shawn
>
>


-- 
- Mark

http://www.lucidimagination.com

Re: Setting solrj server connection timeout

Posted by Shawn Heisey <so...@elyograg.org>.
On 2/3/2012 1:12 PM, Shawn Heisey wrote:
> Is the following a reasonable approach to setting a connection timeout 
> with SolrJ?
>
>         queryCore.getHttpClient().getHttpConnectionManager().getParams()
>                 .setConnectionTimeout(15000);
>
> Right now I have all my solr server objects sharing a single 
> HttpClient that gets created using the multithreaded connection 
> manager, where I set the timeout for all of them.  Now I will be 
> letting each server object create its own HttpClient object, and using 
> the above statement to set the timeout on each one individually.  
> It'll use up a bunch more memory, as there are 56 server objects, but 
> maybe it'll work better.  The total of 56 objects comes about from 7 
> shards, a build core and a live core per shard, two complete index 
> chains, and for each of those, one server object for access to 
> CoreAdmin and another for the index.
>
> The impetus for this, as it's possible I'm stating an XY problem: 
> Currently I have an occasional problem where SolrJ connections throw 
> an exception.  When it happens, nothing is logged in Solr.  My code is 
> smart enough to notice the problem, send an email alert, and simply 
> try again at the top of the next minute.  The simple explanation is 
> that this is a Linux networking problem, but I never had any problem 
> like this when I was using Perl with LWP to keep my index up to date.  
> I sent a message to the list some time ago on this exception, but I 
> never got a response that helped me figure it out.
>
> Caused by: org.apache.solr.client.solrj.SolrServerException: 
> java.net.SocketException: Connection reset
>
> at 
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:480)
>
> at 
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:246)
>
> at 
> org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:89)
>
> at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:276)
>
> at com.newscom.idxbuild.solr.Core.getCount(Core.java:325)
>
> ... 3 more
>
> Caused by: java.net.SocketException: Connection reset
>
> at java.net.SocketInputStream.read(SocketInputStream.java:168)
>
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
>
> at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
>
> at 
> org.apache.commons.httpclient.HttpParser.readRawLine(HttpParser.java:78)
>
> at org.apache.commons.httpclient.HttpParser.readLine(HttpParser.java:106)
>
> at 
> org.apache.commons.httpclient.HttpConnection.readLine(HttpConnection.java:1116)
>
> at 
> org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.readLine(MultiThreadedHttpConnectionManager.java:1413)
>
> at 
> org.apache.commons.httpclient.HttpMethodBase.readStatusLine(HttpMethodBase.java:1973)
>
> at 
> org.apache.commons.httpclient.HttpMethodBase.readResponse(HttpMethodBase.java:1735)
>
> at 
> org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1098)
>
> at 
> org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:398)
>
> at 
> org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
>
> at 
> org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
>
> at 
> org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
>
> at 
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:424)
>
> ... 7 more

No response in quite some time, so I'm bringing it up again.  I brought 
up the Exception issue before, and though I did get some responses, I 
didn't feel that I got an answer.

http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201112.mbox/%3C4EEAF6E5.9030107@elyograg.org%3E

Thanks,
Shawn