You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by neoman <ha...@gmail.com> on 2013/09/12 16:18:42 UTC
Solr cloud shard goes down after SocketException in another shard
Exception in shard1 (solr01-prod) primary
<09/12/13
13:56:46:635|http-bio-8080-exec-66|ERROR|apache.solr.servlet.SolrDispatchFilter|null:ClientAbortException:
java.net.SocketException: Broken pipe
at
org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:406)
at org.apache.tomcat.util.buf.ByteChunk.append(ByteChunk.java:342)
at
org.apache.catalina.connector.OutputBuffer.writeBytes(OutputBuffer.java:431)
at
org.apache.catalina.connector.OutputBuffer.write(OutputBuffer.java:419)
at
org.apache.catalina.connector.CoyoteOutputStream.write(CoyoteOutputStream.java:91)
at
org.apache.solr.common.util.FastOutputStream.flush(FastOutputStream.java:214)
at
org.apache.solr.common.util.FastOutputStream.write(FastOutputStream.java:95)
at
org.apache.solr.common.util.JavaBinCodec.writeStr(JavaBinCodec.java:470)
at
org.apache.solr.common.util.JavaBinCodec.writePrimitive(JavaBinCodec.java:545)
at
org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:232)
at
org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:149)
at
org.apache.solr.common.util.JavaBinCodec.writeSolrDocument(JavaBinCodec.java:320)
at
org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:257)
at
org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:149)
at
org.apache.solr.common.util.JavaBinCodec.writeArray(JavaBinCodec.java:427)
at
org.apache.solr.common.util.JavaBinCodec.writeSolrDocumentList(JavaBinCodec.java:356)
Exception in shard1 (solr08-prod) secondary
<09/12/13
13:56:46:729|http-bio-8080-exec-50|ERROR|apache.solr.core.SolrCore|org.apache.solr.common.SolrException:
ClusterState says we are the leader (http://solr08-prod:8080/solr/aq-core),
but locally we don't think so. Request came from
http://solr03-prod.phneaz:8080/solr/aq-core/
at
org.apache.solr.update.processor.DistributedUpdateProcessor.doDefensiveChecks(DistributedUpdateProcessor.java:381)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:243)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:428)
at
org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:100)
Out configuration
Solr 4.4, Tomcat 7, 3 shards
Thanks for your help
--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-cloud-shard-goes-down-after-SocketException-in-another-shard-tp4089576.html
Sent from the Solr - User mailing list archive at Nabble.com.
RE: Solr cloud shard goes down after SocketException in another
shard
Posted by Greg Walters <gw...@sherpaanalytics.com>.
Neoman,
Make sure that solr08-prod (or the elected leader at any time) isn't doing a stop-the-world garbage collection that takes long enough that the zookeeper connection times out. I've seen that in my cluster when I didn't have parallel GC enabled and my "zkClientTimeout" in solr.xml was too low.
Thanks,
Greg
-----Original Message-----
From: neoman [mailto:hariram.s@gmail.com]
Sent: Thursday, September 12, 2013 9:19 AM
To: solr-user@lucene.apache.org
Subject: Solr cloud shard goes down after SocketException in another shard
Exception in shard1 (solr01-prod) primary
<09/12/13
13:56:46:635|http-bio-8080-exec-66|ERROR|apache.solr.servlet.SolrDispatchFilter|null:ClientAbortException:
java.net.SocketException: Broken pipe
at
org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:406)
at org.apache.tomcat.util.buf.ByteChunk.append(ByteChunk.java:342)
at
org.apache.catalina.connector.OutputBuffer.writeBytes(OutputBuffer.java:431)
at
org.apache.catalina.connector.OutputBuffer.write(OutputBuffer.java:419)
at
org.apache.catalina.connector.CoyoteOutputStream.write(CoyoteOutputStream.java:91)
at
org.apache.solr.common.util.FastOutputStream.flush(FastOutputStream.java:214)
at
org.apache.solr.common.util.FastOutputStream.write(FastOutputStream.java:95)
at
org.apache.solr.common.util.JavaBinCodec.writeStr(JavaBinCodec.java:470)
at
org.apache.solr.common.util.JavaBinCodec.writePrimitive(JavaBinCodec.java:545)
at
org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:232)
at
org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:149)
at
org.apache.solr.common.util.JavaBinCodec.writeSolrDocument(JavaBinCodec.java:320)
at
org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:257)
at
org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:149)
at
org.apache.solr.common.util.JavaBinCodec.writeArray(JavaBinCodec.java:427)
at
org.apache.solr.common.util.JavaBinCodec.writeSolrDocumentList(JavaBinCodec.java:356)
Exception in shard1 (solr08-prod) secondary
<09/12/13
13:56:46:729|http-bio-8080-exec-50|ERROR|apache.solr.core.SolrCore|org.apache.solr.common.SolrException:
ClusterState says we are the leader (http://solr08-prod:8080/solr/aq-core),
but locally we don't think so. Request came from http://solr03-prod.phneaz:8080/solr/aq-core/
at
org.apache.solr.update.processor.DistributedUpdateProcessor.doDefensiveChecks(DistributedUpdateProcessor.java:381)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:243)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:428)
at
org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:100)
Out configuration
Solr 4.4, Tomcat 7, 3 shards
Thanks for your help
--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-cloud-shard-goes-down-after-SocketException-in-another-shard-tp4089576.html
Sent from the Solr - User mailing list archive at Nabble.com.
RE: Solr cloud shard goes down after SocketException in another
shard
Posted by Greg Walters <gw...@sherpaanalytics.com>.
Neoman,
I've got ours set at 45 seconds:
<int name="zkClientTimeout">${zkClientTimeout:45000}</int>
-----Original Message-----
From: neoman [mailto:hariram.s@gmail.com]
Sent: Thursday, September 12, 2013 9:33 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr cloud shard goes down after SocketException in another shard
Thanks greg. Currently we have 60 seconds (we reduced it recently). I may have to reduce it again. can you please share your timeout value.
--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-cloud-shard-goes-down-after-SocketException-in-another-shard-tp4089576p4089582.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr cloud shard goes down after SocketException in another
shard
Posted by neoman <ha...@gmail.com>.
Thanks greg. Currently we have 60 seconds (we reduced it recently). I may
have to reduce it again. can you please share your timeout value.
--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-cloud-shard-goes-down-after-SocketException-in-another-shard-tp4089576p4089582.html
Sent from the Solr - User mailing list archive at Nabble.com.