You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Modassar Ather <mo...@gmail.com> on 2014/10/28 07:11:31 UTC

Log message "zkClient has disconnected".

Hi,

I am getting following INFO log messages many a times during my indexing.
The indexing process read records from database and using multiple threads
sends them for indexing in batches.
There are four shards and one embedded Zookeeper on one of the shards.

org.apache.zookeeper.ClientCnxn$SendThread run
INFO: Client session timed out, have not heard from server in 9276ms for
sessionid <id>, closing socket connection and attempting reconnect
org.apache.solr.common.cloud.ConnectionManager process
INFO: Watcher org.apache.solr.common.cloud.ConnectionManager@3debc153
name:ZooKeeperConnection Watcher:<host>:<port> got event WatchedEvent
state:Disconnected type:None path:null path:null type:None
org.apache.solr.common.cloud.ConnectionManager process
INFO: zkClient has disconnected

Kindly help me understand the possible cause of Zookeeper state
disconnection.

Thanks,
Modassar

Re: Log message "zkClient has disconnected".

Posted by Modassar Ather <mo...@gmail.com>.
Thanks Shawn for your response and the link of GC tuning.

Regards,
Modassar

On Tue, Oct 28, 2014 at 7:01 PM, Shawn Heisey <ap...@elyograg.org> wrote:

> On 10/28/2014 1:48 AM, Modassar Ather wrote:
> > These Solrcloud instances are 8-core machines with a RAM of 24 GB each
> > assigned to tomcat. The Indexer machine starts with -Xmx16g.
> > All these machines are connected to the same switch.
>
> If you have not tuned your garbage collection, a 16GB heap will be
> enough to create garbage collection pauses that are long enough to
> exceed a 15 second zkClientTimeout, which is the setting that is
> commonly seen in example configs.  I was seeing pauses longer than 12
> seconds with ConcurrentMarkSweep enabled on an 8GB heap, before I tuned
> the GC.  With a 16GB heap, it would even be possible to exceed a 30
> second timeout, which is the default in later releases.
>
> After I tuned the CMS collector, my GC pauses are no longer long enough
> to cause problems.  These are GC settings that have worked for me and
> for others:
>
> http://wiki.apache.org/solr/ShawnHeisey#GC_Tuning
>
> Thanks,
> Shawn
>
>

Re: Log message "zkClient has disconnected".

Posted by Mark Miller <ma...@gmail.com>.

> On Oct 28, 2014, at 9:31 AM, Shawn Heisey <ap...@elyograg.org> wrote:
> 
> exceed a 15 second zkClientTimeout

Which is too low even with good GC settings. Anyone with config still using 15 or 10 seconds should move it to at least 30.

- Mark

http://about.me/markrmiller

Re: Log message "zkClient has disconnected".

Posted by Shawn Heisey <ap...@elyograg.org>.
On 10/28/2014 1:48 AM, Modassar Ather wrote:
> These Solrcloud instances are 8-core machines with a RAM of 24 GB each
> assigned to tomcat. The Indexer machine starts with -Xmx16g.
> All these machines are connected to the same switch.

If you have not tuned your garbage collection, a 16GB heap will be
enough to create garbage collection pauses that are long enough to
exceed a 15 second zkClientTimeout, which is the setting that is
commonly seen in example configs.  I was seeing pauses longer than 12
seconds with ConcurrentMarkSweep enabled on an 8GB heap, before I tuned
the GC.  With a 16GB heap, it would even be possible to exceed a 30
second timeout, which is the default in later releases.

After I tuned the CMS collector, my GC pauses are no longer long enough
to cause problems.  These are GC settings that have worked for me and
for others:

http://wiki.apache.org/solr/ShawnHeisey#GC_Tuning

Thanks,
Shawn


Re: Log message "zkClient has disconnected".

Posted by Modassar Ather <mo...@gmail.com>.
Hi Will,
Thanks for your response.

These Solrcloud instances are 8-core machines with a RAM of 24 GB each
assigned to tomcat. The Indexer machine starts with -Xmx16g.
All these machines are connected to the same switch.

The batch size is 5000 documents and there are 8 threads which adds 5000
document's batch per thread to solrcloud.
I have tried with a bigger batch size but that caused OutOfMemory error.

I can see the Solr cloud instances are not running out of memory or going
low in memory. The CPU utilization is also around 50% on each core.

Whereas indexer is using maximum of the assigned memory which is -Xmx16g
but is not going out of memory.

Thanks,
Modassar

On Tue, Oct 28, 2014 at 12:18 PM, Will Martin <wm...@gmail.com> wrote:

> Modassar:
>
> Can you share your hw setup?  And what size are your batches? Can you make
> them smaller; it doesn't mean your throughput will necessarily suffer.
> Re
> Will
>
>
> -----Original Message-----
> From: Modassar Ather [mailto:modather1981@gmail.com]
> Sent: Tuesday, October 28, 2014 2:12 AM
> To: solr-user@lucene.apache.org
> Subject: Log message "zkClient has disconnected".
>
> Hi,
>
> I am getting following INFO log messages many a times during my indexing.
> The indexing process read records from database and using multiple threads
> sends them for indexing in batches.
> There are four shards and one embedded Zookeeper on one of the shards.
>
> org.apache.zookeeper.ClientCnxn$SendThread run
> INFO: Client session timed out, have not heard from server in 9276ms for
> sessionid <id>, closing socket connection and attempting reconnect
> org.apache.solr.common.cloud.ConnectionManager process
> INFO: Watcher org.apache.solr.common.cloud.ConnectionManager@3debc153
> name:ZooKeeperConnection Watcher:<host>:<port> got event WatchedEvent
> state:Disconnected type:None path:null path:null type:None
> org.apache.solr.common.cloud.ConnectionManager process
> INFO: zkClient has disconnected
>
> Kindly help me understand the possible cause of Zookeeper state
> disconnection.
>
> Thanks,
> Modassar
>
>

RE: Log message "zkClient has disconnected".

Posted by Will Martin <wm...@gmail.com>.
Modassar:

Can you share your hw setup?  And what size are your batches? Can you make them smaller; it doesn't mean your throughput will necessarily suffer. 
Re
Will


-----Original Message-----
From: Modassar Ather [mailto:modather1981@gmail.com] 
Sent: Tuesday, October 28, 2014 2:12 AM
To: solr-user@lucene.apache.org
Subject: Log message "zkClient has disconnected".

Hi,

I am getting following INFO log messages many a times during my indexing.
The indexing process read records from database and using multiple threads sends them for indexing in batches.
There are four shards and one embedded Zookeeper on one of the shards.

org.apache.zookeeper.ClientCnxn$SendThread run
INFO: Client session timed out, have not heard from server in 9276ms for sessionid <id>, closing socket connection and attempting reconnect org.apache.solr.common.cloud.ConnectionManager process
INFO: Watcher org.apache.solr.common.cloud.ConnectionManager@3debc153
name:ZooKeeperConnection Watcher:<host>:<port> got event WatchedEvent state:Disconnected type:None path:null path:null type:None org.apache.solr.common.cloud.ConnectionManager process
INFO: zkClient has disconnected

Kindly help me understand the possible cause of Zookeeper state disconnection.

Thanks,
Modassar