You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Rahul Yadav (JIRA)" <ji...@apache.org> on 2016/04/06 22:22:25 UTC
[jira] [Commented] (ZOOKEEPER-885) Zookeeper drops connections under moderate IO load

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15229018#comment-15229018 ] 

Rahul Yadav commented on ZOOKEEPER-885:
---------------------------------------

Hi

We are also facing the same problem in our 3 node production cluster in one of the data centers.

Setup details:
3 node Zk cluster running on 3.3.6

Used with:
6 node Kafka cluster on 0.8.0
~10 mirror makers for cross DC replication across ~900 partitions (that many znode entries will be created for offset storage by each mirror maker)
~10 consumers to consume data across ~900 partitions (same as above)

Zoo.cfg:
==================
tickTime=2000

dataDir= /my/data/dir 
dataLogDir=/my/data
#both data and data log have dedicated devices

clientPort=2181
initLimit=5
syncLimit=2
maxClientCnxns=200

server.1=myzookeeper01.host.name:2888:3888
server.2=myzookeeper02.host.name:2888:3888
server.3=myzookeeper03.host.name:2888:3888


Observations:
1. It's happening mostly at the beginning of an hour i.e at the 2nd or the 3rd second.
2. Some of the client (mirror makers/brokers/consumers) connections get dropped stating that server is unresponsive for 4000ms.

[06/04/2016:12:00:03 PDT] [INFO] [org.apache.zookeeper.ClientCnxn main-SendThread(<zookeeperHost>:2181)]: Client session timed out, have not heard from server in 4000ms for sessionid 0x153c64c806416a8, closing socket connection and attempting reconnect

3. It leads to EndOfStreamExceptions combined with CancelledKeyException in the Zookeeper logs.

>From the discussion in the thread above, we looked at the following things:
1. We looked at the no. of established connections to each zookeeper at every second for few hours. This was done to identify whether there was a spike in the no. of new connections at the start of an hour which could cause the zookeeper queue to fill up and hence, not respond to pings.

There was no spike at the hour boundaries.

2. We looked at the the disk usage for all the zookeepers to verify whether there was a I/O spike at the hour boundaries which could be causing the fsync to delay and hence, making the server unresponsive.

No such thing was observed. Data in/out rate remained normal throughout the hour.

3. Looked at the logs of zookeeper to see whether there was anything different happening at the end of the previous hour or start of the current hour, which could explain the unresponsiveness. 

No luck there as well.

4. We are using a RollingFileAppender for zookeeper logs performs hourly rolling and the archive directory share the same disk with the dataDir. To verify whether this was the cause, we changed the rolling frequency to once per day and still observed EndOfStreamExceptions at the start of every hour.

5. We looked at the GC dump using verbose:gc and the GC times are almost constant (~.13 seconds) throughout the day.

6. We are observing this behavior only in one of our data centers in spite of more data flowing in and out of the unaffected data center

Worries:

1. 2 weeks back, we had 4 controller switches (Kafka) in a day for the associated Kafka cluster, all of them at the beginning of the hour. This pushed the cluster into an unhealthy state where significant no. of partitions went offline. Controller state is stored in zookeeper and we suspect this unresponsiveness to be the cause of the switch.

2. Since it's happening consistently at the beginning of the hour, we want to narrow down to the cause and work on it as we are about to get even more clients deployed. If the suggestion is to expand the ensemble, we will be happy to do that but we need some figures for devOps to justify the expansion.

Kindly advice what else we should look at.
Do tell me if you need any other info to understand the problem better.

Thanx

> Zookeeper drops connections under moderate IO load
> --------------------------------------------------
>
>                 Key: ZOOKEEPER-885
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-885
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.2.2, 3.3.1
>         Environment: Debian (Lenny)
> 1Gb RAM
> swap disabled
> 100Mb heap for zookeeper
>            Reporter: Alexandre Hardy
>            Priority: Critical
>             Fix For: 3.5.2, 3.6.0
>
>         Attachments: WatcherTest.java, benchmark.csv, tracezklogs.tar.gz, tracezklogs.tar.gz, zklogs.tar.gz
>
>
> A zookeeper server under minimum load, with a number of clients watching exactly one node will fail to maintain the connection when the machine is subjected to moderate IO load.
> In a specific test example we had three zookeeper servers running on dedicated machines with 45 clients connected, watching exactly one node. The clients would disconnect after moderate load was added to each of the zookeeper servers with the command:
> {noformat}
> dd if=/dev/urandom of=/dev/mapper/nimbula-test
> {noformat}
> The {{dd}} command transferred data at a rate of about 4Mb/s.
> The same thing happens with
> {noformat}
> dd if=/dev/zero of=/dev/mapper/nimbula-test
> {noformat}
> It seems strange that such a moderate load should cause instability in the connection.
> Very few other processes were running, the machines were setup to test the connection instability we have experienced. Clients performed no other read or mutation operations.
> Although the documents state that minimal competing IO load should present on the zookeeper server, it seems reasonable that moderate IO should not cause problems in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)