You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@zookeeper.apache.org by Jared Cantwell <ja...@gmail.com> on 2013/03/29 21:59:55 UTC

Continuous snapshots and out of sync follower

We ran into an issue where one follower in a five node configuration was
significantly out of sync with the rest of the nodes.  Running 'stat'
showed the stale server at zxid 0x900c44edd and other servers were at
zxid 0x900c4679b.
 I manually ran 'sync /' from the cli but that had no impact.  While this
was happening, snapshots were being created very frequently (500+ in just a
few hours w/o many transactions).  The logs for the snapshots are below.
 Eventually, restarting the server (without cleaning out the database on
disk) resolved the issue, but we're now trying to understand what happened
and how to prevent it.

At the moment we are using an earlier version of 3.5.0 (revision 1398005).
Has anyone seen this before?

...................
Mar 28 23:33:38 zookeeper - INFO  [SyncThread:7:FileTxnLog@199] - Creating
new log file: log.90016c3db
Mar 28 23:33:46 zookeeper - INFO  [Snapshot Thread:FileTxnSnapLog@270] -
Snapshotting: 0x90016e292 to /data/zookeeper/
10.5.3.61/version-2/snapshot.90016e292
Mar 28 23:33:46 zookeeper - INFO  [SyncThread:7:FileTxnLog@199] - Creating
new log file: log.90016e294
Mar 28 23:33:54 zookeeper - INFO  [Snapshot Thread:FileTxnSnapLog@270] -
Snapshotting: 0x90016feb4 to /data/zookeeper/
10.5.3.61/version-2/snapshot.90016feb4
Mar 28 23:33:54 zookeeper - INFO  [SyncThread:7:FileTxnLog@199] - Creating
new log file: log.90016feb5
Mar 28 23:34:04 zookeeper - INFO  [Snapshot Thread:FileTxnSnapLog@270] -
Snapshotting: 0x9001721db to /data/zookeeper/
10.5.3.61/version-2/snapshot.9001721db
.......................

~Jared