You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@zookeeper.apache.org by Merlin Morgenstern <me...@gmail.com> on 2016/01/15 17:27:29 UTC

Zookeeper will not join the cluster after removing database post corruption

I am running a 3 node zookeeper ensemble on ubuntu 14.04. After noticing
that node1 was not running, I had to remove the snapshots of this node and
now the server did start, but refuses to join the cluster.

This is the log from zookeeper:

2016-01-15 17:07:28,187 - INFO  [main:QuorumPeerConfig@101] - Reading
configuration from: /etc/zookeeper/conf/zoo.cfg
2016-01-15 17:07:28,189 - INFO  [main:QuorumPeerConfig@334] -
Defaulting to majority quorums
2016-01-15 17:07:28,191 - INFO  [main:DatadirCleanupManager@78] -
autopurge.snapRetainCount set to 3
2016-01-15 17:07:28,191 - INFO  [main:DatadirCleanupManager@79] -
autopurge.purgeInterval set to 0
2016-01-15 17:07:28,192 - INFO  [main:DatadirCleanupManager@101] -
Purge task is not scheduled.
2016-01-15 17:07:28,198 - INFO  [main:QuorumPeerMain@127] - Starting quorum peer
2016-01-15 17:07:28,204 - INFO  [main:NIOServerCnxnFactory@94] -
binding to port 0.0.0.0/0.0.0.0:2181
2016-01-15 17:07:28,213 - INFO  [main:QuorumPeer@913] - tickTime set to 2000
2016-01-15 17:07:28,213 - INFO  [main:QuorumPeer@933] -
minSessionTimeout set to -1
2016-01-15 17:07:28,213 - INFO  [main:QuorumPeer@944] -
maxSessionTimeout set to -1
2016-01-15 17:07:28,213 - INFO  [main:QuorumPeer@959] - initLimit set to 10
2016-01-15 17:07:28,220 - INFO  [main:QuorumPeer@429] - currentEpoch
not found! Creating with a reasonable default of 0. This should only
happen when you are upgrading your installation
2016-01-15 17:07:28,223 - INFO  [main:QuorumPeer@444] - acceptedEpoch
not found! Creating with a reasonable default of 0. This should only
happen when you are upgrading your installation
2016-01-15 17:07:28,228 - INFO
[Thread-1:QuorumCnxManager$Listener@486] - My election bind port:
0.0.0.0/0.0.0.0:3888
2016-01-15 17:07:28,233 - INFO
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumPeer@670] - LOOKING
2016-01-15 17:07:28,234 - INFO
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@740] - New
election. My id =  1, proposed zxid=0x0
2016-01-15 17:07:28,235 - INFO
[WorkerReceiver[myid=1]:FastLeaderElection@542] - Notification: 1
(n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING (n.state), 1 (n.sid),
0x0 (n.peerEPoch), LOOKING (my state)
2016-01-15 17:07:28,237 - INFO
[WorkerSender[myid=1]:QuorumCnxManager@190] - Have smaller server
identifier, so dropping the connection: (2, 1)
2016-01-15 17:07:28,237 - INFO
[WorkerSender[myid=1]:QuorumCnxManager@190] - Have smaller server
identifier, so dropping the connection: (3, 1)
2016-01-15 17:07:28,238 - INFO
[zk1/10.0.0.101:3888:QuorumCnxManager$Listener@493] - Received
connection request /10.0.0.103:48431
2016-01-15 17:07:28,239 - INFO
[zk1/10.0.0.101:3888:QuorumCnxManager$Listener@493] - Received
connection request /10.0.0.102:46486
2016-01-15 17:07:28,239 - INFO
[WorkerReceiver[myid=1]:FastLeaderElection@542] - Notification: 3
(n.leader), 0x1500000072 (n.zxid), 0x20 (n.round), LEADING (n.state),
3 (n.sid), 0x15 (n.peerEPoch), LOOKING (my state)
2016-01-15 17:07:28,240 - INFO
[WorkerReceiver[myid=1]:FastLeaderElection@542] - Notification: 3
(n.leader), 0x1500000072 (n.zxid), 0x20 (n.round), FOLLOWING
(n.state), 2 (n.sid), 0x15 (n.peerEPoch), LOOKING (my state)
2016-01-15 17:07:28,240 - INFO
[WorkerReceiver[myid=1]:FastLeaderElection@542] - Notification: 3
(n.leader), 0x1500000072 (n.zxid), 0x20 (n.round), LEADING (n.state),
3 (n.sid), 0x15 (n.peerEPoch), LOOKING (my state)
2016-01-15 17:07:28,240 - INFO
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumPeer@738] - FOLLOWING
2016-01-15 17:07:28,241 - INFO
[WorkerReceiver[myid=1]:FastLeaderElection@542] - Notification: 3
(n.leader), 0x1500000072 (n.zxid), 0x20 (n.round), FOLLOWING
(n.state), 2 (n.sid), 0x15 (n.peerEPoch), FOLLOWING (my state)
2016-01-15 17:07:28,242 - INFO
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Learner@85] - TCP NoDelay set
to: true
2016-01-15 17:07:28,246 - INFO
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server
environment:zookeeper.version=3.4.5--1, built on 06/10/2013 17:26 GMT
2016-01-15 17:07:28,246 - INFO
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server
environment:host.name=fx1.findix.com
2016-01-15 17:07:28,246 - INFO
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server
environment:java.version=1.8.0_66
2016-01-15 17:07:28,246 - INFO
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server
environment:java.vendor=Oracle Corporation
2016-01-15 17:07:28,246 - INFO
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server
environment:java.home=/usr/lib/jvm/java-8-oracle/jre
2016-01-15 17:07:28,246 - INFO
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server
environment:java.class.path=/etc/zookeeper/conf:/usr/share/java/jline.jar:/usr/share/java/log4j-1.2.jar:/usr/share/java/xercesImpl.jar:/usr/share/java/xmlParserAPIs.jar:/usr/share/java/netty.jar:/usr/share/java/slf4j-api.jar:/usr/share/java/slf4j-log4j12.jar:/usr/share/java/zookeeper.jar
2016-01-15 17:07:28,246 - INFO
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server
environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2016-01-15 17:07:28,246 - INFO
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server
environment:java.io.tmpdir=/tmp
2016-01-15 17:07:28,246 - INFO
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server
environment:java.compiler=<NA>
2016-01-15 17:07:28,246 - INFO
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server
environment:os.name=Linux
2016-01-15 17:07:28,246 - INFO
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server
environment:os.arch=amd64
2016-01-15 17:07:28,246 - INFO
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server
environment:os.version=3.19.0-43-generic
2016-01-15 17:07:28,246 - INFO
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server
environment:user.name=zookeeper
2016-01-15 17:07:28,246 - INFO
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server
environment:user.home=/var/lib/zookeeper
2016-01-15 17:07:28,246 - INFO
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server
environment:user.dir=/
2016-01-15 17:07:28,247 - INFO
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@162] -
Created server with tickTime 2000 minSessionTimeout 4000
maxSessionTimeout 40000 datadir /var/lib/zookeeper/version-2 snapdir
/var/lib/zookeeper/version-2
2016-01-15 17:07:28,247 - INFO
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@63] - FOLLOWING -
LEADER ELECTION TOOK - 13
2016-01-15 17:07:28,253 - INFO
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Learner@325] - Getting a
snapshot from leader
2016-01-15 17:07:28,262 - INFO
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FileTxnSnapLog@240] -
Snapshotting: 0x16000009b6 to
/var/lib/zookeeper/version-2/snapshot.16000009b6
2016-01-15 17:07:48,724 - INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] -
Accepted socket connection from /10.0.0.101:58156
2016-01-15 17:07:48,761 - INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@821] -
Processing stat command from /10.0.0.101:58156
2016-01-15 17:07:48,763 - INFO
[Thread-2:NIOServerCnxn$StatCommand@655] - Stat command output
2016-01-15 17:07:48,763 - INFO  [Thread-2:NIOServerCnxn@1001] - Closed
socket connection for client /10.0.0.101:58156 (no session established
for client)
2016-01-15 17:08:02,003 - WARN
[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@118] - Got zxid
0x16000009b7 expected 0x1
2016-01-15 17:08:02,003 - INFO  [SyncThread:1:FileTxnLog@199] -
Creating new log file: log.16000009b7

The relevant entry seems to be: Got zxid 0x16000009b7 expected 0x1

The configuration of the node was OK I believe, so it must be something
else that is wrong.