You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Lasaro Camargos (JIRA)" <ji...@apache.org> on 2016/12/22 17:03:58 UTC

[jira] [Created] (ZOOKEEPER-2653) epoch files do not match snapshots and logs

Lasaro Camargos created ZOOKEEPER-2653:
------------------------------------------

             Summary: epoch files do not match snapshots and logs
                 Key: ZOOKEEPER-2653
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2653
             Project: ZooKeeper
          Issue Type: Bug
    Affects Versions: 3.4.9
         Environment: Linux  3.10.0-327.el7.x86_64 #1 SMP Thu Nov 19 22:10:57 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux (centos)
            Reporter: Lasaro Camargos


Hi all.
After shutting zk down and upgrading to centos 7, ZK would not start with exception

Removing file: Dec 19, 2016 10:55:08 PM /hedvig/hpod/log/version-2/log.300ee0308
Removing file: Dec 19, 2016 7:11:23 PM  /hedvig/hpod/data/version-2/snapshot.300ee0307
java.lang.RuntimeException: Unable to run quorum server
        at org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:558)
        at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:500)
        at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:153)
        at com.hedvig.hpod.service.PodnetService$1.run(PodnetService.java:2262)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: The current epoch, 3, is older than the last zxid, 17179871862
        at org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:539)
        ... 4 more

All logs are empty, and the following snapshot and commit logs exist

find .
.
./log
./log/version-2
./log/version-2/log.40000010a
./log/version-2/log.300ef712b
./log/version-2/log.300f0659e
./log/version-2/.ignore
./data
./data/version-2
./data/version-2/snapshot.400000109
./data/version-2/currentEpoch
./data/version-2/acceptedEpoch
./data/version-2/snapshot.300ef712a
./data/version-2/snapshot.300f0659d
./data/myid.bak
./data/myid


On other nodes we had the same exception but no commit log deletion.
java.lang.RuntimeException: Unable to run quorum server
       at org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:558)
        at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:500)
        at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:153)
        at com.hedvig.hpod.service.PodnetService$1.run(PodnetService.java:2262)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: The current epoch, 3, is older than the last zxid, 17179871862
        at org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:539)

./log
./log/version-2
./log/version-2/log.300f06cfc
./log/version-2/log.300f03890
./log/version-2/.ignore
./data
./data/version-2
./data/version-2/snapshot.300f06cfb
./data/version-2/snapshot.300f06f10
./data/version-2/currentEpoch
./data/version-2/acceptedEpoch
./data/version-2/snapshot.300f0388f
./data/myid.bak
./data/myid


/log
./log/version-2
./log/version-2/log.300f06dbf
./log/version-2/log.300ed96fc
./log/version-2/log.300ef1048
./log/version-2/.ignore
./data
./data/version-2
./data/version-2/snapshot.300f06dbe
./data/version-2/currentEpoch
./data/version-2/acceptedEpoch
./data/version-2/snapshot.300ed96fb
./data/version-2/snapshot.300ef1048
./data/myid.bak
./data/myid

The symptoms look like ZOOKEEPER-1549, but we are running 3.4.9 here.

Any ideas?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)