You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "suja s (JIRA)" <ji...@apache.org> on 2012/12/27 07:58:12 UTC

[jira] [Created] (ZOOKEEPER-1612) Zookeeper unable to recover and start once datadir disk is full and disk space cleared

suja s created ZOOKEEPER-1612:
---------------------------------

             Summary: Zookeeper unable to recover and start once datadir disk is full and disk space cleared
                 Key: ZOOKEEPER-1612
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1612
             Project: ZooKeeper
          Issue Type: Bug
    Affects Versions: 3.4.3
            Reporter: suja s


Once zookeeper data dir disk becomes full, the process gets shut down.
{noformat}
2012-12-14 13:22:26,959 [myid:2] - ERROR [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@276] - Severe unrecoverable error, exiting
java.io.IOException: No space left on device
	at java.io.FileOutputStream.writeBytes(Native Method)
	at java.io.FileOutputStream.write(FileOutputStream.java:282)
	at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
	at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
	at java.util.zip.CheckedOutputStream.write(CheckedOutputStream.java:56)
	at java.io.DataOutputStream.write(DataOutputStream.java:90)
	at java.io.FilterOutputStream.write(FilterOutputStream.java:80)
	at org.apache.jute.BinaryOutputArchive.writeBuffer(BinaryOutputArchive.java:119)
	at org.apache.zookeeper.server.DataNode.serialize(DataNode.java:168)
	at org.apache.jute.BinaryOutputArchive.writeRecord(BinaryOutputArchive.java:123)
	at org.apache.zookeeper.server.DataTree.serializeNode(DataTree.java:1115)
	at org.apache.zookeeper.server.DataTree.serializeNode(DataTree.java:1130)
	at org.apache.zookeeper.server.DataTree.serializeNode(DataTree.java:1130)
	at org.apache.zookeeper.server.DataTree.serialize(DataTree.java:1179)
	at org.apache.zookeeper.server.util.SerializeUtils.serializeSnapshot(SerializeUtils.java:138)
	at org.apache.zookeeper.server.persistence.FileSnap.serialize(FileSnap.java:213)
	at org.apache.zookeeper.server.persistence.FileSnap.serialize(FileSnap.java:230)
	at org.apache.zookeeper.server.persistence.FileTxnSnapLog.save(FileTxnSnapLog.java:242)
	at org.apache.zookeeper.server.ZooKeeperServer.takeSnapshot(ZooKeeperServer.java:274)
	at org.apache.zookeeper.server.quorum.Learner.syncWithLeader(Learner.java:407)
	at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:82)
	at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:759)
{noformat}

Later disk space is cleared and zk started again. Startup of zk fails as it is not able to read snapshot properly. (Since load from disk failed it is not able to join peers in the quorum and get a snapshot diff)
{noformat}

2012-12-14 16:20:31,489 [myid:2] - INFO  [main:FileSnap@83] - Reading snapshot ../dataDir/version-2/snapshot.1000000042
2012-12-14 16:20:31,564 [myid:2] - ERROR [main:QuorumPeer@472] - Unable to load database on disk
java.io.EOFException
	at java.io.DataInputStream.readInt(DataInputStream.java:375)
	at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
	at org.apache.zookeeper.server.persistence.FileHeader.deserialize(FileHeader.java:64)
	at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:558)
	at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:577)
	at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:543)
	at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:625)
	at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:529)
	at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.<init>(FileTxnLog.java:504)
	at org.apache.zookeeper.server.persistence.FileTxnLog.read(FileTxnLog.java:341)
	at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:132)
	at org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
	at org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:436)
	at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:428)
	at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:152)
	at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:111)
	at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
2012-12-14 16:20:31,566 [myid:2] - ERROR [main:QuorumPeerMain@89] - Unexpected exception, exiting abnormally
java.lang.RuntimeException: Unable to run quorum server 
	at org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:473)
	at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:428)
	at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:152)
	at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:111)
	at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
Caused by: java.io.EOFException
	at java.io.DataInputStream.readInt(DataInputStream.java:375)
	at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
	at org.apache.zookeeper.server.persistence.FileHeader.deserialize(FileHeader.java:64)
	at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:558)
	at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:577)
	at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:543)
	at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:625)
	at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:529)
	at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.<init>(FileTxnLog.java:504)
	at org.apache.zookeeper.server.persistence.FileTxnLog.read(FileTxnLog.java:341)
	at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:132)

 {noformat}




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira