You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Harald Musum (JIRA)" <ji...@apache.org> on 2014/06/10 09:27:02 UTC

[jira] [Created] (ZOOKEEPER-1936) Server exits when unable to create data directory due to race

Harald Musum created ZOOKEEPER-1936:
---------------------------------------

             Summary: Server exits when unable to create data directory due to race 
                 Key: ZOOKEEPER-1936
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1936
             Project: ZooKeeper
          Issue Type: Bug
          Components: server
    Affects Versions: 3.4.5
            Reporter: Harald Musum
            Priority: Minor


We sometime see issues with ZooKeeper server not starting and seeing this error in the log:


[2014-05-27 09:29:48.248] ERROR   : -               
.org.apache.zookeeper.server.ZooKeeperServerMain    Unexpected exception,
exiting abnormally\nexception=\njava.io.IOException: Unable to create data
directory /home/y/var/zookeeper/version-2\n\tat
org.apache.zookeeper.server.persistence.FileTxnSnapLog.<init>(FileTxnSnapLog.java:85)\n\tat
org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:103)\n\tat
org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86)\n\tat
org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52)\n\tat
org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)\n\tat
org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)\n\t
[...]

Stack trace from JVM gives this:

"PurgeTask" daemon prio=10 tid=0x000000000201d000 nid=0x1727 runnable
[0x00007f55d7dc7000]
   java.lang.Thread.State: RUNNABLE
    at java.io.UnixFileSystem.createDirectory(Native Method)
    at java.io.File.mkdir(File.java:1310)
    at java.io.File.mkdirs(File.java:1337)
    at
org.apache.zookeeper.server.persistence.FileTxnSnapLog.<init>(FileTxnSnapLog.java:84)
    at org.apache.zookeeper.server.PurgeTxnLog.purge(PurgeTxnLog.java:68)
    at
org.apache.zookeeper.server.DatadirCleanupManager$PurgeTask.run(DatadirCleanupManager.java:140)
    at java.util.TimerThread.mainLoop(Timer.java:555)
    at java.util.TimerThread.run(Timer.java:505)

"zookeeper server" prio=10 tid=0x00000000027df800 nid=0x1715 runnable
[0x00007f55d7ed8000]
   java.lang.Thread.State: RUNNABLE
    at java.io.UnixFileSystem.createDirectory(Native Method)
    at java.io.File.mkdir(File.java:1310)
    at java.io.File.mkdirs(File.java:1337)
    at
org.apache.zookeeper.server.persistence.FileTxnSnapLog.<init>(FileTxnSnapLog.java:84)
    at
org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:103)
    at
org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86)
    at
org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52)
    at
org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)
    at
org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
[...]

So it seems that when autopurge is used (as it is in our case), it might happen at the same time as starting the server itself. In FileTxnSnapLog() it will check if the directory exists and create it if not. These two tasks do this at the same time, and mkdir fails and server exits the JVM.





--
This message was sent by Atlassian JIRA
(v6.2#6252)