You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Samer Al-Kiswany (JIRA)" <ji...@apache.org> on 2014/08/22 20:18:12 UTC

[jira] [Created] (ZOOKEEPER-2018) Zookeper node fails to boot if writes are reordered

Samer Al-Kiswany created ZOOKEEPER-2018:
-------------------------------------------

             Summary: Zookeper node fails to boot if writes are reordered
                 Key: ZOOKEEPER-2018
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2018
             Project: ZooKeeper
          Issue Type: Bug
    Affects Versions: 3.4.6
            Reporter: Samer Al-Kiswany


After studying the steps ZooKeeper takes to update the logs we found the following bug. The bug may manifest in file systems with writeback buffering. 

If you run the zookeeper client script (zkCli.sh) with the following commands:
VALUE=”8KB value”  # 8KB in size
create /dir1 $VALUE
create /dir1/dir2 $VALUE

the strace generated at the zookeeprer node is: 
mkdir(v)
create(v/log)
append(v/log)
trunk(v/log)
…
fdatasync(v/log)
write(v/log)    ……. 1
write(v/log)    ……. 2
write(v/log)    ……. 3
fdatasync(v/log)

The last four calls are related to the second create of dir2.

If the last write (#3) goes to disk before the second write (#2) and the system crashes before #2 reaches the disk, the zookeeper node will not boot.



--
This message was sent by Atlassian JIRA
(v6.2#6252)