You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Norbert Kalmar (JIRA)" <ji...@apache.org> on 2019/03/26 09:25:00 UTC

[jira] [Updated] (ZOOKEEPER-3333) Detect if txnlogs and / or snapshots is deleted under a running ZK instance

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-3333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Norbert Kalmar updated ZOOKEEPER-3333:
--------------------------------------
    Affects Version/s:     (was: 3.6.0)

> Detect if txnlogs and / or snapshots is deleted under a running ZK instance
> ---------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-3333
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3333
>             Project: ZooKeeper
>          Issue Type: Improvement
>          Components: server
>    Affects Versions: 3.5.5, 3.4.14
>            Reporter: Norbert Kalmar
>            Priority: Major
>
> ZK does not notice if txnlogs are deleted from it's dataDir, and it will just keep running, writing txns in the buffer. Than, when ZK is restarted, it will lose all data.
> To reproduce:
> I run a 3 node ZK ensemble, and deleted dataDir for just one instance, than wrote some data. It turns out, it will not write the transaction to disk. ZK stores everything in memory, until it “feels like” it’s time to persist it on disk. So it doesn’t even notice the file is deleted, and when it tried to flush, I imagine it just fails and keeps it in the buffer. 
> So anyway, I restarted the instance, it got the snapshot + latest txn logs from the other nodes, as expected it would. It also wrote them in dataDir, so now every node had the dataDir.
> So deleting from one node is fine (again, as expected, they will sync after a restart).
> Then, I deleted all 3 nodes dataDir under running instances. Until restart, it worked fine (of course I was getting my buffer full, I did not test until the point it got overflowed).
> But after restart, I got a fresh new ZK with all my znodes gone.
> For starter, I think ZK should detect if the file it is appending is removed. 
> What should ZK do? At least give a warning log message. The question should it try to create a new file? Or try to get it from other nodes? Or just fail instantly? Restart itself, see if it can sync?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)