You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@zookeeper.apache.org by "Joe Wang (JIRA)" <ji...@apache.org> on 2016/09/29 18:57:20 UTC

[jira] [Updated] (ZOOKEEPER-2605) Snapshot generation fills up disk space due to high volume of requests.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-2605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joe Wang updated ZOOKEEPER-2605:
--------------------------------
    Description: 
Not sure if it's a bug, or just a consequence of a design decision.

Recently we had an issue where faulty clients were issuing create requests at an abnormally high rate, which caused zookeeper to generate more snapshots than our cron job could clean up. This filled up the disk on our zookeeper hosts and brought the cluster down.

Is there a reason why Zookeeper uses a write-ahead log instead only flushing successful transactions to disk? If only successful transactions are flushed and counted towards snapCount, then even if a client is spamming requests to create a node that already exists, it wouldn't cause a flood of snapshots to be persisted to disk.

  was:
Not sure if it's a bug, or just a consequence of a design decision.

Recently we had an issue where faulty clients were issuing create requests at an abnormally high rate, which caused zookeeper to generate more snapshots than our cron job could clean up. This filled up the disk on our zookeeper hosts and brought the cluster down.

Is there a reason why Zookeeper uses a write-ahead log instead only flushing successful transactions to disk?


> Snapshot generation fills up disk space due to high volume of requests.
> -----------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-2605
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2605
>             Project: ZooKeeper
>          Issue Type: Bug
>    Affects Versions: 3.4.5
>            Reporter: Joe Wang
>            Priority: Minor
>
> Not sure if it's a bug, or just a consequence of a design decision.
> Recently we had an issue where faulty clients were issuing create requests at an abnormally high rate, which caused zookeeper to generate more snapshots than our cron job could clean up. This filled up the disk on our zookeeper hosts and brought the cluster down.
> Is there a reason why Zookeeper uses a write-ahead log instead only flushing successful transactions to disk? If only successful transactions are flushed and counted towards snapCount, then even if a client is spamming requests to create a node that already exists, it wouldn't cause a flood of snapshots to be persisted to disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)