You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@zookeeper.apache.org by "Benjamin Reed (JIRA)" <ji...@apache.org> on 2009/11/17 21:58:39 UTC

[jira] Created: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

ZooKeeper can revert to old data when a snapshot is created outside of normal processing
----------------------------------------------------------------------------------------

Key: ZOOKEEPER-582
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
Project: Zookeeper
Issue Type: Bug
Components: server
Affects Versions: 3.2.1
Reporter: Benjamin Reed
Fix For: 3.2.2

when zookeeper starts up it will restore the most recent state (latest zxid) it finds in the data directory. unfortunately, in the quorum version of zookeeper updates are logged using an epoch based on the latest log file in a directory. if there is a snapshot with a higher epoch than the log files, the zookeeper server will start logging using an epoch one higher than the highest log file.

so if a data directory has a snapshot with an epoch of 27 and there are no log files, zookeeper will start logging changes using epoch 1. if the cluster restarts the state will be restored from the snapshot with the epoch of 27, which in effect, restores old data.

normal operation of zookeeper will never result in this situation.

this does not effect standalone zookeeper.

a fix should make sure to use an epoch one higher than the current state, whether it comes from the snapshot or log, and should include a sanity check to make sure that a follower never connects to a leader that has a lower epoch than its own.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

Posted by "Benjamin Reed (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Benjamin Reed updated ZOOKEEPER-582:
------------------------------------

    Hadoop Flags: [Reviewed]

+1 great job mahadev!

> ZooKeeper can revert to old data when a snapshot is created outside of normal processing
> ----------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-582
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.1.1, 3.2.1
>            Reporter: Benjamin Reed
>            Assignee: Mahadev konar
>            Priority: Blocker
>             Fix For: 3.1.2, 3.2.2, 3.3.0
>
>         Attachments: test.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582_3.1.patch, ZOOKEEPER-582_3.2.patch, ZOOKEEPER-582_3.2.patch
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) it finds in the data directory. unfortunately, in the quorum version of zookeeper updates are logged using an epoch based on the latest log file in a directory. if there is a snapshot with a higher epoch than the log files, the zookeeper server will start logging using an epoch one higher than the highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no log files, zookeeper will start logging changes using epoch 1. if the cluster restarts the state will be restored from the snapshot with the epoch of 27, which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, whether it comes from the snapshot or log, and should include a sanity check to make sure that a follower never connects to a leader that has a lower epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-582:
------------------------------------

    Attachment: ZOOKEEPER-582_3.1.patch

latest patch for the 3.1 branch. I ran the tests and they pass on this branch as well.

> ZooKeeper can revert to old data when a snapshot is created outside of normal processing
> ----------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-582
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.1.1, 3.2.1
>            Reporter: Benjamin Reed
>            Assignee: Mahadev konar
>            Priority: Blocker
>             Fix For: 3.1.2, 3.2.2, 3.3.0
>
>         Attachments: test.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582_3.2.patch, ZOOKEEPER-582_3.2.patch
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) it finds in the data directory. unfortunately, in the quorum version of zookeeper updates are logged using an epoch based on the latest log file in a directory. if there is a snapshot with a higher epoch than the log files, the zookeeper server will start logging using an epoch one higher than the highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no log files, zookeeper will start logging changes using epoch 1. if the cluster restarts the state will be restored from the snapshot with the epoch of 27, which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, whether it comes from the snapshot or log, and should include a sanity check to make sure that a follower never connects to a leader that has a lower epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-582:
------------------------------------

    Attachment: ZOOKEEPER-582.patch

this patch fixes the issue. Ill test out the patch tomm.

> ZooKeeper can revert to old data when a snapshot is created outside of normal processing
> ----------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-582
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.1.1, 3.2.1
>            Reporter: Benjamin Reed
>            Priority: Blocker
>             Fix For: 3.2.2, 3.1.2
>
>         Attachments: test.patch, ZOOKEEPER-582.patch
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) it finds in the data directory. unfortunately, in the quorum version of zookeeper updates are logged using an epoch based on the latest log file in a directory. if there is a snapshot with a higher epoch than the log files, the zookeeper server will start logging using an epoch one higher than the highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no log files, zookeeper will start logging changes using epoch 1. if the cluster restarts the state will be restored from the snapshot with the epoch of 27, which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, whether it comes from the snapshot or log, and should include a sanity check to make sure that a follower never connects to a leader that has a lower epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-582:
------------------------------------

    Attachment:     (was: ZOOKEEPER-582_3.1.patch)

> ZooKeeper can revert to old data when a snapshot is created outside of normal processing
> ----------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-582
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.1.1, 3.2.1
>            Reporter: Benjamin Reed
>            Assignee: Mahadev konar
>            Priority: Blocker
>             Fix For: 3.1.2, 3.2.2, 3.3.0
>
>         Attachments: test.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582_3.2.patch, ZOOKEEPER-582_3.2.patch
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) it finds in the data directory. unfortunately, in the quorum version of zookeeper updates are logged using an epoch based on the latest log file in a directory. if there is a snapshot with a higher epoch than the log files, the zookeeper server will start logging using an epoch one higher than the highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no log files, zookeeper will start logging changes using epoch 1. if the cluster restarts the state will be restored from the snapshot with the epoch of 27, which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, whether it comes from the snapshot or log, and should include a sanity check to make sure that a follower never connects to a leader that has a lower epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-582:
------------------------------------

    Attachment: ZOOKEEPER-582_3.1.patch

looks like am really tired. this time i think its the correct file!

> ZooKeeper can revert to old data when a snapshot is created outside of normal processing
> ----------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-582
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.1.1, 3.2.1
>            Reporter: Benjamin Reed
>            Assignee: Mahadev konar
>            Priority: Blocker
>             Fix For: 3.1.2, 3.2.2, 3.3.0
>
>         Attachments: test.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582_3.1.patch, ZOOKEEPER-582_3.2.patch, ZOOKEEPER-582_3.2.patch
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) it finds in the data directory. unfortunately, in the quorum version of zookeeper updates are logged using an epoch based on the latest log file in a directory. if there is a snapshot with a higher epoch than the log files, the zookeeper server will start logging using an epoch one higher than the highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no log files, zookeeper will start logging changes using epoch 1. if the cluster restarts the state will be restored from the snapshot with the epoch of 27, which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, whether it comes from the snapshot or log, and should include a sanity check to make sure that a follower never connects to a leader that has a lower epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-582:
------------------------------------

      Resolution: Fixed
    Release Note: fixed bug in zookeeper that can lead zookeeper to revert to old data when a snapshot is created without a corresponding log.
          Status: Resolved  (was: Patch Available)

I just committed this to 3.1, 3.2 and trunk. thanks ben!

> ZooKeeper can revert to old data when a snapshot is created outside of normal processing
> ----------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-582
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.1.1, 3.2.1
>            Reporter: Benjamin Reed
>            Assignee: Mahadev konar
>            Priority: Blocker
>             Fix For: 3.1.2, 3.2.2, 3.3.0
>
>         Attachments: test.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582_3.1.patch, ZOOKEEPER-582_3.2.patch, ZOOKEEPER-582_3.2.patch
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) it finds in the data directory. unfortunately, in the quorum version of zookeeper updates are logged using an epoch based on the latest log file in a directory. if there is a snapshot with a higher epoch than the log files, the zookeeper server will start logging using an epoch one higher than the highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no log files, zookeeper will start logging changes using epoch 1. if the cluster restarts the state will be restored from the snapshot with the epoch of 27, which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, whether it comes from the snapshot or log, and should include a sanity check to make sure that a follower never connects to a leader that has a lower epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-582:
------------------------------------

    Attachment: ZOOKEEPER-582.patch

this patch includes the patch and the test for trunk. ill upload combined patches for 3.1 and 3.2 branch.

> ZooKeeper can revert to old data when a snapshot is created outside of normal processing
> ----------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-582
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.1.1, 3.2.1
>            Reporter: Benjamin Reed
>            Priority: Blocker
>             Fix For: 3.2.2, 3.1.2
>
>         Attachments: test.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582_3.1.patch, ZOOKEEPER-582_3.2.patch
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) it finds in the data directory. unfortunately, in the quorum version of zookeeper updates are logged using an epoch based on the latest log file in a directory. if there is a snapshot with a higher epoch than the log files, the zookeeper server will start logging using an epoch one higher than the highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no log files, zookeeper will start logging changes using epoch 1. if the cluster restarts the state will be restored from the snapshot with the epoch of 27, which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, whether it comes from the snapshot or log, and should include a sanity check to make sure that a follower never connects to a leader that has a lower epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrick Hunt updated ZOOKEEPER-582:
-----------------------------------

             Priority: Blocker  (was: Major)
    Affects Version/s: 3.1.1
        Fix Version/s: 3.1.2

> ZooKeeper can revert to old data when a snapshot is created outside of normal processing
> ----------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-582
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.1.1, 3.2.1
>            Reporter: Benjamin Reed
>            Priority: Blocker
>             Fix For: 3.2.2, 3.1.2
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) it finds in the data directory. unfortunately, in the quorum version of zookeeper updates are logged using an epoch based on the latest log file in a directory. if there is a snapshot with a higher epoch than the log files, the zookeeper server will start logging using an epoch one higher than the highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no log files, zookeeper will start logging changes using epoch 1. if the cluster restarts the state will be restored from the snapshot with the epoch of 27, which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, whether it comes from the snapshot or log, and should include a sanity check to make sure that a follower never connects to a leader that has a lower epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-582:
------------------------------------

    Status: Open  (was: Patch Available)

> ZooKeeper can revert to old data when a snapshot is created outside of normal processing
> ----------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-582
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.2.1, 3.1.1
>            Reporter: Benjamin Reed
>            Assignee: Mahadev konar
>            Priority: Blocker
>             Fix For: 3.2.2, 3.3.0, 3.1.2
>
>         Attachments: test.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582_3.1.patch, ZOOKEEPER-582_3.2.patch
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) it finds in the data directory. unfortunately, in the quorum version of zookeeper updates are logged using an epoch based on the latest log file in a directory. if there is a snapshot with a higher epoch than the log files, the zookeeper server will start logging using an epoch one higher than the highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no log files, zookeeper will start logging changes using epoch 1. if the cluster restarts the state will be restored from the snapshot with the epoch of 27, which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, whether it comes from the snapshot or log, and should include a sanity check to make sure that a follower never connects to a leader that has a lower epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-582:
------------------------------------

    Attachment:     (was: ZOOKEEPER-582_3.1.patch)

> ZooKeeper can revert to old data when a snapshot is created outside of normal processing
> ----------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-582
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.1.1, 3.2.1
>            Reporter: Benjamin Reed
>            Assignee: Mahadev konar
>            Priority: Blocker
>             Fix For: 3.1.2, 3.2.2, 3.3.0
>
>         Attachments: test.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582_3.2.patch, ZOOKEEPER-582_3.2.patch
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) it finds in the data directory. unfortunately, in the quorum version of zookeeper updates are logged using an epoch based on the latest log file in a directory. if there is a snapshot with a higher epoch than the log files, the zookeeper server will start logging using an epoch one higher than the highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no log files, zookeeper will start logging changes using epoch 1. if the cluster restarts the state will be restored from the snapshot with the epoch of 27, which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, whether it comes from the snapshot or log, and should include a sanity check to make sure that a follower never connects to a leader that has a lower epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-582:
------------------------------------

    Attachment: ZOOKEEPER-582.patch

addressed ben's comments in this patch.

> ZooKeeper can revert to old data when a snapshot is created outside of normal processing
> ----------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-582
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.1.1, 3.2.1
>            Reporter: Benjamin Reed
>            Assignee: Mahadev konar
>            Priority: Blocker
>             Fix For: 3.2.2, 3.3.0, 3.1.2
>
>         Attachments: test.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582_3.1.patch, ZOOKEEPER-582_3.2.patch
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) it finds in the data directory. unfortunately, in the quorum version of zookeeper updates are logged using an epoch based on the latest log file in a directory. if there is a snapshot with a higher epoch than the log files, the zookeeper server will start logging using an epoch one higher than the highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no log files, zookeeper will start logging changes using epoch 1. if the cluster restarts the state will be restored from the snapshot with the epoch of 27, which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, whether it comes from the snapshot or log, and should include a sanity check to make sure that a follower never connects to a leader that has a lower epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780327#action_12780327 ] 

Mahadev konar commented on ZOOKEEPER-582:
-----------------------------------------

for 1) good catch.. i missed that

for 2) good point....ill fix that .... 

> ZooKeeper can revert to old data when a snapshot is created outside of normal processing
> ----------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-582
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.1.1, 3.2.1
>            Reporter: Benjamin Reed
>            Assignee: Mahadev konar
>            Priority: Blocker
>             Fix For: 3.2.2, 3.3.0, 3.1.2
>
>         Attachments: test.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582_3.1.patch, ZOOKEEPER-582_3.2.patch
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) it finds in the data directory. unfortunately, in the quorum version of zookeeper updates are logged using an epoch based on the latest log file in a directory. if there is a snapshot with a higher epoch than the log files, the zookeeper server will start logging using an epoch one higher than the highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no log files, zookeeper will start logging changes using epoch 1. if the cluster restarts the state will be restored from the snapshot with the epoch of 27, which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, whether it comes from the snapshot or log, and should include a sanity check to make sure that a follower never connects to a leader that has a lower epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780317#action_12780317 ] 

Hadoop QA commented on ZOOKEEPER-582:
-------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12425536/ZOOKEEPER-582.patch
  against trunk revision 882313.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 9 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/68/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/68/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/68/console

This message is automatically generated.

> ZooKeeper can revert to old data when a snapshot is created outside of normal processing
> ----------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-582
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.1.1, 3.2.1
>            Reporter: Benjamin Reed
>            Assignee: Mahadev konar
>            Priority: Blocker
>             Fix For: 3.2.2, 3.3.0, 3.1.2
>
>         Attachments: test.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582_3.1.patch, ZOOKEEPER-582_3.2.patch
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) it finds in the data directory. unfortunately, in the quorum version of zookeeper updates are logged using an epoch based on the latest log file in a directory. if there is a snapshot with a higher epoch than the log files, the zookeeper server will start logging using an epoch one higher than the highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no log files, zookeeper will start logging changes using epoch 1. if the cluster restarts the state will be restored from the snapshot with the epoch of 27, which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, whether it comes from the snapshot or log, and should include a sanity check to make sure that a follower never connects to a leader that has a lower epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780177#action_12780177 ] 

Hadoop QA commented on ZOOKEEPER-582:
-------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12425503/ZOOKEEPER-582.patch
  against trunk revision 881882.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 9 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/67/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/67/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/67/console

This message is automatically generated.

> ZooKeeper can revert to old data when a snapshot is created outside of normal processing
> ----------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-582
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.1.1, 3.2.1
>            Reporter: Benjamin Reed
>            Priority: Blocker
>             Fix For: 3.2.2, 3.1.2
>
>         Attachments: test.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582_3.1.patch, ZOOKEEPER-582_3.2.patch
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) it finds in the data directory. unfortunately, in the quorum version of zookeeper updates are logged using an epoch based on the latest log file in a directory. if there is a snapshot with a higher epoch than the log files, the zookeeper server will start logging using an epoch one higher than the highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no log files, zookeeper will start logging changes using epoch 1. if the cluster restarts the state will be restored from the snapshot with the epoch of 27, which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, whether it comes from the snapshot or log, and should include a sanity check to make sure that a follower never connects to a leader that has a lower epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-582:
------------------------------------

    Attachment: ZOOKEEPER-582_3.1.patch

attached the wrong patch for 3.1 .. attaching again.

> ZooKeeper can revert to old data when a snapshot is created outside of normal processing
> ----------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-582
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.1.1, 3.2.1
>            Reporter: Benjamin Reed
>            Assignee: Mahadev konar
>            Priority: Blocker
>             Fix For: 3.1.2, 3.2.2, 3.3.0
>
>         Attachments: test.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582_3.2.patch, ZOOKEEPER-582_3.2.patch
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) it finds in the data directory. unfortunately, in the quorum version of zookeeper updates are logged using an epoch based on the latest log file in a directory. if there is a snapshot with a higher epoch than the log files, the zookeeper server will start logging using an epoch one higher than the highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no log files, zookeeper will start logging changes using epoch 1. if the cluster restarts the state will be restored from the snapshot with the epoch of 27, which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, whether it comes from the snapshot or log, and should include a sanity check to make sure that a follower never connects to a leader that has a lower epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

Posted by "Hudson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780948#action_12780948 ] 

Hudson commented on ZOOKEEPER-582:
----------------------------------

Integrated in ZooKeeper-trunk #546 (See [http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/546/])
    . ZooKeeper can revert to old data when a snapshot is created outside of normal processing (ben reed and mahadev via mahadev)


> ZooKeeper can revert to old data when a snapshot is created outside of normal processing
> ----------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-582
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.1.1, 3.2.1
>            Reporter: Benjamin Reed
>            Assignee: Mahadev konar
>            Priority: Blocker
>             Fix For: 3.1.2, 3.2.2, 3.3.0
>
>         Attachments: test.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582_3.1.patch, ZOOKEEPER-582_3.2.patch, ZOOKEEPER-582_3.2.patch
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) it finds in the data directory. unfortunately, in the quorum version of zookeeper updates are logged using an epoch based on the latest log file in a directory. if there is a snapshot with a higher epoch than the log files, the zookeeper server will start logging using an epoch one higher than the highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no log files, zookeeper will start logging changes using epoch 1. if the cluster restarts the state will be restored from the snapshot with the epoch of 27, which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, whether it comes from the snapshot or log, and should include a sanity check to make sure that a follower never connects to a leader that has a lower epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780737#action_12780737 ] 

Hadoop QA commented on ZOOKEEPER-582:
-------------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12425653/ZOOKEEPER-582.patch
  against trunk revision 882313.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 9 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/71/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/71/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/71/console

This message is automatically generated.

> ZooKeeper can revert to old data when a snapshot is created outside of normal processing
> ----------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-582
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.1.1, 3.2.1
>            Reporter: Benjamin Reed
>            Assignee: Mahadev konar
>            Priority: Blocker
>             Fix For: 3.2.2, 3.3.0, 3.1.2
>
>         Attachments: test.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582_3.1.patch, ZOOKEEPER-582_3.2.patch
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) it finds in the data directory. unfortunately, in the quorum version of zookeeper updates are logged using an epoch based on the latest log file in a directory. if there is a snapshot with a higher epoch than the log files, the zookeeper server will start logging using an epoch one higher than the highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no log files, zookeeper will start logging changes using epoch 1. if the cluster restarts the state will be restored from the snapshot with the epoch of 27, which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, whether it comes from the snapshot or log, and should include a sanity check to make sure that a follower never connects to a leader that has a lower epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-582:
------------------------------------

    Status: Patch Available  (was: Open)

> ZooKeeper can revert to old data when a snapshot is created outside of normal processing
> ----------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-582
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.2.1, 3.1.1
>            Reporter: Benjamin Reed
>            Assignee: Mahadev konar
>            Priority: Blocker
>             Fix For: 3.2.2, 3.3.0, 3.1.2
>
>         Attachments: test.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582_3.1.patch, ZOOKEEPER-582_3.2.patch
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) it finds in the data directory. unfortunately, in the quorum version of zookeeper updates are logged using an epoch based on the latest log file in a directory. if there is a snapshot with a higher epoch than the log files, the zookeeper server will start logging using an epoch one higher than the highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no log files, zookeeper will start logging changes using epoch 1. if the cluster restarts the state will be restored from the snapshot with the epoch of 27, which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, whether it comes from the snapshot or log, and should include a sanity check to make sure that a follower never connects to a leader that has a lower epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-582:
------------------------------------

    Attachment: ZOOKEEPER-582.patch

this patch fixes the issue with FLE test. Ill upload the other patches for 3.1 and 3.2 as soon as hudson is done running this.

> ZooKeeper can revert to old data when a snapshot is created outside of normal processing
> ----------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-582
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.1.1, 3.2.1
>            Reporter: Benjamin Reed
>            Assignee: Mahadev konar
>            Priority: Blocker
>             Fix For: 3.2.2, 3.3.0, 3.1.2
>
>         Attachments: test.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582_3.1.patch, ZOOKEEPER-582_3.2.patch
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) it finds in the data directory. unfortunately, in the quorum version of zookeeper updates are logged using an epoch based on the latest log file in a directory. if there is a snapshot with a higher epoch than the log files, the zookeeper server will start logging using an epoch one higher than the highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no log files, zookeeper will start logging changes using epoch 1. if the cluster restarts the state will be restored from the snapshot with the epoch of 27, which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, whether it comes from the snapshot or log, and should include a sanity check to make sure that a follower never connects to a leader that has a lower epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-582:
------------------------------------

    Attachment: ZOOKEEPER-582.patch

looks like my eclipse settings added tabs to the indenation. fixed it in this patch.

> ZooKeeper can revert to old data when a snapshot is created outside of normal processing
> ----------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-582
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.1.1, 3.2.1
>            Reporter: Benjamin Reed
>            Assignee: Mahadev konar
>            Priority: Blocker
>             Fix For: 3.2.2, 3.3.0, 3.1.2
>
>         Attachments: test.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582_3.1.patch, ZOOKEEPER-582_3.2.patch
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) it finds in the data directory. unfortunately, in the quorum version of zookeeper updates are logged using an epoch based on the latest log file in a directory. if there is a snapshot with a higher epoch than the log files, the zookeeper server will start logging using an epoch one higher than the highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no log files, zookeeper will start logging changes using epoch 1. if the cluster restarts the state will be restored from the snapshot with the epoch of 27, which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, whether it comes from the snapshot or log, and should include a sanity check to make sure that a follower never connects to a leader that has a lower epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-582:
------------------------------------

    Attachment: ZOOKEEPER-582_3.2.patch

a patch for 3.2 branch. 

> ZooKeeper can revert to old data when a snapshot is created outside of normal processing
> ----------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-582
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.1.1, 3.2.1
>            Reporter: Benjamin Reed
>            Priority: Blocker
>             Fix For: 3.2.2, 3.1.2
>
>         Attachments: test.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582_3.2.patch
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) it finds in the data directory. unfortunately, in the quorum version of zookeeper updates are logged using an epoch based on the latest log file in a directory. if there is a snapshot with a higher epoch than the log files, the zookeeper server will start logging using an epoch one higher than the highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no log files, zookeeper will start logging changes using epoch 1. if the cluster restarts the state will be restored from the snapshot with the epoch of 27, which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, whether it comes from the snapshot or log, and should include a sanity check to make sure that a follower never connects to a leader that has a lower epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-582:
------------------------------------

    Status: Patch Available  (was: Open)

> ZooKeeper can revert to old data when a snapshot is created outside of normal processing
> ----------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-582
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.2.1, 3.1.1
>            Reporter: Benjamin Reed
>            Assignee: Mahadev konar
>            Priority: Blocker
>             Fix For: 3.2.2, 3.3.0, 3.1.2
>
>         Attachments: test.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582_3.1.patch, ZOOKEEPER-582_3.2.patch
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) it finds in the data directory. unfortunately, in the quorum version of zookeeper updates are logged using an epoch based on the latest log file in a directory. if there is a snapshot with a higher epoch than the log files, the zookeeper server will start logging using an epoch one higher than the highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no log files, zookeeper will start logging changes using epoch 1. if the cluster restarts the state will be restored from the snapshot with the epoch of 27, which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, whether it comes from the snapshot or log, and should include a sanity check to make sure that a follower never connects to a leader that has a lower epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

Posted by "Benjamin Reed (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Benjamin Reed updated ZOOKEEPER-582:
------------------------------------

    Status: Open  (was: Patch Available)

looks good mahadev just two things:

1) (minor) in getLastLoggedZxid() you should be useing maxLogZxid instead of calling getLastLoggedZxid() again.

2) when doing the sanity check with the leaders zxid you should be checking epochs not zxids. it is possible for a follower to see something later and have to truncate from the same epoch, put a follower should never see a later epoch.

> ZooKeeper can revert to old data when a snapshot is created outside of normal processing
> ----------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-582
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.2.1, 3.1.1
>            Reporter: Benjamin Reed
>            Assignee: Mahadev konar
>            Priority: Blocker
>             Fix For: 3.2.2, 3.3.0, 3.1.2
>
>         Attachments: test.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582_3.1.patch, ZOOKEEPER-582_3.2.patch
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) it finds in the data directory. unfortunately, in the quorum version of zookeeper updates are logged using an epoch based on the latest log file in a directory. if there is a snapshot with a higher epoch than the log files, the zookeeper server will start logging using an epoch one higher than the highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no log files, zookeeper will start logging changes using epoch 1. if the cluster restarts the state will be restored from the snapshot with the epoch of 27, which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, whether it comes from the snapshot or log, and should include a sanity check to make sure that a follower never connects to a leader that has a lower epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

Posted by "Benjamin Reed (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Benjamin Reed updated ZOOKEEPER-582:
------------------------------------

    Attachment: test.patch

this patch reproduces the problems outlined in this issue.

> ZooKeeper can revert to old data when a snapshot is created outside of normal processing
> ----------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-582
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.1.1, 3.2.1
>            Reporter: Benjamin Reed
>            Priority: Blocker
>             Fix For: 3.2.2, 3.1.2
>
>         Attachments: test.patch
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) it finds in the data directory. unfortunately, in the quorum version of zookeeper updates are logged using an epoch based on the latest log file in a directory. if there is a snapshot with a higher epoch than the log files, the zookeeper server will start logging using an epoch one higher than the highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no log files, zookeeper will start logging changes using epoch 1. if the cluster restarts the state will be restored from the snapshot with the epoch of 27, which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, whether it comes from the snapshot or log, and should include a sanity check to make sure that a follower never connects to a leader that has a lower epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrick Hunt updated ZOOKEEPER-582:
-----------------------------------

    Fix Version/s: 3.3.0

> ZooKeeper can revert to old data when a snapshot is created outside of normal processing
> ----------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-582
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.1.1, 3.2.1
>            Reporter: Benjamin Reed
>            Assignee: Mahadev konar
>            Priority: Blocker
>             Fix For: 3.2.2, 3.3.0, 3.1.2
>
>         Attachments: test.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582_3.1.patch, ZOOKEEPER-582_3.2.patch
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) it finds in the data directory. unfortunately, in the quorum version of zookeeper updates are logged using an epoch based on the latest log file in a directory. if there is a snapshot with a higher epoch than the log files, the zookeeper server will start logging using an epoch one higher than the highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no log files, zookeeper will start logging changes using epoch 1. if the cluster restarts the state will be restored from the snapshot with the epoch of 27, which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, whether it comes from the snapshot or log, and should include a sanity check to make sure that a follower never connects to a leader that has a lower epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-582:
------------------------------------

    Status: Open  (was: Patch Available)

> ZooKeeper can revert to old data when a snapshot is created outside of normal processing
> ----------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-582
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.2.1, 3.1.1
>            Reporter: Benjamin Reed
>            Assignee: Mahadev konar
>            Priority: Blocker
>             Fix For: 3.2.2, 3.3.0, 3.1.2
>
>         Attachments: test.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582_3.1.patch, ZOOKEEPER-582_3.2.patch
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) it finds in the data directory. unfortunately, in the quorum version of zookeeper updates are logged using an epoch based on the latest log file in a directory. if there is a snapshot with a higher epoch than the log files, the zookeeper server will start logging using an epoch one higher than the highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no log files, zookeeper will start logging changes using epoch 1. if the cluster restarts the state will be restored from the snapshot with the epoch of 27, which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, whether it comes from the snapshot or log, and should include a sanity check to make sure that a follower never connects to a leader that has a lower epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-582:
------------------------------------

    Attachment: ZOOKEEPER-582_3.1.patch

a patch for 3.1 branch.

> ZooKeeper can revert to old data when a snapshot is created outside of normal processing
> ----------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-582
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.1.1, 3.2.1
>            Reporter: Benjamin Reed
>            Priority: Blocker
>             Fix For: 3.2.2, 3.1.2
>
>         Attachments: test.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582_3.1.patch, ZOOKEEPER-582_3.2.patch
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) it finds in the data directory. unfortunately, in the quorum version of zookeeper updates are logged using an epoch based on the latest log file in a directory. if there is a snapshot with a higher epoch than the log files, the zookeeper server will start logging using an epoch one higher than the highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no log files, zookeeper will start logging changes using epoch 1. if the cluster restarts the state will be restored from the snapshot with the epoch of 27, which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, whether it comes from the snapshot or log, and should include a sanity check to make sure that a follower never connects to a leader that has a lower epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12779149#action_12779149 ] 

Patrick Hunt commented on ZOOKEEPER-582:
----------------------------------------

As Ben mentioned we will never see this situation during normal operation of ZK.

The case where we did see this was a result of a user running the migration tool that we provide to upgrade from version 2 to version 3 of ZooKeeper. The tool migrates the data by writing a single snapshot file where the zxid is maintained (it does not write a log file). As a result of the scenario Ben mentioned (snap with no associated log file) this could cause this bug to occur. If you have run the migration tool, documented here:
http://hadoop.apache.org/zookeeper/docs/r3.0.0/releasenotes.html#migration_data
you can verify whether or not you have this situation by looking at your ZooKeeper datadirectory

Here's an example

-rw-r--r--  1 root search 67108880 Nov 17 19:31 log.300022b61
-rw-r--r--  1 root search 67108880 Nov 17 19:38 log.3000292d0
-rw-r--r--  1 root search  3646608 Nov  5 12:13 snapshot.1db5df6e2d6
-rw-r--r--  1 root search  3616579 Nov 17 19:31 snapshot.3000292c9
-rw-r--r--  1 root search  3616708 Nov 17 19:38 snapshot.300038d32

where the files are of the form <file>.<epoch><xid> 
epoch and xid both being 4 byte values represented as hex

Notice that the snapshot.1db5df6e2d6 has epoch of 0x1db, while the other
files have epoch of 0x3, this is the scenario described in the description of this
JIRA. (there is no log file associated with epoch 0x1db)

If you see this in your datadir - a snapshot with an epoch where there are no log files with
this same epoch, then this bug pertains.  If you see snapshots of a particular epoch
and log files with the same epoch then this bug does NOT pertain.


> ZooKeeper can revert to old data when a snapshot is created outside of normal processing
> ----------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-582
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.1.1, 3.2.1
>            Reporter: Benjamin Reed
>            Priority: Blocker
>             Fix For: 3.2.2, 3.1.2
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) it finds in the data directory. unfortunately, in the quorum version of zookeeper updates are logged using an epoch based on the latest log file in a directory. if there is a snapshot with a higher epoch than the log files, the zookeeper server will start logging using an epoch one higher than the highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no log files, zookeeper will start logging changes using epoch 1. if the cluster restarts the state will be restored from the snapshot with the epoch of 27, which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, whether it comes from the snapshot or log, and should include a sanity check to make sure that a follower never connects to a leader that has a lower epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780323#action_12780323 ] 

Hadoop QA commented on ZOOKEEPER-582:
-------------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12425538/ZOOKEEPER-582.patch
  against trunk revision 882313.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 9 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/69/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/69/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/69/console

This message is automatically generated.

> ZooKeeper can revert to old data when a snapshot is created outside of normal processing
> ----------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-582
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.1.1, 3.2.1
>            Reporter: Benjamin Reed
>            Assignee: Mahadev konar
>            Priority: Blocker
>             Fix For: 3.2.2, 3.3.0, 3.1.2
>
>         Attachments: test.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582_3.1.patch, ZOOKEEPER-582_3.2.patch
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) it finds in the data directory. unfortunately, in the quorum version of zookeeper updates are logged using an epoch based on the latest log file in a directory. if there is a snapshot with a higher epoch than the log files, the zookeeper server will start logging using an epoch one higher than the highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no log files, zookeeper will start logging changes using epoch 1. if the cluster restarts the state will be restored from the snapshot with the epoch of 27, which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, whether it comes from the snapshot or log, and should include a sanity check to make sure that a follower never connects to a leader that has a lower epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-582:
------------------------------------

    Attachment: ZOOKEEPER-582_3.2.patch

latest patch for 3.2 branch. I ran the tests and they pass.

> ZooKeeper can revert to old data when a snapshot is created outside of normal processing
> ----------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-582
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.1.1, 3.2.1
>            Reporter: Benjamin Reed
>            Assignee: Mahadev konar
>            Priority: Blocker
>             Fix For: 3.1.2, 3.2.2, 3.3.0
>
>         Attachments: test.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582_3.2.patch, ZOOKEEPER-582_3.2.patch
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) it finds in the data directory. unfortunately, in the quorum version of zookeeper updates are logged using an epoch based on the latest log file in a directory. if there is a snapshot with a higher epoch than the log files, the zookeeper server will start logging using an epoch one higher than the highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no log files, zookeeper will start logging changes using epoch 1. if the cluster restarts the state will be restored from the snapshot with the epoch of 27, which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, whether it comes from the snapshot or log, and should include a sanity check to make sure that a follower never connects to a leader that has a lower epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrick Hunt reassigned ZOOKEEPER-582:
--------------------------------------

    Assignee: Mahadev konar

> ZooKeeper can revert to old data when a snapshot is created outside of normal processing
> ----------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-582
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.1.1, 3.2.1
>            Reporter: Benjamin Reed
>            Assignee: Mahadev konar
>            Priority: Blocker
>             Fix For: 3.2.2, 3.1.2
>
>         Attachments: test.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582_3.1.patch, ZOOKEEPER-582_3.2.patch
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) it finds in the data directory. unfortunately, in the quorum version of zookeeper updates are logged using an epoch based on the latest log file in a directory. if there is a snapshot with a higher epoch than the log files, the zookeeper server will start logging using an epoch one higher than the highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no log files, zookeeper will start logging changes using epoch 1. if the cluster restarts the state will be restored from the snapshot with the epoch of 27, which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, whether it comes from the snapshot or log, and should include a sanity check to make sure that a follower never connects to a leader that has a lower epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-582:
------------------------------------

    Status: Patch Available  (was: Open)

> ZooKeeper can revert to old data when a snapshot is created outside of normal processing
> ----------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-582
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.2.1, 3.1.1
>            Reporter: Benjamin Reed
>            Assignee: Mahadev konar
>            Priority: Blocker
>             Fix For: 3.2.2, 3.3.0, 3.1.2
>
>         Attachments: test.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582_3.1.patch, ZOOKEEPER-582_3.2.patch
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) it finds in the data directory. unfortunately, in the quorum version of zookeeper updates are logged using an epoch based on the latest log file in a directory. if there is a snapshot with a higher epoch than the log files, the zookeeper server will start logging using an epoch one higher than the highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no log files, zookeeper will start logging changes using epoch 1. if the cluster restarts the state will be restored from the snapshot with the epoch of 27, which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, whether it comes from the snapshot or log, and should include a sanity check to make sure that a follower never connects to a leader that has a lower epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-582) ZooKeeper can revert to old data when a snapshot is created outside of normal processing

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-582:
------------------------------------

    Status: Patch Available  (was: Open)

> ZooKeeper can revert to old data when a snapshot is created outside of normal processing
> ----------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-582
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-582
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.2.1, 3.1.1
>            Reporter: Benjamin Reed
>            Priority: Blocker
>             Fix For: 3.2.2, 3.1.2
>
>         Attachments: test.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582.patch, ZOOKEEPER-582_3.1.patch, ZOOKEEPER-582_3.2.patch
>
>
> when zookeeper starts up it will restore the most recent state (latest zxid) it finds in the data directory. unfortunately, in the quorum version of zookeeper updates are logged using an epoch based on the latest log file in a directory. if there is a snapshot with a higher epoch than the log files, the zookeeper server will start logging using an epoch one higher than the highest log file.
> so if a data directory has a snapshot with an epoch of 27 and there are no log files, zookeeper will start logging changes using epoch 1. if the cluster restarts the state will be restored from the snapshot with the epoch of 27, which in effect, restores old data.
> normal operation of zookeeper will never result in this situation.
> this does not effect standalone zookeeper.
> a fix should make sure to use an epoch one higher than the current state, whether it comes from the snapshot or log, and should include a sanity check to make sure that a follower never connects to a leader that has a lower epoch than its own.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.