You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Mahadev konar (JIRA)" <ji...@apache.org> on 2009/11/24 23:51:39 UTC

[jira] Created: (ZOOKEEPER-596) The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.

The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.
--------------------------------------------------------------------------------------------------------------------

                 Key: ZOOKEEPER-596
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-596
             Project: Zookeeper
          Issue Type: Bug
    Affects Versions: 3.2.1
            Reporter: Mahadev konar
            Assignee: Mahadev konar
             Fix For: 3.3.0


It is possible that the last loggged zxid as reported by all the servers during leader election is not the last zxid that the server can upload data to. It is very much possible that some transaction or snapshot gets corrupted and the servers actually do not have valid data till last logged zxid. We need to make sure that what the servers report as there last logged zxid, they are able to load data till that zxid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-596) The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782223#action_12782223 ] 

Mahadev konar commented on ZOOKEEPER-596:
-----------------------------------------

To elablorate on the problem:
Currently this is what happens:
- servers read the last logged zxid from the last log or snapshot and use that in the leader election
- it is quite possible that something in the logs (some transaction lower than the one reported in leader election) is corrupt and the server does not have sane data till the last reported zxid in leader election
- this could lead to leader election spinning in a loop if the one elected a leader cannot actually read the data till the reported transaction id.

The solution is to let the servers upload all the data before they start the leader election and then send the last logged zxid. This way the server can be sure that it has valid data til the last zxid it actually reports in the leader election.



> The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-596
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-596
>             Project: Zookeeper
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>            Reporter: Mahadev konar
>            Assignee: Mahadev konar
>             Fix For: 3.3.0
>
>
> It is possible that the last loggged zxid as reported by all the servers during leader election is not the last zxid that the server can upload data to. It is very much possible that some transaction or snapshot gets corrupted and the servers actually do not have valid data till last logged zxid. We need to make sure that what the servers report as there last logged zxid, they are able to load data till that zxid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-596) The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-596:
------------------------------------

    Status: Patch Available  (was: Open)

> The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-596
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-596
>             Project: Zookeeper
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>            Reporter: Mahadev konar
>            Assignee: Mahadev konar
>             Fix For: 3.3.0
>
>         Attachments: ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch
>
>
> It is possible that the last loggged zxid as reported by all the servers during leader election is not the last zxid that the server can upload data to. It is very much possible that some transaction or snapshot gets corrupted and the servers actually do not have valid data till last logged zxid. We need to make sure that what the servers report as there last logged zxid, they are able to load data till that zxid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-596) The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12792481#action_12792481 ] 

Hudson commented on ZOOKEEPER-596:
----------------------------------

Integrated in ZooKeeper-trunk #634 (See [http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/634/])
    . The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted. (mahadev)


> The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-596
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-596
>             Project: Zookeeper
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>            Reporter: Mahadev konar
>            Assignee: Mahadev konar
>             Fix For: 3.3.0
>
>         Attachments: ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch
>
>
> It is possible that the last loggged zxid as reported by all the servers during leader election is not the last zxid that the server can upload data to. It is very much possible that some transaction or snapshot gets corrupted and the servers actually do not have valid data till last logged zxid. We need to make sure that what the servers report as there last logged zxid, they are able to load data till that zxid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-596) The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12791546#action_12791546 ] 

Hadoop QA commented on ZOOKEEPER-596:
-------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12428210/ZOOKEEPER-596.patch
  against trunk revision 891368.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 12 new or modified tests.

    -1 patch.  The patch command could not apply the patch.

Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/28/console

This message is automatically generated.

> The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-596
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-596
>             Project: Zookeeper
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>            Reporter: Mahadev konar
>            Assignee: Mahadev konar
>             Fix For: 3.3.0
>
>         Attachments: ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch
>
>
> It is possible that the last loggged zxid as reported by all the servers during leader election is not the last zxid that the server can upload data to. It is very much possible that some transaction or snapshot gets corrupted and the servers actually do not have valid data till last logged zxid. We need to make sure that what the servers report as there last logged zxid, they are able to load data till that zxid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-596) The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-596:
------------------------------------

    Attachment: ZOOKEEPER-596.patch

an updated patch with acess methods for accessing zkdatabase from zookeeperserver. It was accessing members directly which I think is a bad idea.

> The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-596
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-596
>             Project: Zookeeper
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>            Reporter: Mahadev konar
>            Assignee: Mahadev konar
>             Fix For: 3.3.0
>
>         Attachments: ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch
>
>
> It is possible that the last loggged zxid as reported by all the servers during leader election is not the last zxid that the server can upload data to. It is very much possible that some transaction or snapshot gets corrupted and the servers actually do not have valid data till last logged zxid. We need to make sure that what the servers report as there last logged zxid, they are able to load data till that zxid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-596) The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-596:
------------------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]
          Status: Resolved  (was: Patch Available)

I just committed this.

> The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-596
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-596
>             Project: Zookeeper
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>            Reporter: Mahadev konar
>            Assignee: Mahadev konar
>             Fix For: 3.3.0
>
>         Attachments: ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch
>
>
> It is possible that the last loggged zxid as reported by all the servers during leader election is not the last zxid that the server can upload data to. It is very much possible that some transaction or snapshot gets corrupted and the servers actually do not have valid data till last logged zxid. We need to make sure that what the servers report as there last logged zxid, they are able to load data till that zxid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-596) The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-596:
------------------------------------

    Status: Open  (was: Patch Available)

> The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-596
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-596
>             Project: Zookeeper
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>            Reporter: Mahadev konar
>            Assignee: Mahadev konar
>             Fix For: 3.3.0
>
>         Attachments: ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch
>
>
> It is possible that the last loggged zxid as reported by all the servers during leader election is not the last zxid that the server can upload data to. It is very much possible that some transaction or snapshot gets corrupted and the servers actually do not have valid data till last logged zxid. We need to make sure that what the servers report as there last logged zxid, they are able to load data till that zxid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-596) The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-596:
------------------------------------

    Attachment: ZOOKEEPER-596.patch

a preliminary patch. Does not include tests. Will still be adding tests and cleaning it up. 

this patch adds:

- a new class ZKDatabase that becomes a top level member of quorumpeer and passed around to all the zookeeper server's created for the life of a quorumPeer
- the zkdatabase includes all the api's needed to modify/use/load the zk database
making ZKDatabase as the top level member of quorumpeer allows it to be shared across different instances of zookeeper servers (leader/learner/observer) of an instance of quorumpeer.

I will be adding javadocs and cleaning up the patch shortly.

> The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-596
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-596
>             Project: Zookeeper
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>            Reporter: Mahadev konar
>            Assignee: Mahadev konar
>             Fix For: 3.3.0
>
>         Attachments: ZOOKEEPER-596.patch
>
>
> It is possible that the last loggged zxid as reported by all the servers during leader election is not the last zxid that the server can upload data to. It is very much possible that some transaction or snapshot gets corrupted and the servers actually do not have valid data till last logged zxid. We need to make sure that what the servers report as there last logged zxid, they are able to load data till that zxid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-596) The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-596:
------------------------------------

    Status: Patch Available  (was: Open)

> The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-596
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-596
>             Project: Zookeeper
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>            Reporter: Mahadev konar
>            Assignee: Mahadev konar
>             Fix For: 3.3.0
>
>         Attachments: ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch
>
>
> It is possible that the last loggged zxid as reported by all the servers during leader election is not the last zxid that the server can upload data to. It is very much possible that some transaction or snapshot gets corrupted and the servers actually do not have valid data till last logged zxid. We need to make sure that what the servers report as there last logged zxid, they are able to load data till that zxid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-596) The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-596:
------------------------------------

    Status: Patch Available  (was: Open)

> The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-596
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-596
>             Project: Zookeeper
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>            Reporter: Mahadev konar
>            Assignee: Mahadev konar
>             Fix For: 3.3.0
>
>         Attachments: ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch
>
>
> It is possible that the last loggged zxid as reported by all the servers during leader election is not the last zxid that the server can upload data to. It is very much possible that some transaction or snapshot gets corrupted and the servers actually do not have valid data till last logged zxid. We need to make sure that what the servers report as there last logged zxid, they are able to load data till that zxid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-596) The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-596:
------------------------------------

    Attachment: ZOOKEEPER-596.patch

this patch adds a test case for using memory based zkdatabase most of the time. The test checks to see that a server who has corrupted database cannot join the cluster. I am still thinking if we should just start with a empty database in such a case or just shutdown and let the admin figure it out. This way if the disk is corrupt, an admin can take care of it. For now, I have left the quorumpeer to exit if it finds its database is corrupt on and upto the admin to sanitize the database (by just deleting the database and starting all new on that node).

this patch includes :
- ZOOKEEPER-629



> The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-596
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-596
>             Project: Zookeeper
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>            Reporter: Mahadev konar
>            Assignee: Mahadev konar
>             Fix For: 3.3.0
>
>         Attachments: ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch
>
>
> It is possible that the last loggged zxid as reported by all the servers during leader election is not the last zxid that the server can upload data to. It is very much possible that some transaction or snapshot gets corrupted and the servers actually do not have valid data till last logged zxid. We need to make sure that what the servers report as there last logged zxid, they are able to load data till that zxid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-596) The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-596:
------------------------------------

    Attachment: ZOOKEEPER-596.patch

looks like the patch got stale.... uploading a new patch that applies to the trunk.




> The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-596
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-596
>             Project: Zookeeper
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>            Reporter: Mahadev konar
>            Assignee: Mahadev konar
>             Fix For: 3.3.0
>
>         Attachments: ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch
>
>
> It is possible that the last loggged zxid as reported by all the servers during leader election is not the last zxid that the server can upload data to. It is very much possible that some transaction or snapshot gets corrupted and the servers actually do not have valid data till last logged zxid. We need to make sure that what the servers report as there last logged zxid, they are able to load data till that zxid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-596) The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-596:
------------------------------------

    Status: Open  (was: Patch Available)

> The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-596
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-596
>             Project: Zookeeper
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>            Reporter: Mahadev konar
>            Assignee: Mahadev konar
>             Fix For: 3.3.0
>
>         Attachments: ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch
>
>
> It is possible that the last loggged zxid as reported by all the servers during leader election is not the last zxid that the server can upload data to. It is very much possible that some transaction or snapshot gets corrupted and the servers actually do not have valid data till last logged zxid. We need to make sure that what the servers report as there last logged zxid, they are able to load data till that zxid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-596) The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-596:
------------------------------------

    Attachment: ZOOKEEPER-596.patch

patch that addresses ben's comments. Also removed some unnecessary logging from the tests.

> The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-596
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-596
>             Project: Zookeeper
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>            Reporter: Mahadev konar
>            Assignee: Mahadev konar
>             Fix For: 3.3.0
>
>         Attachments: ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch
>
>
> It is possible that the last loggged zxid as reported by all the servers during leader election is not the last zxid that the server can upload data to. It is very much possible that some transaction or snapshot gets corrupted and the servers actually do not have valid data till last logged zxid. We need to make sure that what the servers report as there last logged zxid, they are able to load data till that zxid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-596) The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.

Posted by "Benjamin Reed (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12792142#action_12792142 ] 

Benjamin Reed commented on ZOOKEEPER-596:
-----------------------------------------

+1 (assuming it passes test)

> The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-596
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-596
>             Project: Zookeeper
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>            Reporter: Mahadev konar
>            Assignee: Mahadev konar
>             Fix For: 3.3.0
>
>         Attachments: ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch
>
>
> It is possible that the last loggged zxid as reported by all the servers during leader election is not the last zxid that the server can upload data to. It is very much possible that some transaction or snapshot gets corrupted and the servers actually do not have valid data till last logged zxid. We need to make sure that what the servers report as there last logged zxid, they are able to load data till that zxid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-596) The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12791611#action_12791611 ] 

Hadoop QA commented on ZOOKEEPER-596:
-------------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12428223/ZOOKEEPER-596.patch
  against trunk revision 891368.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 9 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/30/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/30/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/30/console

This message is automatically generated.

> The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-596
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-596
>             Project: Zookeeper
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>            Reporter: Mahadev konar
>            Assignee: Mahadev konar
>             Fix For: 3.3.0
>
>         Attachments: ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch
>
>
> It is possible that the last loggged zxid as reported by all the servers during leader election is not the last zxid that the server can upload data to. It is very much possible that some transaction or snapshot gets corrupted and the servers actually do not have valid data till last logged zxid. We need to make sure that what the servers report as there last logged zxid, they are able to load data till that zxid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-596) The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.

Posted by "Benjamin Reed (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12792127#action_12792127 ] 

Benjamin Reed commented on ZOOKEEPER-596:
-----------------------------------------

looks really good. just a couple of comments:

1) i think zkDb should be private
2) you have get/setZkb() in QuorumPeer. you should make it consistent with ZooKeeper and use get/setZKDatabase

> The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-596
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-596
>             Project: Zookeeper
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>            Reporter: Mahadev konar
>            Assignee: Mahadev konar
>             Fix For: 3.3.0
>
>         Attachments: ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch
>
>
> It is possible that the last loggged zxid as reported by all the servers during leader election is not the last zxid that the server can upload data to. It is very much possible that some transaction or snapshot gets corrupted and the servers actually do not have valid data till last logged zxid. We need to make sure that what the servers report as there last logged zxid, they are able to load data till that zxid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-596) The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12791566#action_12791566 ] 

Hadoop QA commented on ZOOKEEPER-596:
-------------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12428212/ZOOKEEPER-596.patch
  against trunk revision 891368.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 9 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/29/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/29/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/29/console

This message is automatically generated.

> The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-596
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-596
>             Project: Zookeeper
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>            Reporter: Mahadev konar
>            Assignee: Mahadev konar
>             Fix For: 3.3.0
>
>         Attachments: ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch
>
>
> It is possible that the last loggged zxid as reported by all the servers during leader election is not the last zxid that the server can upload data to. It is very much possible that some transaction or snapshot gets corrupted and the servers actually do not have valid data till last logged zxid. We need to make sure that what the servers report as there last logged zxid, they are able to load data till that zxid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-596) The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12792253#action_12792253 ] 

Hadoop QA commented on ZOOKEEPER-596:
-------------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12428346/ZOOKEEPER-596.patch
  against trunk revision 892001.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 9 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/93/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/93/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/93/console

This message is automatically generated.

> The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-596
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-596
>             Project: Zookeeper
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>            Reporter: Mahadev konar
>            Assignee: Mahadev konar
>             Fix For: 3.3.0
>
>         Attachments: ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch
>
>
> It is possible that the last loggged zxid as reported by all the servers during leader election is not the last zxid that the server can upload data to. It is very much possible that some transaction or snapshot gets corrupted and the servers actually do not have valid data till last logged zxid. We need to make sure that what the servers report as there last logged zxid, they are able to load data till that zxid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-596) The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-596:
------------------------------------

    Status: Open  (was: Patch Available)

> The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-596
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-596
>             Project: Zookeeper
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>            Reporter: Mahadev konar
>            Assignee: Mahadev konar
>             Fix For: 3.3.0
>
>         Attachments: ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch
>
>
> It is possible that the last loggged zxid as reported by all the servers during leader election is not the last zxid that the server can upload data to. It is very much possible that some transaction or snapshot gets corrupted and the servers actually do not have valid data till last logged zxid. We need to make sure that what the servers report as there last logged zxid, they are able to load data till that zxid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-596) The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-596:
------------------------------------

    Attachment: ZOOKEEPER-596.patch

after 3 failed attempts, hopefully jira would upload this file.

a cleaned up patch with comments/javadoc. I am still adding tests. Trying to fix some tests that are failing.


> The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-596
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-596
>             Project: Zookeeper
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>            Reporter: Mahadev konar
>            Assignee: Mahadev konar
>             Fix For: 3.3.0
>
>         Attachments: ZOOKEEPER-596.patch, ZOOKEEPER-596.patch
>
>
> It is possible that the last loggged zxid as reported by all the servers during leader election is not the last zxid that the server can upload data to. It is very much possible that some transaction or snapshot gets corrupted and the servers actually do not have valid data till last logged zxid. We need to make sure that what the servers report as there last logged zxid, they are able to load data till that zxid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-596) The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-596:
------------------------------------

    Status: Patch Available  (was: Open)

> The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-596
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-596
>             Project: Zookeeper
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>            Reporter: Mahadev konar
>            Assignee: Mahadev konar
>             Fix For: 3.3.0
>
>         Attachments: ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch, ZOOKEEPER-596.patch
>
>
> It is possible that the last loggged zxid as reported by all the servers during leader election is not the last zxid that the server can upload data to. It is very much possible that some transaction or snapshot gets corrupted and the servers actually do not have valid data till last logged zxid. We need to make sure that what the servers report as there last logged zxid, they are able to load data till that zxid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.