You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Mahadev konar (JIRA)" <ji...@apache.org> on 2009/11/25 00:03:39 UTC

[jira] Commented: (ZOOKEEPER-596) The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782223#action_12782223 ] 

Mahadev konar commented on ZOOKEEPER-596:
-----------------------------------------

To elablorate on the problem:
Currently this is what happens:
- servers read the last logged zxid from the last log or snapshot and use that in the leader election
- it is quite possible that something in the logs (some transaction lower than the one reported in leader election) is corrupt and the server does not have sane data till the last reported zxid in leader election
- this could lead to leader election spinning in a loop if the one elected a leader cannot actually read the data till the reported transaction id.

The solution is to let the servers upload all the data before they start the leader election and then send the last logged zxid. This way the server can be sure that it has valid data til the last zxid it actually reports in the leader election.



> The last logged zxid calculated by zookeeper servers could cause problems in leader election if data gets corrupted.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-596
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-596
>             Project: Zookeeper
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>            Reporter: Mahadev konar
>            Assignee: Mahadev konar
>             Fix For: 3.3.0
>
>
> It is possible that the last loggged zxid as reported by all the servers during leader election is not the last zxid that the server can upload data to. It is very much possible that some transaction or snapshot gets corrupted and the servers actually do not have valid data till last logged zxid. We need to make sure that what the servers report as there last logged zxid, they are able to load data till that zxid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.