You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Germán Blanco (JIRA)" <ji...@apache.org> on 2013/10/02 12:57:23 UTC

[jira] [Updated] (ZOOKEEPER-1777) Missing ephemeral nodes in one of the members of the ensemble

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Germán Blanco updated ZOOKEEPER-1777:
-------------------------------------

    Attachment: snaps.tar

Snapshots of the three members of the ZooKeeper ensemble.
The 8 missing nodes in "the follower that is not ok" were created in the end of epoch 1:
 < cZxid = 0x0000010000007d
 ...
 < cZxid = 0x000001000000a9
 while the complete list is:
 ...
 cZxid = 0x0000010000007b
 cZxid = 0x0000010000007d
 ...
 cZxid = 0x000001000000a9
 cZxid = 0x00000200000004
 ...
 4 of the 6 ephemeral owners of these nodes have made modifications during epoch 2, which makes me think that this problem might not be related with session expiration, but more likely with synchronization after leader election.
 Even though some of the missing znodes were modified in epoch 2, "the follower that is not ok" didn't use this event to notice that something was wrong and e.g. restart and synchronize via snapshot.

> Missing ephemeral nodes in one of the members of the ensemble
> -------------------------------------------------------------
>
>                 Key: ZOOKEEPER-1777
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1777
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.4.5
>         Environment: Linux, Java 1.7
>            Reporter: Germán Blanco
>            Assignee: Germán Blanco
>             Fix For: 3.4.6, 3.5.0
>
>         Attachments: snaps.tar
>
>
> In a 3-servers ensemble, one of the followers doesn't see part of the ephemeral nodes that are present in the leader and the other follower. 
> The 8 missing nodes in "the follower that is not ok" were created in the end of epoch 1, the ensemble is running in epoch 2.



--
This message was sent by Atlassian JIRA
(v6.1#6144)