You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Germán Blanco (JIRA)" <ji...@apache.org> on 2013/10/02 18:16:43 UTC

[jira] [Commented] (ZOOKEEPER-1777) Missing ephemeral nodes in one of the members of the ensemble

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13784125#comment-13784125 ] 

Germán Blanco commented on ZOOKEEPER-1777:
------------------------------------------

I have been able to reach a similar situation if data loss for one of the servers is included in the picture.
Say we have servers A, B and C.
1 - A, B and C form an ensemble and reach up to zxid 0x0000010000007b.
2 - C is stopped.
3 - A and B continue until transaction 0x000001000000a9.
4 - A is stopped. B is stopped and loses all data.
5 - B and C are restarted and form an ensemble starting with zxid 0x0000010000007b. They build a different story up to 0x00000200000004.
6 - A is restarted, joins the new ensemble, receives a DIFF and continues working.
7 - Transactions in A from 0x0000010000007d to 0x000001000000a9 maybe irrelevant. In any case they do not contain the creation of the znodes included in the history of B and C.
Since losing data for one server is a possibility in my case, I am considering forcing synchronization via snapshot as the solution for this.
Any help or opinions will be very appreciated.

> Missing ephemeral nodes in one of the members of the ensemble
> -------------------------------------------------------------
>
>                 Key: ZOOKEEPER-1777
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1777
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.4.5
>         Environment: Linux, Java 1.7
>            Reporter: Germán Blanco
>            Assignee: Germán Blanco
>             Fix For: 3.4.6, 3.5.0
>
>         Attachments: snaps.tar
>
>
> In a 3-servers ensemble, one of the followers doesn't see part of the ephemeral nodes that are present in the leader and the other follower. 
> The 8 missing nodes in "the follower that is not ok" were created in the end of epoch 1, the ensemble is running in epoch 2.



--
This message was sent by Atlassian JIRA
(v6.1#6144)