You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@zookeeper.apache.org by "Mate Szalay-Beko (Jira)" <ji...@apache.org> on 2020/02/20 09:20:00 UTC

[jira] [Assigned] (ZOOKEEPER-3725) Zookeeper fails to establish quorum with 2 servers using 3.5.6

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mate Szalay-Beko reassigned ZOOKEEPER-3725:
-------------------------------------------

    Assignee: Mate Szalay-Beko

> Zookeeper fails to establish quorum with 2 servers using 3.5.6
> --------------------------------------------------------------
>
>                 Key: ZOOKEEPER-3725
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3725
>             Project: ZooKeeper
>          Issue Type: Bug
>    Affects Versions: 3.5.6
>            Reporter: Antoine DESSAIGNE
>            Assignee: Mate Szalay-Beko
>            Priority: Major
>         Attachments: failure-3.5.6.txt, success-3.4.14.txt, success-3.5.6.txt
>
>
> Hello everyone,
> We noticed that with Zookeeper 3.5.6, it fails to establish quorum on a new deployment on a regular basis (approx 50% of the time)
> We were able to reduce the reproduction steps to the bare minimum we could. Consider the following docker-compose.yml file
> {noformat}
> version: '2'
> services:
>   orchestrator1.cameltest.int:
>     image: zookeeper:3.5.6
>     environment:
>       ZOO_MY_ID: 1
>       ZOO_SERVERS: server.1=0.0.0.0:2888:3888 server.2=orchestrator2.cameltest.int:2888:3888
>   orchestrator2.cameltest.int:
>     image: zookeeper:3.5.6
>     environment:
>       ZOO_MY_ID: 2
>       ZOO_SERVERS: server.1=orchestrator1.cameltest.int:2888:3888 server.2=0.0.0.0:2888:3888
> {noformat}
> When launching a brand new cluster with it (with {{docker-compose up}}, no previous data) it fails half of the time with 3.5.6 and never in 3.4.14.
> You'll find attached 3 logs:
> * a failure one using 3.5.6
> * a success one using 3.5.6
> * a success one 3.4.14
> I don't think it's related to some docker/docker-compose issue (as it's working using 3.4.14 on the same server)
> I'll try to check each intermediate release to pin a more specific version.
> Unfortunately, I don't know yet my way in the Zookeeper code, what can I do to help? Thanks!
> PS: Yes, it's strange to have 2 servers as they're both required to work, but it's the smallest repro-case



--
This message was sent by Atlassian Jira
(v8.3.4#803005)