You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@zookeeper.apache.org by "Antoine DESSAIGNE (Jira)" <ji...@apache.org> on 2020/02/13 08:37:00 UTC

[jira] [Created] (ZOOKEEPER-3725) Zookeeper fails to establish quorum with 2 servers using 3.5.6

Antoine DESSAIGNE created ZOOKEEPER-3725:
--------------------------------------------

             Summary: Zookeeper fails to establish quorum with 2 servers using 3.5.6
                 Key: ZOOKEEPER-3725
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3725
             Project: ZooKeeper
          Issue Type: Bug
    Affects Versions: 3.5.6
            Reporter: Antoine DESSAIGNE
         Attachments: failure-3.5.6.txt, success-3.4.14.txt, success-3.5.6.txt

Hello everyone,

We noticed that with Zookeeper 3.5.6, it fails to establish quorum on a new deployment on a regular basis (approx 50% of the time)

We were able to reduce the reproduction steps to the bare minimum we could. Consider the following docker-compose.yml file
{noformat}
version: '2'
services:
  orchestrator1.cameltest.int:
    image: zookeeper:3.5.6
    environment:
      ZOO_MY_ID: 1
      ZOO_SERVERS: server.1=0.0.0.0:2888:3888 server.2=orchestrator2.cameltest.int:2888:3888
  orchestrator2.cameltest.int:
    image: zookeeper:3.5.6
    environment:
      ZOO_MY_ID: 2
      ZOO_SERVERS: server.1=orchestrator1.cameltest.int:2888:3888 server.2=0.0.0.0:2888:3888
{noformat}
When launching it (with {{docker-compose up}}) it fails half of the time with 3.5.6 and never in 3.4.14.

You'll find attached 3 logs:
* a failure one using 3.5.6
* a success one using 3.5.6
* a success one 3.4.14

I don't think it's related to some docker/docker-compose issue (as it's working using 3.4.14 on the same server)

I'll try to check each intermediate release to pin a more specific version.

Unfortunately, I don't know yet my way in the Zookeeper code, what can I do to help? Thanks!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)