You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@zookeeper.apache.org by Lee Carroll <le...@googlemail.com> on 2017/06/05 09:27:49 UTC

zookeeper enemble fails to start on aws ec2 start up

On starting up ec2 instances zookeeper processes are started but fail to
agree a leader. No connectivity. (see below)
However on stopping and restarting the ensemble has connectivity and all is
well.

My init d script has chkconfig: 2345 99 9

What is the best practice for starting zookeeper on ec2 ?

logs on ec2 start up (the processes  sit and spin with no connectivity) on
issuing sudo service zookeeper restart evrything is hunky-dory and a leader
is chosen

myid = 1
2017-06-05 05:06:29,401 [myid:1] - WARN
 [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@382] - Cannot
open channel to 3 at election address
ec2xxx.compute-1.amazonaws.com/xx.xxx.xxx.xxx:3888
java.net.SocketTimeoutException: connect timed out
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
        at
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
        at
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:579)
        at
org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368)
        at
org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:402)
        at
org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:840)
        at
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:762)
2017-06-05 05:06:29,401 [myid:1] - INFO
 [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@849] -
Notification time out: 60000

myid=2
2017-06-05 05:08:50,276 [myid:2] - WARN
 [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@382] - Cannot
open channel to 3 at election address
ec2-xxx.compute-1.amazonaws.com/xx.xxx.xxx.xxx:3888
java.net.SocketTimeoutException: connect timed out
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
        at
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
        at
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:579)
        at
org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368)
        at
org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:402)
        at
org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:840)
        at
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:762)
2017-06-05 05:08:50,277 [myid:2] - INFO
 [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@849] -
Notification time out: 60000

myid=3
2017-06-05 05:09:48,057 [myid:3] - WARN
 [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@382] - Cannot
open channel to 1 at election address
ec2-xxx.compute-1.amazonaws.com/xxx.xxx.xxx.xxx:3888
java.net.SocketTimeoutException: connect timed out
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
        at
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
        at
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:579)
        at
org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368)
        at
org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:402)
        at
org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:840)
        at
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:762)
2017-06-05 05:09:48,058 [myid:3] - INFO
 [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@849] -
Notification time out: 60000

cheers lee c

Re: zookeeper enemble fails to start on aws ec2 start up

Posted by Lee Carroll <le...@googlemail.com>.
Hi Ben,
Thanks for replying. Yes ports open in the security group. The ensemble
works on a manual restart and can talk on each port, I think that means the
config is ok?

In the config i'm using the servers elastic load balanced addresses (but
the local address for the server the config file is on).



On 5 June 2017 at 18:00, Ben Sherman <be...@gmail.com> wrote:

> Can you check to see that all the ports defined in your zookeeper.conf are
> open as listed in the security groups on the servers? By default you need
> connectivity between the servers for ports 2888 and 3888, as well as 2181
> open to all clients.
>
> On Mon, Jun 5, 2017 at 2:27 AM, Lee Carroll <le...@googlemail.com>
> wrote:
>
> > On starting up ec2 instances zookeeper processes are started but fail to
> > agree a leader. No connectivity. (see below)
> > However on stopping and restarting the ensemble has connectivity and all
> is
> > well.
> >
> > My init d script has chkconfig: 2345 99 9
> >
> > What is the best practice for starting zookeeper on ec2 ?
> >
> > logs on ec2 start up (the processes  sit and spin with no connectivity)
> on
> > issuing sudo service zookeeper restart evrything is hunky-dory and a
> leader
> > is chosen
> >
> > myid = 1
> > 2017-06-05 05:06:29,401 [myid:1] - WARN
> >  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@382] - Cannot
> > open channel to 3 at election address
> > ec2xxx.compute-1.amazonaws.com/xx.xxx.xxx.xxx:3888
> > java.net.SocketTimeoutException: connect timed out
> >         at java.net.PlainSocketImpl.socketConnect(Native Method)
> >         at
> > java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:
> > 339)
> >         at
> > java.net.AbstractPlainSocketImpl.connectToAddress(
> > AbstractPlainSocketImpl.java:200)
> >         at
> > java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:
> 182)
> >         at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> >         at java.net.Socket.connect(Socket.java:579)
> >         at
> > org.apache.zookeeper.server.quorum.QuorumCnxManager.
> > connectOne(QuorumCnxManager.java:368)
> >         at
> > org.apache.zookeeper.server.quorum.QuorumCnxManager.
> > connectAll(QuorumCnxManager.java:402)
> >         at
> > org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(
> > FastLeaderElection.java:840)
> >         at
> > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:762)
> > 2017-06-05 05:06:29,401 [myid:1] - INFO
> >  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@849] -
> > Notification time out: 60000
> >
> > myid=2
> > 2017-06-05 05:08:50,276 [myid:2] - WARN
> >  [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@382] - Cannot
> > open channel to 3 at election address
> > ec2-xxx.compute-1.amazonaws.com/xx.xxx.xxx.xxx:3888
> > java.net.SocketTimeoutException: connect timed out
> >         at java.net.PlainSocketImpl.socketConnect(Native Method)
> >         at
> > java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:
> > 339)
> >         at
> > java.net.AbstractPlainSocketImpl.connectToAddress(
> > AbstractPlainSocketImpl.java:200)
> >         at
> > java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:
> 182)
> >         at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> >         at java.net.Socket.connect(Socket.java:579)
> >         at
> > org.apache.zookeeper.server.quorum.QuorumCnxManager.
> > connectOne(QuorumCnxManager.java:368)
> >         at
> > org.apache.zookeeper.server.quorum.QuorumCnxManager.
> > connectAll(QuorumCnxManager.java:402)
> >         at
> > org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(
> > FastLeaderElection.java:840)
> >         at
> > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:762)
> > 2017-06-05 05:08:50,277 [myid:2] - INFO
> >  [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@849] -
> > Notification time out: 60000
> >
> > myid=3
> > 2017-06-05 05:09:48,057 [myid:3] - WARN
> >  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@382] - Cannot
> > open channel to 1 at election address
> > ec2-xxx.compute-1.amazonaws.com/xxx.xxx.xxx.xxx:3888
> > java.net.SocketTimeoutException: connect timed out
> >         at java.net.PlainSocketImpl.socketConnect(Native Method)
> >         at
> > java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:
> > 339)
> >         at
> > java.net.AbstractPlainSocketImpl.connectToAddress(
> > AbstractPlainSocketImpl.java:200)
> >         at
> > java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:
> 182)
> >         at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> >         at java.net.Socket.connect(Socket.java:579)
> >         at
> > org.apache.zookeeper.server.quorum.QuorumCnxManager.
> > connectOne(QuorumCnxManager.java:368)
> >         at
> > org.apache.zookeeper.server.quorum.QuorumCnxManager.
> > connectAll(QuorumCnxManager.java:402)
> >         at
> > org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(
> > FastLeaderElection.java:840)
> >         at
> > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:762)
> > 2017-06-05 05:09:48,058 [myid:3] - INFO
> >  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@849] -
> > Notification time out: 60000
> >
> > cheers lee c
> >
>

Re: zookeeper enemble fails to start on aws ec2 start up

Posted by Ben Sherman <be...@gmail.com>.
Can you check to see that all the ports defined in your zookeeper.conf are
open as listed in the security groups on the servers? By default you need
connectivity between the servers for ports 2888 and 3888, as well as 2181
open to all clients.

On Mon, Jun 5, 2017 at 2:27 AM, Lee Carroll <le...@googlemail.com>
wrote:

> On starting up ec2 instances zookeeper processes are started but fail to
> agree a leader. No connectivity. (see below)
> However on stopping and restarting the ensemble has connectivity and all is
> well.
>
> My init d script has chkconfig: 2345 99 9
>
> What is the best practice for starting zookeeper on ec2 ?
>
> logs on ec2 start up (the processes  sit and spin with no connectivity) on
> issuing sudo service zookeeper restart evrything is hunky-dory and a leader
> is chosen
>
> myid = 1
> 2017-06-05 05:06:29,401 [myid:1] - WARN
>  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@382] - Cannot
> open channel to 3 at election address
> ec2xxx.compute-1.amazonaws.com/xx.xxx.xxx.xxx:3888
> java.net.SocketTimeoutException: connect timed out
>         at java.net.PlainSocketImpl.socketConnect(Native Method)
>         at
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:
> 339)
>         at
> java.net.AbstractPlainSocketImpl.connectToAddress(
> AbstractPlainSocketImpl.java:200)
>         at
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
>         at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>         at java.net.Socket.connect(Socket.java:579)
>         at
> org.apache.zookeeper.server.quorum.QuorumCnxManager.
> connectOne(QuorumCnxManager.java:368)
>         at
> org.apache.zookeeper.server.quorum.QuorumCnxManager.
> connectAll(QuorumCnxManager.java:402)
>         at
> org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(
> FastLeaderElection.java:840)
>         at
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:762)
> 2017-06-05 05:06:29,401 [myid:1] - INFO
>  [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@849] -
> Notification time out: 60000
>
> myid=2
> 2017-06-05 05:08:50,276 [myid:2] - WARN
>  [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@382] - Cannot
> open channel to 3 at election address
> ec2-xxx.compute-1.amazonaws.com/xx.xxx.xxx.xxx:3888
> java.net.SocketTimeoutException: connect timed out
>         at java.net.PlainSocketImpl.socketConnect(Native Method)
>         at
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:
> 339)
>         at
> java.net.AbstractPlainSocketImpl.connectToAddress(
> AbstractPlainSocketImpl.java:200)
>         at
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
>         at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>         at java.net.Socket.connect(Socket.java:579)
>         at
> org.apache.zookeeper.server.quorum.QuorumCnxManager.
> connectOne(QuorumCnxManager.java:368)
>         at
> org.apache.zookeeper.server.quorum.QuorumCnxManager.
> connectAll(QuorumCnxManager.java:402)
>         at
> org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(
> FastLeaderElection.java:840)
>         at
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:762)
> 2017-06-05 05:08:50,277 [myid:2] - INFO
>  [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@849] -
> Notification time out: 60000
>
> myid=3
> 2017-06-05 05:09:48,057 [myid:3] - WARN
>  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@382] - Cannot
> open channel to 1 at election address
> ec2-xxx.compute-1.amazonaws.com/xxx.xxx.xxx.xxx:3888
> java.net.SocketTimeoutException: connect timed out
>         at java.net.PlainSocketImpl.socketConnect(Native Method)
>         at
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:
> 339)
>         at
> java.net.AbstractPlainSocketImpl.connectToAddress(
> AbstractPlainSocketImpl.java:200)
>         at
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
>         at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>         at java.net.Socket.connect(Socket.java:579)
>         at
> org.apache.zookeeper.server.quorum.QuorumCnxManager.
> connectOne(QuorumCnxManager.java:368)
>         at
> org.apache.zookeeper.server.quorum.QuorumCnxManager.
> connectAll(QuorumCnxManager.java:402)
>         at
> org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(
> FastLeaderElection.java:840)
>         at
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:762)
> 2017-06-05 05:09:48,058 [myid:3] - INFO
>  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@849] -
> Notification time out: 60000
>
> cheers lee c
>