You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@nifi.apache.org by Joe Gresock <jg...@gmail.com> on 2016/11/18 15:18:51 UTC

Zookeeper error

I'm upgrading a test 0.x nifi cluster to 1.x using the latest in master as
of today.

I was able to successfully start the 3-node cluster once, but then I
restarted it and get the following error spammed in the nifi-app.log.

I'm not sure where to start debugging this, and I'm puzzled why it would
work once and then start giving me errors on the second restart.  Has
anyone run into this error?

2016-11-18 15:07:18,178 INFO [main] org.eclipse.jetty.server.Server Started
@83426ms
2016-11-18 15:07:18,883 INFO [main] org.apache.nifi.web.server.JettyServer
Loading Flow...
2016-11-18 15:07:18,889 INFO [main]
org.apache.nifi.io.socket.SocketListener Now listening for connections from
nodes on port 9001
2016-11-18 15:07:19,117 INFO [main] o.a.nifi.controller.StandardFlowService
Connecting Node: ip-172-31-33-34.ec2.internal:8443
2016-11-18 15:07:25,781 WARN [main] o.a.nifi.controller.StandardFlowService
There is currently no Cluster Coordinator. This often happens upon restart
of NiFi when running an embedded ZooKeeper. Will register this node to
become the active Cluster Coordinator and will attempt to connect to
cluster again
2016-11-18 15:07:25,782 INFO [main]
o.a.n.c.l.e.CuratorLeaderElectionManager
CuratorLeaderElectionManager[stopped=false] Attempted to register Leader
Election for role 'Cluster Coordinator' but this role is already registered
2016-11-18 15:07:34,685 WARN [main] o.a.nifi.controller.StandardFlowService
There is currently no Cluster Coordinator. This often happens upon restart
of NiFi when running an embedded ZooKeeper. Will register this node to
become the active Cluster Coordinator and will attempt to connect to
cluster again
2016-11-18 15:07:34,685 INFO [main]
o.a.n.c.l.e.CuratorLeaderElectionManager
CuratorLeaderElectionManager[stopped=false] Attempted to register Leader
Election for role 'Cluster Coordinator' but this role is already registered
2016-11-18 15:07:34,696 INFO [Curator-Framework-0]
o.a.c.f.state.ConnectionStateManager State change: SUSPENDED
2016-11-18 15:07:34,698 INFO [Curator-ConnectionStateManager-0]
o.a.n.c.l.e.CuratorLeaderElectionManager
org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager$ElectionListener@671a652a
Connection State changed to SUSPENDED

*2016-11-18 15:07:34,699 ERROR [Curator-Framework-0]
o.a.c.f.imps.CuratorFrameworkImpl Background operation retry gave
uporg.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss*
        at
org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
~[zookeeper-3.4.6.jar:3.4.6-1569965]
        at
org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:728)
[curator-framework-2.11.0.jar:na]
        at
org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:857)
[curator-framework-2.11.0.jar:na]
        at
org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:809)
[curator-framework-2.11.0.jar:na]
        at
org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:64)
[curator-framework-2.11.0.jar:na]
        at
org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:267)
[curator-framework-2.11.0.jar:na]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
[na:1.8.0_111]
        at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
[na:1.8.0_111]
        at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
[na:1.8.0_111]
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[na:1.8.0_111]
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[na:1.8.0_111]
        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111]


-- 
I know what it is to be in need, and I know what it is to have plenty.  I
have learned the secret of being content in any and every situation,
whether well fed or hungry, whether living in plenty or in want.  I can do
all this through him who gives me strength.    *-Philippians 4:12-13*

Re: Zookeeper error

Posted by Joe Gresock <jg...@gmail.com>.

I think the root cause may have been that my flow.xml.gz files were out of
sync: I had scp'ed the same file to all servers, but apparently something
with the controller service IDs caused them to get out of sync as soon as
they started up.  I'm not sure why that affected zookeeper, but as soon as
I deleted all but one of the flow files, the problem resolved.

On Fri, Nov 18, 2016 at 2:35 PM, Mark Payne <ma...@hotmail.com> wrote:

> Joe,
>
> Assuming that you're using an embedded ZooKeeper server, it is not
> surprising that you saw a lot of
> ERROR-level messages about dropped ZK connections. Since you have only 1
> of 3 NiFi nodes up,
> you had only 1 of 3 ZK servers, so there was no quorum and you were
> continually trying to connect
> to nodes that were not available. Once the other nodes were started, you
> should be okay.
>
> The log messages that you are seeing there indicating weird ports I
> believe are the ephemeral ports
> that the client is using for the outgoing connection. These should not
> need to be opened up in your VM
> (assuming that you're not blocking outbound ports). The last message there
> indicates that a session was
> established with a timeout of 4000 milliseconds, so I don't believe
> there's any problem with ports being
> blocked.
>
> However, once the nodes have all started up, they shouldn't have problems
> connecting to each other. Can
> you grep your logs for "changed from"? NiFi logs at an INFO level every
> time the connection
> status of a node in the cluster changes. This may shed some light as to
> why the nodes
> were not connecting to the cluster.
>
> Thanks
> -Mark
>
>
> On Nov 18, 2016, at 12:30 PM, Jeff <jtswork@gmail.com<mailto:jtsw
> ork@gmail.com>> wrote:
>
> Joe,
>
> I'm glad you were able to get the nodes to reconnect, but I'm interested to
> know how it got into a state where it couldn't start up previously.  If you
> can reproduce the scenario, and provide the full logs and your NiFi
> configuration, we can investigate what caused it to get into that state.
>
> On Fri, Nov 18, 2016 at 12:17 PM Joe Gresock <jgresock@gmail.com<mailto:
> jgresock@gmail.com>> wrote:
>
> I waited the 5 minutes of the election process, and then several minutes
> beyond that.
>
> Incidentally, when I cleared the state (except zookeeper/my_id) from all
> the nodes, and deleted the flow.xml.gz from all but one of the nodes, and
> then restarted hte whole cluster, it came back.
>
> On Fri, Nov 18, 2016 at 5:11 PM, Jeff <jtswork@gmail.com<mailto:jtsw
> ork@gmail.com>> wrote:
>
> Hello Joe,
>
> Just out of curiosity, how long did you let NiFi run while waiting for
> the
> nodes to connect?
>
> On Fri, Nov 18, 2016 at 10:53 AM Joe Gresock <jgresock@gmail.com<mailto:
> jgresock@gmail.com>> wrote:
>
> Despite starting up, the nodes now cannot connect to each other, so
> they're
> all listed as Disconnected in the UI.  I see this in the logs:
>
> 2016-11-18 15:50:19,080 INFO [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2181]
> o.a.zookeeper.server.ZooKeeperServer Client attempting to establish new
> session at /172.31.33.34:47224
> 2016-11-18 15:50:19,081 INFO [CommitProcessor:2]
> o.a.zookeeper.server.ZooKeeperServer Established session
> 0x258781845940bf9
> with negotiated timeout 4000 for client /172.31.33.34:47224
> 2016-11-18 15:50:19,185 INFO [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2181]
> o.a.zookeeper.server.ZooKeeperServer Client attempting to establish new
> session at /172.31.33.34:47228
> 2016-11-18 15:50:19,186 INFO [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2181]
> o.a.zookeeper.server.ZooKeeperServer Client attempting to establish new
> session at /172.31.33.34:47230
> 2016-11-18 15:50:19,187 INFO [CommitProcessor:2]
> o.a.zookeeper.server.ZooKeeperServer Established session
> 0x258781845940bfa
> with negotiated timeout 4000 for client /172.31.33.34:47228
> 2016-11-18 15:50:19,187 INFO [CommitProcessor:2]
> o.a.zookeeper.server.ZooKeeperServer Established session
> 0x258781845940bfb
> with negotiated timeout 4000 for client /172.31.33.34:47230
> 2016-11-18 15:50:19,292 INFO [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2181]
> o.a.zookeeper.server.ZooKeeperServer Client attempting to establish new
> session at /172.31.33.34:47234
> 2016-11-18 15:50:19,293 INFO [CommitProcessor:2]
> o.a.zookeeper.server.ZooKeeperServer Established session
> 0x258781845940bfc
> with negotiated timeout 4000 for client /172.31.33.34:47234
>
>
> However, I definitely did not open any ports similar to 47234 on my
> nifi
> VMs.  Is there a certain set of ports that need to be open between the
> servers?  My understanding was that only 2888, 3888, and 2121 were
> necessary for zookeeper.
>
> On Fri, Nov 18, 2016 at 3:41 PM, Joe Gresock <jgresock@gmail.com<mailto:
> jgresock@gmail.com>>
> wrote:
>
> It appears that if you try to start up just one node in a cluster
> with
> multiple zk hosts specified in zookeeper.properties, you get this
> error
> spammed at an incredible rate in your logs.  When I started up all 3
> nodes
> at once, they didn't receive the error.
>
> On Fri, Nov 18, 2016 at 3:18 PM, Joe Gresock <jgresock@gmail.com<mailto:
> jgresock@gmail.com>>
> wrote:
>
> I'm upgrading a test 0.x nifi cluster to 1.x using the latest in
> master
> as of today.
>
> I was able to successfully start the 3-node cluster once, but then I
> restarted it and get the following error spammed in the
> nifi-app.log.
>
> I'm not sure where to start debugging this, and I'm puzzled why it
> would
> work once and then start giving me errors on the second restart.
> Has
> anyone run into this error?
>
> 2016-11-18 15:07:18,178 INFO [main] org.eclipse.jetty.server.Server
> Started @83426ms
> 2016-11-18 15:07:18,883 INFO [main]
> org.apache.nifi.web.server.JettyServer
> Loading Flow...
> 2016-11-18 15:07:18,889 INFO [main]
> org.apache.nifi.io.socket.SocketListener
> Now listening for connections from nodes on port 9001
> 2016-11-18 15:07:19,117 INFO [main]
> o.a.nifi.controller.StandardFlowService
> Connecting Node: ip-172-31-33-34.ec2.internal:8443
> 2016-11-18 15:07:25,781 WARN [main]
> o.a.nifi.controller.StandardFlowService
> There is currently no Cluster Coordinator. This often happens upon
> restart
> of NiFi when running an embedded ZooKeeper. Will register this node
> to
> become the active Cluster Coordinator and will attempt to connect to
> cluster again
> 2016-11-18 15:07:25,782 INFO [main]
> o.a.n.c.l.e.CuratorLeaderElectionManager
> CuratorLeaderElectionManager[stopped=false] Attempted to register
> Leader
> Election for role 'Cluster Coordinator' but this role is already
> registered
> 2016-11-18 15:07:34,685 WARN [main]
> o.a.nifi.controller.StandardFlowService
> There is currently no Cluster Coordinator. This often happens upon
> restart
> of NiFi when running an embedded ZooKeeper. Will register this node
> to
> become the active Cluster Coordinator and will attempt to connect to
> cluster again
> 2016-11-18 15:07:34,685 INFO [main]
> o.a.n.c.l.e.CuratorLeaderElectionManager
> CuratorLeaderElectionManager[stopped=false] Attempted to register
> Leader
> Election for role 'Cluster Coordinator' but this role is already
> registered
> 2016-11-18 15:07:34,696 INFO [Curator-Framework-0]
> o.a.c.f.state.ConnectionStateManager State change: SUSPENDED
> 2016-11-18 15:07:34,698 INFO [Curator-ConnectionStateManager-0]
> o.a.n.c.l.e.CuratorLeaderElectionManager
> org.apache.nifi.controller.lea
> der.election.CuratorLeaderElectionManager$ElectionListener@671a652a
> Connection State changed to SUSPENDED
>
> *2016-11-18 15:07:34,699 ERROR [Curator-Framework-0]
> o.a.c.f.imps.CuratorFrameworkImpl Background operation retry gave
> uporg.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss*
>        at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
> ~[zookeeper-3.4.6.jar:3.4.6-1569965]
>        at org.apache.curator.framework.imps.CuratorFrameworkImpl.
> check
> BackgroundRetry(CuratorFrameworkImpl.java:728)
> [curator-framework-2.11.0.jar:na]
>        at org.apache.curator.framework.imps.CuratorFrameworkImpl.
> perfo
> rmBackgroundOperation(CuratorFrameworkImpl.java:857)
> [curator-framework-2.11.0.jar:na]
>        at org.apache.curator.framework.imps.CuratorFrameworkImpl.
> backg
> roundOperationsLoop(CuratorFrameworkImpl.java:809)
> [curator-framework-2.11.0.jar:na]
>        at org.apache.curator.framework.imps.CuratorFrameworkImpl.
> acces
> s$300(CuratorFrameworkImpl.java:64)
> [curator-framework-2.11.0.jar:na]
>        at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.
> cal
> l(CuratorFrameworkImpl.java:267) [curator-framework-2.11.0.jar:na]
>        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [na:1.8.0_111]
>        at java.util.concurrent.ScheduledThreadPoolExecutor$
> ScheduledFu
> tureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> [na:1.8.0_111]
>        at java.util.concurrent.ScheduledThreadPoolExecutor$
> ScheduledFu
> tureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.8.0_111]
>        at
> java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
> [na:1.8.0_111]
>        at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
> [na:1.8.0_111]
>        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111]
>
>
> --
> I know what it is to be in need, and I know what it is to have
> plenty.
> I
> have learned the secret of being content in any and every situation,
> whether well fed or hungry, whether living in plenty or in want.  I
> can
> do all this through him who gives me strength.    *-Philippians
> 4:12-13*
>
>
>
>
> --
> I know what it is to be in need, and I know what it is to have
> plenty.  I
> have learned the secret of being content in any and every situation,
> whether well fed or hungry, whether living in plenty or in want.  I
> can
> do all this through him who gives me strength.    *-Philippians
> 4:12-13*
>
>
>
>
> --
> I know what it is to be in need, and I know what it is to have
> plenty.  I
> have learned the secret of being content in any and every situation,
> whether well fed or hungry, whether living in plenty or in want.  I can
> do
> all this through him who gives me strength.    *-Philippians 4:12-13*
>
>
>
>
>
> --
> I know what it is to be in need, and I know what it is to have plenty.  I
> have learned the secret of being content in any and every situation,
> whether well fed or hungry, whether living in plenty or in want.  I can do
> all this through him who gives me strength.    *-Philippians 4:12-13*
>
>
>


-- 
I know what it is to be in need, and I know what it is to have plenty.  I
have learned the secret of being content in any and every situation,
whether well fed or hungry, whether living in plenty or in want.  I can do
all this through him who gives me strength.    *-Philippians 4:12-13*

Re: Zookeeper error

Posted by Mark Payne <ma...@hotmail.com>.

Joe,

Assuming that you're using an embedded ZooKeeper server, it is not surprising that you saw a lot of
ERROR-level messages about dropped ZK connections. Since you have only 1 of 3 NiFi nodes up,
you had only 1 of 3 ZK servers, so there was no quorum and you were continually trying to connect
to nodes that were not available. Once the other nodes were started, you should be okay.

The log messages that you are seeing there indicating weird ports I believe are the ephemeral ports
that the client is using for the outgoing connection. These should not need to be opened up in your VM
(assuming that you're not blocking outbound ports). The last message there indicates that a session was
established with a timeout of 4000 milliseconds, so I don't believe there's any problem with ports being
blocked.

However, once the nodes have all started up, they shouldn't have problems connecting to each other. Can
you grep your logs for "changed from"? NiFi logs at an INFO level every time the connection
status of a node in the cluster changes. This may shed some light as to why the nodes
were not connecting to the cluster.

Thanks
-Mark


On Nov 18, 2016, at 12:30 PM, Jeff <jt...@gmail.com>> wrote:

Joe,

I'm glad you were able to get the nodes to reconnect, but I'm interested to
know how it got into a state where it couldn't start up previously.  If you
can reproduce the scenario, and provide the full logs and your NiFi
configuration, we can investigate what caused it to get into that state.

On Fri, Nov 18, 2016 at 12:17 PM Joe Gresock <jg...@gmail.com>> wrote:

I waited the 5 minutes of the election process, and then several minutes
beyond that.

Incidentally, when I cleared the state (except zookeeper/my_id) from all
the nodes, and deleted the flow.xml.gz from all but one of the nodes, and
then restarted hte whole cluster, it came back.

On Fri, Nov 18, 2016 at 5:11 PM, Jeff <jt...@gmail.com>> wrote:

Hello Joe,

Just out of curiosity, how long did you let NiFi run while waiting for
the
nodes to connect?

On Fri, Nov 18, 2016 at 10:53 AM Joe Gresock <jg...@gmail.com>> wrote:

Despite starting up, the nodes now cannot connect to each other, so
they're
all listed as Disconnected in the UI.  I see this in the logs:

2016-11-18 15:50:19,080 INFO [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2181]
o.a.zookeeper.server.ZooKeeperServer Client attempting to establish new
session at /172.31.33.34:47224
2016-11-18 15:50:19,081 INFO [CommitProcessor:2]
o.a.zookeeper.server.ZooKeeperServer Established session
0x258781845940bf9
with negotiated timeout 4000 for client /172.31.33.34:47224
2016-11-18 15:50:19,185 INFO [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2181]
o.a.zookeeper.server.ZooKeeperServer Client attempting to establish new
session at /172.31.33.34:47228
2016-11-18 15:50:19,186 INFO [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2181]
o.a.zookeeper.server.ZooKeeperServer Client attempting to establish new
session at /172.31.33.34:47230
2016-11-18 15:50:19,187 INFO [CommitProcessor:2]
o.a.zookeeper.server.ZooKeeperServer Established session
0x258781845940bfa
with negotiated timeout 4000 for client /172.31.33.34:47228
2016-11-18 15:50:19,187 INFO [CommitProcessor:2]
o.a.zookeeper.server.ZooKeeperServer Established session
0x258781845940bfb
with negotiated timeout 4000 for client /172.31.33.34:47230
2016-11-18 15:50:19,292 INFO [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2181]
o.a.zookeeper.server.ZooKeeperServer Client attempting to establish new
session at /172.31.33.34:47234
2016-11-18 15:50:19,293 INFO [CommitProcessor:2]
o.a.zookeeper.server.ZooKeeperServer Established session
0x258781845940bfc
with negotiated timeout 4000 for client /172.31.33.34:47234


However, I definitely did not open any ports similar to 47234 on my
nifi
VMs.  Is there a certain set of ports that need to be open between the
servers?  My understanding was that only 2888, 3888, and 2121 were
necessary for zookeeper.

On Fri, Nov 18, 2016 at 3:41 PM, Joe Gresock <jg...@gmail.com>>
wrote:

It appears that if you try to start up just one node in a cluster
with
multiple zk hosts specified in zookeeper.properties, you get this
error
spammed at an incredible rate in your logs.  When I started up all 3
nodes
at once, they didn't receive the error.

On Fri, Nov 18, 2016 at 3:18 PM, Joe Gresock <jg...@gmail.com>>
wrote:

I'm upgrading a test 0.x nifi cluster to 1.x using the latest in
master
as of today.

I was able to successfully start the 3-node cluster once, but then I
restarted it and get the following error spammed in the
nifi-app.log.

I'm not sure where to start debugging this, and I'm puzzled why it
would
work once and then start giving me errors on the second restart.
Has
anyone run into this error?

2016-11-18 15:07:18,178 INFO [main] org.eclipse.jetty.server.Server
Started @83426ms
2016-11-18 15:07:18,883 INFO [main]
org.apache.nifi.web.server.JettyServer
Loading Flow...
2016-11-18 15:07:18,889 INFO [main]
org.apache.nifi.io.socket.SocketListener
Now listening for connections from nodes on port 9001
2016-11-18 15:07:19,117 INFO [main]
o.a.nifi.controller.StandardFlowService
Connecting Node: ip-172-31-33-34.ec2.internal:8443
2016-11-18 15:07:25,781 WARN [main]
o.a.nifi.controller.StandardFlowService
There is currently no Cluster Coordinator. This often happens upon
restart
of NiFi when running an embedded ZooKeeper. Will register this node
to
become the active Cluster Coordinator and will attempt to connect to
cluster again
2016-11-18 15:07:25,782 INFO [main]
o.a.n.c.l.e.CuratorLeaderElectionManager
CuratorLeaderElectionManager[stopped=false] Attempted to register
Leader
Election for role 'Cluster Coordinator' but this role is already
registered
2016-11-18 15:07:34,685 WARN [main]
o.a.nifi.controller.StandardFlowService
There is currently no Cluster Coordinator. This often happens upon
restart
of NiFi when running an embedded ZooKeeper. Will register this node
to
become the active Cluster Coordinator and will attempt to connect to
cluster again
2016-11-18 15:07:34,685 INFO [main]
o.a.n.c.l.e.CuratorLeaderElectionManager
CuratorLeaderElectionManager[stopped=false] Attempted to register
Leader
Election for role 'Cluster Coordinator' but this role is already
registered
2016-11-18 15:07:34,696 INFO [Curator-Framework-0]
o.a.c.f.state.ConnectionStateManager State change: SUSPENDED
2016-11-18 15:07:34,698 INFO [Curator-ConnectionStateManager-0]
o.a.n.c.l.e.CuratorLeaderElectionManager
org.apache.nifi.controller.lea
der.election.CuratorLeaderElectionManager$ElectionListener@671a652a
Connection State changed to SUSPENDED

*2016-11-18 15:07:34,699 ERROR [Curator-Framework-0]
o.a.c.f.imps.CuratorFrameworkImpl Background operation retry gave
uporg.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss*
       at
org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
~[zookeeper-3.4.6.jar:3.4.6-1569965]
       at org.apache.curator.framework.imps.CuratorFrameworkImpl.
check
BackgroundRetry(CuratorFrameworkImpl.java:728)
[curator-framework-2.11.0.jar:na]
       at org.apache.curator.framework.imps.CuratorFrameworkImpl.
perfo
rmBackgroundOperation(CuratorFrameworkImpl.java:857)
[curator-framework-2.11.0.jar:na]
       at org.apache.curator.framework.imps.CuratorFrameworkImpl.
backg
roundOperationsLoop(CuratorFrameworkImpl.java:809)
[curator-framework-2.11.0.jar:na]
       at org.apache.curator.framework.imps.CuratorFrameworkImpl.
acces
s$300(CuratorFrameworkImpl.java:64)
[curator-framework-2.11.0.jar:na]
       at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.
cal
l(CuratorFrameworkImpl.java:267) [curator-framework-2.11.0.jar:na]
       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
[na:1.8.0_111]
       at java.util.concurrent.ScheduledThreadPoolExecutor$
ScheduledFu
tureTask.access$201(ScheduledThreadPoolExecutor.java:180)
[na:1.8.0_111]
       at java.util.concurrent.ScheduledThreadPoolExecutor$
ScheduledFu
tureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.8.0_111]
       at
java.util.concurrent.ThreadPoolExecutor.runWorker(
ThreadPoolExecutor.java:1142)
[na:1.8.0_111]
       at
java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:617)
[na:1.8.0_111]
       at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111]


--
I know what it is to be in need, and I know what it is to have
plenty.
I
have learned the secret of being content in any and every situation,
whether well fed or hungry, whether living in plenty or in want.  I
can
do all this through him who gives me strength.    *-Philippians
4:12-13*




--
I know what it is to be in need, and I know what it is to have
plenty.  I
have learned the secret of being content in any and every situation,
whether well fed or hungry, whether living in plenty or in want.  I
can
do all this through him who gives me strength.    *-Philippians
4:12-13*




--
I know what it is to be in need, and I know what it is to have
plenty.  I
have learned the secret of being content in any and every situation,
whether well fed or hungry, whether living in plenty or in want.  I can
do
all this through him who gives me strength.    *-Philippians 4:12-13*





--
I know what it is to be in need, and I know what it is to have plenty.  I
have learned the secret of being content in any and every situation,
whether well fed or hungry, whether living in plenty or in want.  I can do
all this through him who gives me strength.    *-Philippians 4:12-13*

Re: Zookeeper error

Posted by Jeff <jt...@gmail.com>.

Joe,

I'm glad you were able to get the nodes to reconnect, but I'm interested to
know how it got into a state where it couldn't start up previously.  If you
can reproduce the scenario, and provide the full logs and your NiFi
configuration, we can investigate what caused it to get into that state.

On Fri, Nov 18, 2016 at 12:17 PM Joe Gresock <jg...@gmail.com> wrote:

> I waited the 5 minutes of the election process, and then several minutes
> beyond that.
>
> Incidentally, when I cleared the state (except zookeeper/my_id) from all
> the nodes, and deleted the flow.xml.gz from all but one of the nodes, and
> then restarted hte whole cluster, it came back.
>
> On Fri, Nov 18, 2016 at 5:11 PM, Jeff <jt...@gmail.com> wrote:
>
> > Hello Joe,
> >
> > Just out of curiosity, how long did you let NiFi run while waiting for
> the
> > nodes to connect?
> >
> > On Fri, Nov 18, 2016 at 10:53 AM Joe Gresock <jg...@gmail.com> wrote:
> >
> > > Despite starting up, the nodes now cannot connect to each other, so
> > they're
> > > all listed as Disconnected in the UI.  I see this in the logs:
> > >
> > > 2016-11-18 15:50:19,080 INFO [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2181]
> > > o.a.zookeeper.server.ZooKeeperServer Client attempting to establish new
> > > session at /172.31.33.34:47224
> > > 2016-11-18 15:50:19,081 INFO [CommitProcessor:2]
> > > o.a.zookeeper.server.ZooKeeperServer Established session
> > 0x258781845940bf9
> > > with negotiated timeout 4000 for client /172.31.33.34:47224
> > > 2016-11-18 15:50:19,185 INFO [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2181]
> > > o.a.zookeeper.server.ZooKeeperServer Client attempting to establish new
> > > session at /172.31.33.34:47228
> > > 2016-11-18 15:50:19,186 INFO [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2181]
> > > o.a.zookeeper.server.ZooKeeperServer Client attempting to establish new
> > > session at /172.31.33.34:47230
> > > 2016-11-18 15:50:19,187 INFO [CommitProcessor:2]
> > > o.a.zookeeper.server.ZooKeeperServer Established session
> > 0x258781845940bfa
> > > with negotiated timeout 4000 for client /172.31.33.34:47228
> > > 2016-11-18 15:50:19,187 INFO [CommitProcessor:2]
> > > o.a.zookeeper.server.ZooKeeperServer Established session
> > 0x258781845940bfb
> > > with negotiated timeout 4000 for client /172.31.33.34:47230
> > > 2016-11-18 15:50:19,292 INFO [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2181]
> > > o.a.zookeeper.server.ZooKeeperServer Client attempting to establish new
> > > session at /172.31.33.34:47234
> > > 2016-11-18 15:50:19,293 INFO [CommitProcessor:2]
> > > o.a.zookeeper.server.ZooKeeperServer Established session
> > 0x258781845940bfc
> > > with negotiated timeout 4000 for client /172.31.33.34:47234
> > >
> > >
> > > However, I definitely did not open any ports similar to 47234 on my
> nifi
> > > VMs.  Is there a certain set of ports that need to be open between the
> > > servers?  My understanding was that only 2888, 3888, and 2121 were
> > > necessary for zookeeper.
> > >
> > > On Fri, Nov 18, 2016 at 3:41 PM, Joe Gresock <jg...@gmail.com>
> wrote:
> > >
> > > > It appears that if you try to start up just one node in a cluster
> with
> > > > multiple zk hosts specified in zookeeper.properties, you get this
> error
> > > > spammed at an incredible rate in your logs.  When I started up all 3
> > > nodes
> > > > at once, they didn't receive the error.
> > > >
> > > > On Fri, Nov 18, 2016 at 3:18 PM, Joe Gresock <jg...@gmail.com>
> > wrote:
> > > >
> > > >> I'm upgrading a test 0.x nifi cluster to 1.x using the latest in
> > master
> > > >> as of today.
> > > >>
> > > >> I was able to successfully start the 3-node cluster once, but then I
> > > >> restarted it and get the following error spammed in the
> nifi-app.log.
> > > >>
> > > >> I'm not sure where to start debugging this, and I'm puzzled why it
> > would
> > > >> work once and then start giving me errors on the second restart.
> Has
> > > >> anyone run into this error?
> > > >>
> > > >> 2016-11-18 15:07:18,178 INFO [main] org.eclipse.jetty.server.Server
> > > >> Started @83426ms
> > > >> 2016-11-18 15:07:18,883 INFO [main]
> > > org.apache.nifi.web.server.JettyServer
> > > >> Loading Flow...
> > > >> 2016-11-18 15:07:18,889 INFO [main]
> > > org.apache.nifi.io.socket.SocketListener
> > > >> Now listening for connections from nodes on port 9001
> > > >> 2016-11-18 15:07:19,117 INFO [main]
> > > o.a.nifi.controller.StandardFlowService
> > > >> Connecting Node: ip-172-31-33-34.ec2.internal:8443
> > > >> 2016-11-18 15:07:25,781 WARN [main]
> > > o.a.nifi.controller.StandardFlowService
> > > >> There is currently no Cluster Coordinator. This often happens upon
> > > restart
> > > >> of NiFi when running an embedded ZooKeeper. Will register this node
> to
> > > >> become the active Cluster Coordinator and will attempt to connect to
> > > >> cluster again
> > > >> 2016-11-18 15:07:25,782 INFO [main]
> > > o.a.n.c.l.e.CuratorLeaderElectionManager
> > > >> CuratorLeaderElectionManager[stopped=false] Attempted to register
> > Leader
> > > >> Election for role 'Cluster Coordinator' but this role is already
> > > registered
> > > >> 2016-11-18 15:07:34,685 WARN [main]
> > > o.a.nifi.controller.StandardFlowService
> > > >> There is currently no Cluster Coordinator. This often happens upon
> > > restart
> > > >> of NiFi when running an embedded ZooKeeper. Will register this node
> to
> > > >> become the active Cluster Coordinator and will attempt to connect to
> > > >> cluster again
> > > >> 2016-11-18 15:07:34,685 INFO [main]
> > > o.a.n.c.l.e.CuratorLeaderElectionManager
> > > >> CuratorLeaderElectionManager[stopped=false] Attempted to register
> > Leader
> > > >> Election for role 'Cluster Coordinator' but this role is already
> > > registered
> > > >> 2016-11-18 15:07:34,696 INFO [Curator-Framework-0]
> > > >> o.a.c.f.state.ConnectionStateManager State change: SUSPENDED
> > > >> 2016-11-18 15:07:34,698 INFO [Curator-ConnectionStateManager-0]
> > > >> o.a.n.c.l.e.CuratorLeaderElectionManager
> > org.apache.nifi.controller.lea
> > > >> der.election.CuratorLeaderElectionManager$ElectionListener@671a652a
> > > >> Connection State changed to SUSPENDED
> > > >>
> > > >> *2016-11-18 15:07:34,699 ERROR [Curator-Framework-0]
> > > >> o.a.c.f.imps.CuratorFrameworkImpl Background operation retry gave
> > > >> uporg.apache.zookeeper.KeeperException$ConnectionLossException:
> > > >> KeeperErrorCode = ConnectionLoss*
> > > >>         at
> > > org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
> > > >> ~[zookeeper-3.4.6.jar:3.4.6-1569965]
> > > >>         at org.apache.curator.framework.imps.CuratorFrameworkImpl.
> > check
> > > >> BackgroundRetry(CuratorFrameworkImpl.java:728)
> > > >> [curator-framework-2.11.0.jar:na]
> > > >>         at org.apache.curator.framework.imps.CuratorFrameworkImpl.
> > perfo
> > > >> rmBackgroundOperation(CuratorFrameworkImpl.java:857)
> > > >> [curator-framework-2.11.0.jar:na]
> > > >>         at org.apache.curator.framework.imps.CuratorFrameworkImpl.
> > backg
> > > >> roundOperationsLoop(CuratorFrameworkImpl.java:809)
> > > >> [curator-framework-2.11.0.jar:na]
> > > >>         at org.apache.curator.framework.imps.CuratorFrameworkImpl.
> > acces
> > > >> s$300(CuratorFrameworkImpl.java:64)
> [curator-framework-2.11.0.jar:na]
> > > >>         at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.
> > cal
> > > >> l(CuratorFrameworkImpl.java:267) [curator-framework-2.11.0.jar:na]
> > > >>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> > > >> [na:1.8.0_111]
> > > >>         at java.util.concurrent.ScheduledThreadPoolExecutor$
> > ScheduledFu
> > > >> tureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> > [na:1.8.0_111]
> > > >>         at java.util.concurrent.ScheduledThreadPoolExecutor$
> > ScheduledFu
> > > >> tureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.8.0_111]
> > > >>         at
> > > java.util.concurrent.ThreadPoolExecutor.runWorker(
> > ThreadPoolExecutor.java:1142)
> > > >> [na:1.8.0_111]
> > > >>         at
> > > java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > ThreadPoolExecutor.java:617)
> > > >> [na:1.8.0_111]
> > > >>         at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111]
> > > >>
> > > >>
> > > >> --
> > > >> I know what it is to be in need, and I know what it is to have
> plenty.
> > > I
> > > >> have learned the secret of being content in any and every situation,
> > > >> whether well fed or hungry, whether living in plenty or in want.  I
> > can
> > > >> do all this through him who gives me strength.    *-Philippians
> > 4:12-13*
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > I know what it is to be in need, and I know what it is to have
> > plenty.  I
> > > > have learned the secret of being content in any and every situation,
> > > > whether well fed or hungry, whether living in plenty or in want.  I
> can
> > > > do all this through him who gives me strength.    *-Philippians
> > 4:12-13*
> > > >
> > >
> > >
> > >
> > > --
> > > I know what it is to be in need, and I know what it is to have
> plenty.  I
> > > have learned the secret of being content in any and every situation,
> > > whether well fed or hungry, whether living in plenty or in want.  I can
> > do
> > > all this through him who gives me strength.    *-Philippians 4:12-13*
> > >
> >
>
>
>
> --
> I know what it is to be in need, and I know what it is to have plenty.  I
> have learned the secret of being content in any and every situation,
> whether well fed or hungry, whether living in plenty or in want.  I can do
> all this through him who gives me strength.    *-Philippians 4:12-13*
>

Re: Zookeeper error

Posted by Joe Gresock <jg...@gmail.com>.

I waited the 5 minutes of the election process, and then several minutes
beyond that.

Incidentally, when I cleared the state (except zookeeper/my_id) from all
the nodes, and deleted the flow.xml.gz from all but one of the nodes, and
then restarted hte whole cluster, it came back.

On Fri, Nov 18, 2016 at 5:11 PM, Jeff <jt...@gmail.com> wrote:

> Hello Joe,
>
> Just out of curiosity, how long did you let NiFi run while waiting for the
> nodes to connect?
>
> On Fri, Nov 18, 2016 at 10:53 AM Joe Gresock <jg...@gmail.com> wrote:
>
> > Despite starting up, the nodes now cannot connect to each other, so
> they're
> > all listed as Disconnected in the UI.  I see this in the logs:
> >
> > 2016-11-18 15:50:19,080 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181]
> > o.a.zookeeper.server.ZooKeeperServer Client attempting to establish new
> > session at /172.31.33.34:47224
> > 2016-11-18 15:50:19,081 INFO [CommitProcessor:2]
> > o.a.zookeeper.server.ZooKeeperServer Established session
> 0x258781845940bf9
> > with negotiated timeout 4000 for client /172.31.33.34:47224
> > 2016-11-18 15:50:19,185 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181]
> > o.a.zookeeper.server.ZooKeeperServer Client attempting to establish new
> > session at /172.31.33.34:47228
> > 2016-11-18 15:50:19,186 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181]
> > o.a.zookeeper.server.ZooKeeperServer Client attempting to establish new
> > session at /172.31.33.34:47230
> > 2016-11-18 15:50:19,187 INFO [CommitProcessor:2]
> > o.a.zookeeper.server.ZooKeeperServer Established session
> 0x258781845940bfa
> > with negotiated timeout 4000 for client /172.31.33.34:47228
> > 2016-11-18 15:50:19,187 INFO [CommitProcessor:2]
> > o.a.zookeeper.server.ZooKeeperServer Established session
> 0x258781845940bfb
> > with negotiated timeout 4000 for client /172.31.33.34:47230
> > 2016-11-18 15:50:19,292 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181]
> > o.a.zookeeper.server.ZooKeeperServer Client attempting to establish new
> > session at /172.31.33.34:47234
> > 2016-11-18 15:50:19,293 INFO [CommitProcessor:2]
> > o.a.zookeeper.server.ZooKeeperServer Established session
> 0x258781845940bfc
> > with negotiated timeout 4000 for client /172.31.33.34:47234
> >
> >
> > However, I definitely did not open any ports similar to 47234 on my nifi
> > VMs.  Is there a certain set of ports that need to be open between the
> > servers?  My understanding was that only 2888, 3888, and 2121 were
> > necessary for zookeeper.
> >
> > On Fri, Nov 18, 2016 at 3:41 PM, Joe Gresock <jg...@gmail.com> wrote:
> >
> > > It appears that if you try to start up just one node in a cluster with
> > > multiple zk hosts specified in zookeeper.properties, you get this error
> > > spammed at an incredible rate in your logs.  When I started up all 3
> > nodes
> > > at once, they didn't receive the error.
> > >
> > > On Fri, Nov 18, 2016 at 3:18 PM, Joe Gresock <jg...@gmail.com>
> wrote:
> > >
> > >> I'm upgrading a test 0.x nifi cluster to 1.x using the latest in
> master
> > >> as of today.
> > >>
> > >> I was able to successfully start the 3-node cluster once, but then I
> > >> restarted it and get the following error spammed in the nifi-app.log.
> > >>
> > >> I'm not sure where to start debugging this, and I'm puzzled why it
> would
> > >> work once and then start giving me errors on the second restart.  Has
> > >> anyone run into this error?
> > >>
> > >> 2016-11-18 15:07:18,178 INFO [main] org.eclipse.jetty.server.Server
> > >> Started @83426ms
> > >> 2016-11-18 15:07:18,883 INFO [main]
> > org.apache.nifi.web.server.JettyServer
> > >> Loading Flow...
> > >> 2016-11-18 15:07:18,889 INFO [main]
> > org.apache.nifi.io.socket.SocketListener
> > >> Now listening for connections from nodes on port 9001
> > >> 2016-11-18 15:07:19,117 INFO [main]
> > o.a.nifi.controller.StandardFlowService
> > >> Connecting Node: ip-172-31-33-34.ec2.internal:8443
> > >> 2016-11-18 15:07:25,781 WARN [main]
> > o.a.nifi.controller.StandardFlowService
> > >> There is currently no Cluster Coordinator. This often happens upon
> > restart
> > >> of NiFi when running an embedded ZooKeeper. Will register this node to
> > >> become the active Cluster Coordinator and will attempt to connect to
> > >> cluster again
> > >> 2016-11-18 15:07:25,782 INFO [main]
> > o.a.n.c.l.e.CuratorLeaderElectionManager
> > >> CuratorLeaderElectionManager[stopped=false] Attempted to register
> Leader
> > >> Election for role 'Cluster Coordinator' but this role is already
> > registered
> > >> 2016-11-18 15:07:34,685 WARN [main]
> > o.a.nifi.controller.StandardFlowService
> > >> There is currently no Cluster Coordinator. This often happens upon
> > restart
> > >> of NiFi when running an embedded ZooKeeper. Will register this node to
> > >> become the active Cluster Coordinator and will attempt to connect to
> > >> cluster again
> > >> 2016-11-18 15:07:34,685 INFO [main]
> > o.a.n.c.l.e.CuratorLeaderElectionManager
> > >> CuratorLeaderElectionManager[stopped=false] Attempted to register
> Leader
> > >> Election for role 'Cluster Coordinator' but this role is already
> > registered
> > >> 2016-11-18 15:07:34,696 INFO [Curator-Framework-0]
> > >> o.a.c.f.state.ConnectionStateManager State change: SUSPENDED
> > >> 2016-11-18 15:07:34,698 INFO [Curator-ConnectionStateManager-0]
> > >> o.a.n.c.l.e.CuratorLeaderElectionManager
> org.apache.nifi.controller.lea
> > >> der.election.CuratorLeaderElectionManager$ElectionListener@671a652a
> > >> Connection State changed to SUSPENDED
> > >>
> > >> *2016-11-18 15:07:34,699 ERROR [Curator-Framework-0]
> > >> o.a.c.f.imps.CuratorFrameworkImpl Background operation retry gave
> > >> uporg.apache.zookeeper.KeeperException$ConnectionLossException:
> > >> KeeperErrorCode = ConnectionLoss*
> > >>         at
> > org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
> > >> ~[zookeeper-3.4.6.jar:3.4.6-1569965]
> > >>         at org.apache.curator.framework.imps.CuratorFrameworkImpl.
> check
> > >> BackgroundRetry(CuratorFrameworkImpl.java:728)
> > >> [curator-framework-2.11.0.jar:na]
> > >>         at org.apache.curator.framework.imps.CuratorFrameworkImpl.
> perfo
> > >> rmBackgroundOperation(CuratorFrameworkImpl.java:857)
> > >> [curator-framework-2.11.0.jar:na]
> > >>         at org.apache.curator.framework.imps.CuratorFrameworkImpl.
> backg
> > >> roundOperationsLoop(CuratorFrameworkImpl.java:809)
> > >> [curator-framework-2.11.0.jar:na]
> > >>         at org.apache.curator.framework.imps.CuratorFrameworkImpl.
> acces
> > >> s$300(CuratorFrameworkImpl.java:64) [curator-framework-2.11.0.jar:na]
> > >>         at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.
> cal
> > >> l(CuratorFrameworkImpl.java:267) [curator-framework-2.11.0.jar:na]
> > >>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> > >> [na:1.8.0_111]
> > >>         at java.util.concurrent.ScheduledThreadPoolExecutor$
> ScheduledFu
> > >> tureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> [na:1.8.0_111]
> > >>         at java.util.concurrent.ScheduledThreadPoolExecutor$
> ScheduledFu
> > >> tureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.8.0_111]
> > >>         at
> > java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
> > >> [na:1.8.0_111]
> > >>         at
> > java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
> > >> [na:1.8.0_111]
> > >>         at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111]
> > >>
> > >>
> > >> --
> > >> I know what it is to be in need, and I know what it is to have plenty.
> > I
> > >> have learned the secret of being content in any and every situation,
> > >> whether well fed or hungry, whether living in plenty or in want.  I
> can
> > >> do all this through him who gives me strength.    *-Philippians
> 4:12-13*
> > >>
> > >
> > >
> > >
> > > --
> > > I know what it is to be in need, and I know what it is to have
> plenty.  I
> > > have learned the secret of being content in any and every situation,
> > > whether well fed or hungry, whether living in plenty or in want.  I can
> > > do all this through him who gives me strength.    *-Philippians
> 4:12-13*
> > >
> >
> >
> >
> > --
> > I know what it is to be in need, and I know what it is to have plenty.  I
> > have learned the secret of being content in any and every situation,
> > whether well fed or hungry, whether living in plenty or in want.  I can
> do
> > all this through him who gives me strength.    *-Philippians 4:12-13*
> >
>



-- 
I know what it is to be in need, and I know what it is to have plenty.  I
have learned the secret of being content in any and every situation,
whether well fed or hungry, whether living in plenty or in want.  I can do
all this through him who gives me strength.    *-Philippians 4:12-13*

Re: Zookeeper error

Posted by Jeff <jt...@gmail.com>.

Hello Joe,

Just out of curiosity, how long did you let NiFi run while waiting for the
nodes to connect?

On Fri, Nov 18, 2016 at 10:53 AM Joe Gresock <jg...@gmail.com> wrote:

> Despite starting up, the nodes now cannot connect to each other, so they're
> all listed as Disconnected in the UI.  I see this in the logs:
>
> 2016-11-18 15:50:19,080 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181]
> o.a.zookeeper.server.ZooKeeperServer Client attempting to establish new
> session at /172.31.33.34:47224
> 2016-11-18 15:50:19,081 INFO [CommitProcessor:2]
> o.a.zookeeper.server.ZooKeeperServer Established session 0x258781845940bf9
> with negotiated timeout 4000 for client /172.31.33.34:47224
> 2016-11-18 15:50:19,185 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181]
> o.a.zookeeper.server.ZooKeeperServer Client attempting to establish new
> session at /172.31.33.34:47228
> 2016-11-18 15:50:19,186 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181]
> o.a.zookeeper.server.ZooKeeperServer Client attempting to establish new
> session at /172.31.33.34:47230
> 2016-11-18 15:50:19,187 INFO [CommitProcessor:2]
> o.a.zookeeper.server.ZooKeeperServer Established session 0x258781845940bfa
> with negotiated timeout 4000 for client /172.31.33.34:47228
> 2016-11-18 15:50:19,187 INFO [CommitProcessor:2]
> o.a.zookeeper.server.ZooKeeperServer Established session 0x258781845940bfb
> with negotiated timeout 4000 for client /172.31.33.34:47230
> 2016-11-18 15:50:19,292 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181]
> o.a.zookeeper.server.ZooKeeperServer Client attempting to establish new
> session at /172.31.33.34:47234
> 2016-11-18 15:50:19,293 INFO [CommitProcessor:2]
> o.a.zookeeper.server.ZooKeeperServer Established session 0x258781845940bfc
> with negotiated timeout 4000 for client /172.31.33.34:47234
>
>
> However, I definitely did not open any ports similar to 47234 on my nifi
> VMs.  Is there a certain set of ports that need to be open between the
> servers?  My understanding was that only 2888, 3888, and 2121 were
> necessary for zookeeper.
>
> On Fri, Nov 18, 2016 at 3:41 PM, Joe Gresock <jg...@gmail.com> wrote:
>
> > It appears that if you try to start up just one node in a cluster with
> > multiple zk hosts specified in zookeeper.properties, you get this error
> > spammed at an incredible rate in your logs.  When I started up all 3
> nodes
> > at once, they didn't receive the error.
> >
> > On Fri, Nov 18, 2016 at 3:18 PM, Joe Gresock <jg...@gmail.com> wrote:
> >
> >> I'm upgrading a test 0.x nifi cluster to 1.x using the latest in master
> >> as of today.
> >>
> >> I was able to successfully start the 3-node cluster once, but then I
> >> restarted it and get the following error spammed in the nifi-app.log.
> >>
> >> I'm not sure where to start debugging this, and I'm puzzled why it would
> >> work once and then start giving me errors on the second restart.  Has
> >> anyone run into this error?
> >>
> >> 2016-11-18 15:07:18,178 INFO [main] org.eclipse.jetty.server.Server
> >> Started @83426ms
> >> 2016-11-18 15:07:18,883 INFO [main]
> org.apache.nifi.web.server.JettyServer
> >> Loading Flow...
> >> 2016-11-18 15:07:18,889 INFO [main]
> org.apache.nifi.io.socket.SocketListener
> >> Now listening for connections from nodes on port 9001
> >> 2016-11-18 15:07:19,117 INFO [main]
> o.a.nifi.controller.StandardFlowService
> >> Connecting Node: ip-172-31-33-34.ec2.internal:8443
> >> 2016-11-18 15:07:25,781 WARN [main]
> o.a.nifi.controller.StandardFlowService
> >> There is currently no Cluster Coordinator. This often happens upon
> restart
> >> of NiFi when running an embedded ZooKeeper. Will register this node to
> >> become the active Cluster Coordinator and will attempt to connect to
> >> cluster again
> >> 2016-11-18 15:07:25,782 INFO [main]
> o.a.n.c.l.e.CuratorLeaderElectionManager
> >> CuratorLeaderElectionManager[stopped=false] Attempted to register Leader
> >> Election for role 'Cluster Coordinator' but this role is already
> registered
> >> 2016-11-18 15:07:34,685 WARN [main]
> o.a.nifi.controller.StandardFlowService
> >> There is currently no Cluster Coordinator. This often happens upon
> restart
> >> of NiFi when running an embedded ZooKeeper. Will register this node to
> >> become the active Cluster Coordinator and will attempt to connect to
> >> cluster again
> >> 2016-11-18 15:07:34,685 INFO [main]
> o.a.n.c.l.e.CuratorLeaderElectionManager
> >> CuratorLeaderElectionManager[stopped=false] Attempted to register Leader
> >> Election for role 'Cluster Coordinator' but this role is already
> registered
> >> 2016-11-18 15:07:34,696 INFO [Curator-Framework-0]
> >> o.a.c.f.state.ConnectionStateManager State change: SUSPENDED
> >> 2016-11-18 15:07:34,698 INFO [Curator-ConnectionStateManager-0]
> >> o.a.n.c.l.e.CuratorLeaderElectionManager org.apache.nifi.controller.lea
> >> der.election.CuratorLeaderElectionManager$ElectionListener@671a652a
> >> Connection State changed to SUSPENDED
> >>
> >> *2016-11-18 15:07:34,699 ERROR [Curator-Framework-0]
> >> o.a.c.f.imps.CuratorFrameworkImpl Background operation retry gave
> >> uporg.apache.zookeeper.KeeperException$ConnectionLossException:
> >> KeeperErrorCode = ConnectionLoss*
> >>         at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
> >> ~[zookeeper-3.4.6.jar:3.4.6-1569965]
> >>         at org.apache.curator.framework.imps.CuratorFrameworkImpl.check
> >> BackgroundRetry(CuratorFrameworkImpl.java:728)
> >> [curator-framework-2.11.0.jar:na]
> >>         at org.apache.curator.framework.imps.CuratorFrameworkImpl.perfo
> >> rmBackgroundOperation(CuratorFrameworkImpl.java:857)
> >> [curator-framework-2.11.0.jar:na]
> >>         at org.apache.curator.framework.imps.CuratorFrameworkImpl.backg
> >> roundOperationsLoop(CuratorFrameworkImpl.java:809)
> >> [curator-framework-2.11.0.jar:na]
> >>         at org.apache.curator.framework.imps.CuratorFrameworkImpl.acces
> >> s$300(CuratorFrameworkImpl.java:64) [curator-framework-2.11.0.jar:na]
> >>         at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.cal
> >> l(CuratorFrameworkImpl.java:267) [curator-framework-2.11.0.jar:na]
> >>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> >> [na:1.8.0_111]
> >>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFu
> >> tureTask.access$201(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_111]
> >>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFu
> >> tureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.8.0_111]
> >>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> >> [na:1.8.0_111]
> >>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> >> [na:1.8.0_111]
> >>         at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111]
> >>
> >>
> >> --
> >> I know what it is to be in need, and I know what it is to have plenty.
> I
> >> have learned the secret of being content in any and every situation,
> >> whether well fed or hungry, whether living in plenty or in want.  I can
> >> do all this through him who gives me strength.    *-Philippians 4:12-13*
> >>
> >
> >
> >
> > --
> > I know what it is to be in need, and I know what it is to have plenty.  I
> > have learned the secret of being content in any and every situation,
> > whether well fed or hungry, whether living in plenty or in want.  I can
> > do all this through him who gives me strength.    *-Philippians 4:12-13*
> >
>
>
>
> --
> I know what it is to be in need, and I know what it is to have plenty.  I
> have learned the secret of being content in any and every situation,
> whether well fed or hungry, whether living in plenty or in want.  I can do
> all this through him who gives me strength.    *-Philippians 4:12-13*
>

Re: Zookeeper error

Posted by Joe Gresock <jg...@gmail.com>.

Despite starting up, the nodes now cannot connect to each other, so they're
all listed as Disconnected in the UI.  I see this in the logs:

2016-11-18 15:50:19,080 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181]
o.a.zookeeper.server.ZooKeeperServer Client attempting to establish new
session at /172.31.33.34:47224
2016-11-18 15:50:19,081 INFO [CommitProcessor:2]
o.a.zookeeper.server.ZooKeeperServer Established session 0x258781845940bf9
with negotiated timeout 4000 for client /172.31.33.34:47224
2016-11-18 15:50:19,185 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181]
o.a.zookeeper.server.ZooKeeperServer Client attempting to establish new
session at /172.31.33.34:47228
2016-11-18 15:50:19,186 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181]
o.a.zookeeper.server.ZooKeeperServer Client attempting to establish new
session at /172.31.33.34:47230
2016-11-18 15:50:19,187 INFO [CommitProcessor:2]
o.a.zookeeper.server.ZooKeeperServer Established session 0x258781845940bfa
with negotiated timeout 4000 for client /172.31.33.34:47228
2016-11-18 15:50:19,187 INFO [CommitProcessor:2]
o.a.zookeeper.server.ZooKeeperServer Established session 0x258781845940bfb
with negotiated timeout 4000 for client /172.31.33.34:47230
2016-11-18 15:50:19,292 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181]
o.a.zookeeper.server.ZooKeeperServer Client attempting to establish new
session at /172.31.33.34:47234
2016-11-18 15:50:19,293 INFO [CommitProcessor:2]
o.a.zookeeper.server.ZooKeeperServer Established session 0x258781845940bfc
with negotiated timeout 4000 for client /172.31.33.34:47234


However, I definitely did not open any ports similar to 47234 on my nifi
VMs.  Is there a certain set of ports that need to be open between the
servers?  My understanding was that only 2888, 3888, and 2121 were
necessary for zookeeper.

On Fri, Nov 18, 2016 at 3:41 PM, Joe Gresock <jg...@gmail.com> wrote:

> It appears that if you try to start up just one node in a cluster with
> multiple zk hosts specified in zookeeper.properties, you get this error
> spammed at an incredible rate in your logs.  When I started up all 3 nodes
> at once, they didn't receive the error.
>
> On Fri, Nov 18, 2016 at 3:18 PM, Joe Gresock <jg...@gmail.com> wrote:
>
>> I'm upgrading a test 0.x nifi cluster to 1.x using the latest in master
>> as of today.
>>
>> I was able to successfully start the 3-node cluster once, but then I
>> restarted it and get the following error spammed in the nifi-app.log.
>>
>> I'm not sure where to start debugging this, and I'm puzzled why it would
>> work once and then start giving me errors on the second restart.  Has
>> anyone run into this error?
>>
>> 2016-11-18 15:07:18,178 INFO [main] org.eclipse.jetty.server.Server
>> Started @83426ms
>> 2016-11-18 15:07:18,883 INFO [main] org.apache.nifi.web.server.JettyServer
>> Loading Flow...
>> 2016-11-18 15:07:18,889 INFO [main] org.apache.nifi.io.socket.SocketListener
>> Now listening for connections from nodes on port 9001
>> 2016-11-18 15:07:19,117 INFO [main] o.a.nifi.controller.StandardFlowService
>> Connecting Node: ip-172-31-33-34.ec2.internal:8443
>> 2016-11-18 15:07:25,781 WARN [main] o.a.nifi.controller.StandardFlowService
>> There is currently no Cluster Coordinator. This often happens upon restart
>> of NiFi when running an embedded ZooKeeper. Will register this node to
>> become the active Cluster Coordinator and will attempt to connect to
>> cluster again
>> 2016-11-18 15:07:25,782 INFO [main] o.a.n.c.l.e.CuratorLeaderElectionManager
>> CuratorLeaderElectionManager[stopped=false] Attempted to register Leader
>> Election for role 'Cluster Coordinator' but this role is already registered
>> 2016-11-18 15:07:34,685 WARN [main] o.a.nifi.controller.StandardFlowService
>> There is currently no Cluster Coordinator. This often happens upon restart
>> of NiFi when running an embedded ZooKeeper. Will register this node to
>> become the active Cluster Coordinator and will attempt to connect to
>> cluster again
>> 2016-11-18 15:07:34,685 INFO [main] o.a.n.c.l.e.CuratorLeaderElectionManager
>> CuratorLeaderElectionManager[stopped=false] Attempted to register Leader
>> Election for role 'Cluster Coordinator' but this role is already registered
>> 2016-11-18 15:07:34,696 INFO [Curator-Framework-0]
>> o.a.c.f.state.ConnectionStateManager State change: SUSPENDED
>> 2016-11-18 15:07:34,698 INFO [Curator-ConnectionStateManager-0]
>> o.a.n.c.l.e.CuratorLeaderElectionManager org.apache.nifi.controller.lea
>> der.election.CuratorLeaderElectionManager$ElectionListener@671a652a
>> Connection State changed to SUSPENDED
>>
>> *2016-11-18 15:07:34,699 ERROR [Curator-Framework-0]
>> o.a.c.f.imps.CuratorFrameworkImpl Background operation retry gave
>> uporg.apache.zookeeper.KeeperException$ConnectionLossException:
>> KeeperErrorCode = ConnectionLoss*
>>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>> ~[zookeeper-3.4.6.jar:3.4.6-1569965]
>>         at org.apache.curator.framework.imps.CuratorFrameworkImpl.check
>> BackgroundRetry(CuratorFrameworkImpl.java:728)
>> [curator-framework-2.11.0.jar:na]
>>         at org.apache.curator.framework.imps.CuratorFrameworkImpl.perfo
>> rmBackgroundOperation(CuratorFrameworkImpl.java:857)
>> [curator-framework-2.11.0.jar:na]
>>         at org.apache.curator.framework.imps.CuratorFrameworkImpl.backg
>> roundOperationsLoop(CuratorFrameworkImpl.java:809)
>> [curator-framework-2.11.0.jar:na]
>>         at org.apache.curator.framework.imps.CuratorFrameworkImpl.acces
>> s$300(CuratorFrameworkImpl.java:64) [curator-framework-2.11.0.jar:na]
>>         at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.cal
>> l(CuratorFrameworkImpl.java:267) [curator-framework-2.11.0.jar:na]
>>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>> [na:1.8.0_111]
>>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFu
>> tureTask.access$201(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_111]
>>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFu
>> tureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.8.0_111]
>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>> [na:1.8.0_111]
>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>> [na:1.8.0_111]
>>         at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111]
>>
>>
>> --
>> I know what it is to be in need, and I know what it is to have plenty.  I
>> have learned the secret of being content in any and every situation,
>> whether well fed or hungry, whether living in plenty or in want.  I can
>> do all this through him who gives me strength.    *-Philippians 4:12-13*
>>
>
>
>
> --
> I know what it is to be in need, and I know what it is to have plenty.  I
> have learned the secret of being content in any and every situation,
> whether well fed or hungry, whether living in plenty or in want.  I can
> do all this through him who gives me strength.    *-Philippians 4:12-13*
>



-- 
I know what it is to be in need, and I know what it is to have plenty.  I
have learned the secret of being content in any and every situation,
whether well fed or hungry, whether living in plenty or in want.  I can do
all this through him who gives me strength.    *-Philippians 4:12-13*

Re: Zookeeper error

Posted by Joe Gresock <jg...@gmail.com>.

It appears that if you try to start up just one node in a cluster with
multiple zk hosts specified in zookeeper.properties, you get this error
spammed at an incredible rate in your logs.  When I started up all 3 nodes
at once, they didn't receive the error.

On Fri, Nov 18, 2016 at 3:18 PM, Joe Gresock <jg...@gmail.com> wrote:

> I'm upgrading a test 0.x nifi cluster to 1.x using the latest in master as
> of today.
>
> I was able to successfully start the 3-node cluster once, but then I
> restarted it and get the following error spammed in the nifi-app.log.
>
> I'm not sure where to start debugging this, and I'm puzzled why it would
> work once and then start giving me errors on the second restart.  Has
> anyone run into this error?
>
> 2016-11-18 15:07:18,178 INFO [main] org.eclipse.jetty.server.Server
> Started @83426ms
> 2016-11-18 15:07:18,883 INFO [main] org.apache.nifi.web.server.JettyServer
> Loading Flow...
> 2016-11-18 15:07:18,889 INFO [main] org.apache.nifi.io.socket.SocketListener
> Now listening for connections from nodes on port 9001
> 2016-11-18 15:07:19,117 INFO [main] o.a.nifi.controller.StandardFlowService
> Connecting Node: ip-172-31-33-34.ec2.internal:8443
> 2016-11-18 15:07:25,781 WARN [main] o.a.nifi.controller.StandardFlowService
> There is currently no Cluster Coordinator. This often happens upon restart
> of NiFi when running an embedded ZooKeeper. Will register this node to
> become the active Cluster Coordinator and will attempt to connect to
> cluster again
> 2016-11-18 15:07:25,782 INFO [main] o.a.n.c.l.e.CuratorLeaderElectionManager
> CuratorLeaderElectionManager[stopped=false] Attempted to register Leader
> Election for role 'Cluster Coordinator' but this role is already registered
> 2016-11-18 15:07:34,685 WARN [main] o.a.nifi.controller.StandardFlowService
> There is currently no Cluster Coordinator. This often happens upon restart
> of NiFi when running an embedded ZooKeeper. Will register this node to
> become the active Cluster Coordinator and will attempt to connect to
> cluster again
> 2016-11-18 15:07:34,685 INFO [main] o.a.n.c.l.e.CuratorLeaderElectionManager
> CuratorLeaderElectionManager[stopped=false] Attempted to register Leader
> Election for role 'Cluster Coordinator' but this role is already registered
> 2016-11-18 15:07:34,696 INFO [Curator-Framework-0] o.a.c.f.state.ConnectionStateManager
> State change: SUSPENDED
> 2016-11-18 15:07:34,698 INFO [Curator-ConnectionStateManager-0]
> o.a.n.c.l.e.CuratorLeaderElectionManager org.apache.nifi.controller.
> leader.election.CuratorLeaderElectionManager$ElectionListener@671a652a
> Connection State changed to SUSPENDED
>
> *2016-11-18 15:07:34,699 ERROR [Curator-Framework-0]
> o.a.c.f.imps.CuratorFrameworkImpl Background operation retry gave
> uporg.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss*
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
> ~[zookeeper-3.4.6.jar:3.4.6-1569965]
>         at org.apache.curator.framework.imps.CuratorFrameworkImpl.
> checkBackgroundRetry(CuratorFrameworkImpl.java:728)
> [curator-framework-2.11.0.jar:na]
>         at org.apache.curator.framework.imps.CuratorFrameworkImpl.
> performBackgroundOperation(CuratorFrameworkImpl.java:857)
> [curator-framework-2.11.0.jar:na]
>         at org.apache.curator.framework.imps.CuratorFrameworkImpl.
> backgroundOperationsLoop(CuratorFrameworkImpl.java:809)
> [curator-framework-2.11.0.jar:na]
>         at org.apache.curator.framework.imps.CuratorFrameworkImpl.
> access$300(CuratorFrameworkImpl.java:64) [curator-framework-2.11.0.jar:na]
>         at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.
> call(CuratorFrameworkImpl.java:267) [curator-framework-2.11.0.jar:na]
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [na:1.8.0_111]
>         at java.util.concurrent.ScheduledThreadPoolExecutor$
> ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> [na:1.8.0_111]
>         at java.util.concurrent.ScheduledThreadPoolExecutor$
> ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> [na:1.8.0_111]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> [na:1.8.0_111]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [na:1.8.0_111]
>         at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111]
>
>
> --
> I know what it is to be in need, and I know what it is to have plenty.  I
> have learned the secret of being content in any and every situation,
> whether well fed or hungry, whether living in plenty or in want.  I can
> do all this through him who gives me strength.    *-Philippians 4:12-13*
>



-- 
I know what it is to be in need, and I know what it is to have plenty.  I
have learned the secret of being content in any and every situation,
whether well fed or hungry, whether living in plenty or in want.  I can do
all this through him who gives me strength.    *-Philippians 4:12-13*