You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Joseph Percivall (JIRA)" <ji...@apache.org> on 2016/07/26 17:56:20 UTC

[jira] [Created] (NIFI-2406) Rare start-up problems resulting in all nodes disconnected

Joseph Percivall created NIFI-2406:
--------------------------------------

             Summary: Rare start-up problems resulting in all nodes disconnected
                 Key: NIFI-2406
                 URL: https://issues.apache.org/jira/browse/NIFI-2406
             Project: Apache NiFi
          Issue Type: Bug
            Reporter: Joseph Percivall


While testing PR 678[1], I came across a time where all the nodes were in a disconnected state and each were in a weird state of heartbeating but not connected.

Also in the logs there were ~1000 lines of:

2016-07-26 11:38:07,841 INFO [Leader Election Notification Thread-1] o.a.n.c.l.e.CuratorLeaderElectionManager org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager$ElectionListener@24fae8c6 This node has been elected Leader for Role 'Cluster Coordinator'

This message only gets called here[2] which is a call back for ZK. Also there were many log messages of:

2016-07-26 11:54:07,910 WARN [Clustering Tasks Thread-1] o.a.n.c.c.node.NodeClusterCoordinator Failed to determine which node is elected active Cluster Coordinator: ZooKeeper reports the address as localhost:6001, but there is no node with this address

I believe this is a problem with ZK/NiFi that existed before this PR and not directly related to the PR being reviewed. I will attach a tar of the 3 node's logs.

[1] https://github.com/apache/nifi/pull/678
[2] https://github.com/apache/nifi/blame/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/leader/election/CuratorLeaderElectionManager.java#L220-L220



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)