You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Hendrik Ruijter <He...@verisure.com> on 2022/02/08 15:25:01 UTC
NIFI-8204 in NiFi 1.15.3?
Hello,
We have a NiFi 1.15.3 5-node cluster with zookeeper and think we see NIFI-8204?
The nodes were not able to recover and the cluster broke up without recovery. All pipelines scheduled executed in the primary node/cluster coordinator as far as I can tell.
Could this happen when Microsoft Azure suddenly drops the entire VNET for a period of time?
----->The primary errors in the log. 2/7/2022 10:01:01 PM
2/7/2022 10:01:01 PM 00:36,983 WARN [Heartbeat Monitor Thread-1] o.a.n.c.c.node.NodeClusterCoordinator Requesting that nifi-node-3:8443 reconnect to the cluster due to: Node has a Revision Update Count of 11167 but local value is only 11166. Node appears not to have the appropriate set of Component Revisions
2/7/2022 10:01:01 PM 00:37,084 WARN [Heartbeat Monitor Thread-1] o.a.n.c.c.node.NodeClusterCoordinator Requesting that nifi-node-1:8443 reconnect to the cluster due to: Node has a Revision Update Count of 11167 but local value is only 11166. Node appears not to have the appropriate set of Component Revisions
2/7/2022 10:01:01 PM 00:37,193 WARN [Heartbeat Monitor Thread-1] o.a.n.c.c.node.NodeClusterCoordinator Requesting that nifi-node-4:8443 reconnect to the cluster due to: Node has a Revision Update Count of 11167 but local value is only 11166. Node appears not to have the appropriate set of Component Revisions
2/7/2022 10:01:01 PM 00:37,348 WARN [Heartbeat Monitor Thread-1] o.a.n.c.c.node.NodeClusterCoordinator Requesting that nifi-node-5:8443 reconnect to the cluster due to: Node has a Revision Update Count of 11167 but local value is only 11166. Node appears not to have the appropriate set of Component Revisions
2/7/2022 10:01:09 PM 00:28,221 WARN [Clustering Tasks Thread-3] o.apache.nifi.controller.FlowController Failed to send heartbeat due to: org.apache.nifi.cluster.protocol.ProtocolException: Failed marshalling 'HEARTBEAT' protocol message due to: javax.net.ssl.SSLException: Read timed out
2/7/2022 10:01:09 PM 00:39,263 ERROR [Reconnect to Cluster] o.a.nifi.controller.StandardFlowService Handling reconnection request failed due to: org.apache.nifi.controller.UninheritableFlowException: Failed to connect node to cluster
2/7/2022 10:01:09 PM 00:39,264 ERROR [Reconnect to Cluster] o.a.n.c.c.node.NodeClusterCoordinator Event Reported for nifi-node-1:8443 -- Node disconnected from cluster due to org.apache.nifi.controller.UninheritableFlowException: Failed to connect node to cluster
2/7/2022 10:01:27 PM 00:28,850 WARN [Clustering Tasks Thread-3] o.apache.nifi.controller.FlowController Failed to send heartbeat due to: org.apache.nifi.cluster.protocol.ProtocolException: Failed marshalling 'HEARTBEAT' protocol message due to: javax.net.ssl.SSLException: Read timed out
2/7/2022 10:01:27 PM 00:39,223 ERROR [Reconnect to Cluster] o.a.n.c.c.node.NodeClusterCoordinator Event Reported for nifi-node-4:8443 -- Node disconnected from cluster due to org.apache.nifi.controller.UninheritableFlowException: Failed to connect node to cluster
2/7/2022 10:01:27 PM 00:39,223 ERROR [Reconnect to Cluster] o.a.nifi.controller.StandardFlowService Handling reconnection request failed due to: org.apache.nifi.controller.UninheritableFlowException: Failed to connect node to cluster
2/7/2022 10:01:30 PM 00:29,979 WARN [Process Cluster Protocol Request-27] o.a.n.c.p.impl.SocketProtocolListener Failed processing protocol message from nifi-node-1 due to readHandshakeRecord
2/7/2022 10:01:37 PM 00:39,188 ERROR [Reconnect to Cluster] o.a.n.c.c.node.NodeClusterCoordinator Event Reported for nifi-node-5:8443 -- Node disconnected from cluster due to org.apache.nifi.controller.UninheritableFlowException: Failed to connect node to cluster
2/7/2022 10:01:36 PM 00:39,187 ERROR [Reconnect to Cluster] o.a.nifi.controller.StandardFlowService Handling reconnection request failed due to: org.apache.nifi.controller.UninheritableFlowException: Failed to connect node to cluster
2/7/2022 10:01:30 PM 00:38,895 ERROR [Reconnect to Cluster] o.a.n.c.c.node.NodeClusterCoordinator Event Reported for nifi-node-3:8443 -- Node disconnected from cluster due to org.apache.nifi.controller.UninheritableFlowException: Failed to connect node to cluster
2/7/2022 10:01:30 PM 00:38,894 ERROR [Reconnect to Cluster] o.a.nifi.controller.StandardFlowService Handling reconnection request failed due to: org.apache.nifi.controller.UninheritableFlowException: Failed to connect node to cluster
----->Secondary errors in the log? All in the primary node/cluster coordinator where the pipelines executed. Many during the night, last one was 2/8/2022 7:50:04 AM
2/8/2022 7:50:04 AM 50:00,599 ERROR [Variable Registry Update Thread] o.a.nifi.web.api.ProcessGroupResource Failed to update variable registry
----->Same primary errors for the first time since 2/7/2022 10:01:01 PM
2/8/2022 7:50:32 AM 50:23,351 ERROR [Reconnect to Cluster] o.a.nifi.controller.StandardFlowService Handling reconnection request failed due to: org.apache.nifi.controller.UninheritableFlowException: Failed to connect node to cluster
2/8/2022 7:50:32 AM 50:23,352 ERROR [Reconnect to Cluster] o.a.n.c.c.node.NodeClusterCoordinator Event Reported for nifi-node-3:8443 -- Node disconnected from cluster due to org.apache.nifi.controller.UninheritableFlowException: Failed to connect node to cluster
2/8/2022 7:50:38 AM 50:26,280 ERROR [Reconnect to Cluster] o.a.nifi.controller.StandardFlowService Handling reconnection request failed due to: org.apache.nifi.controller.UninheritableFlowException: Failed to connect node to cluster
2/8/2022 7:50:38 AM 50:26,280 ERROR [Reconnect to Cluster] o.a.n.c.c.node.NodeClusterCoordinator Event Reported for nifi-node-5:8443 -- Node disconnected from cluster due to org.apache.nifi.controller.UninheritableFlowException: Failed to connect node to cluster
2/8/2022 7:51:16 AM 50:19,737 ERROR [Reconnect to Cluster] o.a.nifi.controller.StandardFlowService Handling reconnection request failed due to: org.apache.nifi.controller.UninheritableFlowException: Failed to connect node to cluster
2/8/2022 7:51:16 AM 50:19,737 ERROR [Reconnect to Cluster] o.a.n.c.c.node.NodeClusterCoordinator Event Reported for nifi-node-1:8443 -- Node disconnected from cluster due to org.apache.nifi.controller.UninheritableFlowException: Failed to connect node to cluster
2/8/2022 7:51:28 AM 50:29,514 ERROR [Reconnect to Cluster] o.a.nifi.controller.StandardFlowService Handling reconnection request failed due to: org.apache.nifi.controller.UninheritableFlowException: Failed to connect node to cluster
2/8/2022 7:51:28 AM 50:29,515 ERROR [Reconnect to Cluster] o.a.n.c.c.node.NodeClusterCoordinator Event Reported for nifi-node-4:8443 -- Node disconnected from cluster due to org.apache.nifi.controller.UninheritableFlowException: Failed to connect node to cluster
----->Restart all nodes and it is alive and kicking! Cool!