You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Mark Payne (Jira)" <ji...@apache.org> on 2021/02/05 18:21:00 UTC
[jira] [Updated] (NIFI-7866) When cluster coordinator dies, other
nodes may have trouble rejoining cluster
[ https://issues.apache.org/jira/browse/NIFI-7866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mark Payne updated NIFI-7866:
-----------------------------
Fix Version/s: 1.13.0
Assignee: Mark Payne
Status: Patch Available (was: Open)
> When cluster coordinator dies, other nodes may have trouble rejoining cluster
> -----------------------------------------------------------------------------
>
> Key: NIFI-7866
> URL: https://issues.apache.org/jira/browse/NIFI-7866
> Project: Apache NiFi
> Issue Type: Bug
> Components: Core Framework
> Reporter: Mark Payne
> Assignee: Mark Payne
> Priority: Major
> Fix For: 1.13.0
>
>
> When the cluster coordinator is lost, the nodes must now begin communicating with a newly elected Cluster Coordinator. This is handled through the StandardFlowService.
> When the `handleReconnectionRequest` method is called and the request provided does not contain the dataflow, the node is to connect to the cluster coordinator and request the dataflow:
> {code:java}
> private void handleReconnectionRequest(final ReconnectionRequestMessage request) {
> try {
> logger.info("Processing reconnection request from cluster coordinator.");
> // reconnect
> ConnectionResponse connectionResponse = new ConnectionResponse(getNodeId(), request.getDataFlow(),
> request.getInstanceId(), request.getNodeConnectionStatuses(), request.getComponentRevisions());
> if (connectionResponse.getDataFlow() == null) {
> logger.info("Received a Reconnection Request that contained no DataFlow. Will attempt to connect to cluster using local flow.");
> connectionResponse = connect(false, false, createDataFlowFromController());
> }
> loadFromConnectionResponse(connectionResponse);
> ... {code}
> However, if the call above to `connect(false, false, createDataFlowFromController()` returns false (which is a valid case), that null value is passed along to the loadFromConnectionResponse. This method expects a non-null connectionResponse and throws a NullPointerException, resulting in the following stack trace (stack trace based on nifi 1.11.4):
> {code:java}
> 2020-09-29 10:18:53,324 ERROR [Reconnect to Cluster] o.a.nifi.controller.StandardFlowService Handling reconnection request failed due to: org.apache.nifi.cluster.ConnectionException: Failed to connect node to cluster due to: java.lang.NullPointerExceptionorg.apache.nifi.cluster.ConnectionException: Failed to connect node to cluster due to: java.lang.NullPointerExceptionat org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:1035)at org.apache.nifi.controller.StandardFlowService.handleReconnectionRequest(StandardFlowService.java:668)at org.apache.nifi.controller.StandardFlowService.access$200(StandardFlowService.java:109)at org.apache.nifi.controller.StandardFlowService$1.run(StandardFlowService.java:415)at java.lang.Thread.run(Thread.java:748)Caused by: java.lang.NullPointerException: nullat org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:989)... 4 common frames omitted {code}
> This results in the node not reconnecting to the cluster.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)