You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by David Early via users <us...@nifi.apache.org> on 2022/12/09 18:22:35 UTC

Issue with removal and re-add of a cluster node

Hi all,

I have a major issue and am not sure what to do about it.

We have a 3 node cluster.  I was working on a one-off load for some data we
were doing out of sequence and it resulted in build-up of some flowfiles in
a queue.  In order to prevent a backpressure situation, I cleared one of
the holding queues that had about 69k flow files.

During the clear operation the node I was on (node 3 UI in this case)
returned and stated that the node was no longer part of the cluster.  Not
clear why that happened.

This, by itself, is not really an issue.  Looking at the logs (at bottom of
this note), you can see theflowfile drop and immediate adjustment to the
node 3 to state of CONNECTING to the cluster.  Subsequently, an error
occurred:  "*Disconnecting node due to Failed to properly handle
Reconnection request due to
org.apache.nifi.controller.serialization.FlowSynchronizationException:
Failed to connect node to cluster because local flow controller partially
updated. Administrator should disconnect node and review flow for
corruption*".

When I attempted to readd the node from the UI, it repeated this error.

I compared users.xml and authroizations.xml on all three nodes, textually
the same and identical md5sum on all (issues with users.xml and
authorizations.xml were listed online as usual suspects).

I then offloaded the node via the UI to make sure I didn't have anything
stuck in queue on node 3 and hoped it would allow the node to rejoin.
After offloading, I attempted to reconnect and what happened next gave me a
heart attack:  Node 3 now showed as connected but in the UI (accessed via
node 1), ALL PROCESSORS WERE SHOWN AS STOPPED.

A quick review showed that traffic was actually flowing (View status
history showed flowfiles moving, observing some of our observation queues
showed traffic on nodes 1 and 2).  Removing node 3 from the cluster
restored the UI to show everything running, but adding it back in showed
everything as stopped.

I tried to start some processors while node 3 was connected and while I
could start individual processors, I could not do a "global" start by right
clicking on canvas and trying "start".  I set up a sample processor to
generate a file on all 3 nodes and it did generate a new flowfile on node
3.  All of that worked fine.

We have 400+ processors that I would need to hand start and I am super
nervous about the cluster deciding to make node 3 the primary which would
affect some timed process that we are running on the primary node.  As long
as I don't restart the http input feed, I COULD restart all the processors,
but this seems like the wrong process.

Anyone have any idea what I did wrong and how to fix it?  The errors show
in the log attached happened before any offloading, but I wondered if the
offloading caused part of this issue.

Is there anything I can do to readd the node without having to restart all
the processors manually?

Should I clean up the node and add it as a "new" node and let it completely
sync?

Thanks for any insight!


Dave


-------------------------------
Log:
-------------------------------
*2022-12-08 22:26:20,706 INFO [Drop FlowFiles for Connection
8b0ee741-0183-1000-0000-000068704c93] o.a.n.c.queue.SwappablePriorityQueue
Successfully dropped 69615 FlowFiles (35496003 bytes) from Connection with
ID 8b0ee741-0183-1000-0000-000068704c93 on behalf of user@org.com
<us...@org.com>*
2022-12-08 22:26:20,707 INFO [Process Cluster Protocol Request-29]
o.a.n.c.c.node.NodeClusterCoordinator Status of prod-stsc2-3:8443 changed
from NodeConnectionStatus[nodeId=prod-stsc2-3:8443, state=CONNECTED,
updateId=108] to NodeConnectionStatus[nodeId=prod-stsc2-3:8443,
state=CONNECTING, updateId=114]
2022-12-08 22:26:20,707 INFO [Process Cluster Protocol Request-29]
o.a.n.c.p.impl.SocketProtocolListener Finished processing request
070fe65c-4a77-41d0-9d7f-8f08ede6ac71 (type=NODE_STATUS_CHANGE, length=1217
bytes) from prod-stsc2-1.internal.cloudapp.net in 10 seconds, 842 millis
2022-12-08 22:26:20,750 INFO [Reconnect to Cluster]
o.a.nifi.controller.StandardFlowService Setting Flow Controller's Node ID:
prod-stsc2-3:8443
2022-12-08 22:26:20,751 INFO [Reconnect to Cluster]
o.a.n.c.s.VersionedFlowSynchronizer Synchronizing FlowController with
proposed flow: Controller Already Synchronized = true
2022-12-08 22:26:21,298 INFO [NiFi Web Server-1481911]
o.a.c.f.imps.CuratorFrameworkImpl Starting
2022-12-08 22:26:21,298 INFO [NiFi Web Server-1481911]
org.apache.zookeeper.ClientCnxnSocket jute.maxbuffer value is 4194304 Bytes
2022-12-08 22:26:21,304 INFO [NiFi Web Server-1481911]
o.a.c.f.imps.CuratorFrameworkImpl Default schema
2022-12-08 22:26:21,314 INFO [NiFi Web Server-1481911-EventThread]
o.a.c.f.state.ConnectionStateManager State change: CONNECTED
2022-12-08 22:26:21,322 INFO [NiFi Web Server-1481911-EventThread]
o.a.c.framework.imps.EnsembleTracker New config event received:
{server.1=prod-zkpr-1:2888:3888:participant;0.0.0.0:2181, version=0,
server.3=prod-zkpr-3:2888:3888:participant;0.0.0.0:2181,
server.2=prod-zkpr-2:2888:3888:participant;0.0.0.0:2181}
2022-12-08 22:26:21,322 INFO [NiFi Web Server-1481911-EventThread]
o.a.c.framework.imps.EnsembleTracker New config event received:
{server.1=prod-zkpr-1:2888:3888:participant;0.0.0.0:2181, version=0,
server.3=prod-zkpr-3:2888:3888:participant;0.0.0.0:2181,
server.2=prod-zkpr-2:2888:3888:participant;0.0.0.0:2181}
2022-12-08 22:26:21,323 INFO [Curator-Framework-0]
o.a.c.f.imps.CuratorFrameworkImpl backgroundOperationsLoop exiting
2022-12-08 22:26:21,590 INFO [Reconnect to Cluster]
o.a.n.p.FlowConfigurationArchiveManager Removing old archive file
./conf/archive/20221206T064055+0000_flow.json.gz to reduce storage usage.
currentSize=522676591
2022-12-08 22:26:21,593 INFO [Reconnect to Cluster]
o.a.n.c.s.VersionedFlowSynchronizer Successfully created backup of existing
flow to
/data/nifi/nifi-1.16.3/./conf/archive/20221208T222621+0000_flow.json.gz
before inheriting dataflow
2022-12-08 22:26:21,594 INFO [Reconnect to Cluster]
o.a.n.c.s.VersionedFlowSynchronizer In order to inherit proposed dataflow,
will stop any components that will be affected by the update
2022-12-08 22:26:21,594 INFO [Reconnect to Cluster]
o.a.n.c.s.AffectedComponentSet Stopping the following components:
AffectedComponentSet[inputPorts=[], outputPorts=[], remoteInputPorts=[],
remoteOutputPorts=[],
processors=[PutEmail[id=6ebadc8f-142f-3f4c-81b9-0c2c55474420],
PutSFTP[id=1d317c8d-6340-3075-b80e-d14efee639d3],
PutEmail[id=4ac49804-9fe1-3d67-9e54-7487a9b92e89],
PutEmail[id=9d573e65-2c6d-3925-bdaa-afa2162ce804],
PutEmail[id=f558b7d4-ef81-3e7b-99d9-0cde20a5c96f],
PutEmail[id=a8063e92-1dad-3414-bff8-93ad33e0a059],
PutSFTP[id=232e3309-2ddd-1203-b575-0df2da962d8a],
PutEmail[id=98dc80cd-b210-3371-8775-6f50b6750158],
PutEmail[id=da59713c-5387-3de9-bc44-748a22b1ff5e],
PutEmail[id=d7e8d506-8f6a-3e50-8e52-06bcad6cd4e5]], controllerServices=[],
reportingTasks=[]]
2022-12-08 22:26:21,594 INFO [Reconnect to Cluster]
o.a.n.c.s.StandardProcessScheduler Stopping
PutEmail[id=6ebadc8f-142f-3f4c-81b9-0c2c55474420]
2022-12-08 22:26:21,594 INFO [Reconnect to Cluster]
o.a.n.controller.StandardProcessorNode Stopping processor:
PutEmail[id=6ebadc8f-142f-3f4c-81b9-0c2c55474420]
2022-12-08 22:26:21,595 INFO [Timer-Driven Process Thread-9]
o.a.n.c.s.TimerDrivenSchedulingAgent Stopped scheduling
PutEmail[id=6ebadc8f-142f-3f4c-81b9-0c2c55474420] to run
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
o.a.n.c.s.StandardProcessScheduler Stopping
PutSFTP[id=1d317c8d-6340-3075-b80e-d14efee639d3]
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
o.a.n.controller.StandardProcessorNode Stopping processor:
PutSFTP[id=1d317c8d-6340-3075-b80e-d14efee639d3]
2022-12-08 22:26:21,595 INFO [Timer-Driven Process Thread-53]
o.a.n.c.s.TimerDrivenSchedulingAgent Stopped scheduling
PutSFTP[id=1d317c8d-6340-3075-b80e-d14efee639d3] to run
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
o.a.n.c.s.StandardProcessScheduler Stopping
PutEmail[id=4ac49804-9fe1-3d67-9e54-7487a9b92e89]
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
o.a.n.controller.StandardProcessorNode Stopping processor:
PutEmail[id=4ac49804-9fe1-3d67-9e54-7487a9b92e89]
2022-12-08 22:26:21,595 INFO [Timer-Driven Process Thread-16]
o.a.n.c.s.TimerDrivenSchedulingAgent Stopped scheduling
PutEmail[id=4ac49804-9fe1-3d67-9e54-7487a9b92e89] to run
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
o.a.n.c.s.StandardProcessScheduler Stopping
PutEmail[id=9d573e65-2c6d-3925-bdaa-afa2162ce804]
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
o.a.n.controller.StandardProcessorNode Stopping processor:
PutEmail[id=9d573e65-2c6d-3925-bdaa-afa2162ce804]
2022-12-08 22:26:21,595 INFO [Timer-Driven Process Thread-91]
o.a.n.c.s.TimerDrivenSchedulingAgent Stopped scheduling
PutEmail[id=9d573e65-2c6d-3925-bdaa-afa2162ce804] to run
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
o.a.n.c.s.StandardProcessScheduler Stopping
PutEmail[id=f558b7d4-ef81-3e7b-99d9-0cde20a5c96f]
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
o.a.n.controller.StandardProcessorNode Stopping processor:
PutEmail[id=f558b7d4-ef81-3e7b-99d9-0cde20a5c96f]
2022-12-08 22:26:21,595 INFO [Timer-Driven Process Thread-65]
o.a.n.c.s.TimerDrivenSchedulingAgent Stopped scheduling
PutEmail[id=f558b7d4-ef81-3e7b-99d9-0cde20a5c96f] to run
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
o.a.n.c.s.StandardProcessScheduler Stopping
PutEmail[id=a8063e92-1dad-3414-bff8-93ad33e0a059]
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
o.a.n.controller.StandardProcessorNode Stopping processor:
PutEmail[id=a8063e92-1dad-3414-bff8-93ad33e0a059]
2022-12-08 22:26:21,595 INFO [Timer-Driven Process Thread-81]
o.a.n.c.s.TimerDrivenSchedulingAgent Stopped scheduling
PutEmail[id=a8063e92-1dad-3414-bff8-93ad33e0a059] to run
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
o.a.n.c.s.StandardProcessScheduler Stopping
PutSFTP[id=232e3309-2ddd-1203-b575-0df2da962d8a]
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
o.a.n.controller.StandardProcessorNode Stopping processor:
PutSFTP[id=232e3309-2ddd-1203-b575-0df2da962d8a]
2022-12-08 22:26:21,595 INFO [Timer-Driven Process Thread-48]
o.a.n.c.s.TimerDrivenSchedulingAgent Stopped scheduling
PutSFTP[id=232e3309-2ddd-1203-b575-0df2da962d8a] to run
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
o.a.n.c.s.StandardProcessScheduler Stopping
PutEmail[id=98dc80cd-b210-3371-8775-6f50b6750158]
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
o.a.n.controller.StandardProcessorNode Stopping processor:
PutEmail[id=98dc80cd-b210-3371-8775-6f50b6750158]
2022-12-08 22:26:21,595 INFO [Timer-Driven Process Thread-43]
o.a.n.c.s.TimerDrivenSchedulingAgent Stopped scheduling
PutEmail[id=98dc80cd-b210-3371-8775-6f50b6750158] to run
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
o.a.n.c.s.StandardProcessScheduler Stopping
PutEmail[id=da59713c-5387-3de9-bc44-748a22b1ff5e]
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
o.a.n.controller.StandardProcessorNode Stopping processor:
PutEmail[id=da59713c-5387-3de9-bc44-748a22b1ff5e]
2022-12-08 22:26:21,595 INFO [Timer-Driven Process Thread-72]
o.a.n.c.s.TimerDrivenSchedulingAgent Stopped scheduling
PutEmail[id=da59713c-5387-3de9-bc44-748a22b1ff5e] to run
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
o.a.n.c.s.StandardProcessScheduler Stopping
PutEmail[id=d7e8d506-8f6a-3e50-8e52-06bcad6cd4e5]
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
o.a.n.controller.StandardProcessorNode Stopping processor:
PutEmail[id=d7e8d506-8f6a-3e50-8e52-06bcad6cd4e5]
2022-12-08 22:26:21,595 INFO [Timer-Driven Process Thread-96]
o.a.n.c.s.TimerDrivenSchedulingAgent Stopped scheduling
PutEmail[id=d7e8d506-8f6a-3e50-8e52-06bcad6cd4e5] to run
2022-12-08 22:26:21,596 INFO [Reconnect to Cluster]
o.a.n.c.s.AffectedComponentSet Waiting for all required Processors and
Reporting Tasks to stop...
2022-12-08 22:26:21,606 INFO [Reconnect to Cluster]
o.a.n.c.s.AffectedComponentSet Successfully stopped all components in 12
milliseconds
...
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
o.a.n.c.s.StandardProcessScheduler Starting
PutEmail[id=6ebadc8f-142f-3f4c-81b9-0c2c55474420]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
o.a.n.controller.StandardProcessorNode Starting
PutEmail[id=6ebadc8f-142f-3f4c-81b9-0c2c55474420]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
o.a.n.c.s.StandardProcessScheduler Starting
PutSFTP[id=1d317c8d-6340-3075-b80e-d14efee639d3]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
o.a.n.controller.StandardProcessorNode Starting
PutSFTP[id=1d317c8d-6340-3075-b80e-d14efee639d3]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
o.a.n.c.s.StandardProcessScheduler Starting
PutEmail[id=4ac49804-9fe1-3d67-9e54-7487a9b92e89]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
o.a.n.controller.StandardProcessorNode Starting
PutEmail[id=4ac49804-9fe1-3d67-9e54-7487a9b92e89]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
o.a.n.c.s.StandardProcessScheduler Starting
PutEmail[id=9d573e65-2c6d-3925-bdaa-afa2162ce804]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
o.a.n.controller.StandardProcessorNode Starting
PutEmail[id=9d573e65-2c6d-3925-bdaa-afa2162ce804]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
o.a.n.c.s.StandardProcessScheduler Starting
PutEmail[id=f558b7d4-ef81-3e7b-99d9-0cde20a5c96f]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
o.a.n.controller.StandardProcessorNode Starting
PutEmail[id=f558b7d4-ef81-3e7b-99d9-0cde20a5c96f]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
o.a.n.c.s.StandardProcessScheduler Starting
PutEmail[id=a8063e92-1dad-3414-bff8-93ad33e0a059]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
o.a.n.controller.StandardProcessorNode Starting
PutEmail[id=a8063e92-1dad-3414-bff8-93ad33e0a059]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
o.a.n.c.s.StandardProcessScheduler Starting
PutSFTP[id=232e3309-2ddd-1203-b575-0df2da962d8a]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
o.a.n.controller.StandardProcessorNode Starting
PutSFTP[id=232e3309-2ddd-1203-b575-0df2da962d8a]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
o.a.n.c.s.StandardProcessScheduler Starting
PutEmail[id=98dc80cd-b210-3371-8775-6f50b6750158]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
o.a.n.controller.StandardProcessorNode Starting
PutEmail[id=98dc80cd-b210-3371-8775-6f50b6750158]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
o.a.n.c.s.StandardProcessScheduler Starting
PutEmail[id=da59713c-5387-3de9-bc44-748a22b1ff5e]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
o.a.n.controller.StandardProcessorNode Starting
PutEmail[id=da59713c-5387-3de9-bc44-748a22b1ff5e]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
o.a.n.c.s.StandardProcessScheduler Starting
PutEmail[id=d7e8d506-8f6a-3e50-8e52-06bcad6cd4e5]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
o.a.n.controller.StandardProcessorNode Starting
PutEmail[id=d7e8d506-8f6a-3e50-8e52-06bcad6cd4e5]
*2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
o.a.nifi.controller.StandardFlowService Disconnecting node due to Failed to
properly handle Reconnection request due to
org.apache.nifi.controller.serialization.FlowSynchronizationException:
Failed to connect node to cluster because local flow controller partially
updated. Administrator should disconnect node and review flow for
corruption.*
2022-12-08 22:26:21,666 INFO [Timer-Driven Process Thread-56]
o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled
PutEmail[id=6ebadc8f-142f-3f4c-81b9-0c2c55474420] to run with 1 threads
2022-12-08 22:26:21,666 INFO [Timer-Driven Process Thread-78]
o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled
PutSFTP[id=1d317c8d-6340-3075-b80e-d14efee639d3] to run with 1 threads
2022-12-08 22:26:21,666 INFO [Timer-Driven Process Thread-56]
o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled
PutEmail[id=4ac49804-9fe1-3d67-9e54-7487a9b92e89] to run with 1 threads
2022-12-08 22:26:21,666 INFO [Timer-Driven Process Thread-56]
o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled
PutEmail[id=f558b7d4-ef81-3e7b-99d9-0cde20a5c96f] to run with 1 threads
2022-12-08 22:26:21,666 INFO [Timer-Driven Process Thread-11]
o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled
PutEmail[id=a8063e92-1dad-3414-bff8-93ad33e0a059] to run with 1 threads
2022-12-08 22:26:21,666 INFO [Timer-Driven Process Thread-56]
o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled
PutSFTP[id=232e3309-2ddd-1203-b575-0df2da962d8a] to run with 1 threads
2022-12-08 22:26:21,666 INFO [Timer-Driven Process Thread-11]
o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled
PutEmail[id=98dc80cd-b210-3371-8775-6f50b6750158] to run with 1 threads
2022-12-08 22:26:21,666 INFO [Timer-Driven Process Thread-56]
o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled
PutEmail[id=da59713c-5387-3de9-bc44-748a22b1ff5e] to run with 1 threads
*2022-12-08 22:26:21,666 INFO [Reconnect to Cluster]
o.apache.nifi.controller.FlowController Will no longer send heartbeats*
2022-12-08 22:26:21,667 INFO [Reconnect to Cluster]
o.apache.nifi.controller.FlowController FlowController will stop sending
heartbeats to Cluster Coordinator
2022-12-08 22:26:21,667 INFO [Reconnect to Cluster]
o.apache.nifi.controller.FlowController Cluster State changed from
Clustered to Not Clustered
2022-12-08 22:26:21,666 INFO [Timer-Driven Process Thread-78]
o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled
PutEmail[id=9d573e65-2c6d-3925-bdaa-afa2162ce804] to run with 1 threads
2022-12-08 22:26:21,667 INFO [Reconnect to Cluster]
o.a.n.c.l.e.CuratorLeaderElectionManager Cannot unregister Leader Election
Role 'Primary Node' because that role is not registered
2022-12-08 22:26:21,667 INFO [Reconnect to Cluster]
o.a.n.c.l.e.CuratorLeaderElectionManager Cannot unregister Leader Election
Role 'Cluster Coordinator' because that role is not registered
2022-12-08 22:26:21,667 INFO [Reconnect to Cluster]
o.a.nifi.controller.StandardFlowService Node disconnected due to Failed to
properly handle Reconnection request due to
org.apache.nifi.controller.serialization.FlowSynchronizationException:
Failed to connect node to cluster because local flow controller partially
updated. Administrator should disconnect node and review flow for
corruption.
2022-12-08 22:26:21,668 INFO [Timer-Driven Process Thread-11]
o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled
PutEmail[id=d7e8d506-8f6a-3e50-8e52-06bcad6cd4e5] to run with 1 threads
2022-12-08 22:26:21,689 ERROR [Reconnect to Cluster]
o.a.nifi.controller.StandardFlowService Handling reconnection request
failed due to:
org.apache.nifi.controller.serialization.FlowSynchronizationException:
Failed to connect node to cluster because local flow controller partially
updated. Administrator should disconnect node and review flow for
corruption.
org.apache.nifi.controller.serialization.FlowSynchronizationException:
Failed to connect node to cluster because local flow controller partially
updated. Administrator should disconnect node and review flow for
corruption.
        at
org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:1061)
        at
org.apache.nifi.controller.StandardFlowService.handleReconnectionRequest(StandardFlowService.java:669)
        at
org.apache.nifi.controller.StandardFlowService.access$200(StandardFlowService.java:108)
        at
org.apache.nifi.controller.StandardFlowService$1.run(StandardFlowService.java:400)
        at java.lang.Thread.run(Thread.java:750)
Caused by:
org.apache.nifi.controller.serialization.FlowSynchronizationException:
java.lang.IllegalStateException: Cannot set AnnotationData while processor
is running
        at
org.apache.nifi.controller.serialization.VersionedFlowSynchronizer.synchronizeFlow(VersionedFlowSynchronizer.java:370)
        at
org.apache.nifi.controller.serialization.VersionedFlowSynchronizer.sync(VersionedFlowSynchronizer.java:188)
        at
org.apache.nifi.controller.serialization.StandardFlowSynchronizer.sync(StandardFlowSynchronizer.java:43)
        at
org.apache.nifi.controller.FlowController.synchronize(FlowController.java:1524)
        at
org.apache.nifi.persistence.StandardFlowConfigurationDAO.load(StandardFlowConfigurationDAO.java:107)
        at
org.apache.nifi.controller.StandardFlowService.loadFromBytes(StandardFlowService.java:819)
        at
org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:1030)
        ... 4 common frames omitted
Caused by: java.lang.IllegalStateException: Cannot set AnnotationData while
processor is running
        at
org.apache.nifi.controller.StandardProcessorNode.setAnnotationData(StandardProcessorNode.java:1297)
        at
org.apache.nifi.groups.StandardProcessGroupSynchronizer.updateProcessor(StandardProcessGroupSynchronizer.java:1545)
        at
org.apache.nifi.groups.StandardProcessGroupSynchronizer.synchronizeProcessors(StandardProcessGroupSynchronizer.java:846)
        at
org.apache.nifi.groups.StandardProcessGroupSynchronizer.synchronize(StandardProcessGroupSynchronizer.java:387)
        at
org.apache.nifi.groups.StandardProcessGroupSynchronizer.synchronizeChildGroups(StandardProcessGroupSynchronizer.java:445)
        at
org.apache.nifi.groups.StandardProcessGroupSynchronizer.synchronize(StandardProcessGroupSynchronizer.java:381)
        at
org.apache.nifi.groups.StandardProcessGroupSynchronizer.synchronizeChildGroups(StandardProcessGroupSynchronizer.java:445)
        at
org.apache.nifi.groups.StandardProcessGroupSynchronizer.synchronize(StandardProcessGroupSynchronizer.java:381)
        at
org.apache.nifi.groups.StandardProcessGroupSynchronizer.lambda$synchronize$0(StandardProcessGroupSynchronizer.java:227)
        at
org.apache.nifi.controller.flow.AbstractFlowManager.withParameterContextResolution(AbstractFlowManager.java:464)
        at
org.apache.nifi.groups.StandardProcessGroupSynchronizer.synchronize(StandardProcessGroupSynchronizer.java:225)
        at
org.apache.nifi.groups.StandardProcessGroup.synchronizeFlow(StandardProcessGroup.java:3823)
        at
org.apache.nifi.controller.serialization.VersionedFlowSynchronizer.synchronizeFlow(VersionedFlowSynchronizer.java:361)
        ... 10 common frames omitted

Re: Issue with removal and re-add of a cluster node

Posted by Mark Payne <ma...@hotmail.com>.
David,

I think you’re running into https://issues.apache.org/jira/browse/NIFI-10453, which was fixed in 1.19.

It results in the "Cannot set AnnotationData while processor is running” error.

Recommend upgrading to 1.19. In the meantime, though, you should be okay to shutdown node 3, delete the conf/flow.xml.gz and conf/flow.json.gz and restart
That will rejoin the cluster and inherit whatever the cluster’s flow is.

Thanks
-Mark


On Dec 9, 2022, at 2:18 PM, David Early via users <us...@nifi.apache.org> wrote:

Forgot my version: 1.16.3

Dave

On Fri, Dec 9, 2022 at 11:22 AM David Early <da...@grokstream.com>> wrote:
Hi all,

I have a major issue and am not sure what to do about it.

We have a 3 node cluster.  I was working on a one-off load for some data we were doing out of sequence and it resulted in build-up of some flowfiles in a queue.  In order to prevent a backpressure situation, I cleared one of the holding queues that had about 69k flow files.

During the clear operation the node I was on (node 3 UI in this case) returned and stated that the node was no longer part of the cluster.  Not clear why that happened.

This, by itself, is not really an issue.  Looking at the logs (at bottom of this note), you can see theflowfile drop and immediate adjustment to the node 3 to state of CONNECTING to the cluster.  Subsequently, an error occurred:  "Disconnecting node due to Failed to properly handle Reconnection request due to org.apache.nifi.controller.serialization.FlowSynchronizationException: Failed to connect node to cluster because local flow controller partially updated. Administrator should disconnect node and review flow for corruption".

When I attempted to readd the node from the UI, it repeated this error.

I compared users.xml and authroizations.xml on all three nodes, textually the same and identical md5sum on all (issues with users.xml and authorizations.xml were listed online as usual suspects).

I then offloaded the node via the UI to make sure I didn't have anything stuck in queue on node 3 and hoped it would allow the node to rejoin.  After offloading, I attempted to reconnect and what happened next gave me a heart attack:  Node 3 now showed as connected but in the UI (accessed via node 1), ALL PROCESSORS WERE SHOWN AS STOPPED.

A quick review showed that traffic was actually flowing (View status history showed flowfiles moving, observing some of our observation queues showed traffic on nodes 1 and 2).  Removing node 3 from the cluster restored the UI to show everything running, but adding it back in showed everything as stopped.

I tried to start some processors while node 3 was connected and while I could start individual processors, I could not do a "global" start by right clicking on canvas and trying "start".  I set up a sample processor to generate a file on all 3 nodes and it did generate a new flowfile on node 3.  All of that worked fine.

We have 400+ processors that I would need to hand start and I am super nervous about the cluster deciding to make node 3 the primary which would affect some timed process that we are running on the primary node.  As long as I don't restart the http input feed, I COULD restart all the processors, but this seems like the wrong process.

Anyone have any idea what I did wrong and how to fix it?  The errors show in the log attached happened before any offloading, but I wondered if the offloading caused part of this issue.

Is there anything I can do to readd the node without having to restart all the processors manually?

Should I clean up the node and add it as a "new" node and let it completely sync?

Thanks for any insight!


Dave


-------------------------------
Log:
-------------------------------
2022-12-08 22:26:20,706 INFO [Drop FlowFiles for Connection 8b0ee741-0183-1000-0000-000068704c93] o.a.n.c.queue.SwappablePriorityQueue Successfully dropped 69615 FlowFiles (35496003 bytes) from Connection with ID 8b0ee741-0183-1000-0000-000068704c93 on behalf of user@org.com<ma...@org.com>
2022-12-08 22:26:20,707 INFO [Process Cluster Protocol Request-29] o.a.n.c.c.node.NodeClusterCoordinator Status of prod-stsc2-3:8443 changed from NodeConnectionStatus[nodeId=prod-stsc2-3:8443, state=CONNECTED, updateId=108] to NodeConnectionStatus[nodeId=prod-stsc2-3:8443, state=CONNECTING, updateId=114]
2022-12-08 22:26:20,707 INFO [Process Cluster Protocol Request-29] o.a.n.c.p.impl.SocketProtocolListener Finished processing request 070fe65c-4a77-41d0-9d7f-8f08ede6ac71 (type=NODE_STATUS_CHANGE, length=1217 bytes) from prod-stsc2-1.internal.cloudapp.net<http://prod-stsc2-1.internal.cloudapp.net/> in 10 seconds, 842 millis
2022-12-08 22:26:20,750 INFO [Reconnect to Cluster] o.a.nifi.controller.StandardFlowService Setting Flow Controller's Node ID: prod-stsc2-3:8443
2022-12-08 22:26:20,751 INFO [Reconnect to Cluster] o.a.n.c.s.VersionedFlowSynchronizer Synchronizing FlowController with proposed flow: Controller Already Synchronized = true
2022-12-08 22:26:21,298 INFO [NiFi Web Server-1481911] o.a.c.f.imps.CuratorFrameworkImpl Starting
2022-12-08 22:26:21,298 INFO [NiFi Web Server-1481911] org.apache.zookeeper.ClientCnxnSocket jute.maxbuffer value is 4194304 Bytes
2022-12-08 22:26:21,304 INFO [NiFi Web Server-1481911] o.a.c.f.imps.CuratorFrameworkImpl Default schema
2022-12-08 22:26:21,314 INFO [NiFi Web Server-1481911-EventThread] o.a.c.f.state.ConnectionStateManager State change: CONNECTED
2022-12-08 22:26:21,322 INFO [NiFi Web Server-1481911-EventThread] o.a.c.framework.imps.EnsembleTracker New config event received: {server.1=prod-zkpr-1:2888:3888:participant;0.0.0.0:2181<http://0.0.0.0:2181/>, version=0, server.3=prod-zkpr-3:2888:3888:participant;0.0.0.0:2181<http://0.0.0.0:2181/>, server.2=prod-zkpr-2:2888:3888:participant;0.0.0.0:2181<http://0.0.0.0:2181/>}
2022-12-08 22:26:21,322 INFO [NiFi Web Server-1481911-EventThread] o.a.c.framework.imps.EnsembleTracker New config event received: {server.1=prod-zkpr-1:2888:3888:participant;0.0.0.0:2181<http://0.0.0.0:2181/>, version=0, server.3=prod-zkpr-3:2888:3888:participant;0.0.0.0:2181<http://0.0.0.0:2181/>, server.2=prod-zkpr-2:2888:3888:participant;0.0.0.0:2181<http://0.0.0.0:2181/>}
2022-12-08 22:26:21,323 INFO [Curator-Framework-0] o.a.c.f.imps.CuratorFrameworkImpl backgroundOperationsLoop exiting
2022-12-08 22:26:21,590 INFO [Reconnect to Cluster] o.a.n.p.FlowConfigurationArchiveManager Removing old archive file ./conf/archive/20221206T064055+0000_flow.json.gz to reduce storage usage. currentSize=522676591
2022-12-08 22:26:21,593 INFO [Reconnect to Cluster] o.a.n.c.s.VersionedFlowSynchronizer Successfully created backup of existing flow to /data/nifi/nifi-1.16.3/./conf/archive/20221208T222621+0000_flow.json.gz before inheriting dataflow
2022-12-08 22:26:21,594 INFO [Reconnect to Cluster] o.a.n.c.s.VersionedFlowSynchronizer In order to inherit proposed dataflow, will stop any components that will be affected by the update
2022-12-08 22:26:21,594 INFO [Reconnect to Cluster] o.a.n.c.s.AffectedComponentSet Stopping the following components: AffectedComponentSet[inputPorts=[], outputPorts=[], remoteInputPorts=[], remoteOutputPorts=[], processors=[PutEmail[id=6ebadc8f-142f-3f4c-81b9-0c2c55474420], PutSFTP[id=1d317c8d-6340-3075-b80e-d14efee639d3], PutEmail[id=4ac49804-9fe1-3d67-9e54-7487a9b92e89], PutEmail[id=9d573e65-2c6d-3925-bdaa-afa2162ce804], PutEmail[id=f558b7d4-ef81-3e7b-99d9-0cde20a5c96f], PutEmail[id=a8063e92-1dad-3414-bff8-93ad33e0a059], PutSFTP[id=232e3309-2ddd-1203-b575-0df2da962d8a], PutEmail[id=98dc80cd-b210-3371-8775-6f50b6750158], PutEmail[id=da59713c-5387-3de9-bc44-748a22b1ff5e], PutEmail[id=d7e8d506-8f6a-3e50-8e52-06bcad6cd4e5]], controllerServices=[], reportingTasks=[]]
2022-12-08 22:26:21,594 INFO [Reconnect to Cluster] o.a.n.c.s.StandardProcessScheduler Stopping PutEmail[id=6ebadc8f-142f-3f4c-81b9-0c2c55474420]
2022-12-08 22:26:21,594 INFO [Reconnect to Cluster] o.a.n.controller.StandardProcessorNode Stopping processor: PutEmail[id=6ebadc8f-142f-3f4c-81b9-0c2c55474420]
2022-12-08 22:26:21,595 INFO [Timer-Driven Process Thread-9] o.a.n.c.s.TimerDrivenSchedulingAgent Stopped scheduling PutEmail[id=6ebadc8f-142f-3f4c-81b9-0c2c55474420] to run
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster] o.a.n.c.s.StandardProcessScheduler Stopping PutSFTP[id=1d317c8d-6340-3075-b80e-d14efee639d3]
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster] o.a.n.controller.StandardProcessorNode Stopping processor: PutSFTP[id=1d317c8d-6340-3075-b80e-d14efee639d3]
2022-12-08 22:26:21,595 INFO [Timer-Driven Process Thread-53] o.a.n.c.s.TimerDrivenSchedulingAgent Stopped scheduling PutSFTP[id=1d317c8d-6340-3075-b80e-d14efee639d3] to run
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster] o.a.n.c.s.StandardProcessScheduler Stopping PutEmail[id=4ac49804-9fe1-3d67-9e54-7487a9b92e89]
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster] o.a.n.controller.StandardProcessorNode Stopping processor: PutEmail[id=4ac49804-9fe1-3d67-9e54-7487a9b92e89]
2022-12-08 22:26:21,595 INFO [Timer-Driven Process Thread-16] o.a.n.c.s.TimerDrivenSchedulingAgent Stopped scheduling PutEmail[id=4ac49804-9fe1-3d67-9e54-7487a9b92e89] to run
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster] o.a.n.c.s.StandardProcessScheduler Stopping PutEmail[id=9d573e65-2c6d-3925-bdaa-afa2162ce804]
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster] o.a.n.controller.StandardProcessorNode Stopping processor: PutEmail[id=9d573e65-2c6d-3925-bdaa-afa2162ce804]
2022-12-08 22:26:21,595 INFO [Timer-Driven Process Thread-91] o.a.n.c.s.TimerDrivenSchedulingAgent Stopped scheduling PutEmail[id=9d573e65-2c6d-3925-bdaa-afa2162ce804] to run
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster] o.a.n.c.s.StandardProcessScheduler Stopping PutEmail[id=f558b7d4-ef81-3e7b-99d9-0cde20a5c96f]
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster] o.a.n.controller.StandardProcessorNode Stopping processor: PutEmail[id=f558b7d4-ef81-3e7b-99d9-0cde20a5c96f]
2022-12-08 22:26:21,595 INFO [Timer-Driven Process Thread-65] o.a.n.c.s.TimerDrivenSchedulingAgent Stopped scheduling PutEmail[id=f558b7d4-ef81-3e7b-99d9-0cde20a5c96f] to run
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster] o.a.n.c.s.StandardProcessScheduler Stopping PutEmail[id=a8063e92-1dad-3414-bff8-93ad33e0a059]
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster] o.a.n.controller.StandardProcessorNode Stopping processor: PutEmail[id=a8063e92-1dad-3414-bff8-93ad33e0a059]
2022-12-08 22:26:21,595 INFO [Timer-Driven Process Thread-81] o.a.n.c.s.TimerDrivenSchedulingAgent Stopped scheduling PutEmail[id=a8063e92-1dad-3414-bff8-93ad33e0a059] to run
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster] o.a.n.c.s.StandardProcessScheduler Stopping PutSFTP[id=232e3309-2ddd-1203-b575-0df2da962d8a]
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster] o.a.n.controller.StandardProcessorNode Stopping processor: PutSFTP[id=232e3309-2ddd-1203-b575-0df2da962d8a]
2022-12-08 22:26:21,595 INFO [Timer-Driven Process Thread-48] o.a.n.c.s.TimerDrivenSchedulingAgent Stopped scheduling PutSFTP[id=232e3309-2ddd-1203-b575-0df2da962d8a] to run
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster] o.a.n.c.s.StandardProcessScheduler Stopping PutEmail[id=98dc80cd-b210-3371-8775-6f50b6750158]
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster] o.a.n.controller.StandardProcessorNode Stopping processor: PutEmail[id=98dc80cd-b210-3371-8775-6f50b6750158]
2022-12-08 22:26:21,595 INFO [Timer-Driven Process Thread-43] o.a.n.c.s.TimerDrivenSchedulingAgent Stopped scheduling PutEmail[id=98dc80cd-b210-3371-8775-6f50b6750158] to run
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster] o.a.n.c.s.StandardProcessScheduler Stopping PutEmail[id=da59713c-5387-3de9-bc44-748a22b1ff5e]
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster] o.a.n.controller.StandardProcessorNode Stopping processor: PutEmail[id=da59713c-5387-3de9-bc44-748a22b1ff5e]
2022-12-08 22:26:21,595 INFO [Timer-Driven Process Thread-72] o.a.n.c.s.TimerDrivenSchedulingAgent Stopped scheduling PutEmail[id=da59713c-5387-3de9-bc44-748a22b1ff5e] to run
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster] o.a.n.c.s.StandardProcessScheduler Stopping PutEmail[id=d7e8d506-8f6a-3e50-8e52-06bcad6cd4e5]
2022-12-08 22:26:21,595 INFO [Reconnect to Cluster] o.a.n.controller.StandardProcessorNode Stopping processor: PutEmail[id=d7e8d506-8f6a-3e50-8e52-06bcad6cd4e5]
2022-12-08 22:26:21,595 INFO [Timer-Driven Process Thread-96] o.a.n.c.s.TimerDrivenSchedulingAgent Stopped scheduling PutEmail[id=d7e8d506-8f6a-3e50-8e52-06bcad6cd4e5] to run
2022-12-08 22:26:21,596 INFO [Reconnect to Cluster] o.a.n.c.s.AffectedComponentSet Waiting for all required Processors and Reporting Tasks to stop...
2022-12-08 22:26:21,606 INFO [Reconnect to Cluster] o.a.n.c.s.AffectedComponentSet Successfully stopped all components in 12 milliseconds
...
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster] o.a.n.c.s.StandardProcessScheduler Starting PutEmail[id=6ebadc8f-142f-3f4c-81b9-0c2c55474420]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster] o.a.n.controller.StandardProcessorNode Starting PutEmail[id=6ebadc8f-142f-3f4c-81b9-0c2c55474420]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster] o.a.n.c.s.StandardProcessScheduler Starting PutSFTP[id=1d317c8d-6340-3075-b80e-d14efee639d3]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster] o.a.n.controller.StandardProcessorNode Starting PutSFTP[id=1d317c8d-6340-3075-b80e-d14efee639d3]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster] o.a.n.c.s.StandardProcessScheduler Starting PutEmail[id=4ac49804-9fe1-3d67-9e54-7487a9b92e89]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster] o.a.n.controller.StandardProcessorNode Starting PutEmail[id=4ac49804-9fe1-3d67-9e54-7487a9b92e89]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster] o.a.n.c.s.StandardProcessScheduler Starting PutEmail[id=9d573e65-2c6d-3925-bdaa-afa2162ce804]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster] o.a.n.controller.StandardProcessorNode Starting PutEmail[id=9d573e65-2c6d-3925-bdaa-afa2162ce804]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster] o.a.n.c.s.StandardProcessScheduler Starting PutEmail[id=f558b7d4-ef81-3e7b-99d9-0cde20a5c96f]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster] o.a.n.controller.StandardProcessorNode Starting PutEmail[id=f558b7d4-ef81-3e7b-99d9-0cde20a5c96f]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster] o.a.n.c.s.StandardProcessScheduler Starting PutEmail[id=a8063e92-1dad-3414-bff8-93ad33e0a059]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster] o.a.n.controller.StandardProcessorNode Starting PutEmail[id=a8063e92-1dad-3414-bff8-93ad33e0a059]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster] o.a.n.c.s.StandardProcessScheduler Starting PutSFTP[id=232e3309-2ddd-1203-b575-0df2da962d8a]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster] o.a.n.controller.StandardProcessorNode Starting PutSFTP[id=232e3309-2ddd-1203-b575-0df2da962d8a]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster] o.a.n.c.s.StandardProcessScheduler Starting PutEmail[id=98dc80cd-b210-3371-8775-6f50b6750158]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster] o.a.n.controller.StandardProcessorNode Starting PutEmail[id=98dc80cd-b210-3371-8775-6f50b6750158]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster] o.a.n.c.s.StandardProcessScheduler Starting PutEmail[id=da59713c-5387-3de9-bc44-748a22b1ff5e]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster] o.a.n.controller.StandardProcessorNode Starting PutEmail[id=da59713c-5387-3de9-bc44-748a22b1ff5e]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster] o.a.n.c.s.StandardProcessScheduler Starting PutEmail[id=d7e8d506-8f6a-3e50-8e52-06bcad6cd4e5]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster] o.a.n.controller.StandardProcessorNode Starting PutEmail[id=d7e8d506-8f6a-3e50-8e52-06bcad6cd4e5]
2022-12-08 22:26:21,665 INFO [Reconnect to Cluster] o.a.nifi.controller.StandardFlowService Disconnecting node due to Failed to properly handle Reconnection request due to org.apache.nifi.controller.serialization.FlowSynchronizationException: Failed to connect node to cluster because local flow controller partially updated. Administrator should disconnect node and review flow for corruption.
2022-12-08 22:26:21,666 INFO [Timer-Driven Process Thread-56] o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled PutEmail[id=6ebadc8f-142f-3f4c-81b9-0c2c55474420] to run with 1 threads
2022-12-08 22:26:21,666 INFO [Timer-Driven Process Thread-78] o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled PutSFTP[id=1d317c8d-6340-3075-b80e-d14efee639d3] to run with 1 threads
2022-12-08 22:26:21,666 INFO [Timer-Driven Process Thread-56] o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled PutEmail[id=4ac49804-9fe1-3d67-9e54-7487a9b92e89] to run with 1 threads
2022-12-08 22:26:21,666 INFO [Timer-Driven Process Thread-56] o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled PutEmail[id=f558b7d4-ef81-3e7b-99d9-0cde20a5c96f] to run with 1 threads
2022-12-08 22:26:21,666 INFO [Timer-Driven Process Thread-11] o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled PutEmail[id=a8063e92-1dad-3414-bff8-93ad33e0a059] to run with 1 threads
2022-12-08 22:26:21,666 INFO [Timer-Driven Process Thread-56] o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled PutSFTP[id=232e3309-2ddd-1203-b575-0df2da962d8a] to run with 1 threads
2022-12-08 22:26:21,666 INFO [Timer-Driven Process Thread-11] o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled PutEmail[id=98dc80cd-b210-3371-8775-6f50b6750158] to run with 1 threads
2022-12-08 22:26:21,666 INFO [Timer-Driven Process Thread-56] o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled PutEmail[id=da59713c-5387-3de9-bc44-748a22b1ff5e] to run with 1 threads
2022-12-08 22:26:21,666 INFO [Reconnect to Cluster] o.apache.nifi.controller.FlowController Will no longer send heartbeats
2022-12-08 22:26:21,667 INFO [Reconnect to Cluster] o.apache.nifi.controller.FlowController FlowController will stop sending heartbeats to Cluster Coordinator
2022-12-08 22:26:21,667 INFO [Reconnect to Cluster] o.apache.nifi.controller.FlowController Cluster State changed from Clustered to Not Clustered
2022-12-08 22:26:21,666 INFO [Timer-Driven Process Thread-78] o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled PutEmail[id=9d573e65-2c6d-3925-bdaa-afa2162ce804] to run with 1 threads
2022-12-08 22:26:21,667 INFO [Reconnect to Cluster] o.a.n.c.l.e.CuratorLeaderElectionManager Cannot unregister Leader Election Role 'Primary Node' because that role is not registered
2022-12-08 22:26:21,667 INFO [Reconnect to Cluster] o.a.n.c.l.e.CuratorLeaderElectionManager Cannot unregister Leader Election Role 'Cluster Coordinator' because that role is not registered
2022-12-08 22:26:21,667 INFO [Reconnect to Cluster] o.a.nifi.controller.StandardFlowService Node disconnected due to Failed to properly handle Reconnection request due to org.apache.nifi.controller.serialization.FlowSynchronizationException: Failed to connect node to cluster because local flow controller partially updated. Administrator should disconnect node and review flow for corruption.
2022-12-08 22:26:21,668 INFO [Timer-Driven Process Thread-11] o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled PutEmail[id=d7e8d506-8f6a-3e50-8e52-06bcad6cd4e5] to run with 1 threads
2022-12-08 22:26:21,689 ERROR [Reconnect to Cluster] o.a.nifi.controller.StandardFlowService Handling reconnection request failed due to: org.apache.nifi.controller.serialization.FlowSynchronizationException: Failed to connect node to cluster because local flow controller partially updated. Administrator should disconnect node and review flow for corruption.
org.apache.nifi.controller.serialization.FlowSynchronizationException: Failed to connect node to cluster because local flow controller partially updated. Administrator should disconnect node and review flow for corruption.
        at org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:1061)
        at org.apache.nifi.controller.StandardFlowService.handleReconnectionRequest(StandardFlowService.java:669)
        at org.apache.nifi.controller.StandardFlowService.access$200(StandardFlowService.java:108)
        at org.apache.nifi.controller.StandardFlowService$1.run(StandardFlowService.java:400)
        at java.lang.Thread.run(Thread.java:750)
Caused by: org.apache.nifi.controller.serialization.FlowSynchronizationException: java.lang.IllegalStateException: Cannot set AnnotationData while processor is running
        at org.apache.nifi.controller.serialization.VersionedFlowSynchronizer.synchronizeFlow(VersionedFlowSynchronizer.java:370)
        at org.apache.nifi.controller.serialization.VersionedFlowSynchronizer.sync(VersionedFlowSynchronizer.java:188)
        at org.apache.nifi.controller.serialization.StandardFlowSynchronizer.sync(StandardFlowSynchronizer.java:43)
        at org.apache.nifi.controller.FlowController.synchronize(FlowController.java:1524)
        at org.apache.nifi.persistence.StandardFlowConfigurationDAO.load(StandardFlowConfigurationDAO.java:107)
        at org.apache.nifi.controller.StandardFlowService.loadFromBytes(StandardFlowService.java:819)
        at org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:1030)
        ... 4 common frames omitted
Caused by: java.lang.IllegalStateException: Cannot set AnnotationData while processor is running
        at org.apache.nifi.controller.StandardProcessorNode.setAnnotationData(StandardProcessorNode.java:1297)
        at org.apache.nifi.groups.StandardProcessGroupSynchronizer.updateProcessor(StandardProcessGroupSynchronizer.java:1545)
        at org.apache.nifi.groups.StandardProcessGroupSynchronizer.synchronizeProcessors(StandardProcessGroupSynchronizer.java:846)
        at org.apache.nifi.groups.StandardProcessGroupSynchronizer.synchronize(StandardProcessGroupSynchronizer.java:387)
        at org.apache.nifi.groups.StandardProcessGroupSynchronizer.synchronizeChildGroups(StandardProcessGroupSynchronizer.java:445)
        at org.apache.nifi.groups.StandardProcessGroupSynchronizer.synchronize(StandardProcessGroupSynchronizer.java:381)
        at org.apache.nifi.groups.StandardProcessGroupSynchronizer.synchronizeChildGroups(StandardProcessGroupSynchronizer.java:445)
        at org.apache.nifi.groups.StandardProcessGroupSynchronizer.synchronize(StandardProcessGroupSynchronizer.java:381)
        at org.apache.nifi.groups.StandardProcessGroupSynchronizer.lambda$synchronize$0(StandardProcessGroupSynchronizer.java:227)
        at org.apache.nifi.controller.flow.AbstractFlowManager.withParameterContextResolution(AbstractFlowManager.java:464)
        at org.apache.nifi.groups.StandardProcessGroupSynchronizer.synchronize(StandardProcessGroupSynchronizer.java:225)
        at org.apache.nifi.groups.StandardProcessGroup.synchronizeFlow(StandardProcessGroup.java:3823)
        at org.apache.nifi.controller.serialization.VersionedFlowSynchronizer.synchronizeFlow(VersionedFlowSynchronizer.java:361)
        ... 10 common frames omitted


--
David Early, Ph.D.
david.early@grokstream.com<ma...@grokstream.com>
720-470-7460 Cell
[https://mail.google.com/mail/u/2/?ui=2&ik=895cb87005&view=fimg&th=161c41e1075485ea&attid=0.1&disp=emb&attbid=ANGjdJ8triZLVo6nOgScYlgyrs3XZvhyOYWrhNTMP-5XQhyy8mMKEwrvbjZAHrITo_Ml-A6mOLQUxLVtoi_fdJwPLYrMSTVcYr2rBj-_Z5-F6BEq_2AxB71ch7a5HHg&sz=s0-l75-ft&ats=1519413808013&rm=161c41e1075485ea&atsh=1]


Re: Issue with removal and re-add of a cluster node

Posted by David Early via users <us...@nifi.apache.org>.
Forgot my version: 1.16.3

Dave

On Fri, Dec 9, 2022 at 11:22 AM David Early <da...@grokstream.com>
wrote:

> Hi all,
>
> I have a major issue and am not sure what to do about it.
>
> We have a 3 node cluster.  I was working on a one-off load for some data
> we were doing out of sequence and it resulted in build-up of some flowfiles
> in a queue.  In order to prevent a backpressure situation, I cleared one of
> the holding queues that had about 69k flow files.
>
> During the clear operation the node I was on (node 3 UI in this case)
> returned and stated that the node was no longer part of the cluster.  Not
> clear why that happened.
>
> This, by itself, is not really an issue.  Looking at the logs (at bottom
> of this note), you can see theflowfile drop and immediate adjustment to the
> node 3 to state of CONNECTING to the cluster.  Subsequently, an error
> occurred:  "*Disconnecting node due to Failed to properly handle
> Reconnection request due to
> org.apache.nifi.controller.serialization.FlowSynchronizationException:
> Failed to connect node to cluster because local flow controller partially
> updated. Administrator should disconnect node and review flow for
> corruption*".
>
> When I attempted to readd the node from the UI, it repeated this error.
>
> I compared users.xml and authroizations.xml on all three nodes, textually
> the same and identical md5sum on all (issues with users.xml and
> authorizations.xml were listed online as usual suspects).
>
> I then offloaded the node via the UI to make sure I didn't have anything
> stuck in queue on node 3 and hoped it would allow the node to rejoin.
> After offloading, I attempted to reconnect and what happened next gave me a
> heart attack:  Node 3 now showed as connected but in the UI (accessed via
> node 1), ALL PROCESSORS WERE SHOWN AS STOPPED.
>
> A quick review showed that traffic was actually flowing (View status
> history showed flowfiles moving, observing some of our observation queues
> showed traffic on nodes 1 and 2).  Removing node 3 from the cluster
> restored the UI to show everything running, but adding it back in showed
> everything as stopped.
>
> I tried to start some processors while node 3 was connected and while I
> could start individual processors, I could not do a "global" start by right
> clicking on canvas and trying "start".  I set up a sample processor to
> generate a file on all 3 nodes and it did generate a new flowfile on node
> 3.  All of that worked fine.
>
> We have 400+ processors that I would need to hand start and I am super
> nervous about the cluster deciding to make node 3 the primary which would
> affect some timed process that we are running on the primary node.  As long
> as I don't restart the http input feed, I COULD restart all the processors,
> but this seems like the wrong process.
>
> Anyone have any idea what I did wrong and how to fix it?  The errors show
> in the log attached happened before any offloading, but I wondered if the
> offloading caused part of this issue.
>
> Is there anything I can do to readd the node without having to restart all
> the processors manually?
>
> Should I clean up the node and add it as a "new" node and let it
> completely sync?
>
> Thanks for any insight!
>
>
> Dave
>
>
> -------------------------------
> Log:
> -------------------------------
> *2022-12-08 22:26:20,706 INFO [Drop FlowFiles for Connection
> 8b0ee741-0183-1000-0000-000068704c93] o.a.n.c.queue.SwappablePriorityQueue
> Successfully dropped 69615 FlowFiles (35496003 bytes) from Connection with
> ID 8b0ee741-0183-1000-0000-000068704c93 on behalf of user@org.com
> <us...@org.com>*
> 2022-12-08 22:26:20,707 INFO [Process Cluster Protocol Request-29]
> o.a.n.c.c.node.NodeClusterCoordinator Status of prod-stsc2-3:8443 changed
> from NodeConnectionStatus[nodeId=prod-stsc2-3:8443, state=CONNECTED,
> updateId=108] to NodeConnectionStatus[nodeId=prod-stsc2-3:8443,
> state=CONNECTING, updateId=114]
> 2022-12-08 22:26:20,707 INFO [Process Cluster Protocol Request-29]
> o.a.n.c.p.impl.SocketProtocolListener Finished processing request
> 070fe65c-4a77-41d0-9d7f-8f08ede6ac71 (type=NODE_STATUS_CHANGE, length=1217
> bytes) from prod-stsc2-1.internal.cloudapp.net in 10 seconds, 842 millis
> 2022-12-08 22:26:20,750 INFO [Reconnect to Cluster]
> o.a.nifi.controller.StandardFlowService Setting Flow Controller's Node ID:
> prod-stsc2-3:8443
> 2022-12-08 22:26:20,751 INFO [Reconnect to Cluster]
> o.a.n.c.s.VersionedFlowSynchronizer Synchronizing FlowController with
> proposed flow: Controller Already Synchronized = true
> 2022-12-08 22:26:21,298 INFO [NiFi Web Server-1481911]
> o.a.c.f.imps.CuratorFrameworkImpl Starting
> 2022-12-08 22:26:21,298 INFO [NiFi Web Server-1481911]
> org.apache.zookeeper.ClientCnxnSocket jute.maxbuffer value is 4194304 Bytes
> 2022-12-08 22:26:21,304 INFO [NiFi Web Server-1481911]
> o.a.c.f.imps.CuratorFrameworkImpl Default schema
> 2022-12-08 22:26:21,314 INFO [NiFi Web Server-1481911-EventThread]
> o.a.c.f.state.ConnectionStateManager State change: CONNECTED
> 2022-12-08 22:26:21,322 INFO [NiFi Web Server-1481911-EventThread]
> o.a.c.framework.imps.EnsembleTracker New config event received:
> {server.1=prod-zkpr-1:2888:3888:participant;0.0.0.0:2181, version=0,
> server.3=prod-zkpr-3:2888:3888:participant;0.0.0.0:2181,
> server.2=prod-zkpr-2:2888:3888:participant;0.0.0.0:2181}
> 2022-12-08 22:26:21,322 INFO [NiFi Web Server-1481911-EventThread]
> o.a.c.framework.imps.EnsembleTracker New config event received:
> {server.1=prod-zkpr-1:2888:3888:participant;0.0.0.0:2181, version=0,
> server.3=prod-zkpr-3:2888:3888:participant;0.0.0.0:2181,
> server.2=prod-zkpr-2:2888:3888:participant;0.0.0.0:2181}
> 2022-12-08 22:26:21,323 INFO [Curator-Framework-0]
> o.a.c.f.imps.CuratorFrameworkImpl backgroundOperationsLoop exiting
> 2022-12-08 22:26:21,590 INFO [Reconnect to Cluster]
> o.a.n.p.FlowConfigurationArchiveManager Removing old archive file
> ./conf/archive/20221206T064055+0000_flow.json.gz to reduce storage usage.
> currentSize=522676591
> 2022-12-08 22:26:21,593 INFO [Reconnect to Cluster]
> o.a.n.c.s.VersionedFlowSynchronizer Successfully created backup of existing
> flow to
> /data/nifi/nifi-1.16.3/./conf/archive/20221208T222621+0000_flow.json.gz
> before inheriting dataflow
> 2022-12-08 22:26:21,594 INFO [Reconnect to Cluster]
> o.a.n.c.s.VersionedFlowSynchronizer In order to inherit proposed dataflow,
> will stop any components that will be affected by the update
> 2022-12-08 22:26:21,594 INFO [Reconnect to Cluster]
> o.a.n.c.s.AffectedComponentSet Stopping the following components:
> AffectedComponentSet[inputPorts=[], outputPorts=[], remoteInputPorts=[],
> remoteOutputPorts=[],
> processors=[PutEmail[id=6ebadc8f-142f-3f4c-81b9-0c2c55474420],
> PutSFTP[id=1d317c8d-6340-3075-b80e-d14efee639d3],
> PutEmail[id=4ac49804-9fe1-3d67-9e54-7487a9b92e89],
> PutEmail[id=9d573e65-2c6d-3925-bdaa-afa2162ce804],
> PutEmail[id=f558b7d4-ef81-3e7b-99d9-0cde20a5c96f],
> PutEmail[id=a8063e92-1dad-3414-bff8-93ad33e0a059],
> PutSFTP[id=232e3309-2ddd-1203-b575-0df2da962d8a],
> PutEmail[id=98dc80cd-b210-3371-8775-6f50b6750158],
> PutEmail[id=da59713c-5387-3de9-bc44-748a22b1ff5e],
> PutEmail[id=d7e8d506-8f6a-3e50-8e52-06bcad6cd4e5]], controllerServices=[],
> reportingTasks=[]]
> 2022-12-08 22:26:21,594 INFO [Reconnect to Cluster]
> o.a.n.c.s.StandardProcessScheduler Stopping
> PutEmail[id=6ebadc8f-142f-3f4c-81b9-0c2c55474420]
> 2022-12-08 22:26:21,594 INFO [Reconnect to Cluster]
> o.a.n.controller.StandardProcessorNode Stopping processor:
> PutEmail[id=6ebadc8f-142f-3f4c-81b9-0c2c55474420]
> 2022-12-08 22:26:21,595 INFO [Timer-Driven Process Thread-9]
> o.a.n.c.s.TimerDrivenSchedulingAgent Stopped scheduling
> PutEmail[id=6ebadc8f-142f-3f4c-81b9-0c2c55474420] to run
> 2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
> o.a.n.c.s.StandardProcessScheduler Stopping
> PutSFTP[id=1d317c8d-6340-3075-b80e-d14efee639d3]
> 2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
> o.a.n.controller.StandardProcessorNode Stopping processor:
> PutSFTP[id=1d317c8d-6340-3075-b80e-d14efee639d3]
> 2022-12-08 22:26:21,595 INFO [Timer-Driven Process Thread-53]
> o.a.n.c.s.TimerDrivenSchedulingAgent Stopped scheduling
> PutSFTP[id=1d317c8d-6340-3075-b80e-d14efee639d3] to run
> 2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
> o.a.n.c.s.StandardProcessScheduler Stopping
> PutEmail[id=4ac49804-9fe1-3d67-9e54-7487a9b92e89]
> 2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
> o.a.n.controller.StandardProcessorNode Stopping processor:
> PutEmail[id=4ac49804-9fe1-3d67-9e54-7487a9b92e89]
> 2022-12-08 22:26:21,595 INFO [Timer-Driven Process Thread-16]
> o.a.n.c.s.TimerDrivenSchedulingAgent Stopped scheduling
> PutEmail[id=4ac49804-9fe1-3d67-9e54-7487a9b92e89] to run
> 2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
> o.a.n.c.s.StandardProcessScheduler Stopping
> PutEmail[id=9d573e65-2c6d-3925-bdaa-afa2162ce804]
> 2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
> o.a.n.controller.StandardProcessorNode Stopping processor:
> PutEmail[id=9d573e65-2c6d-3925-bdaa-afa2162ce804]
> 2022-12-08 22:26:21,595 INFO [Timer-Driven Process Thread-91]
> o.a.n.c.s.TimerDrivenSchedulingAgent Stopped scheduling
> PutEmail[id=9d573e65-2c6d-3925-bdaa-afa2162ce804] to run
> 2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
> o.a.n.c.s.StandardProcessScheduler Stopping
> PutEmail[id=f558b7d4-ef81-3e7b-99d9-0cde20a5c96f]
> 2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
> o.a.n.controller.StandardProcessorNode Stopping processor:
> PutEmail[id=f558b7d4-ef81-3e7b-99d9-0cde20a5c96f]
> 2022-12-08 22:26:21,595 INFO [Timer-Driven Process Thread-65]
> o.a.n.c.s.TimerDrivenSchedulingAgent Stopped scheduling
> PutEmail[id=f558b7d4-ef81-3e7b-99d9-0cde20a5c96f] to run
> 2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
> o.a.n.c.s.StandardProcessScheduler Stopping
> PutEmail[id=a8063e92-1dad-3414-bff8-93ad33e0a059]
> 2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
> o.a.n.controller.StandardProcessorNode Stopping processor:
> PutEmail[id=a8063e92-1dad-3414-bff8-93ad33e0a059]
> 2022-12-08 22:26:21,595 INFO [Timer-Driven Process Thread-81]
> o.a.n.c.s.TimerDrivenSchedulingAgent Stopped scheduling
> PutEmail[id=a8063e92-1dad-3414-bff8-93ad33e0a059] to run
> 2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
> o.a.n.c.s.StandardProcessScheduler Stopping
> PutSFTP[id=232e3309-2ddd-1203-b575-0df2da962d8a]
> 2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
> o.a.n.controller.StandardProcessorNode Stopping processor:
> PutSFTP[id=232e3309-2ddd-1203-b575-0df2da962d8a]
> 2022-12-08 22:26:21,595 INFO [Timer-Driven Process Thread-48]
> o.a.n.c.s.TimerDrivenSchedulingAgent Stopped scheduling
> PutSFTP[id=232e3309-2ddd-1203-b575-0df2da962d8a] to run
> 2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
> o.a.n.c.s.StandardProcessScheduler Stopping
> PutEmail[id=98dc80cd-b210-3371-8775-6f50b6750158]
> 2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
> o.a.n.controller.StandardProcessorNode Stopping processor:
> PutEmail[id=98dc80cd-b210-3371-8775-6f50b6750158]
> 2022-12-08 22:26:21,595 INFO [Timer-Driven Process Thread-43]
> o.a.n.c.s.TimerDrivenSchedulingAgent Stopped scheduling
> PutEmail[id=98dc80cd-b210-3371-8775-6f50b6750158] to run
> 2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
> o.a.n.c.s.StandardProcessScheduler Stopping
> PutEmail[id=da59713c-5387-3de9-bc44-748a22b1ff5e]
> 2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
> o.a.n.controller.StandardProcessorNode Stopping processor:
> PutEmail[id=da59713c-5387-3de9-bc44-748a22b1ff5e]
> 2022-12-08 22:26:21,595 INFO [Timer-Driven Process Thread-72]
> o.a.n.c.s.TimerDrivenSchedulingAgent Stopped scheduling
> PutEmail[id=da59713c-5387-3de9-bc44-748a22b1ff5e] to run
> 2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
> o.a.n.c.s.StandardProcessScheduler Stopping
> PutEmail[id=d7e8d506-8f6a-3e50-8e52-06bcad6cd4e5]
> 2022-12-08 22:26:21,595 INFO [Reconnect to Cluster]
> o.a.n.controller.StandardProcessorNode Stopping processor:
> PutEmail[id=d7e8d506-8f6a-3e50-8e52-06bcad6cd4e5]
> 2022-12-08 22:26:21,595 INFO [Timer-Driven Process Thread-96]
> o.a.n.c.s.TimerDrivenSchedulingAgent Stopped scheduling
> PutEmail[id=d7e8d506-8f6a-3e50-8e52-06bcad6cd4e5] to run
> 2022-12-08 22:26:21,596 INFO [Reconnect to Cluster]
> o.a.n.c.s.AffectedComponentSet Waiting for all required Processors and
> Reporting Tasks to stop...
> 2022-12-08 22:26:21,606 INFO [Reconnect to Cluster]
> o.a.n.c.s.AffectedComponentSet Successfully stopped all components in 12
> milliseconds
> ...
> 2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
> o.a.n.c.s.StandardProcessScheduler Starting
> PutEmail[id=6ebadc8f-142f-3f4c-81b9-0c2c55474420]
> 2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
> o.a.n.controller.StandardProcessorNode Starting
> PutEmail[id=6ebadc8f-142f-3f4c-81b9-0c2c55474420]
> 2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
> o.a.n.c.s.StandardProcessScheduler Starting
> PutSFTP[id=1d317c8d-6340-3075-b80e-d14efee639d3]
> 2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
> o.a.n.controller.StandardProcessorNode Starting
> PutSFTP[id=1d317c8d-6340-3075-b80e-d14efee639d3]
> 2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
> o.a.n.c.s.StandardProcessScheduler Starting
> PutEmail[id=4ac49804-9fe1-3d67-9e54-7487a9b92e89]
> 2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
> o.a.n.controller.StandardProcessorNode Starting
> PutEmail[id=4ac49804-9fe1-3d67-9e54-7487a9b92e89]
> 2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
> o.a.n.c.s.StandardProcessScheduler Starting
> PutEmail[id=9d573e65-2c6d-3925-bdaa-afa2162ce804]
> 2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
> o.a.n.controller.StandardProcessorNode Starting
> PutEmail[id=9d573e65-2c6d-3925-bdaa-afa2162ce804]
> 2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
> o.a.n.c.s.StandardProcessScheduler Starting
> PutEmail[id=f558b7d4-ef81-3e7b-99d9-0cde20a5c96f]
> 2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
> o.a.n.controller.StandardProcessorNode Starting
> PutEmail[id=f558b7d4-ef81-3e7b-99d9-0cde20a5c96f]
> 2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
> o.a.n.c.s.StandardProcessScheduler Starting
> PutEmail[id=a8063e92-1dad-3414-bff8-93ad33e0a059]
> 2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
> o.a.n.controller.StandardProcessorNode Starting
> PutEmail[id=a8063e92-1dad-3414-bff8-93ad33e0a059]
> 2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
> o.a.n.c.s.StandardProcessScheduler Starting
> PutSFTP[id=232e3309-2ddd-1203-b575-0df2da962d8a]
> 2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
> o.a.n.controller.StandardProcessorNode Starting
> PutSFTP[id=232e3309-2ddd-1203-b575-0df2da962d8a]
> 2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
> o.a.n.c.s.StandardProcessScheduler Starting
> PutEmail[id=98dc80cd-b210-3371-8775-6f50b6750158]
> 2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
> o.a.n.controller.StandardProcessorNode Starting
> PutEmail[id=98dc80cd-b210-3371-8775-6f50b6750158]
> 2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
> o.a.n.c.s.StandardProcessScheduler Starting
> PutEmail[id=da59713c-5387-3de9-bc44-748a22b1ff5e]
> 2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
> o.a.n.controller.StandardProcessorNode Starting
> PutEmail[id=da59713c-5387-3de9-bc44-748a22b1ff5e]
> 2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
> o.a.n.c.s.StandardProcessScheduler Starting
> PutEmail[id=d7e8d506-8f6a-3e50-8e52-06bcad6cd4e5]
> 2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
> o.a.n.controller.StandardProcessorNode Starting
> PutEmail[id=d7e8d506-8f6a-3e50-8e52-06bcad6cd4e5]
> *2022-12-08 22:26:21,665 INFO [Reconnect to Cluster]
> o.a.nifi.controller.StandardFlowService Disconnecting node due to Failed to
> properly handle Reconnection request due to
> org.apache.nifi.controller.serialization.FlowSynchronizationException:
> Failed to connect node to cluster because local flow controller partially
> updated. Administrator should disconnect node and review flow for
> corruption.*
> 2022-12-08 22:26:21,666 INFO [Timer-Driven Process Thread-56]
> o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled
> PutEmail[id=6ebadc8f-142f-3f4c-81b9-0c2c55474420] to run with 1 threads
> 2022-12-08 22:26:21,666 INFO [Timer-Driven Process Thread-78]
> o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled
> PutSFTP[id=1d317c8d-6340-3075-b80e-d14efee639d3] to run with 1 threads
> 2022-12-08 22:26:21,666 INFO [Timer-Driven Process Thread-56]
> o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled
> PutEmail[id=4ac49804-9fe1-3d67-9e54-7487a9b92e89] to run with 1 threads
> 2022-12-08 22:26:21,666 INFO [Timer-Driven Process Thread-56]
> o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled
> PutEmail[id=f558b7d4-ef81-3e7b-99d9-0cde20a5c96f] to run with 1 threads
> 2022-12-08 22:26:21,666 INFO [Timer-Driven Process Thread-11]
> o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled
> PutEmail[id=a8063e92-1dad-3414-bff8-93ad33e0a059] to run with 1 threads
> 2022-12-08 22:26:21,666 INFO [Timer-Driven Process Thread-56]
> o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled
> PutSFTP[id=232e3309-2ddd-1203-b575-0df2da962d8a] to run with 1 threads
> 2022-12-08 22:26:21,666 INFO [Timer-Driven Process Thread-11]
> o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled
> PutEmail[id=98dc80cd-b210-3371-8775-6f50b6750158] to run with 1 threads
> 2022-12-08 22:26:21,666 INFO [Timer-Driven Process Thread-56]
> o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled
> PutEmail[id=da59713c-5387-3de9-bc44-748a22b1ff5e] to run with 1 threads
> *2022-12-08 22:26:21,666 INFO [Reconnect to Cluster]
> o.apache.nifi.controller.FlowController Will no longer send heartbeats*
> 2022-12-08 22:26:21,667 INFO [Reconnect to Cluster]
> o.apache.nifi.controller.FlowController FlowController will stop sending
> heartbeats to Cluster Coordinator
> 2022-12-08 22:26:21,667 INFO [Reconnect to Cluster]
> o.apache.nifi.controller.FlowController Cluster State changed from
> Clustered to Not Clustered
> 2022-12-08 22:26:21,666 INFO [Timer-Driven Process Thread-78]
> o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled
> PutEmail[id=9d573e65-2c6d-3925-bdaa-afa2162ce804] to run with 1 threads
> 2022-12-08 22:26:21,667 INFO [Reconnect to Cluster]
> o.a.n.c.l.e.CuratorLeaderElectionManager Cannot unregister Leader Election
> Role 'Primary Node' because that role is not registered
> 2022-12-08 22:26:21,667 INFO [Reconnect to Cluster]
> o.a.n.c.l.e.CuratorLeaderElectionManager Cannot unregister Leader Election
> Role 'Cluster Coordinator' because that role is not registered
> 2022-12-08 22:26:21,667 INFO [Reconnect to Cluster]
> o.a.nifi.controller.StandardFlowService Node disconnected due to Failed to
> properly handle Reconnection request due to
> org.apache.nifi.controller.serialization.FlowSynchronizationException:
> Failed to connect node to cluster because local flow controller partially
> updated. Administrator should disconnect node and review flow for
> corruption.
> 2022-12-08 22:26:21,668 INFO [Timer-Driven Process Thread-11]
> o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled
> PutEmail[id=d7e8d506-8f6a-3e50-8e52-06bcad6cd4e5] to run with 1 threads
> 2022-12-08 22:26:21,689 ERROR [Reconnect to Cluster]
> o.a.nifi.controller.StandardFlowService Handling reconnection request
> failed due to:
> org.apache.nifi.controller.serialization.FlowSynchronizationException:
> Failed to connect node to cluster because local flow controller partially
> updated. Administrator should disconnect node and review flow for
> corruption.
> org.apache.nifi.controller.serialization.FlowSynchronizationException:
> Failed to connect node to cluster because local flow controller partially
> updated. Administrator should disconnect node and review flow for
> corruption.
>         at
> org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:1061)
>         at
> org.apache.nifi.controller.StandardFlowService.handleReconnectionRequest(StandardFlowService.java:669)
>         at
> org.apache.nifi.controller.StandardFlowService.access$200(StandardFlowService.java:108)
>         at
> org.apache.nifi.controller.StandardFlowService$1.run(StandardFlowService.java:400)
>         at java.lang.Thread.run(Thread.java:750)
> Caused by:
> org.apache.nifi.controller.serialization.FlowSynchronizationException:
> java.lang.IllegalStateException: Cannot set AnnotationData while processor
> is running
>         at
> org.apache.nifi.controller.serialization.VersionedFlowSynchronizer.synchronizeFlow(VersionedFlowSynchronizer.java:370)
>         at
> org.apache.nifi.controller.serialization.VersionedFlowSynchronizer.sync(VersionedFlowSynchronizer.java:188)
>         at
> org.apache.nifi.controller.serialization.StandardFlowSynchronizer.sync(StandardFlowSynchronizer.java:43)
>         at
> org.apache.nifi.controller.FlowController.synchronize(FlowController.java:1524)
>         at
> org.apache.nifi.persistence.StandardFlowConfigurationDAO.load(StandardFlowConfigurationDAO.java:107)
>         at
> org.apache.nifi.controller.StandardFlowService.loadFromBytes(StandardFlowService.java:819)
>         at
> org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:1030)
>         ... 4 common frames omitted
> Caused by: java.lang.IllegalStateException: Cannot set AnnotationData
> while processor is running
>         at
> org.apache.nifi.controller.StandardProcessorNode.setAnnotationData(StandardProcessorNode.java:1297)
>         at
> org.apache.nifi.groups.StandardProcessGroupSynchronizer.updateProcessor(StandardProcessGroupSynchronizer.java:1545)
>         at
> org.apache.nifi.groups.StandardProcessGroupSynchronizer.synchronizeProcessors(StandardProcessGroupSynchronizer.java:846)
>         at
> org.apache.nifi.groups.StandardProcessGroupSynchronizer.synchronize(StandardProcessGroupSynchronizer.java:387)
>         at
> org.apache.nifi.groups.StandardProcessGroupSynchronizer.synchronizeChildGroups(StandardProcessGroupSynchronizer.java:445)
>         at
> org.apache.nifi.groups.StandardProcessGroupSynchronizer.synchronize(StandardProcessGroupSynchronizer.java:381)
>         at
> org.apache.nifi.groups.StandardProcessGroupSynchronizer.synchronizeChildGroups(StandardProcessGroupSynchronizer.java:445)
>         at
> org.apache.nifi.groups.StandardProcessGroupSynchronizer.synchronize(StandardProcessGroupSynchronizer.java:381)
>         at
> org.apache.nifi.groups.StandardProcessGroupSynchronizer.lambda$synchronize$0(StandardProcessGroupSynchronizer.java:227)
>         at
> org.apache.nifi.controller.flow.AbstractFlowManager.withParameterContextResolution(AbstractFlowManager.java:464)
>         at
> org.apache.nifi.groups.StandardProcessGroupSynchronizer.synchronize(StandardProcessGroupSynchronizer.java:225)
>         at
> org.apache.nifi.groups.StandardProcessGroup.synchronizeFlow(StandardProcessGroup.java:3823)
>         at
> org.apache.nifi.controller.serialization.VersionedFlowSynchronizer.synchronizeFlow(VersionedFlowSynchronizer.java:361)
>         ... 10 common frames omitted
>


-- 
David Early, Ph.D.
david.early@grokstream.com
720-470-7460 Cell