You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nifi.apache.org by Mark Bean <ma...@gmail.com> on 2017/04/11 17:19:10 UTC

Shutdown of one Node in Cluster

I have a 3-node Cluster with each Node hosting the embedded zookeeper. When
one Node is shutdown (and the Node is not the Cluster Coordinator), the
Cluster becomes unavailable. The UI indicates "Action cannot be performed
because there is currently no Cluster Coordinator elected. The request
should be tried again after a moment, after a Cluster Coordinator has been
automatically elected."

The app.log indicates "ConnectionStateManager State change: SUSPENDED".
And, there are an endless number of "CuratorFrameworkImpl Background retry
gave up" messages; the surviving Nodes are not able to allow the Cluster to
survive.

I would have thought since 2/3 Nodes are surviving, there wouldn't be a
problem. In addition, since the Node that was shutdown was not the Cluster
Coordinator nor Primary node, no Cluster state changes were required.

nifi.cluster.flow.election.max.wait.time=2 mins
nifi.cluster.flow.election.max.candidates=

The same behavior was observed when max.candidates was set to 2.

NiFi 1.1.2

Re: Shutdown of one Node in Cluster

Posted by Jeremy Farbota <jf...@payoff.com>.
To echo Max... I had some similar issues using the embedded Zookeeper. I
switched to my cluster Zookeeper and have not seen this issue since.

On Tue, Apr 11, 2017 at 11:37 AM, Mark Bean <ma...@gmail.com> wrote:

> Ok, will keep the standalone ZooKeeper in mind.
>
> Back to the original issue, any idea why ZooKeeper went to a PENDING state
> making the cluster unavailable?
>
>
> On Tue, Apr 11, 2017 at 2:10 PM, Mark Payne <ma...@hotmail.com> wrote:
>
> > Mark,
> >
> > Yes, 2 out of 3 should be sufficient. For testing purposes, a single
> > zookeeper instance
> > is fine, as well. For production, I would not actually recommend using an
> > embedded
> > ZooKeeper at all and instead use a standalone ZooKeeper. ZooKeeper tends
> > not to be
> > very happy when running on a box on which there is already heavy resource
> > load, so if
> > your cluster starts getting busy, you'll see far more stable performance
> > from a standalone
> > ZooKeeper.
> >
> >
> > > On Apr 11, 2017, at 2:06 PM, Mark Bean <ma...@gmail.com> wrote:
> > >
> > > All 3 nodes are running embedded ZooKeeper. And, the Admin Guide states
> > > "ZooKeeper requires a majority of nodes be active in order to
> function".
> > > So, I assumed 2/3 being active was ok. Perhaps not.
> > >
> > > Related: can a Cluster be setup with only 1 ZooKeeper node? Clearly, in
> > > production, one would not want to do this. But when testing, this
> should
> > be
> > > acceptable, yes?
> > >
> > >
> > >
> > > On Tue, Apr 11, 2017 at 1:56 PM, Mark Payne <ma...@hotmail.com>
> > wrote:
> > >
> > >> Mark,
> > >>
> > >> Are all of your nodes running an embedded ZooKeeper, or only 1 or 2 of
> > >> them?
> > >>
> > >> Thanks
> > >> -Mark
> > >>
> > >>> On Apr 11, 2017, at 1:19 PM, Mark Bean <ma...@gmail.com>
> wrote:
> > >>>
> > >>> I have a 3-node Cluster with each Node hosting the embedded
> zookeeper.
> > >> When
> > >>> one Node is shutdown (and the Node is not the Cluster Coordinator),
> the
> > >>> Cluster becomes unavailable. The UI indicates "Action cannot be
> > performed
> > >>> because there is currently no Cluster Coordinator elected. The
> request
> > >>> should be tried again after a moment, after a Cluster Coordinator has
> > >> been
> > >>> automatically elected."
> > >>>
> > >>> The app.log indicates "ConnectionStateManager State change:
> SUSPENDED".
> > >>> And, there are an endless number of "CuratorFrameworkImpl Background
> > >> retry
> > >>> gave up" messages; the surviving Nodes are not able to allow the
> > Cluster
> > >> to
> > >>> survive.
> > >>>
> > >>> I would have thought since 2/3 Nodes are surviving, there wouldn't
> be a
> > >>> problem. In addition, since the Node that was shutdown was not the
> > >> Cluster
> > >>> Coordinator nor Primary node, no Cluster state changes were required.
> > >>>
> > >>> nifi.cluster.flow.election.max.wait.time=2 mins
> > >>> nifi.cluster.flow.election.max.candidates=
> > >>>
> > >>> The same behavior was observed when max.candidates was set to 2.
> > >>>
> > >>> NiFi 1.1.2
> > >>
> > >>
> >
> >
>



-- 
*Jeremy Farbota*
Software Engineer, Data
Payoff, Inc.

jfarbota@payoff.com
(217) 898-8110 <+2178988110>

Re: Shutdown of one Node in Cluster

Posted by Mark Payne <ma...@hotmail.com>.
Excellent! Glad it's all working now. And thanks for the follow-up to let us know!

-Mark

> On Apr 12, 2017, at 11:30 AM, Mark Bean <ma...@gmail.com> wrote:
> 
> Mark,
> 
> I believe you're right. Yesterday, I corrected a typo in the
> nifi.properties file related to the FQDN name. I thought it was only the
> site-to-site property (nifi.remote.input.host). However, when I
> intentionally introduced a typo to one of the three ZK servers in the
> nifi.zookeeper.connect.string today, I was able to reproduce the symptoms.
> I'm sure that must have been it. Without the typo, all is working well.
> 
> Thanks,
> Mark
> 
> On Wed, Apr 12, 2017 at 10:36 AM, Mark Payne <ma...@hotmail.com> wrote:
> 
>> Mark,
>> 
>> I haven't seen this behavior personally, so I can't be sure why exactly it
>> would change state
>> to SUSPENDED and not then re-connect. In your nifi.properties, do you have
>> the
>> "nifi.zookeeper.connect.string" property setup to point to all 3 of the
>> nodes, also? If so, it should
>> be able to connect to one of the other two nodes listed.
>> 
>> Thanks
>> -Mark
>> 
>>> On Apr 11, 2017, at 2:37 PM, Mark Bean <ma...@gmail.com> wrote:
>>> 
>>> Ok, will keep the standalone ZooKeeper in mind.
>>> 
>>> Back to the original issue, any idea why ZooKeeper went to a PENDING
>> state
>>> making the cluster unavailable?
>>> 
>>> 
>>> On Tue, Apr 11, 2017 at 2:10 PM, Mark Payne <ma...@hotmail.com>
>> wrote:
>>> 
>>>> Mark,
>>>> 
>>>> Yes, 2 out of 3 should be sufficient. For testing purposes, a single
>>>> zookeeper instance
>>>> is fine, as well. For production, I would not actually recommend using
>> an
>>>> embedded
>>>> ZooKeeper at all and instead use a standalone ZooKeeper. ZooKeeper tends
>>>> not to be
>>>> very happy when running on a box on which there is already heavy
>> resource
>>>> load, so if
>>>> your cluster starts getting busy, you'll see far more stable performance
>>>> from a standalone
>>>> ZooKeeper.
>>>> 
>>>> 
>>>>> On Apr 11, 2017, at 2:06 PM, Mark Bean <ma...@gmail.com> wrote:
>>>>> 
>>>>> All 3 nodes are running embedded ZooKeeper. And, the Admin Guide states
>>>>> "ZooKeeper requires a majority of nodes be active in order to
>> function".
>>>>> So, I assumed 2/3 being active was ok. Perhaps not.
>>>>> 
>>>>> Related: can a Cluster be setup with only 1 ZooKeeper node? Clearly, in
>>>>> production, one would not want to do this. But when testing, this
>> should
>>>> be
>>>>> acceptable, yes?
>>>>> 
>>>>> 
>>>>> 
>>>>> On Tue, Apr 11, 2017 at 1:56 PM, Mark Payne <ma...@hotmail.com>
>>>> wrote:
>>>>> 
>>>>>> Mark,
>>>>>> 
>>>>>> Are all of your nodes running an embedded ZooKeeper, or only 1 or 2 of
>>>>>> them?
>>>>>> 
>>>>>> Thanks
>>>>>> -Mark
>>>>>> 
>>>>>>> On Apr 11, 2017, at 1:19 PM, Mark Bean <ma...@gmail.com>
>> wrote:
>>>>>>> 
>>>>>>> I have a 3-node Cluster with each Node hosting the embedded
>> zookeeper.
>>>>>> When
>>>>>>> one Node is shutdown (and the Node is not the Cluster Coordinator),
>> the
>>>>>>> Cluster becomes unavailable. The UI indicates "Action cannot be
>>>> performed
>>>>>>> because there is currently no Cluster Coordinator elected. The
>> request
>>>>>>> should be tried again after a moment, after a Cluster Coordinator has
>>>>>> been
>>>>>>> automatically elected."
>>>>>>> 
>>>>>>> The app.log indicates "ConnectionStateManager State change:
>> SUSPENDED".
>>>>>>> And, there are an endless number of "CuratorFrameworkImpl Background
>>>>>> retry
>>>>>>> gave up" messages; the surviving Nodes are not able to allow the
>>>> Cluster
>>>>>> to
>>>>>>> survive.
>>>>>>> 
>>>>>>> I would have thought since 2/3 Nodes are surviving, there wouldn't
>> be a
>>>>>>> problem. In addition, since the Node that was shutdown was not the
>>>>>> Cluster
>>>>>>> Coordinator nor Primary node, no Cluster state changes were required.
>>>>>>> 
>>>>>>> nifi.cluster.flow.election.max.wait.time=2 mins
>>>>>>> nifi.cluster.flow.election.max.candidates=
>>>>>>> 
>>>>>>> The same behavior was observed when max.candidates was set to 2.
>>>>>>> 
>>>>>>> NiFi 1.1.2
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>> 
>> 


Re: Shutdown of one Node in Cluster

Posted by Mark Bean <ma...@gmail.com>.
Mark,

I believe you're right. Yesterday, I corrected a typo in the
nifi.properties file related to the FQDN name. I thought it was only the
site-to-site property (nifi.remote.input.host). However, when I
intentionally introduced a typo to one of the three ZK servers in the
nifi.zookeeper.connect.string today, I was able to reproduce the symptoms.
I'm sure that must have been it. Without the typo, all is working well.

Thanks,
Mark

On Wed, Apr 12, 2017 at 10:36 AM, Mark Payne <ma...@hotmail.com> wrote:

> Mark,
>
> I haven't seen this behavior personally, so I can't be sure why exactly it
> would change state
> to SUSPENDED and not then re-connect. In your nifi.properties, do you have
> the
> "nifi.zookeeper.connect.string" property setup to point to all 3 of the
> nodes, also? If so, it should
> be able to connect to one of the other two nodes listed.
>
> Thanks
> -Mark
>
> > On Apr 11, 2017, at 2:37 PM, Mark Bean <ma...@gmail.com> wrote:
> >
> > Ok, will keep the standalone ZooKeeper in mind.
> >
> > Back to the original issue, any idea why ZooKeeper went to a PENDING
> state
> > making the cluster unavailable?
> >
> >
> > On Tue, Apr 11, 2017 at 2:10 PM, Mark Payne <ma...@hotmail.com>
> wrote:
> >
> >> Mark,
> >>
> >> Yes, 2 out of 3 should be sufficient. For testing purposes, a single
> >> zookeeper instance
> >> is fine, as well. For production, I would not actually recommend using
> an
> >> embedded
> >> ZooKeeper at all and instead use a standalone ZooKeeper. ZooKeeper tends
> >> not to be
> >> very happy when running on a box on which there is already heavy
> resource
> >> load, so if
> >> your cluster starts getting busy, you'll see far more stable performance
> >> from a standalone
> >> ZooKeeper.
> >>
> >>
> >>> On Apr 11, 2017, at 2:06 PM, Mark Bean <ma...@gmail.com> wrote:
> >>>
> >>> All 3 nodes are running embedded ZooKeeper. And, the Admin Guide states
> >>> "ZooKeeper requires a majority of nodes be active in order to
> function".
> >>> So, I assumed 2/3 being active was ok. Perhaps not.
> >>>
> >>> Related: can a Cluster be setup with only 1 ZooKeeper node? Clearly, in
> >>> production, one would not want to do this. But when testing, this
> should
> >> be
> >>> acceptable, yes?
> >>>
> >>>
> >>>
> >>> On Tue, Apr 11, 2017 at 1:56 PM, Mark Payne <ma...@hotmail.com>
> >> wrote:
> >>>
> >>>> Mark,
> >>>>
> >>>> Are all of your nodes running an embedded ZooKeeper, or only 1 or 2 of
> >>>> them?
> >>>>
> >>>> Thanks
> >>>> -Mark
> >>>>
> >>>>> On Apr 11, 2017, at 1:19 PM, Mark Bean <ma...@gmail.com>
> wrote:
> >>>>>
> >>>>> I have a 3-node Cluster with each Node hosting the embedded
> zookeeper.
> >>>> When
> >>>>> one Node is shutdown (and the Node is not the Cluster Coordinator),
> the
> >>>>> Cluster becomes unavailable. The UI indicates "Action cannot be
> >> performed
> >>>>> because there is currently no Cluster Coordinator elected. The
> request
> >>>>> should be tried again after a moment, after a Cluster Coordinator has
> >>>> been
> >>>>> automatically elected."
> >>>>>
> >>>>> The app.log indicates "ConnectionStateManager State change:
> SUSPENDED".
> >>>>> And, there are an endless number of "CuratorFrameworkImpl Background
> >>>> retry
> >>>>> gave up" messages; the surviving Nodes are not able to allow the
> >> Cluster
> >>>> to
> >>>>> survive.
> >>>>>
> >>>>> I would have thought since 2/3 Nodes are surviving, there wouldn't
> be a
> >>>>> problem. In addition, since the Node that was shutdown was not the
> >>>> Cluster
> >>>>> Coordinator nor Primary node, no Cluster state changes were required.
> >>>>>
> >>>>> nifi.cluster.flow.election.max.wait.time=2 mins
> >>>>> nifi.cluster.flow.election.max.candidates=
> >>>>>
> >>>>> The same behavior was observed when max.candidates was set to 2.
> >>>>>
> >>>>> NiFi 1.1.2
> >>>>
> >>>>
> >>
> >>
>
>

Re: Shutdown of one Node in Cluster

Posted by Mark Payne <ma...@hotmail.com>.
Mark,

I haven't seen this behavior personally, so I can't be sure why exactly it would change state
to SUSPENDED and not then re-connect. In your nifi.properties, do you have the 
"nifi.zookeeper.connect.string" property setup to point to all 3 of the nodes, also? If so, it should
be able to connect to one of the other two nodes listed.

Thanks
-Mark

> On Apr 11, 2017, at 2:37 PM, Mark Bean <ma...@gmail.com> wrote:
> 
> Ok, will keep the standalone ZooKeeper in mind.
> 
> Back to the original issue, any idea why ZooKeeper went to a PENDING state
> making the cluster unavailable?
> 
> 
> On Tue, Apr 11, 2017 at 2:10 PM, Mark Payne <ma...@hotmail.com> wrote:
> 
>> Mark,
>> 
>> Yes, 2 out of 3 should be sufficient. For testing purposes, a single
>> zookeeper instance
>> is fine, as well. For production, I would not actually recommend using an
>> embedded
>> ZooKeeper at all and instead use a standalone ZooKeeper. ZooKeeper tends
>> not to be
>> very happy when running on a box on which there is already heavy resource
>> load, so if
>> your cluster starts getting busy, you'll see far more stable performance
>> from a standalone
>> ZooKeeper.
>> 
>> 
>>> On Apr 11, 2017, at 2:06 PM, Mark Bean <ma...@gmail.com> wrote:
>>> 
>>> All 3 nodes are running embedded ZooKeeper. And, the Admin Guide states
>>> "ZooKeeper requires a majority of nodes be active in order to function".
>>> So, I assumed 2/3 being active was ok. Perhaps not.
>>> 
>>> Related: can a Cluster be setup with only 1 ZooKeeper node? Clearly, in
>>> production, one would not want to do this. But when testing, this should
>> be
>>> acceptable, yes?
>>> 
>>> 
>>> 
>>> On Tue, Apr 11, 2017 at 1:56 PM, Mark Payne <ma...@hotmail.com>
>> wrote:
>>> 
>>>> Mark,
>>>> 
>>>> Are all of your nodes running an embedded ZooKeeper, or only 1 or 2 of
>>>> them?
>>>> 
>>>> Thanks
>>>> -Mark
>>>> 
>>>>> On Apr 11, 2017, at 1:19 PM, Mark Bean <ma...@gmail.com> wrote:
>>>>> 
>>>>> I have a 3-node Cluster with each Node hosting the embedded zookeeper.
>>>> When
>>>>> one Node is shutdown (and the Node is not the Cluster Coordinator), the
>>>>> Cluster becomes unavailable. The UI indicates "Action cannot be
>> performed
>>>>> because there is currently no Cluster Coordinator elected. The request
>>>>> should be tried again after a moment, after a Cluster Coordinator has
>>>> been
>>>>> automatically elected."
>>>>> 
>>>>> The app.log indicates "ConnectionStateManager State change: SUSPENDED".
>>>>> And, there are an endless number of "CuratorFrameworkImpl Background
>>>> retry
>>>>> gave up" messages; the surviving Nodes are not able to allow the
>> Cluster
>>>> to
>>>>> survive.
>>>>> 
>>>>> I would have thought since 2/3 Nodes are surviving, there wouldn't be a
>>>>> problem. In addition, since the Node that was shutdown was not the
>>>> Cluster
>>>>> Coordinator nor Primary node, no Cluster state changes were required.
>>>>> 
>>>>> nifi.cluster.flow.election.max.wait.time=2 mins
>>>>> nifi.cluster.flow.election.max.candidates=
>>>>> 
>>>>> The same behavior was observed when max.candidates was set to 2.
>>>>> 
>>>>> NiFi 1.1.2
>>>> 
>>>> 
>> 
>> 


Re: Shutdown of one Node in Cluster

Posted by Mark Bean <ma...@gmail.com>.
Ok, will keep the standalone ZooKeeper in mind.

Back to the original issue, any idea why ZooKeeper went to a PENDING state
making the cluster unavailable?


On Tue, Apr 11, 2017 at 2:10 PM, Mark Payne <ma...@hotmail.com> wrote:

> Mark,
>
> Yes, 2 out of 3 should be sufficient. For testing purposes, a single
> zookeeper instance
> is fine, as well. For production, I would not actually recommend using an
> embedded
> ZooKeeper at all and instead use a standalone ZooKeeper. ZooKeeper tends
> not to be
> very happy when running on a box on which there is already heavy resource
> load, so if
> your cluster starts getting busy, you'll see far more stable performance
> from a standalone
> ZooKeeper.
>
>
> > On Apr 11, 2017, at 2:06 PM, Mark Bean <ma...@gmail.com> wrote:
> >
> > All 3 nodes are running embedded ZooKeeper. And, the Admin Guide states
> > "ZooKeeper requires a majority of nodes be active in order to function".
> > So, I assumed 2/3 being active was ok. Perhaps not.
> >
> > Related: can a Cluster be setup with only 1 ZooKeeper node? Clearly, in
> > production, one would not want to do this. But when testing, this should
> be
> > acceptable, yes?
> >
> >
> >
> > On Tue, Apr 11, 2017 at 1:56 PM, Mark Payne <ma...@hotmail.com>
> wrote:
> >
> >> Mark,
> >>
> >> Are all of your nodes running an embedded ZooKeeper, or only 1 or 2 of
> >> them?
> >>
> >> Thanks
> >> -Mark
> >>
> >>> On Apr 11, 2017, at 1:19 PM, Mark Bean <ma...@gmail.com> wrote:
> >>>
> >>> I have a 3-node Cluster with each Node hosting the embedded zookeeper.
> >> When
> >>> one Node is shutdown (and the Node is not the Cluster Coordinator), the
> >>> Cluster becomes unavailable. The UI indicates "Action cannot be
> performed
> >>> because there is currently no Cluster Coordinator elected. The request
> >>> should be tried again after a moment, after a Cluster Coordinator has
> >> been
> >>> automatically elected."
> >>>
> >>> The app.log indicates "ConnectionStateManager State change: SUSPENDED".
> >>> And, there are an endless number of "CuratorFrameworkImpl Background
> >> retry
> >>> gave up" messages; the surviving Nodes are not able to allow the
> Cluster
> >> to
> >>> survive.
> >>>
> >>> I would have thought since 2/3 Nodes are surviving, there wouldn't be a
> >>> problem. In addition, since the Node that was shutdown was not the
> >> Cluster
> >>> Coordinator nor Primary node, no Cluster state changes were required.
> >>>
> >>> nifi.cluster.flow.election.max.wait.time=2 mins
> >>> nifi.cluster.flow.election.max.candidates=
> >>>
> >>> The same behavior was observed when max.candidates was set to 2.
> >>>
> >>> NiFi 1.1.2
> >>
> >>
>
>

Re: Shutdown of one Node in Cluster

Posted by Mark Payne <ma...@hotmail.com>.
Mark,

Yes, 2 out of 3 should be sufficient. For testing purposes, a single zookeeper instance
is fine, as well. For production, I would not actually recommend using an embedded
ZooKeeper at all and instead use a standalone ZooKeeper. ZooKeeper tends not to be
very happy when running on a box on which there is already heavy resource load, so if
your cluster starts getting busy, you'll see far more stable performance from a standalone
ZooKeeper.


> On Apr 11, 2017, at 2:06 PM, Mark Bean <ma...@gmail.com> wrote:
> 
> All 3 nodes are running embedded ZooKeeper. And, the Admin Guide states
> "ZooKeeper requires a majority of nodes be active in order to function".
> So, I assumed 2/3 being active was ok. Perhaps not.
> 
> Related: can a Cluster be setup with only 1 ZooKeeper node? Clearly, in
> production, one would not want to do this. But when testing, this should be
> acceptable, yes?
> 
> 
> 
> On Tue, Apr 11, 2017 at 1:56 PM, Mark Payne <ma...@hotmail.com> wrote:
> 
>> Mark,
>> 
>> Are all of your nodes running an embedded ZooKeeper, or only 1 or 2 of
>> them?
>> 
>> Thanks
>> -Mark
>> 
>>> On Apr 11, 2017, at 1:19 PM, Mark Bean <ma...@gmail.com> wrote:
>>> 
>>> I have a 3-node Cluster with each Node hosting the embedded zookeeper.
>> When
>>> one Node is shutdown (and the Node is not the Cluster Coordinator), the
>>> Cluster becomes unavailable. The UI indicates "Action cannot be performed
>>> because there is currently no Cluster Coordinator elected. The request
>>> should be tried again after a moment, after a Cluster Coordinator has
>> been
>>> automatically elected."
>>> 
>>> The app.log indicates "ConnectionStateManager State change: SUSPENDED".
>>> And, there are an endless number of "CuratorFrameworkImpl Background
>> retry
>>> gave up" messages; the surviving Nodes are not able to allow the Cluster
>> to
>>> survive.
>>> 
>>> I would have thought since 2/3 Nodes are surviving, there wouldn't be a
>>> problem. In addition, since the Node that was shutdown was not the
>> Cluster
>>> Coordinator nor Primary node, no Cluster state changes were required.
>>> 
>>> nifi.cluster.flow.election.max.wait.time=2 mins
>>> nifi.cluster.flow.election.max.candidates=
>>> 
>>> The same behavior was observed when max.candidates was set to 2.
>>> 
>>> NiFi 1.1.2
>> 
>> 


Re: Shutdown of one Node in Cluster

Posted by Mark Bean <ma...@gmail.com>.
All 3 nodes are running embedded ZooKeeper. And, the Admin Guide states
"ZooKeeper requires a majority of nodes be active in order to function".
So, I assumed 2/3 being active was ok. Perhaps not.

Related: can a Cluster be setup with only 1 ZooKeeper node? Clearly, in
production, one would not want to do this. But when testing, this should be
acceptable, yes?



On Tue, Apr 11, 2017 at 1:56 PM, Mark Payne <ma...@hotmail.com> wrote:

> Mark,
>
> Are all of your nodes running an embedded ZooKeeper, or only 1 or 2 of
> them?
>
> Thanks
> -Mark
>
> > On Apr 11, 2017, at 1:19 PM, Mark Bean <ma...@gmail.com> wrote:
> >
> > I have a 3-node Cluster with each Node hosting the embedded zookeeper.
> When
> > one Node is shutdown (and the Node is not the Cluster Coordinator), the
> > Cluster becomes unavailable. The UI indicates "Action cannot be performed
> > because there is currently no Cluster Coordinator elected. The request
> > should be tried again after a moment, after a Cluster Coordinator has
> been
> > automatically elected."
> >
> > The app.log indicates "ConnectionStateManager State change: SUSPENDED".
> > And, there are an endless number of "CuratorFrameworkImpl Background
> retry
> > gave up" messages; the surviving Nodes are not able to allow the Cluster
> to
> > survive.
> >
> > I would have thought since 2/3 Nodes are surviving, there wouldn't be a
> > problem. In addition, since the Node that was shutdown was not the
> Cluster
> > Coordinator nor Primary node, no Cluster state changes were required.
> >
> > nifi.cluster.flow.election.max.wait.time=2 mins
> > nifi.cluster.flow.election.max.candidates=
> >
> > The same behavior was observed when max.candidates was set to 2.
> >
> > NiFi 1.1.2
>
>

Re: Shutdown of one Node in Cluster

Posted by Mark Payne <ma...@hotmail.com>.
Mark,

Are all of your nodes running an embedded ZooKeeper, or only 1 or 2 of them?

Thanks
-Mark

> On Apr 11, 2017, at 1:19 PM, Mark Bean <ma...@gmail.com> wrote:
> 
> I have a 3-node Cluster with each Node hosting the embedded zookeeper. When
> one Node is shutdown (and the Node is not the Cluster Coordinator), the
> Cluster becomes unavailable. The UI indicates "Action cannot be performed
> because there is currently no Cluster Coordinator elected. The request
> should be tried again after a moment, after a Cluster Coordinator has been
> automatically elected."
> 
> The app.log indicates "ConnectionStateManager State change: SUSPENDED".
> And, there are an endless number of "CuratorFrameworkImpl Background retry
> gave up" messages; the surviving Nodes are not able to allow the Cluster to
> survive.
> 
> I would have thought since 2/3 Nodes are surviving, there wouldn't be a
> problem. In addition, since the Node that was shutdown was not the Cluster
> Coordinator nor Primary node, no Cluster state changes were required.
> 
> nifi.cluster.flow.election.max.wait.time=2 mins
> nifi.cluster.flow.election.max.candidates=
> 
> The same behavior was observed when max.candidates was set to 2.
> 
> NiFi 1.1.2