You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@zookeeper.apache.org by Tecno Brain <ce...@gmail.com> on 2018/04/18 22:32:40 UTC

Zookeeper own leader election

Hi,
  I have a cluster of five Zookeeper nodes.

  I have an application deployed in two other servers that execute a leader
election process using the Curator recipe (
https://curator.apache.org/curator-recipes/leader-election.html)

  My DevOps has been executing a ChaosMonkey type of test and they
complained that my application triggered a change in leadership when
they *restarted
two of the Zookeeper* nodes (the leader node and an extra node).

  I find it normal, but they claim that the application should let the
Zookeeper nodes elect its own new leader and my application should not
change leadership because the current leader did not fail, the failure was
in the Zookeeper cluster.

  So, my question is:
    - If the Zookeeper leader node fails, are all sessions lost?
    - What parameters control how quickly the zookeeper nodes elect a new
leader?
    - Can I have longer timeouts in my application before giving up
leadership than that of the zookeeper nodes?

My application currently runs an "expensive" task when taking leadership,
therefore we want to minimize the change of leadership, specially if it
wasn't because the application failed, but rather because the Zookeeper
cluster was unstable.

I want to understand Zookeeper own leadership election process to be able
to either modify the Curator recipe or have a solid argument to explain
that what I am asked to do is not possible.
Any pointers are welcome.

-J

Re: Zookeeper own leader election

Posted by Jordan Zimmerman <jo...@jordanzimmerman.com>.

If the ZooKeeper leader/master instance is restarted, the Apache Curator clients will receive a Disconnected and change their state to SUSPENDED. Apache Curator's recommendation is to exit your leaders at that point. If you're using the default listeners for LeaderSelector in Apache Curator that's what will happen. This is normal. You could change to only exit leadership on LOST. However see:

https://cwiki.apache.org/confluence/display/CURATOR/TN12 
https://cwiki.apache.org/confluence/display/CURATOR/TN14
http://curator.apache.org/errors.html

-Jordan

> On Apr 19, 2018, at 10:51 AM, Tecno Brain <ce...@gmail.com> wrote:
> 
> Hi Jordan,
> 
> Correct, I know that the internal leader election has nothing to do with
> the leader election of my application through Curator.
> 
> What we are observing is that when restarting (or killing) 1 or 2 servers
> from a Zookeeper ensemble of 5 nodes this is  triggering a leader election
> of my application.
> Our expectation is that this should not occur, since I still have quorum in
> the Zookeeper ensemble.
> Is that the correct expectation ?
> 
> 
> 
> 
> 
> 
> On Wed, Apr 18, 2018 at 6:08 PM, Jordan Zimmerman <
> jordan@jordanzimmerman.com> wrote:
> 
>> The term "leader election" has two meanings here. The kind of leader
>> election that your application uses with Apache Curator is different from
>> the internal leader election that ZooKeeper does amongst its nodes. For
>> clarity, the internal leader election should probably be renamed to "master
>> election" or something. In a ZooKeeper ensemble one instance is always
>> chosen as the leader/master. All writes, etc. go through this master. If
>> this master instance goes down (due to crash, restart, chaos monkey, etc.)
>> then the ensemble must choose a new leader/master. This is simply how
>> ZooKeeper works.
>> 
>>>   - If the Zookeeper leader node fails, are all sessions lost?
>> 
>> No. Sessions are transactions in the ZK database like any other. When a
>> new ZK leader/master is elected the sessions will continue. In fact, the
>> session time is reset as the leader/master sets the status of time "0".
>> 
>>>   - What parameters control how quickly the zookeeper nodes elect a new
>>> leader?
>> 
>> I believe "initLimit" is the most important one here (others can correct
>> me).
>> 
>>>   - Can I have longer timeouts in my application before giving up
>>> leadership than that of the zookeeper nodes?
>> 
>> I don't totally understand this question. The internal leader/master
>> election has nothing whatever to do with Apache Curator leaders.
>> 
>> -Jordan
>> 
>>> On Apr 19, 2018, at 7:32 AM, Tecno Brain <ce...@gmail.com>
>> wrote:
>>> 
>>> Hi,
>>> I have a cluster of five Zookeeper nodes.
>>> 
>>> I have an application deployed in two other servers that execute a
>> leader
>>> election process using the Curator recipe (
>>> https://curator.apache.org/curator-recipes/leader-election.html)
>>> 
>>> My DevOps has been executing a ChaosMonkey type of test and they
>>> complained that my application triggered a change in leadership when
>>> they *restarted
>>> two of the Zookeeper* nodes (the leader node and an extra node).
>>> 
>>> I find it normal, but they claim that the application should let the
>>> Zookeeper nodes elect its own new leader and my application should not
>>> change leadership because the current leader did not fail, the failure
>> was
>>> in the Zookeeper cluster.
>>> 
>>> So, my question is:
>>>   - If the Zookeeper leader node fails, are all sessions lost?
>>>   - What parameters control how quickly the zookeeper nodes elect a new
>>> leader?
>>>   - Can I have longer timeouts in my application before giving up
>>> leadership than that of the zookeeper nodes?
>>> 
>>> My application currently runs an "expensive" task when taking leadership,
>>> therefore we want to minimize the change of leadership, specially if it
>>> wasn't because the application failed, but rather because the Zookeeper
>>> cluster was unstable.
>>> 
>>> I want to understand Zookeeper own leadership election process to be able
>>> to either modify the Curator recipe or have a solid argument to explain
>>> that what I am asked to do is not possible.
>>> Any pointers are welcome.
>>> 
>>> -J
>> 
>>

Re: Zookeeper own leader election

Posted by Tecno Brain <ce...@gmail.com>.

Hi Jordan,

Correct, I know that the internal leader election has nothing to do with
the leader election of my application through Curator.

What we are observing is that when restarting (or killing) 1 or 2 servers
from a Zookeeper ensemble of 5 nodes this is  triggering a leader election
of my application.
Our expectation is that this should not occur, since I still have quorum in
the Zookeeper ensemble.
Is that the correct expectation ?






On Wed, Apr 18, 2018 at 6:08 PM, Jordan Zimmerman <
jordan@jordanzimmerman.com> wrote:

> The term "leader election" has two meanings here. The kind of leader
> election that your application uses with Apache Curator is different from
> the internal leader election that ZooKeeper does amongst its nodes. For
> clarity, the internal leader election should probably be renamed to "master
> election" or something. In a ZooKeeper ensemble one instance is always
> chosen as the leader/master. All writes, etc. go through this master. If
> this master instance goes down (due to crash, restart, chaos monkey, etc.)
> then the ensemble must choose a new leader/master. This is simply how
> ZooKeeper works.
>
> >    - If the Zookeeper leader node fails, are all sessions lost?
>
> No. Sessions are transactions in the ZK database like any other. When a
> new ZK leader/master is elected the sessions will continue. In fact, the
> session time is reset as the leader/master sets the status of time "0".
>
> >    - What parameters control how quickly the zookeeper nodes elect a new
> > leader?
>
> I believe "initLimit" is the most important one here (others can correct
> me).
>
> >    - Can I have longer timeouts in my application before giving up
> > leadership than that of the zookeeper nodes?
>
> I don't totally understand this question. The internal leader/master
> election has nothing whatever to do with Apache Curator leaders.
>
> -Jordan
>
> > On Apr 19, 2018, at 7:32 AM, Tecno Brain <ce...@gmail.com>
> wrote:
> >
> > Hi,
> >  I have a cluster of five Zookeeper nodes.
> >
> >  I have an application deployed in two other servers that execute a
> leader
> > election process using the Curator recipe (
> > https://curator.apache.org/curator-recipes/leader-election.html)
> >
> >  My DevOps has been executing a ChaosMonkey type of test and they
> > complained that my application triggered a change in leadership when
> > they *restarted
> > two of the Zookeeper* nodes (the leader node and an extra node).
> >
> >  I find it normal, but they claim that the application should let the
> > Zookeeper nodes elect its own new leader and my application should not
> > change leadership because the current leader did not fail, the failure
> was
> > in the Zookeeper cluster.
> >
> >  So, my question is:
> >    - If the Zookeeper leader node fails, are all sessions lost?
> >    - What parameters control how quickly the zookeeper nodes elect a new
> > leader?
> >    - Can I have longer timeouts in my application before giving up
> > leadership than that of the zookeeper nodes?
> >
> > My application currently runs an "expensive" task when taking leadership,
> > therefore we want to minimize the change of leadership, specially if it
> > wasn't because the application failed, but rather because the Zookeeper
> > cluster was unstable.
> >
> > I want to understand Zookeeper own leadership election process to be able
> > to either modify the Curator recipe or have a solid argument to explain
> > that what I am asked to do is not possible.
> > Any pointers are welcome.
> >
> > -J
>
>

Re: Zookeeper own leader election

Posted by Jordan Zimmerman <jo...@jordanzimmerman.com>.

The term "leader election" has two meanings here. The kind of leader election that your application uses with Apache Curator is different from the internal leader election that ZooKeeper does amongst its nodes. For clarity, the internal leader election should probably be renamed to "master election" or something. In a ZooKeeper ensemble one instance is always chosen as the leader/master. All writes, etc. go through this master. If this master instance goes down (due to crash, restart, chaos monkey, etc.) then the ensemble must choose a new leader/master. This is simply how ZooKeeper works.

>    - If the Zookeeper leader node fails, are all sessions lost?

No. Sessions are transactions in the ZK database like any other. When a new ZK leader/master is elected the sessions will continue. In fact, the session time is reset as the leader/master sets the status of time "0".

>    - What parameters control how quickly the zookeeper nodes elect a new
> leader?

I believe "initLimit" is the most important one here (others can correct me).

>    - Can I have longer timeouts in my application before giving up
> leadership than that of the zookeeper nodes?

I don't totally understand this question. The internal leader/master election has nothing whatever to do with Apache Curator leaders.

-Jordan

> On Apr 19, 2018, at 7:32 AM, Tecno Brain <ce...@gmail.com> wrote:
> 
> Hi,
>  I have a cluster of five Zookeeper nodes.
> 
>  I have an application deployed in two other servers that execute a leader
> election process using the Curator recipe (
> https://curator.apache.org/curator-recipes/leader-election.html)
> 
>  My DevOps has been executing a ChaosMonkey type of test and they
> complained that my application triggered a change in leadership when
> they *restarted
> two of the Zookeeper* nodes (the leader node and an extra node).
> 
>  I find it normal, but they claim that the application should let the
> Zookeeper nodes elect its own new leader and my application should not
> change leadership because the current leader did not fail, the failure was
> in the Zookeeper cluster.
> 
>  So, my question is:
>    - If the Zookeeper leader node fails, are all sessions lost?
>    - What parameters control how quickly the zookeeper nodes elect a new
> leader?
>    - Can I have longer timeouts in my application before giving up
> leadership than that of the zookeeper nodes?
> 
> My application currently runs an "expensive" task when taking leadership,
> therefore we want to minimize the change of leadership, specially if it
> wasn't because the application failed, but rather because the Zookeeper
> cluster was unstable.
> 
> I want to understand Zookeeper own leadership election process to be able
> to either modify the Curator recipe or have a solid argument to explain
> that what I am asked to do is not possible.
> Any pointers are welcome.
> 
> -J