You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@zookeeper.apache.org by Simon <co...@gmx.ch> on 2015/09/12 23:47:35 UTC

disconnected events and session expiration

Hi

I am trying to get a better understanding of Zookeeper and how it should be used. Let’s talk about the lock recipe (http://zookeeper.apache.org/doc/r3.4.6/recipes.html#sc_recipes_Locks). 

- X aquires the lock
- X does some long running work (longer than the session timeout)
- X gets partioned away from the quorum while it was doing some work
- after some time (determined by the timeout passed to ZK) Y will aquire the lock

In that situation both X and Y are holding the lock (unless X is acting properly). If I understand the documentation correctly (http://zookeeper.apache.org/doc/r3.4.6/zookeeperProgrammers.html#ch_zkSessions), X would receive a disconnected event in that situation (but not an expired event unless it successfully reconnects). So, X should stop doing the work it is doing until it gets reconnected. How much time does X have to stop the work it is doing? i.e. how long does it take from disconnected event sent to X to expiration of the ephemeral node used for the lock? Having two clients inside a critical section protected by a lock would not be a good idea.

Regards,
Simon

Re: disconnected events and session expiration

Posted by Jordan Zimmerman <jo...@jordanzimmerman.com>.

You should make your session time long enough to cover the amount of time that you cannot tolerate another process becoming the lock owner. In a partition, your ephemeral node is not deleted until the session times out. So, if you need 20 seconds to gracefully shut down make your session length at least 20 seconds. If you can't tolerate ANY timeout then you are really trying to get around CAP and need to rethink your system. 

====================
Jordan Zimmerman

> On Sep 13, 2015, at 6:28 AM, Simon <co...@gmx.ch> wrote:
> 
> That didn’t really answer any of my questions. 
> 
> If I own a lock, I am entitled to do some work exclusively. No one else should be doing that work. If I get disconnected or the session times out I have to stop working. Somebody else will take over the work in a short time. If I understood the programmers guide correctly, the expired event will not be delivered to me until I reconnect. Correct? So, I have to use the disconnected event to initiate a graceful stop. Stopping work might take some time, e.g. because I am doing a REST service call that takes up to 20s. Let’s say doing the call twice leads to data corruption in the backend service (e.g. HTTP POST, which is non idempotent). So, ideally, if I am still running, I should try my best to complete normally. If the state of the work units is kept in ZK, I cannot update the state anyway. If I store it in some other datastore, I might be able to update the state or not (depending on how the network has been partitioned).
> 
> The more I think about it, the harder it seems to get this stuff working reliably. What if my node crashes? I cannot complete my work normally. So, whoever takes over my work will try to redo it anyways. Either the receiver is made idempotent (which is not always possible) or the new work owner needs to be aware of the aborted task and be extra cautious, e.g. by checking whether the work unit has completed or not. It seems to me that making the “crash” case the default (i.e. “crash” the worker thread whenever a disconnected event is received) is the best solution. Then I am forced to make the crash case robust. Guess that’s what some people call “crash-only design”.
> 
> Simon
> 
> 
> 
>> On 13 Sep 2015, at 03:19 , Jordan Zimmerman <jo...@jordanzimmerman.com> wrote:
>> 
>> I used to advise that people treat Disconnected the same as session loss as it’s safer. But, you can also set a timer when Disconnected is received and when your session timeout elapses you can then consider session loss (note, use the negotiated value from the ZK handle). FYI - version 3.0.0 of Apache Curator will have an option to choose this alternate method.
>> 
>> -Jordan
>> 
>> 
>> 
>>> On September 12, 2015 at 4:47:46 PM, Simon (cocoa@gmx.ch) wrote:
>>> 
>>> Hi 
>>> 
>>> I am trying to get a better understanding of Zookeeper and how it should be used. Let’s talk about the lock recipe (http://zookeeper.apache.org/doc/r3.4.6/recipes.html#sc_recipes_Locks).  
>>> 
>>> - X aquires the lock 
>>> - X does some long running work (longer than the session timeout) 
>>> - X gets partioned away from the quorum while it was doing some work 
>>> - after some time (determined by the timeout passed to ZK) Y will aquire the lock 
>>> 
>>> In that situation both X and Y are holding the lock (unless X is acting properly). If I understand the documentation correctly (http://zookeeper.apache.org/doc/r3.4.6/zookeeperProgrammers.html#ch_zkSessions), X would receive a disconnected event in that situation (but not an expired event unless it successfully reconnects). So, X should stop doing the work it is doing until it gets reconnected. How much time does X have to stop the work it is doing? i.e. how long does it take from disconnected event sent to X to expiration of the ephemeral node used for the lock? Having two clients inside a critical section protected by a lock would not be a good idea. 
>>> 
>>> Regards, 
>>> Simon
>

Re: disconnected events and session expiration

Posted by Simon <co...@gmx.ch>.

Thanks for all the pointers! The time needed to read through all that material was well invested. Now I have to digest all that information. Especially the idea about using sequence numbers to reject invalid requests looks promising.

-Simon

> On 13 Sep 2015, at 17:36 , Ivan Kelly <iv...@apache.org> wrote:
> 
>> 
>> If I own a lock, I am entitled to do some work exclusively. No one else
>> should be doing that work. If I get disconnected or the session times out I
>> have to stop working.
> 
> Locks in zookeeper are advisory. Zookeeper will tell you that you have a
> lock, but by the time that message arrives you aren't guaranteed that this
> is still the case. The same is true in the disconnection case. You may have
> lost the lock well before you receive the disconnection event. What happens
> if you are doing non-idempotent work during this period? The only
> guaranteed solution is to add some sort of lock support in the medium that
> is under lock. You can probably also get by by setting timeouts to be high
> enough, especially since it doesn't sounds like the solution requires low
> latency.
> 
> We actually discussed this exact problem a couple of months ago on this
> list.
> http://mail-archives.apache.org/mod_mbox/zookeeper-user/201507.mbox/%3C1436982861611-7581277.post%40n2.nabble.com%3E
> I also wrote a blogpost on this same topic,
> https://medium.com/@ivankelly/reliable-table-writer-locks-for-hbase-731024295215
> 
> -Ivan
> 
> 
>> Somebody else will take over the work in a short time. If I understood the
>> programmers guide correctly, the expired event will not be delivered to me
>> until I reconnect. Correct? So, I have to use the disconnected event to
>> initiate a graceful stop. Stopping work might take some time, e.g. because
>> I am doing a REST service call that takes up to 20s. Let’s say doing the
>> call twice leads to data corruption in the backend service (e.g. HTTP POST,
>> which is non idempotent). So, ideally, if I am still running, I should try
>> my best to complete normally. If the state of the work units is kept in ZK,
>> I cannot update the state anyway. If I store it in some other datastore, I
>> might be able to update the state or not (depending on how the network has
>> been partitioned).
>> 
>> The more I think about it, the harder it seems to get this stuff working
>> reliably. What if my node crashes? I cannot complete my work normally. So,
>> whoever takes over my work will try to redo it anyways. Either the receiver
>> is made idempotent (which is not always possible) or the new work owner
>> needs to be aware of the aborted task and be extra cautious, e.g. by
>> checking whether the work unit has completed or not. It seems to me that
>> making the “crash” case the default (i.e. “crash” the worker thread
>> whenever a disconnected event is received) is the best solution. Then I am
>> forced to make the crash case robust. Guess that’s what some people call
>> “crash-only design”.
>> 
>> Simon
>> 
>> 
>> 
>>> On 13 Sep 2015, at 03:19 , Jordan Zimmerman <jo...@jordanzimmerman.com>
>> wrote:
>>> 
>>> I used to advise that people treat Disconnected the same as session loss
>> as it’s safer. But, you can also set a timer when Disconnected is received
>> and when your session timeout elapses you can then consider session loss
>> (note, use the negotiated value from the ZK handle). FYI - version 3.0.0 of
>> Apache Curator will have an option to choose this alternate method.
>>> 
>>> -Jordan
>>> 
>>> 
>>> 
>>> On September 12, 2015 at 4:47:46 PM, Simon (cocoa@gmx.ch <mailto:
>> cocoa@gmx.ch>) wrote:
>>> 
>>>> Hi
>>>> 
>>>> I am trying to get a better understanding of Zookeeper and how it
>> should be used. Let’s talk about the lock recipe (
>> http://zookeeper.apache.org/doc/r3.4.6/recipes.html#sc_recipes_Locks).
>>>> 
>>>> - X aquires the lock
>>>> - X does some long running work (longer than the session timeout)
>>>> - X gets partioned away from the quorum while it was doing some work
>>>> - after some time (determined by the timeout passed to ZK) Y will
>> aquire the lock
>>>> 
>>>> In that situation both X and Y are holding the lock (unless X is acting
>> properly). If I understand the documentation correctly (
>> http://zookeeper.apache.org/doc/r3.4.6/zookeeperProgrammers.html#ch_zkSessions),
>> X would receive a disconnected event in that situation (but not an expired
>> event unless it successfully reconnects). So, X should stop doing the work
>> it is doing until it gets reconnected. How much time does X have to stop
>> the work it is doing? i.e. how long does it take from disconnected event
>> sent to X to expiration of the ephemeral node used for the lock? Having two
>> clients inside a critical section protected by a lock would not be a good
>> idea.
>>>> 
>>>> Regards,
>>>> Simon
>> 
>>

Re: disconnected events and session expiration

Posted by Ivan Kelly <iv...@apache.org>.

>
> If I own a lock, I am entitled to do some work exclusively. No one else
> should be doing that work. If I get disconnected or the session times out I
> have to stop working.

Locks in zookeeper are advisory. Zookeeper will tell you that you have a
lock, but by the time that message arrives you aren't guaranteed that this
is still the case. The same is true in the disconnection case. You may have
lost the lock well before you receive the disconnection event. What happens
if you are doing non-idempotent work during this period? The only
guaranteed solution is to add some sort of lock support in the medium that
is under lock. You can probably also get by by setting timeouts to be high
enough, especially since it doesn't sounds like the solution requires low
latency.

We actually discussed this exact problem a couple of months ago on this
list.
http://mail-archives.apache.org/mod_mbox/zookeeper-user/201507.mbox/%3C1436982861611-7581277.post%40n2.nabble.com%3E
I also wrote a blogpost on this same topic,
https://medium.com/@ivankelly/reliable-table-writer-locks-for-hbase-731024295215

-Ivan


> Somebody else will take over the work in a short time. If I understood the
> programmers guide correctly, the expired event will not be delivered to me
> until I reconnect. Correct? So, I have to use the disconnected event to
> initiate a graceful stop. Stopping work might take some time, e.g. because
> I am doing a REST service call that takes up to 20s. Let’s say doing the
> call twice leads to data corruption in the backend service (e.g. HTTP POST,
> which is non idempotent). So, ideally, if I am still running, I should try
> my best to complete normally. If the state of the work units is kept in ZK,
> I cannot update the state anyway. If I store it in some other datastore, I
> might be able to update the state or not (depending on how the network has
> been partitioned).
>
> The more I think about it, the harder it seems to get this stuff working
> reliably. What if my node crashes? I cannot complete my work normally. So,
> whoever takes over my work will try to redo it anyways. Either the receiver
> is made idempotent (which is not always possible) or the new work owner
> needs to be aware of the aborted task and be extra cautious, e.g. by
> checking whether the work unit has completed or not. It seems to me that
> making the “crash” case the default (i.e. “crash” the worker thread
> whenever a disconnected event is received) is the best solution. Then I am
> forced to make the crash case robust. Guess that’s what some people call
> “crash-only design”.
>
> Simon
>
>
>
> > On 13 Sep 2015, at 03:19 , Jordan Zimmerman <jo...@jordanzimmerman.com>
> wrote:
> >
> > I used to advise that people treat Disconnected the same as session loss
> as it’s safer. But, you can also set a timer when Disconnected is received
> and when your session timeout elapses you can then consider session loss
> (note, use the negotiated value from the ZK handle). FYI - version 3.0.0 of
> Apache Curator will have an option to choose this alternate method.
> >
> > -Jordan
> >
> >
> >
> > On September 12, 2015 at 4:47:46 PM, Simon (cocoa@gmx.ch <mailto:
> cocoa@gmx.ch>) wrote:
> >
> >> Hi
> >>
> >> I am trying to get a better understanding of Zookeeper and how it
> should be used. Let’s talk about the lock recipe (
> http://zookeeper.apache.org/doc/r3.4.6/recipes.html#sc_recipes_Locks).
> >>
> >> - X aquires the lock
> >> - X does some long running work (longer than the session timeout)
> >> - X gets partioned away from the quorum while it was doing some work
> >> - after some time (determined by the timeout passed to ZK) Y will
> aquire the lock
> >>
> >> In that situation both X and Y are holding the lock (unless X is acting
> properly). If I understand the documentation correctly (
> http://zookeeper.apache.org/doc/r3.4.6/zookeeperProgrammers.html#ch_zkSessions),
> X would receive a disconnected event in that situation (but not an expired
> event unless it successfully reconnects). So, X should stop doing the work
> it is doing until it gets reconnected. How much time does X have to stop
> the work it is doing? i.e. how long does it take from disconnected event
> sent to X to expiration of the ephemeral node used for the lock? Having two
> clients inside a critical section protected by a lock would not be a good
> idea.
> >>
> >> Regards,
> >> Simon
>
>

Re: disconnected events and session expiration

Posted by Simon <co...@gmx.ch>.

That didn’t really answer any of my questions. 

If I own a lock, I am entitled to do some work exclusively. No one else should be doing that work. If I get disconnected or the session times out I have to stop working. Somebody else will take over the work in a short time. If I understood the programmers guide correctly, the expired event will not be delivered to me until I reconnect. Correct? So, I have to use the disconnected event to initiate a graceful stop. Stopping work might take some time, e.g. because I am doing a REST service call that takes up to 20s. Let’s say doing the call twice leads to data corruption in the backend service (e.g. HTTP POST, which is non idempotent). So, ideally, if I am still running, I should try my best to complete normally. If the state of the work units is kept in ZK, I cannot update the state anyway. If I store it in some other datastore, I might be able to update the state or not (depending on how the network has been partitioned).

The more I think about it, the harder it seems to get this stuff working reliably. What if my node crashes? I cannot complete my work normally. So, whoever takes over my work will try to redo it anyways. Either the receiver is made idempotent (which is not always possible) or the new work owner needs to be aware of the aborted task and be extra cautious, e.g. by checking whether the work unit has completed or not. It seems to me that making the “crash” case the default (i.e. “crash” the worker thread whenever a disconnected event is received) is the best solution. Then I am forced to make the crash case robust. Guess that’s what some people call “crash-only design”.

Simon

> On 13 Sep 2015, at 03:19 , Jordan Zimmerman <jo...@jordanzimmerman.com> wrote:
> 
> I used to advise that people treat Disconnected the same as session loss as it’s safer. But, you can also set a timer when Disconnected is received and when your session timeout elapses you can then consider session loss (note, use the negotiated value from the ZK handle). FYI - version 3.0.0 of Apache Curator will have an option to choose this alternate method.
> 
> -Jordan
> 
> 
> 
> On September 12, 2015 at 4:47:46 PM, Simon (cocoa@gmx.ch <ma...@gmx.ch>) wrote:
> 
>> Hi 
>> 
>> I am trying to get a better understanding of Zookeeper and how it should be used. Let’s talk about the lock recipe (http://zookeeper.apache.org/doc/r3.4.6/recipes.html#sc_recipes_Locks).  
>> 
>> - X aquires the lock 
>> - X does some long running work (longer than the session timeout) 
>> - X gets partioned away from the quorum while it was doing some work 
>> - after some time (determined by the timeout passed to ZK) Y will aquire the lock 
>> 
>> In that situation both X and Y are holding the lock (unless X is acting properly). If I understand the documentation correctly (http://zookeeper.apache.org/doc/r3.4.6/zookeeperProgrammers.html#ch_zkSessions), X would receive a disconnected event in that situation (but not an expired event unless it successfully reconnects). So, X should stop doing the work it is doing until it gets reconnected. How much time does X have to stop the work it is doing? i.e. how long does it take from disconnected event sent to X to expiration of the ephemeral node used for the lock? Having two clients inside a critical section protected by a lock would not be a good idea. 
>> 
>> Regards, 
>> Simon

Re: disconnected events and session expiration

Posted by Jordan Zimmerman <jo...@jordanzimmerman.com>.

I used to advise that people treat Disconnected the same as session loss as it’s safer. But, you can also set a timer when Disconnected is received and when your session timeout elapses you can then consider session loss (note, use the negotiated value from the ZK handle). FYI - version 3.0.0 of Apache Curator will have an option to choose this alternate method.

-Jordan

On September 12, 2015 at 4:47:46 PM, Simon (cocoa@gmx.ch) wrote:

I am trying to get a better understanding of Zookeeper and how it should be used. Let’s talk about the lock recipe (http://zookeeper.apache.org/doc/r3.4.6/recipes.html#sc_recipes_Locks).

- X aquires the lock
- X does some long running work (longer than the session timeout)
- X gets partioned away from the quorum while it was doing some work
- after some time (determined by the timeout passed to ZK) Y will aquire the lock

In that situation both X and Y are holding the lock (unless X is acting properly). If I understand the documentation correctly (http://zookeeper.apache.org/doc/r3.4.6/zookeeperProgrammers.html#ch_zkSessions), X would receive a disconnected event in that situation (but not an expired event unless it successfully reconnects). So, X should stop doing the work it is doing until it gets reconnected. How much time does X have to stop the work it is doing? i.e. how long does it take from disconnected event sent to X to expiration of the ephemeral node used for the lock? Having two clients inside a critical section protected by a lock would not be a good idea.

Regards,
Simon