You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@curator.apache.org by Benson Qiu <qi...@gmail.com> on 2017/03/22 17:30:56 UTC

How does CuratorFramework handle connection losses?

Hi,

Several questions:

1. The CuratorFramework documentation
<http://curator.apache.org/curator-framework/> says that "should share one
CuratorFramework per ZooKeeper cluster in your application". I create an
instance and call CuratorFramework#start() on application startup and reuse
the same instance throughout the lifetime of my application, but I never
call CuratorFramework#close(). Is this bad practice? What happens if my
application periodically killed and restarted?

2. If I acquire an InterProcessMutex and my application is killed before I
call InterProcessMutex#release(), what happens? Based on my experiments
with TestingServer, it seems that after DEFAULT_SESSION_TIMEOUT_MS
<https://github.com/apache/curator/blob/022de3921a120c6f86cc6e21442327cc04b66cd2/curator-framework/src/main/java/org/apache/curator/framework/CuratorFrameworkFactory.java#L51>,
other applications are able to acquire the InterProcessMutex with the same
lock path. So there might be temporary starvation, but no deadlock. Is my
understanding correct?

3. I did a quick experiment where I pulled out my ethernet cable (lost
connection to the remote ZK cluster), waited several minutes, and then
inserted my ethernet cable in again. I observed from
ConnectionStateListener that the state will change to SUSPENDED, then LOST,
and when the ethernet cable is inserted again, RECONNECTED. How long does
it take for each state change to happen? Even if I lose connection for a
long period of time, can I trust that CuratorFramework will always handle
reconnecting?

Any help, even if it's on a subset of these questions, would be really
appreciated!

Thanks,
Benson

Re: How does CuratorFramework handle connection losses?

Posted by Jordan Zimmerman <jo...@jordanzimmerman.com>.

Great! Thanks

> On Mar 22, 2017, at 5:31 PM, Cameron McKenzie <mc...@gmail.com> wrote:
> 
> https://cwiki.apache.org/confluence/display/CURATOR/TN12 <https://cwiki.apache.org/confluence/display/CURATOR/TN12>
> 
> On Thu, Mar 23, 2017 at 9:17 AM, Cameron McKenzie <mckenzie.cam@gmail.com <ma...@gmail.com>> wrote:
> Ok, I will add a tech note to the wiki.
> cheers
> 
> On Thu, Mar 23, 2017 at 9:16 AM, Jordan Zimmerman <jordan@jordanzimmerman.com <ma...@jordanzimmerman.com>> wrote:
> My vote is the wiki - that way we can update it, etc.
> 
> -JZ
> 
> 
>> On Mar 22, 2017, at 5:15 PM, Cameron McKenzie <mckenzie.cam@gmail.com <ma...@gmail.com>> wrote:
>> 
>> Tech note on the wiki? Or in the details section on the curator.apache.org <http://curator.apache.org/>?
>> 
>> On Thu, Mar 23, 2017 at 8:56 AM, Jordan Zimmerman <jordan@jordanzimmerman.com <ma...@jordanzimmerman.com>> wrote:
>> Yeah, sorry, I meant point 3. People ask about connection handling all the time.
>> 
>>> On Mar 22, 2017, at 4:55 PM, Cameron McKenzie <mckenzie.cam@gmail.com <ma...@gmail.com>> wrote:
>>> 
>>> Which bit in particular?
>>> 
>>> Point 3 perhaps? I think that point 1 and 2 are probably already covered?
>>> 
>>> On Thu, Mar 23, 2017 at 8:47 AM, Jordan Zimmerman <jordan@jordanzimmerman.com <ma...@jordanzimmerman.com>> wrote:
>>> This would make a nice tech note on the wiki if anyone's up to it.
>>> 
>>> -Jordan
>>> 
>>>> On Mar 22, 2017, at 4:13 PM, Cameron McKenzie <cammckenzie@apache.org <ma...@apache.org>> wrote:
>>>> 
>>>> 1.) Calling close() will just clean up any resources associated with the CuratorFramework (Zookeeper connection's etc.). If your application exits without calling close(), this will not cause any issues.
>>>> 
>>>> 2.) InterProcessMutex's are implemented using an ephemeral node in Zookeeper. If your client dies without releasing the mutex then this ephemeral node will be removed after the session times out. So, yes, after your specified session timeout other clients will be able to acquire the mutex.
>>>> 
>>>> 3.) SUSPENDED occur as soon as the connection loss to ZK is determined. The LOST event differs depending on which version of Curator you're using. In Curator 2.x lost will occur once all of the retries have occurred (based on your specified retry policy). In Curator 3.x, Curator will simulate server side session loss, by starting a timer upon receiving the SUSPENDED event, and then publish a LOST event once the session timeout has been reached.
>>>> 
>>>> The RECONNECTED event will occur once a connection has been reestablished to ZK. You can rely on Curator reconnecting when it is possible to do so.
>>>> cheers
>>>> 
>>>> On Thu, Mar 23, 2017 at 4:30 AM, Benson Qiu <qiu.benson@gmail.com <ma...@gmail.com>> wrote:
>>>> Hi,
>>>> 
>>>> Several questions:
>>>> 
>>>> 1. The CuratorFramework documentation <http://curator.apache.org/curator-framework/> says that "should share one CuratorFramework per ZooKeeper cluster in your application". I create an instance and call CuratorFramework#start() on application startup and reuse the same instance throughout the lifetime of my application, but I never call CuratorFramework#close(). Is this bad practice? What happens if my application periodically killed and restarted?
>>>> 
>>>> 2. If I acquire an InterProcessMutex and my application is killed before I call InterProcessMutex#release(), what happens? Based on my experiments with TestingServer, it seems that after DEFAULT_SESSION_TIMEOUT_MS <https://github.com/apache/curator/blob/022de3921a120c6f86cc6e21442327cc04b66cd2/curator-framework/src/main/java/org/apache/curator/framework/CuratorFrameworkFactory.java#L51>, other applications are able to acquire the InterProcessMutex with the same lock path. So there might be temporary starvation, but no deadlock. Is my understanding correct?
>>>> 
>>>> 3. I did a quick experiment where I pulled out my ethernet cable (lost connection to the remote ZK cluster), waited several minutes, and then inserted my ethernet cable in again. I observed from ConnectionStateListener that the state will change to SUSPENDED, then LOST, and when the ethernet cable is inserted again, RECONNECTED. How long does it take for each state change to happen? Even if I lose connection for a long period of time, can I trust that CuratorFramework will always handle reconnecting?
>>>> 
>>>> Any help, even if it's on a subset of these questions, would be really appreciated!
>>>> 
>>>> Thanks,
>>>> Benson
>>>> 
>>> 
>>> 
>> 
>> 
> 
> 
>

Re: How does CuratorFramework handle connection losses?

Posted by Benson Qiu <qi...@gmail.com>.

Thanks Cameron for the quick reply - really appreciate it!!

On Wed, Mar 22, 2017 at 3:31 PM, Cameron McKenzie <mc...@gmail.com>
wrote:

> https://cwiki.apache.org/confluence/display/CURATOR/TN12
>
> On Thu, Mar 23, 2017 at 9:17 AM, Cameron McKenzie <mc...@gmail.com>
> wrote:
>
>> Ok, I will add a tech note to the wiki.
>> cheers
>>
>> On Thu, Mar 23, 2017 at 9:16 AM, Jordan Zimmerman <
>> jordan@jordanzimmerman.com> wrote:
>>
>>> My vote is the wiki - that way we can update it, etc.
>>>
>>> -JZ
>>>
>>>
>>> On Mar 22, 2017, at 5:15 PM, Cameron McKenzie <mc...@gmail.com>
>>> wrote:
>>>
>>> Tech note on the wiki? Or in the details section on the
>>> curator.apache.org?
>>>
>>> On Thu, Mar 23, 2017 at 8:56 AM, Jordan Zimmerman <
>>> jordan@jordanzimmerman.com> wrote:
>>>
>>>> Yeah, sorry, I meant point 3. People ask about connection handling all
>>>> the time.
>>>>
>>>> On Mar 22, 2017, at 4:55 PM, Cameron McKenzie <mc...@gmail.com>
>>>> wrote:
>>>>
>>>> Which bit in particular?
>>>>
>>>> Point 3 perhaps? I think that point 1 and 2 are probably already
>>>> covered?
>>>>
>>>> On Thu, Mar 23, 2017 at 8:47 AM, Jordan Zimmerman <
>>>> jordan@jordanzimmerman.com> wrote:
>>>>
>>>>> This would make a nice tech note on the wiki if anyone's up to it.
>>>>>
>>>>> -Jordan
>>>>>
>>>>> On Mar 22, 2017, at 4:13 PM, Cameron McKenzie <ca...@apache.org>
>>>>> wrote:
>>>>>
>>>>> 1.) Calling close() will just clean up any resources associated with
>>>>> the CuratorFramework (Zookeeper connection's etc.). If your application
>>>>> exits without calling close(), this will not cause any issues.
>>>>>
>>>>> 2.) InterProcessMutex's are implemented using an ephemeral node in
>>>>> Zookeeper. If your client dies without releasing the mutex then this
>>>>> ephemeral node will be removed after the session times out. So, yes, after
>>>>> your specified session timeout other clients will be able to acquire the
>>>>> mutex.
>>>>>
>>>>> 3.) SUSPENDED occur as soon as the connection loss to ZK is
>>>>> determined. The LOST event differs depending on which version of Curator
>>>>> you're using. In Curator 2.x lost will occur once all of the retries have
>>>>> occurred (based on your specified retry policy). In Curator 3.x, Curator
>>>>> will simulate server side session loss, by starting a timer upon receiving
>>>>> the SUSPENDED event, and then publish a LOST event once the session timeout
>>>>> has been reached.
>>>>>
>>>>> The RECONNECTED event will occur once a connection has been
>>>>> reestablished to ZK. You can rely on Curator reconnecting when it is
>>>>> possible to do so.
>>>>> cheers
>>>>>
>>>>> On Thu, Mar 23, 2017 at 4:30 AM, Benson Qiu <qi...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Several questions:
>>>>>>
>>>>>> 1. The CuratorFramework documentation
>>>>>> <http://curator.apache.org/curator-framework/> says that "should
>>>>>> share one CuratorFramework per ZooKeeper cluster in your application". I
>>>>>> create an instance and call CuratorFramework#start() on application startup
>>>>>> and reuse the same instance throughout the lifetime of my application, but
>>>>>> I never call CuratorFramework#close(). Is this bad practice? What happens
>>>>>> if my application periodically killed and restarted?
>>>>>>
>>>>>> 2. If I acquire an InterProcessMutex and my application is killed
>>>>>> before I call InterProcessMutex#release(), what happens? Based on my
>>>>>> experiments with TestingServer, it seems that after
>>>>>> DEFAULT_SESSION_TIMEOUT_MS
>>>>>> <https://github.com/apache/curator/blob/022de3921a120c6f86cc6e21442327cc04b66cd2/curator-framework/src/main/java/org/apache/curator/framework/CuratorFrameworkFactory.java#L51>,
>>>>>> other applications are able to acquire the InterProcessMutex with the same
>>>>>> lock path. So there might be temporary starvation, but no deadlock. Is my
>>>>>> understanding correct?
>>>>>>
>>>>>> 3. I did a quick experiment where I pulled out my ethernet cable
>>>>>> (lost connection to the remote ZK cluster), waited several minutes, and
>>>>>> then inserted my ethernet cable in again. I observed from
>>>>>> ConnectionStateListener that the state will change to SUSPENDED, then LOST,
>>>>>> and when the ethernet cable is inserted again, RECONNECTED. How long does
>>>>>> it take for each state change to happen? Even if I lose connection for a
>>>>>> long period of time, can I trust that CuratorFramework will always handle
>>>>>> reconnecting?
>>>>>>
>>>>>> Any help, even if it's on a subset of these questions, would be
>>>>>> really appreciated!
>>>>>>
>>>>>> Thanks,
>>>>>> Benson
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>

Re: How does CuratorFramework handle connection losses?

Posted by Cameron McKenzie <mc...@gmail.com>.

https://cwiki.apache.org/confluence/display/CURATOR/TN12

On Thu, Mar 23, 2017 at 9:17 AM, Cameron McKenzie <mc...@gmail.com>
wrote:

> Ok, I will add a tech note to the wiki.
> cheers
>
> On Thu, Mar 23, 2017 at 9:16 AM, Jordan Zimmerman <
> jordan@jordanzimmerman.com> wrote:
>
>> My vote is the wiki - that way we can update it, etc.
>>
>> -JZ
>>
>>
>> On Mar 22, 2017, at 5:15 PM, Cameron McKenzie <mc...@gmail.com>
>> wrote:
>>
>> Tech note on the wiki? Or in the details section on the
>> curator.apache.org?
>>
>> On Thu, Mar 23, 2017 at 8:56 AM, Jordan Zimmerman <
>> jordan@jordanzimmerman.com> wrote:
>>
>>> Yeah, sorry, I meant point 3. People ask about connection handling all
>>> the time.
>>>
>>> On Mar 22, 2017, at 4:55 PM, Cameron McKenzie <mc...@gmail.com>
>>> wrote:
>>>
>>> Which bit in particular?
>>>
>>> Point 3 perhaps? I think that point 1 and 2 are probably already covered?
>>>
>>> On Thu, Mar 23, 2017 at 8:47 AM, Jordan Zimmerman <
>>> jordan@jordanzimmerman.com> wrote:
>>>
>>>> This would make a nice tech note on the wiki if anyone's up to it.
>>>>
>>>> -Jordan
>>>>
>>>> On Mar 22, 2017, at 4:13 PM, Cameron McKenzie <ca...@apache.org>
>>>> wrote:
>>>>
>>>> 1.) Calling close() will just clean up any resources associated with
>>>> the CuratorFramework (Zookeeper connection's etc.). If your application
>>>> exits without calling close(), this will not cause any issues.
>>>>
>>>> 2.) InterProcessMutex's are implemented using an ephemeral node in
>>>> Zookeeper. If your client dies without releasing the mutex then this
>>>> ephemeral node will be removed after the session times out. So, yes, after
>>>> your specified session timeout other clients will be able to acquire the
>>>> mutex.
>>>>
>>>> 3.) SUSPENDED occur as soon as the connection loss to ZK is determined.
>>>> The LOST event differs depending on which version of Curator you're using.
>>>> In Curator 2.x lost will occur once all of the retries have occurred (based
>>>> on your specified retry policy). In Curator 3.x, Curator will simulate
>>>> server side session loss, by starting a timer upon receiving the SUSPENDED
>>>> event, and then publish a LOST event once the session timeout has been
>>>> reached.
>>>>
>>>> The RECONNECTED event will occur once a connection has been
>>>> reestablished to ZK. You can rely on Curator reconnecting when it is
>>>> possible to do so.
>>>> cheers
>>>>
>>>> On Thu, Mar 23, 2017 at 4:30 AM, Benson Qiu <qi...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Several questions:
>>>>>
>>>>> 1. The CuratorFramework documentation
>>>>> <http://curator.apache.org/curator-framework/> says that "should
>>>>> share one CuratorFramework per ZooKeeper cluster in your application". I
>>>>> create an instance and call CuratorFramework#start() on application startup
>>>>> and reuse the same instance throughout the lifetime of my application, but
>>>>> I never call CuratorFramework#close(). Is this bad practice? What happens
>>>>> if my application periodically killed and restarted?
>>>>>
>>>>> 2. If I acquire an InterProcessMutex and my application is killed
>>>>> before I call InterProcessMutex#release(), what happens? Based on my
>>>>> experiments with TestingServer, it seems that after
>>>>> DEFAULT_SESSION_TIMEOUT_MS
>>>>> <https://github.com/apache/curator/blob/022de3921a120c6f86cc6e21442327cc04b66cd2/curator-framework/src/main/java/org/apache/curator/framework/CuratorFrameworkFactory.java#L51>,
>>>>> other applications are able to acquire the InterProcessMutex with the same
>>>>> lock path. So there might be temporary starvation, but no deadlock. Is my
>>>>> understanding correct?
>>>>>
>>>>> 3. I did a quick experiment where I pulled out my ethernet cable (lost
>>>>> connection to the remote ZK cluster), waited several minutes, and then
>>>>> inserted my ethernet cable in again. I observed from
>>>>> ConnectionStateListener that the state will change to SUSPENDED, then LOST,
>>>>> and when the ethernet cable is inserted again, RECONNECTED. How long does
>>>>> it take for each state change to happen? Even if I lose connection for a
>>>>> long period of time, can I trust that CuratorFramework will always handle
>>>>> reconnecting?
>>>>>
>>>>> Any help, even if it's on a subset of these questions, would be really
>>>>> appreciated!
>>>>>
>>>>> Thanks,
>>>>> Benson
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>

Re: How does CuratorFramework handle connection losses?

Posted by Cameron McKenzie <mc...@gmail.com>.

Ok, I will add a tech note to the wiki.
cheers

On Thu, Mar 23, 2017 at 9:16 AM, Jordan Zimmerman <
jordan@jordanzimmerman.com> wrote:

> My vote is the wiki - that way we can update it, etc.
>
> -JZ
>
>
> On Mar 22, 2017, at 5:15 PM, Cameron McKenzie <mc...@gmail.com>
> wrote:
>
> Tech note on the wiki? Or in the details section on the curator.apache.org
> ?
>
> On Thu, Mar 23, 2017 at 8:56 AM, Jordan Zimmerman <
> jordan@jordanzimmerman.com> wrote:
>
>> Yeah, sorry, I meant point 3. People ask about connection handling all
>> the time.
>>
>> On Mar 22, 2017, at 4:55 PM, Cameron McKenzie <mc...@gmail.com>
>> wrote:
>>
>> Which bit in particular?
>>
>> Point 3 perhaps? I think that point 1 and 2 are probably already covered?
>>
>> On Thu, Mar 23, 2017 at 8:47 AM, Jordan Zimmerman <
>> jordan@jordanzimmerman.com> wrote:
>>
>>> This would make a nice tech note on the wiki if anyone's up to it.
>>>
>>> -Jordan
>>>
>>> On Mar 22, 2017, at 4:13 PM, Cameron McKenzie <ca...@apache.org>
>>> wrote:
>>>
>>> 1.) Calling close() will just clean up any resources associated with the
>>> CuratorFramework (Zookeeper connection's etc.). If your application exits
>>> without calling close(), this will not cause any issues.
>>>
>>> 2.) InterProcessMutex's are implemented using an ephemeral node in
>>> Zookeeper. If your client dies without releasing the mutex then this
>>> ephemeral node will be removed after the session times out. So, yes, after
>>> your specified session timeout other clients will be able to acquire the
>>> mutex.
>>>
>>> 3.) SUSPENDED occur as soon as the connection loss to ZK is determined.
>>> The LOST event differs depending on which version of Curator you're using.
>>> In Curator 2.x lost will occur once all of the retries have occurred (based
>>> on your specified retry policy). In Curator 3.x, Curator will simulate
>>> server side session loss, by starting a timer upon receiving the SUSPENDED
>>> event, and then publish a LOST event once the session timeout has been
>>> reached.
>>>
>>> The RECONNECTED event will occur once a connection has been
>>> reestablished to ZK. You can rely on Curator reconnecting when it is
>>> possible to do so.
>>> cheers
>>>
>>> On Thu, Mar 23, 2017 at 4:30 AM, Benson Qiu <qi...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> Several questions:
>>>>
>>>> 1. The CuratorFramework documentation
>>>> <http://curator.apache.org/curator-framework/> says that "should share
>>>> one CuratorFramework per ZooKeeper cluster in your application". I create
>>>> an instance and call CuratorFramework#start() on application startup and
>>>> reuse the same instance throughout the lifetime of my application, but I
>>>> never call CuratorFramework#close(). Is this bad practice? What happens if
>>>> my application periodically killed and restarted?
>>>>
>>>> 2. If I acquire an InterProcessMutex and my application is killed
>>>> before I call InterProcessMutex#release(), what happens? Based on my
>>>> experiments with TestingServer, it seems that after
>>>> DEFAULT_SESSION_TIMEOUT_MS
>>>> <https://github.com/apache/curator/blob/022de3921a120c6f86cc6e21442327cc04b66cd2/curator-framework/src/main/java/org/apache/curator/framework/CuratorFrameworkFactory.java#L51>,
>>>> other applications are able to acquire the InterProcessMutex with the same
>>>> lock path. So there might be temporary starvation, but no deadlock. Is my
>>>> understanding correct?
>>>>
>>>> 3. I did a quick experiment where I pulled out my ethernet cable (lost
>>>> connection to the remote ZK cluster), waited several minutes, and then
>>>> inserted my ethernet cable in again. I observed from
>>>> ConnectionStateListener that the state will change to SUSPENDED, then LOST,
>>>> and when the ethernet cable is inserted again, RECONNECTED. How long does
>>>> it take for each state change to happen? Even if I lose connection for a
>>>> long period of time, can I trust that CuratorFramework will always handle
>>>> reconnecting?
>>>>
>>>> Any help, even if it's on a subset of these questions, would be really
>>>> appreciated!
>>>>
>>>> Thanks,
>>>> Benson
>>>>
>>>
>>>
>>>
>>
>>
>
>

Re: How does CuratorFramework handle connection losses?

Posted by Jordan Zimmerman <jo...@jordanzimmerman.com>.

My vote is the wiki - that way we can update it, etc.

-JZ


> On Mar 22, 2017, at 5:15 PM, Cameron McKenzie <mc...@gmail.com> wrote:
> 
> Tech note on the wiki? Or in the details section on the curator.apache.org <http://curator.apache.org/>?
> 
> On Thu, Mar 23, 2017 at 8:56 AM, Jordan Zimmerman <jordan@jordanzimmerman.com <ma...@jordanzimmerman.com>> wrote:
> Yeah, sorry, I meant point 3. People ask about connection handling all the time.
> 
>> On Mar 22, 2017, at 4:55 PM, Cameron McKenzie <mckenzie.cam@gmail.com <ma...@gmail.com>> wrote:
>> 
>> Which bit in particular?
>> 
>> Point 3 perhaps? I think that point 1 and 2 are probably already covered?
>> 
>> On Thu, Mar 23, 2017 at 8:47 AM, Jordan Zimmerman <jordan@jordanzimmerman.com <ma...@jordanzimmerman.com>> wrote:
>> This would make a nice tech note on the wiki if anyone's up to it.
>> 
>> -Jordan
>> 
>>> On Mar 22, 2017, at 4:13 PM, Cameron McKenzie <cammckenzie@apache.org <ma...@apache.org>> wrote:
>>> 
>>> 1.) Calling close() will just clean up any resources associated with the CuratorFramework (Zookeeper connection's etc.). If your application exits without calling close(), this will not cause any issues.
>>> 
>>> 2.) InterProcessMutex's are implemented using an ephemeral node in Zookeeper. If your client dies without releasing the mutex then this ephemeral node will be removed after the session times out. So, yes, after your specified session timeout other clients will be able to acquire the mutex.
>>> 
>>> 3.) SUSPENDED occur as soon as the connection loss to ZK is determined. The LOST event differs depending on which version of Curator you're using. In Curator 2.x lost will occur once all of the retries have occurred (based on your specified retry policy). In Curator 3.x, Curator will simulate server side session loss, by starting a timer upon receiving the SUSPENDED event, and then publish a LOST event once the session timeout has been reached.
>>> 
>>> The RECONNECTED event will occur once a connection has been reestablished to ZK. You can rely on Curator reconnecting when it is possible to do so.
>>> cheers
>>> 
>>> On Thu, Mar 23, 2017 at 4:30 AM, Benson Qiu <qiu.benson@gmail.com <ma...@gmail.com>> wrote:
>>> Hi,
>>> 
>>> Several questions:
>>> 
>>> 1. The CuratorFramework documentation <http://curator.apache.org/curator-framework/> says that "should share one CuratorFramework per ZooKeeper cluster in your application". I create an instance and call CuratorFramework#start() on application startup and reuse the same instance throughout the lifetime of my application, but I never call CuratorFramework#close(). Is this bad practice? What happens if my application periodically killed and restarted?
>>> 
>>> 2. If I acquire an InterProcessMutex and my application is killed before I call InterProcessMutex#release(), what happens? Based on my experiments with TestingServer, it seems that after DEFAULT_SESSION_TIMEOUT_MS <https://github.com/apache/curator/blob/022de3921a120c6f86cc6e21442327cc04b66cd2/curator-framework/src/main/java/org/apache/curator/framework/CuratorFrameworkFactory.java#L51>, other applications are able to acquire the InterProcessMutex with the same lock path. So there might be temporary starvation, but no deadlock. Is my understanding correct?
>>> 
>>> 3. I did a quick experiment where I pulled out my ethernet cable (lost connection to the remote ZK cluster), waited several minutes, and then inserted my ethernet cable in again. I observed from ConnectionStateListener that the state will change to SUSPENDED, then LOST, and when the ethernet cable is inserted again, RECONNECTED. How long does it take for each state change to happen? Even if I lose connection for a long period of time, can I trust that CuratorFramework will always handle reconnecting?
>>> 
>>> Any help, even if it's on a subset of these questions, would be really appreciated!
>>> 
>>> Thanks,
>>> Benson
>>> 
>> 
>> 
> 
>

Re: How does CuratorFramework handle connection losses?

Posted by Cameron McKenzie <mc...@gmail.com>.

Tech note on the wiki? Or in the details section on the curator.apache.org?

On Thu, Mar 23, 2017 at 8:56 AM, Jordan Zimmerman <
jordan@jordanzimmerman.com> wrote:

> Yeah, sorry, I meant point 3. People ask about connection handling all the
> time.
>
> On Mar 22, 2017, at 4:55 PM, Cameron McKenzie <mc...@gmail.com>
> wrote:
>
> Which bit in particular?
>
> Point 3 perhaps? I think that point 1 and 2 are probably already covered?
>
> On Thu, Mar 23, 2017 at 8:47 AM, Jordan Zimmerman <
> jordan@jordanzimmerman.com> wrote:
>
>> This would make a nice tech note on the wiki if anyone's up to it.
>>
>> -Jordan
>>
>> On Mar 22, 2017, at 4:13 PM, Cameron McKenzie <ca...@apache.org>
>> wrote:
>>
>> 1.) Calling close() will just clean up any resources associated with the
>> CuratorFramework (Zookeeper connection's etc.). If your application exits
>> without calling close(), this will not cause any issues.
>>
>> 2.) InterProcessMutex's are implemented using an ephemeral node in
>> Zookeeper. If your client dies without releasing the mutex then this
>> ephemeral node will be removed after the session times out. So, yes, after
>> your specified session timeout other clients will be able to acquire the
>> mutex.
>>
>> 3.) SUSPENDED occur as soon as the connection loss to ZK is determined.
>> The LOST event differs depending on which version of Curator you're using.
>> In Curator 2.x lost will occur once all of the retries have occurred (based
>> on your specified retry policy). In Curator 3.x, Curator will simulate
>> server side session loss, by starting a timer upon receiving the SUSPENDED
>> event, and then publish a LOST event once the session timeout has been
>> reached.
>>
>> The RECONNECTED event will occur once a connection has been reestablished
>> to ZK. You can rely on Curator reconnecting when it is possible to do so.
>> cheers
>>
>> On Thu, Mar 23, 2017 at 4:30 AM, Benson Qiu <qi...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> Several questions:
>>>
>>> 1. The CuratorFramework documentation
>>> <http://curator.apache.org/curator-framework/> says that "should share
>>> one CuratorFramework per ZooKeeper cluster in your application". I create
>>> an instance and call CuratorFramework#start() on application startup and
>>> reuse the same instance throughout the lifetime of my application, but I
>>> never call CuratorFramework#close(). Is this bad practice? What happens if
>>> my application periodically killed and restarted?
>>>
>>> 2. If I acquire an InterProcessMutex and my application is killed before
>>> I call InterProcessMutex#release(), what happens? Based on my experiments
>>> with TestingServer, it seems that after DEFAULT_SESSION_TIMEOUT_MS
>>> <https://github.com/apache/curator/blob/022de3921a120c6f86cc6e21442327cc04b66cd2/curator-framework/src/main/java/org/apache/curator/framework/CuratorFrameworkFactory.java#L51>,
>>> other applications are able to acquire the InterProcessMutex with the same
>>> lock path. So there might be temporary starvation, but no deadlock. Is my
>>> understanding correct?
>>>
>>> 3. I did a quick experiment where I pulled out my ethernet cable (lost
>>> connection to the remote ZK cluster), waited several minutes, and then
>>> inserted my ethernet cable in again. I observed from
>>> ConnectionStateListener that the state will change to SUSPENDED, then LOST,
>>> and when the ethernet cable is inserted again, RECONNECTED. How long does
>>> it take for each state change to happen? Even if I lose connection for a
>>> long period of time, can I trust that CuratorFramework will always handle
>>> reconnecting?
>>>
>>> Any help, even if it's on a subset of these questions, would be really
>>> appreciated!
>>>
>>> Thanks,
>>> Benson
>>>
>>
>>
>>
>
>

Re: How does CuratorFramework handle connection losses?

Posted by Jordan Zimmerman <jo...@jordanzimmerman.com>.

Yeah, sorry, I meant point 3. People ask about connection handling all the time.

> On Mar 22, 2017, at 4:55 PM, Cameron McKenzie <mc...@gmail.com> wrote:
> 
> Which bit in particular?
> 
> Point 3 perhaps? I think that point 1 and 2 are probably already covered?
> 
> On Thu, Mar 23, 2017 at 8:47 AM, Jordan Zimmerman <jordan@jordanzimmerman.com <ma...@jordanzimmerman.com>> wrote:
> This would make a nice tech note on the wiki if anyone's up to it.
> 
> -Jordan
> 
>> On Mar 22, 2017, at 4:13 PM, Cameron McKenzie <cammckenzie@apache.org <ma...@apache.org>> wrote:
>> 
>> 1.) Calling close() will just clean up any resources associated with the CuratorFramework (Zookeeper connection's etc.). If your application exits without calling close(), this will not cause any issues.
>> 
>> 2.) InterProcessMutex's are implemented using an ephemeral node in Zookeeper. If your client dies without releasing the mutex then this ephemeral node will be removed after the session times out. So, yes, after your specified session timeout other clients will be able to acquire the mutex.
>> 
>> 3.) SUSPENDED occur as soon as the connection loss to ZK is determined. The LOST event differs depending on which version of Curator you're using. In Curator 2.x lost will occur once all of the retries have occurred (based on your specified retry policy). In Curator 3.x, Curator will simulate server side session loss, by starting a timer upon receiving the SUSPENDED event, and then publish a LOST event once the session timeout has been reached.
>> 
>> The RECONNECTED event will occur once a connection has been reestablished to ZK. You can rely on Curator reconnecting when it is possible to do so.
>> cheers
>> 
>> On Thu, Mar 23, 2017 at 4:30 AM, Benson Qiu <qiu.benson@gmail.com <ma...@gmail.com>> wrote:
>> Hi,
>> 
>> Several questions:
>> 
>> 1. The CuratorFramework documentation <http://curator.apache.org/curator-framework/> says that "should share one CuratorFramework per ZooKeeper cluster in your application". I create an instance and call CuratorFramework#start() on application startup and reuse the same instance throughout the lifetime of my application, but I never call CuratorFramework#close(). Is this bad practice? What happens if my application periodically killed and restarted?
>> 
>> 2. If I acquire an InterProcessMutex and my application is killed before I call InterProcessMutex#release(), what happens? Based on my experiments with TestingServer, it seems that after DEFAULT_SESSION_TIMEOUT_MS <https://github.com/apache/curator/blob/022de3921a120c6f86cc6e21442327cc04b66cd2/curator-framework/src/main/java/org/apache/curator/framework/CuratorFrameworkFactory.java#L51>, other applications are able to acquire the InterProcessMutex with the same lock path. So there might be temporary starvation, but no deadlock. Is my understanding correct?
>> 
>> 3. I did a quick experiment where I pulled out my ethernet cable (lost connection to the remote ZK cluster), waited several minutes, and then inserted my ethernet cable in again. I observed from ConnectionStateListener that the state will change to SUSPENDED, then LOST, and when the ethernet cable is inserted again, RECONNECTED. How long does it take for each state change to happen? Even if I lose connection for a long period of time, can I trust that CuratorFramework will always handle reconnecting?
>> 
>> Any help, even if it's on a subset of these questions, would be really appreciated!
>> 
>> Thanks,
>> Benson
>> 
> 
>

Re: How does CuratorFramework handle connection losses?

Posted by Cameron McKenzie <mc...@gmail.com>.

Which bit in particular?

Point 3 perhaps? I think that point 1 and 2 are probably already covered?

On Thu, Mar 23, 2017 at 8:47 AM, Jordan Zimmerman <
jordan@jordanzimmerman.com> wrote:

> This would make a nice tech note on the wiki if anyone's up to it.
>
> -Jordan
>
> On Mar 22, 2017, at 4:13 PM, Cameron McKenzie <ca...@apache.org>
> wrote:
>
> 1.) Calling close() will just clean up any resources associated with the
> CuratorFramework (Zookeeper connection's etc.). If your application exits
> without calling close(), this will not cause any issues.
>
> 2.) InterProcessMutex's are implemented using an ephemeral node in
> Zookeeper. If your client dies without releasing the mutex then this
> ephemeral node will be removed after the session times out. So, yes, after
> your specified session timeout other clients will be able to acquire the
> mutex.
>
> 3.) SUSPENDED occur as soon as the connection loss to ZK is determined.
> The LOST event differs depending on which version of Curator you're using.
> In Curator 2.x lost will occur once all of the retries have occurred (based
> on your specified retry policy). In Curator 3.x, Curator will simulate
> server side session loss, by starting a timer upon receiving the SUSPENDED
> event, and then publish a LOST event once the session timeout has been
> reached.
>
> The RECONNECTED event will occur once a connection has been reestablished
> to ZK. You can rely on Curator reconnecting when it is possible to do so.
> cheers
>
> On Thu, Mar 23, 2017 at 4:30 AM, Benson Qiu <qi...@gmail.com> wrote:
>
>> Hi,
>>
>> Several questions:
>>
>> 1. The CuratorFramework documentation
>> <http://curator.apache.org/curator-framework/> says that "should share
>> one CuratorFramework per ZooKeeper cluster in your application". I create
>> an instance and call CuratorFramework#start() on application startup and
>> reuse the same instance throughout the lifetime of my application, but I
>> never call CuratorFramework#close(). Is this bad practice? What happens if
>> my application periodically killed and restarted?
>>
>> 2. If I acquire an InterProcessMutex and my application is killed before
>> I call InterProcessMutex#release(), what happens? Based on my experiments
>> with TestingServer, it seems that after DEFAULT_SESSION_TIMEOUT_MS
>> <https://github.com/apache/curator/blob/022de3921a120c6f86cc6e21442327cc04b66cd2/curator-framework/src/main/java/org/apache/curator/framework/CuratorFrameworkFactory.java#L51>,
>> other applications are able to acquire the InterProcessMutex with the same
>> lock path. So there might be temporary starvation, but no deadlock. Is my
>> understanding correct?
>>
>> 3. I did a quick experiment where I pulled out my ethernet cable (lost
>> connection to the remote ZK cluster), waited several minutes, and then
>> inserted my ethernet cable in again. I observed from
>> ConnectionStateListener that the state will change to SUSPENDED, then LOST,
>> and when the ethernet cable is inserted again, RECONNECTED. How long does
>> it take for each state change to happen? Even if I lose connection for a
>> long period of time, can I trust that CuratorFramework will always handle
>> reconnecting?
>>
>> Any help, even if it's on a subset of these questions, would be really
>> appreciated!
>>
>> Thanks,
>> Benson
>>
>
>
>

Re: How does CuratorFramework handle connection losses?

Posted by Jordan Zimmerman <jo...@jordanzimmerman.com>.

This would make a nice tech note on the wiki if anyone's up to it.

-Jordan

> On Mar 22, 2017, at 4:13 PM, Cameron McKenzie <ca...@apache.org> wrote:
> 
> 1.) Calling close() will just clean up any resources associated with the CuratorFramework (Zookeeper connection's etc.). If your application exits without calling close(), this will not cause any issues.
> 
> 2.) InterProcessMutex's are implemented using an ephemeral node in Zookeeper. If your client dies without releasing the mutex then this ephemeral node will be removed after the session times out. So, yes, after your specified session timeout other clients will be able to acquire the mutex.
> 
> 3.) SUSPENDED occur as soon as the connection loss to ZK is determined. The LOST event differs depending on which version of Curator you're using. In Curator 2.x lost will occur once all of the retries have occurred (based on your specified retry policy). In Curator 3.x, Curator will simulate server side session loss, by starting a timer upon receiving the SUSPENDED event, and then publish a LOST event once the session timeout has been reached.
> 
> The RECONNECTED event will occur once a connection has been reestablished to ZK. You can rely on Curator reconnecting when it is possible to do so.
> cheers
> 
> On Thu, Mar 23, 2017 at 4:30 AM, Benson Qiu <qiu.benson@gmail.com <ma...@gmail.com>> wrote:
> Hi,
> 
> Several questions:
> 
> 1. The CuratorFramework documentation <http://curator.apache.org/curator-framework/> says that "should share one CuratorFramework per ZooKeeper cluster in your application". I create an instance and call CuratorFramework#start() on application startup and reuse the same instance throughout the lifetime of my application, but I never call CuratorFramework#close(). Is this bad practice? What happens if my application periodically killed and restarted?
> 
> 2. If I acquire an InterProcessMutex and my application is killed before I call InterProcessMutex#release(), what happens? Based on my experiments with TestingServer, it seems that after DEFAULT_SESSION_TIMEOUT_MS <https://github.com/apache/curator/blob/022de3921a120c6f86cc6e21442327cc04b66cd2/curator-framework/src/main/java/org/apache/curator/framework/CuratorFrameworkFactory.java#L51>, other applications are able to acquire the InterProcessMutex with the same lock path. So there might be temporary starvation, but no deadlock. Is my understanding correct?
> 
> 3. I did a quick experiment where I pulled out my ethernet cable (lost connection to the remote ZK cluster), waited several minutes, and then inserted my ethernet cable in again. I observed from ConnectionStateListener that the state will change to SUSPENDED, then LOST, and when the ethernet cable is inserted again, RECONNECTED. How long does it take for each state change to happen? Even if I lose connection for a long period of time, can I trust that CuratorFramework will always handle reconnecting?
> 
> Any help, even if it's on a subset of these questions, would be really appreciated!
> 
> Thanks,
> Benson
>

Re: How does CuratorFramework handle connection losses?

Posted by Cameron McKenzie <ca...@apache.org>.

1.) Calling close() will just clean up any resources associated with the
CuratorFramework (Zookeeper connection's etc.). If your application exits
without calling close(), this will not cause any issues.

2.) InterProcessMutex's are implemented using an ephemeral node in
Zookeeper. If your client dies without releasing the mutex then this
ephemeral node will be removed after the session times out. So, yes, after
your specified session timeout other clients will be able to acquire the
mutex.

3.) SUSPENDED occur as soon as the connection loss to ZK is determined. The
LOST event differs depending on which version of Curator you're using. In
Curator 2.x lost will occur once all of the retries have occurred (based on
your specified retry policy). In Curator 3.x, Curator will simulate server
side session loss, by starting a timer upon receiving the SUSPENDED event,
and then publish a LOST event once the session timeout has been reached.

The RECONNECTED event will occur once a connection has been reestablished
to ZK. You can rely on Curator reconnecting when it is possible to do so.
cheers

On Thu, Mar 23, 2017 at 4:30 AM, Benson Qiu <qi...@gmail.com> wrote:

> Hi,
>
> Several questions:
>
> 1. The CuratorFramework documentation
> <http://curator.apache.org/curator-framework/> says that "should share
> one CuratorFramework per ZooKeeper cluster in your application". I create
> an instance and call CuratorFramework#start() on application startup and
> reuse the same instance throughout the lifetime of my application, but I
> never call CuratorFramework#close(). Is this bad practice? What happens if
> my application periodically killed and restarted?
>
> 2. If I acquire an InterProcessMutex and my application is killed before I
> call InterProcessMutex#release(), what happens? Based on my experiments
> with TestingServer, it seems that after DEFAULT_SESSION_TIMEOUT_MS
> <https://github.com/apache/curator/blob/022de3921a120c6f86cc6e21442327cc04b66cd2/curator-framework/src/main/java/org/apache/curator/framework/CuratorFrameworkFactory.java#L51>,
> other applications are able to acquire the InterProcessMutex with the same
> lock path. So there might be temporary starvation, but no deadlock. Is my
> understanding correct?
>
> 3. I did a quick experiment where I pulled out my ethernet cable (lost
> connection to the remote ZK cluster), waited several minutes, and then
> inserted my ethernet cable in again. I observed from
> ConnectionStateListener that the state will change to SUSPENDED, then LOST,
> and when the ethernet cable is inserted again, RECONNECTED. How long does
> it take for each state change to happen? Even if I lose connection for a
> long period of time, can I trust that CuratorFramework will always handle
> reconnecting?
>
> Any help, even if it's on a subset of these questions, would be really
> appreciated!
>
> Thanks,
> Benson
>