You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@zookeeper.apache.org by John Lindwall <jo...@gmail.com> on 2019/08/01 00:04:21 UTC

Ephemeral znodes not getting removed

ZooKeeper 3.4.6-1569965

In our environment we seem to have a situation where ephemeral znodes 
are not getting removed after the zookeeper session has been 
terminated.  We can see examples of znodes that were created 3-4 days 
past that still exist, though the zk sessions bound to those znodes 
should no longer exist.

Note that we've had this cluster running to about 4 years and have not 
seen this problem until recently.

1. I am wondering if there are any known issues that would affect our 
zookeeper version that may cause this behavior?
2. Is it possible our servers are simply in a "bad state" and a simple 
reboot might clean things up?
3. Any tips on diagnosing this?

We noticed this issue from 2011 but that seems to have been fixed in our 
branch.

<https://issues.apache.org/jira/browse/ZOOKEEPER-1208>https://issues.apache.org/jira/browse/ZOOKEEPER-1208

Thanks,
John Lindwall

Re: Ephemeral znodes not getting removed

Posted by John Lindwall <jo...@gmail.com>.
Thanks for the response! My direct access to this zk cluster is 
limited.  I'll see about getting a copy of the logs to examine.  I'll 
also try to coordinate your experiment of creating a znode in each node 
in turn and checking the cluster-wide view of that data.  If we see a 
situation where the "global view" is inconsistent what would be the next 
step?

I did receive output from each cluster node containing the results of 
these 4-letter words: dump, cons, mntr, and stat.  For one of the 
ephemerals in question we could see a record of it in the "dump" output 
for one of the 3 cluster nodes (the leader) but not in the other 2 nodes 
dump output.  Weirdly, the session id associated with that ephemeral 
znode does not appear in the "cons" output for any of the cluster 
members.  So this appears to be an ephemeral that has survived the 
termination of its associated zk session (!?)

Thanks for any advice or feedback,
John

Patrick Hunt wrote on 8/2/19 9:38 AM:
> The jira you ref'd is the only one that comes to mind. In terms of
> troubleshooting - try connecting a client to each of the servers in tern
> and see if it's a situation where they have a different view of the world
> wrt those znodes. You might also have the client create separate znodes on
> each server and ensure that they are consistent. The logs are also
> typically a good source of information - check against the session id.
>
> Patrick
>
> On Wed, Jul 31, 2019 at 5:54 PM John Lindwall <jo...@gmail.com>
> wrote:
>
>> ZooKeeper 3.4.6-1569965
>>
>> In our environment we seem to have a situation where ephemeral znodes
>> are not getting removed after the zookeeper session has been
>> terminated.  We can see examples of znodes that were created 3-4 days
>> past that still exist, though the zk sessions bound to those znodes
>> should no longer exist.
>>
>> Note that we've had this cluster running to about 4 years and have not
>> seen this problem until recently.
>>
>> 1. I am wondering if there are any known issues that would affect our
>> zookeeper version that may cause this behavior?
>> 2. Is it possible our servers are simply in a "bad state" and a simple
>> reboot might clean things up?
>> 3. Any tips on diagnosing this?
>>
>> We noticed this issue from 2011 but that seems to have been fixed in our
>> branch.
>>
>> <https://issues.apache.org/jira/browse/ZOOKEEPER-1208>
>> https://issues.apache.org/jira/browse/ZOOKEEPER-1208
>>
>> Thanks,
>> John Lindwall
>>

-- 
Sent from Postbox 
<https://www.postbox-inc.com/?utm_source=email&utm_medium=siglink&utm_campaign=reach>

Re: Ephemeral znodes not getting removed

Posted by Patrick Hunt <ph...@apache.org>.
The jira you ref'd is the only one that comes to mind. In terms of
troubleshooting - try connecting a client to each of the servers in tern
and see if it's a situation where they have a different view of the world
wrt those znodes. You might also have the client create separate znodes on
each server and ensure that they are consistent. The logs are also
typically a good source of information - check against the session id.

Patrick

On Wed, Jul 31, 2019 at 5:54 PM John Lindwall <jo...@gmail.com>
wrote:

> ZooKeeper 3.4.6-1569965
>
> In our environment we seem to have a situation where ephemeral znodes
> are not getting removed after the zookeeper session has been
> terminated.  We can see examples of znodes that were created 3-4 days
> past that still exist, though the zk sessions bound to those znodes
> should no longer exist.
>
> Note that we've had this cluster running to about 4 years and have not
> seen this problem until recently.
>
> 1. I am wondering if there are any known issues that would affect our
> zookeeper version that may cause this behavior?
> 2. Is it possible our servers are simply in a "bad state" and a simple
> reboot might clean things up?
> 3. Any tips on diagnosing this?
>
> We noticed this issue from 2011 but that seems to have been fixed in our
> branch.
>
> <https://issues.apache.org/jira/browse/ZOOKEEPER-1208>
> https://issues.apache.org/jira/browse/ZOOKEEPER-1208
>
> Thanks,
> John Lindwall
>