You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by Jordan Zimmerman <jo...@jordanzimmerman.com> on 2015/04/09 17:00:12 UTC

[PROPOSAL] Container nodes

BACKGROUND
============
A recurring problem for ZooKeeper users is garbage collection of parent nodes. Many recipes (e.g. locks, leaders, etc.) call for the creation of a parent node under which participants create sequential nodes. When the participant is done, it deletes its node. In practice, the ZooKeeper tree begins to fill up with orphaned parent nodes that are no longer needed. The ZooKeeper APIs don’t provide a way to clean these. Over time, ZooKeeper can become unstable due to the number of these nodes.

CURRENT SOLUTIONS
===================
Apache Curator has a workaround solution for this by providing the Reaper class which runs in the background looking for orphaned parent nodes and deleting them. This isn’t ideal and it would be better if ZooKeeper supported this directly.

PROPOSAL
=========
ZOOKEEPER-723 and ZOOKEEPER-834 have been proposed to allow EPHEMERAL nodes to contain child nodes. This is not optimum as EPHEMERALs are tied to a session and the general use case of parent nodes is for PERSISTENT nodes. This proposal adds a new node type, CONTAINER. A CONTAINER node is the same as a PERSISTENT node with the additional property that when its last child is deleted, it is deleted (and CONTAINER nodes recursively up the tree are deleted if empty).

I have a first pass (untested) straw man proposal open for comment here:

https://github.com/apache/zookeeper/pull/28

In order to have minimum impact on existing implementations, a container node is denoted by having an ephemeralOwner id of Long.MIN_VALUE. This is pretty hackish, but I think it’s the most supportable without causing disruption. Also, a container behaves a “little bit” like an EPHEMERAL node so it isn’t totally illogical. Alternatively, a new field could be added to STAT.

I look forward to feedback on this. If people think it’s worthwhile I’ll open a Jira and work on a Production quality solution. If it’s rejected, I’d appreciate discussion of an alternate as this is a real need in the ZK user community.

-Jordan



Re: [PROPOSAL] Container nodes

Posted by Jordan Zimmerman <jo...@jordanzimmerman.com>.
I now have a first pass production-level implementation. It still needs more testing and, more importantly, feedback from those who know ZK internals better than me.

https://issues.apache.org/jira/browse/ZOOKEEPER-2163

-Jordan



On April 14, 2015 at 1:52:38 PM, kishore g (g.kishore@gmail.com) wrote:

Hi Jordon,  

I like this feature and always thought it would be useful to have something  
like this for Apache Helix as well. We do have a clean up thread that  
deletes the znodes. But I felt it was tied to Helix.  

Here are some of the questions that made me think its better to have the  
user of zookeeper handle deleting the parent node according to the use case.  

How would one go about using this feature? Perhaps a pseudo api and client  
code will help me understand.  

How can we guarantee that the last delete is actually the last delete. What  
if there was a race condition on delete and creation of the new node under  
the same parent. What kind of exception will we throw when a participant  
tried to create a node under a container node but the parent directory was  
deleted. How should the client handle such an exception.  

What about libraries(such as curator, zkclient) that provide mkdir -p kind  
of api where they go ahead and create parent nodes automatically if they  
don't exist.  

At a very high level the big question is are we tying this feature to a  
specific recipe and the way its implemented.  

Does this make sense?  

thanks,  
Kishore G  

On Tue, Apr 14, 2015 at 10:49 AM, Camille Fournier <ca...@apache.org>  
wrote:  

> Look at the session managers, they track what sessions are alive and clean  
> up when they aren't.  
>  
> On Tue, Apr 14, 2015 at 1:49 PM, Camille Fournier <cf...@renttherunway.com>  
> wrote:  
>  
> > Look at the session managers, they track what sessions are alive and  
> clean  
> > up when they aren't.  
> >  
> > C  
> >  
> > On Tue, Apr 14, 2015 at 1:36 PM, Jordan Zimmerman <  
> > jordan@jordanzimmerman.com> wrote:  
> >  
> >> Another question…  
> >>  
> >> So, my two current questions are:  
> >>  
> >> * noting that a ZNode is a container, would you suggest the hack of a  
> >> special ephemeralOwner value or would you add a new field to Stat?  
> >>  
> >> * is there a current mechanism in the ZK server code (for the leader in  
> >> particular) to handle periodic housecleaning tasks? If so, where is that  
> >> code?  
> >>  
> >> -Jordan  
> >>  
> >>  
> >>  
> >> On April 13, 2015 at 2:48:27 PM, Jordan Zimmerman (  
> >> jordan@jordanzimmerman.com) wrote:  
> >>  
> >> As for noting that a ZNode is a container, would you suggest the hack of  
> >> a special ephemeralOwner value or would you add a new field to Stat?  
> >>  
> >> -Jordan  
> >>  
> >>  
> >>  
> >> On April 10, 2015 at 6:40:23 PM, Patrick Hunt (phunt@apache.org) wrote:  
> >>  
> >> Adding is typically good from a b/w compact perspective. If you use the  
> >> new  
> >> feature (at runtime) it generally precludes rollback though.  
> >>  
> >> See CreateTxn and CreateTxnV0  
> >>  
> >> A bit of background on convenience vs availability: Originally in ZK's  
> >> life  
> >> we explicitly stayed away from such operations at the API level (another  
> >> example being "rm -r"). We wanted to have high availability, in the  
> sense  
> >> that a single operation performed a single discreet operation on the  
> >> service. We didn't want to allow "unbounded" sets of changes that might  
> >> affect availability - say a single operation that triggered a thousand  
> >> discreet operations on the service, blocking out clients from doing  
> other  
> >> work. This seems pretty bounded to me though - at worst deleting the  
> >> entire  
> >> parent chain, which in general should be relatively small.  
> >>  
> >> Patrick  
> >>  
> >> On Thu, Apr 9, 2015 at 12:41 PM, Jordan Zimmerman <  
> >> jordan@jordanzimmerman.com> wrote:  
> >>  
> >> > You don’t even need to look at cversion. If the parent node is a  
> >> container  
> >> > and has no children (i.e. the node being deleted is the last child),  
> it  
> >> > gets deleted.  
> >> >  
> >> > The trouble I’m currently having, though, is that I don’t want to  
> modify  
> >> > the CreateTxn record. I can’t find a place to mark that the node  
> should  
> >> be  
> >> > a container. I guess I’ll have to add a new record type. What are the  
> >> > ramifications of that?  
> >> >  
> >> > -JZ  
> >> >  
> >> > On April 9, 2015 at 2:24:16 PM, Michi Mutsuzaki (  
> michi@cs.stanford.edu)  
> >> > wrote:  
> >> >  
> >> > I see, so the container znode is a znode that gets deleted if it's  
> >> > empty and it ever had a child (cversion is greater than zero). It  
> >> > sounds good to me. Let's see what other people say.  
> >> >  
> >> > Thanks Jordan!  
> >> >  
> >> > On Thu, Apr 9, 2015 at 10:20 AM, Jordan Zimmerman  
> >> > <jo...@jordanzimmerman.com> wrote:  
> >> > > This sounds great to me, but it sounds a lot like ZOOKEEPER-723.  
> >> > >  
> >> > > The problem with both ZOOKEEPER-723 and ZOOKEEPER-834 is that it  
> >> > overloads  
> >> > > the concept of EPHEMERAL. EPHEMERALs are tied to sessions. In the  
> use  
> >> > cases  
> >> > > that I see, the parent node is always PERSISTENT - i.e. not tied to  
> a  
> >> > > session.  
> >> > >  
> >> > > I haven't looked at the patch yet, but how do you handle the "first  
> >> > > child" problem?  
> >> > >  
> >> > > My solution applies only when a node is deleted. So, there is no  
> need  
> >> > for a  
> >> > > first child check. When a node is deleted, iff it's parent has zero  
> >> > children  
> >> > > and is of type CONTAINER, then the parent is deleted and recursively  
> >> up  
> >> > the  
> >> > > tree.  
> >> > >  
> >> > > -Jordan  
> >> > >  
> >> > > On April 9, 2015 at 12:15:33 PM, Michi Mutsuzaki (  
> >> michi@cs.stanford.edu)  
> >> > > wrote:  
> >> > >  
> >> > > Hi Jordan.  
> >> > >  
> >> > > This sounds great to me, but it sounds a lot like ZOOKEEPER-723.  
> >> > > Different people had different ideas there, but the original  
> >> > > description was:  
> >> > >  
> >> > > "rather than changing the semantics of ephemeral nodes, i propose  
> >> > > ephemeral parents: znodes that disappear when they have no more  
> >> > > children. this cleanup would happen automatically when the last  
> child  
> >> > > is removed. an ephemeral parent is not tied to any particular  
> session,  
> >> > > so even if the creator goes away, the ephemeral parent will remain  
> as  
> >> > > long as there are children."  
> >> > >  
> >> > > I haven't looked at the patch yet, but how do you handle the "first  
> >> > > child" problem? Is the container znode created with a first child to  
> >> > > prevent getting deleted, or does the client rely on multi to create  
> a  
> >> > > container and its children, or something else?  
> >> > >  
> >> > >  
> >> > > On Thu, Apr 9, 2015 at 8:00 AM, Jordan Zimmerman  
> >> > > <jo...@jordanzimmerman.com> wrote:  
> >> > >> BACKGROUND  
> >> > >> ============  
> >> > >> A recurring problem for ZooKeeper users is garbage collection of  
> >> parent  
> >> > >> nodes. Many recipes (e.g. locks, leaders, etc.) call for the  
> creation  
> >> > of a  
> >> > >> parent node under which participants create sequential nodes. When  
> >> the  
> >> > >> participant is done, it deletes its node. In practice, the  
> ZooKeeper  
> >> > tree  
> >> > >> begins to fill up with orphaned parent nodes that are no longer  
> >> needed.  
> >> > The  
> >> > >> ZooKeeper APIs don't provide a way to clean these. Over time,  
> >> ZooKeeper  
> >> > can  
> >> > >> become unstable due to the number of these nodes.  
> >> > >>  
> >> > >> CURRENT SOLUTIONS  
> >> > >> ===================  
> >> > >> Apache Curator has a workaround solution for this by providing the  
> >> > Reaper  
> >> > >> class which runs in the background looking for orphaned parent  
> nodes  
> >> and  
> >> > >> deleting them. This isn't ideal and it would be better if ZooKeeper  
> >> > >> supported this directly.  
> >> > >>  
> >> > >> PROPOSAL  
> >> > >> =========  
> >> > >> ZOOKEEPER-723 and ZOOKEEPER-834 have been proposed to allow  
> EPHEMERAL  
> >> > >> nodes to contain child nodes. This is not optimum as EPHEMERALs are  
> >> > tied to  
> >> > >> a session and the general use case of parent nodes is for  
> PERSISTENT  
> >> > nodes.  
> >> > >> This proposal adds a new node type, CONTAINER. A CONTAINER node is  
> >> the  
> >> > same  
> >> > >> as a PERSISTENT node with the additional property that when its  
> last  
> >> > child  
> >> > >> is deleted, it is deleted (and CONTAINER nodes recursively up the  
> >> tree  
> >> > are  
> >> > >> deleted if empty).  
> >> > >>  
> >> > >> I have a first pass (untested) straw man proposal open for comment  
> >> here:  
> >> > >>  
> >> > >> https://github.com/apache/zookeeper/pull/28  
> >> > >>  
> >> > >> In order to have minimum impact on existing implementations, a  
> >> container  
> >> > >> node is denoted by having an ephemeralOwner id of Long.MIN_VALUE.  
> >> This  
> >> > is  
> >> > >> pretty hackish, but I think it's the most supportable without  
> causing  
> >> > >> disruption. Also, a container behaves a "little bit" like an  
> >> EPHEMERAL  
> >> > node  
> >> > >> so it isn't totally illogical. Alternatively, a new field could be  
> >> > added to  
> >> > >> STAT.  
> >> > >>  
> >> > >> I look forward to feedback on this. If people think it's worthwhile  
> >> I'll  
> >> > >> open a Jira and work on a Production quality solution. If it's  
> >> > rejected, I'd  
> >> > >> appreciate discussion of an alternate as this is a real need in the  
> >> ZK  
> >> > user  
> >> > >> community.  
> >> > >>  
> >> > >> -Jordan  
> >> > >>  
> >> > >>  
> >> >  
> >>  
> >  
> >  
>  

Re: [PROPOSAL] Container nodes

Posted by Jordan Zimmerman <jo...@jordanzimmerman.com>.
How would one go about using this feature? Perhaps a pseudo api and client 
code will help me understand. 
Here’s my current API:

zk.createContainer(path, data, acls);	// and standard variations for async

I opted to create a new API because I didn’t want to add a new CreateMode. Also, sequential and ephemeral don’t make sense for containers. Usage: in Curator, when you create a lock, leader, etc. instance you pass in a path that is used manage things. Now, instead of creating a standard PERSISTENT node, a container would be created. Internally, a container node is normal persistent ZNode with a flag (TBD) that marks it as a container.

How can we guarantee that the last delete is actually the last delete. What 
if there was a race condition on delete and creation of the new node under 
the same parent.
It’s not necessary. A properly written recipe (IMO of course) re-creates parent nodes when necessary. Curator does this (via the createParentsIfNeeded() operation). So, any Curator recipe will be updated to re-create the container if needed.

At a very high level the big question is are we tying this feature to a 
specific recipe and the way its implemented. 
I don’t think so. Of course I’m biased to how Curator does things. But, when following the recipes on the doc pages you will always end up in the state where there are a bunch of parent nodes lying around. There is actually an implied node type of container already being described by the docs - there’s just no support for it in ZK.



-Jordan





On April 14, 2015 at 1:52:38 PM, kishore g (g.kishore@gmail.com) wrote:

Hi Jordon,  

I like this feature and always thought it would be useful to have something  
like this for Apache Helix as well. We do have a clean up thread that  
deletes the znodes. But I felt it was tied to Helix.  

Here are some of the questions that made me think its better to have the  
user of zookeeper handle deleting the parent node according to the use case.  

How would one go about using this feature? Perhaps a pseudo api and client  
code will help me understand.  

How can we guarantee that the last delete is actually the last delete. What  
if there was a race condition on delete and creation of the new node under  
the same parent. What kind of exception will we throw when a participant  
tried to create a node under a container node but the parent directory was  
deleted. How should the client handle such an exception.  

What about libraries(such as curator, zkclient) that provide mkdir -p kind  
of api where they go ahead and create parent nodes automatically if they  
don't exist.  

At a very high level the big question is are we tying this feature to a  
specific recipe and the way its implemented.  

Does this make sense?  

thanks,  
Kishore G  

On Tue, Apr 14, 2015 at 10:49 AM, Camille Fournier <ca...@apache.org>  
wrote:  

> Look at the session managers, they track what sessions are alive and clean  
> up when they aren't.  
>  
> On Tue, Apr 14, 2015 at 1:49 PM, Camille Fournier <cf...@renttherunway.com>  
> wrote:  
>  
> > Look at the session managers, they track what sessions are alive and  
> clean  
> > up when they aren't.  
> >  
> > C  
> >  
> > On Tue, Apr 14, 2015 at 1:36 PM, Jordan Zimmerman <  
> > jordan@jordanzimmerman.com> wrote:  
> >  
> >> Another question…  
> >>  
> >> So, my two current questions are:  
> >>  
> >> * noting that a ZNode is a container, would you suggest the hack of a  
> >> special ephemeralOwner value or would you add a new field to Stat?  
> >>  
> >> * is there a current mechanism in the ZK server code (for the leader in  
> >> particular) to handle periodic housecleaning tasks? If so, where is that  
> >> code?  
> >>  
> >> -Jordan  
> >>  
> >>  
> >>  
> >> On April 13, 2015 at 2:48:27 PM, Jordan Zimmerman (  
> >> jordan@jordanzimmerman.com) wrote:  
> >>  
> >> As for noting that a ZNode is a container, would you suggest the hack of  
> >> a special ephemeralOwner value or would you add a new field to Stat?  
> >>  
> >> -Jordan  
> >>  
> >>  
> >>  
> >> On April 10, 2015 at 6:40:23 PM, Patrick Hunt (phunt@apache.org) wrote:  
> >>  
> >> Adding is typically good from a b/w compact perspective. If you use the  
> >> new  
> >> feature (at runtime) it generally precludes rollback though.  
> >>  
> >> See CreateTxn and CreateTxnV0  
> >>  
> >> A bit of background on convenience vs availability: Originally in ZK's  
> >> life  
> >> we explicitly stayed away from such operations at the API level (another  
> >> example being "rm -r"). We wanted to have high availability, in the  
> sense  
> >> that a single operation performed a single discreet operation on the  
> >> service. We didn't want to allow "unbounded" sets of changes that might  
> >> affect availability - say a single operation that triggered a thousand  
> >> discreet operations on the service, blocking out clients from doing  
> other  
> >> work. This seems pretty bounded to me though - at worst deleting the  
> >> entire  
> >> parent chain, which in general should be relatively small.  
> >>  
> >> Patrick  
> >>  
> >> On Thu, Apr 9, 2015 at 12:41 PM, Jordan Zimmerman <  
> >> jordan@jordanzimmerman.com> wrote:  
> >>  
> >> > You don’t even need to look at cversion. If the parent node is a  
> >> container  
> >> > and has no children (i.e. the node being deleted is the last child),  
> it  
> >> > gets deleted.  
> >> >  
> >> > The trouble I’m currently having, though, is that I don’t want to  
> modify  
> >> > the CreateTxn record. I can’t find a place to mark that the node  
> should  
> >> be  
> >> > a container. I guess I’ll have to add a new record type. What are the  
> >> > ramifications of that?  
> >> >  
> >> > -JZ  
> >> >  
> >> > On April 9, 2015 at 2:24:16 PM, Michi Mutsuzaki (  
> michi@cs.stanford.edu)  
> >> > wrote:  
> >> >  
> >> > I see, so the container znode is a znode that gets deleted if it's  
> >> > empty and it ever had a child (cversion is greater than zero). It  
> >> > sounds good to me. Let's see what other people say.  
> >> >  
> >> > Thanks Jordan!  
> >> >  
> >> > On Thu, Apr 9, 2015 at 10:20 AM, Jordan Zimmerman  
> >> > <jo...@jordanzimmerman.com> wrote:  
> >> > > This sounds great to me, but it sounds a lot like ZOOKEEPER-723.  
> >> > >  
> >> > > The problem with both ZOOKEEPER-723 and ZOOKEEPER-834 is that it  
> >> > overloads  
> >> > > the concept of EPHEMERAL. EPHEMERALs are tied to sessions. In the  
> use  
> >> > cases  
> >> > > that I see, the parent node is always PERSISTENT - i.e. not tied to  
> a  
> >> > > session.  
> >> > >  
> >> > > I haven't looked at the patch yet, but how do you handle the "first  
> >> > > child" problem?  
> >> > >  
> >> > > My solution applies only when a node is deleted. So, there is no  
> need  
> >> > for a  
> >> > > first child check. When a node is deleted, iff it's parent has zero  
> >> > children  
> >> > > and is of type CONTAINER, then the parent is deleted and recursively  
> >> up  
> >> > the  
> >> > > tree.  
> >> > >  
> >> > > -Jordan  
> >> > >  
> >> > > On April 9, 2015 at 12:15:33 PM, Michi Mutsuzaki (  
> >> michi@cs.stanford.edu)  
> >> > > wrote:  
> >> > >  
> >> > > Hi Jordan.  
> >> > >  
> >> > > This sounds great to me, but it sounds a lot like ZOOKEEPER-723.  
> >> > > Different people had different ideas there, but the original  
> >> > > description was:  
> >> > >  
> >> > > "rather than changing the semantics of ephemeral nodes, i propose  
> >> > > ephemeral parents: znodes that disappear when they have no more  
> >> > > children. this cleanup would happen automatically when the last  
> child  
> >> > > is removed. an ephemeral parent is not tied to any particular  
> session,  
> >> > > so even if the creator goes away, the ephemeral parent will remain  
> as  
> >> > > long as there are children."  
> >> > >  
> >> > > I haven't looked at the patch yet, but how do you handle the "first  
> >> > > child" problem? Is the container znode created with a first child to  
> >> > > prevent getting deleted, or does the client rely on multi to create  
> a  
> >> > > container and its children, or something else?  
> >> > >  
> >> > >  
> >> > > On Thu, Apr 9, 2015 at 8:00 AM, Jordan Zimmerman  
> >> > > <jo...@jordanzimmerman.com> wrote:  
> >> > >> BACKGROUND  
> >> > >> ============  
> >> > >> A recurring problem for ZooKeeper users is garbage collection of  
> >> parent  
> >> > >> nodes. Many recipes (e.g. locks, leaders, etc.) call for the  
> creation  
> >> > of a  
> >> > >> parent node under which participants create sequential nodes. When  
> >> the  
> >> > >> participant is done, it deletes its node. In practice, the  
> ZooKeeper  
> >> > tree  
> >> > >> begins to fill up with orphaned parent nodes that are no longer  
> >> needed.  
> >> > The  
> >> > >> ZooKeeper APIs don't provide a way to clean these. Over time,  
> >> ZooKeeper  
> >> > can  
> >> > >> become unstable due to the number of these nodes.  
> >> > >>  
> >> > >> CURRENT SOLUTIONS  
> >> > >> ===================  
> >> > >> Apache Curator has a workaround solution for this by providing the  
> >> > Reaper  
> >> > >> class which runs in the background looking for orphaned parent  
> nodes  
> >> and  
> >> > >> deleting them. This isn't ideal and it would be better if ZooKeeper  
> >> > >> supported this directly.  
> >> > >>  
> >> > >> PROPOSAL  
> >> > >> =========  
> >> > >> ZOOKEEPER-723 and ZOOKEEPER-834 have been proposed to allow  
> EPHEMERAL  
> >> > >> nodes to contain child nodes. This is not optimum as EPHEMERALs are  
> >> > tied to  
> >> > >> a session and the general use case of parent nodes is for  
> PERSISTENT  
> >> > nodes.  
> >> > >> This proposal adds a new node type, CONTAINER. A CONTAINER node is  
> >> the  
> >> > same  
> >> > >> as a PERSISTENT node with the additional property that when its  
> last  
> >> > child  
> >> > >> is deleted, it is deleted (and CONTAINER nodes recursively up the  
> >> tree  
> >> > are  
> >> > >> deleted if empty).  
> >> > >>  
> >> > >> I have a first pass (untested) straw man proposal open for comment  
> >> here:  
> >> > >>  
> >> > >> https://github.com/apache/zookeeper/pull/28  
> >> > >>  
> >> > >> In order to have minimum impact on existing implementations, a  
> >> container  
> >> > >> node is denoted by having an ephemeralOwner id of Long.MIN_VALUE.  
> >> This  
> >> > is  
> >> > >> pretty hackish, but I think it's the most supportable without  
> causing  
> >> > >> disruption. Also, a container behaves a "little bit" like an  
> >> EPHEMERAL  
> >> > node  
> >> > >> so it isn't totally illogical. Alternatively, a new field could be  
> >> > added to  
> >> > >> STAT.  
> >> > >>  
> >> > >> I look forward to feedback on this. If people think it's worthwhile  
> >> I'll  
> >> > >> open a Jira and work on a Production quality solution. If it's  
> >> > rejected, I'd  
> >> > >> appreciate discussion of an alternate as this is a real need in the  
> >> ZK  
> >> > user  
> >> > >> community.  
> >> > >>  
> >> > >> -Jordan  
> >> > >>  
> >> > >>  
> >> >  
> >>  
> >  
> >  
>  

Re: [PROPOSAL] Container nodes

Posted by kishore g <g....@gmail.com>.
Hi Jordon,

I like this feature and always thought it would be useful to have something
like this for Apache Helix as well. We do have a clean up thread that
deletes the znodes. But I felt it was tied to Helix.

Here are some of the questions that made me think its better to have the
user of zookeeper handle deleting the parent node according to the use case.

How would one go about using this feature? Perhaps a pseudo api and client
code will help me understand.

How can we guarantee that the last delete is actually the last delete. What
if there was a race condition on delete and creation of the new node under
the same parent. What kind of exception will we throw when a participant
tried to create a node under a container node but the parent directory was
deleted. How should the client handle such an exception.

What about libraries(such as curator, zkclient) that provide mkdir -p kind
of api where they go ahead and create parent nodes automatically if they
don't exist.

At a very  high level the big question is are we tying this feature to a
specific recipe and the way its implemented.

Does this make sense?

thanks,
Kishore G

On Tue, Apr 14, 2015 at 10:49 AM, Camille Fournier <ca...@apache.org>
wrote:

> Look at the session managers, they track what sessions are alive and clean
> up when they aren't.
>
> On Tue, Apr 14, 2015 at 1:49 PM, Camille Fournier <cf...@renttherunway.com>
> wrote:
>
> > Look at the session managers, they track what sessions are alive and
> clean
> > up when they aren't.
> >
> > C
> >
> > On Tue, Apr 14, 2015 at 1:36 PM, Jordan Zimmerman <
> > jordan@jordanzimmerman.com> wrote:
> >
> >> Another question…
> >>
> >> So, my two current questions are:
> >>
> >> * noting that a ZNode is a container, would you suggest the hack of a
> >> special ephemeralOwner value or would you add a new field to Stat?
> >>
> >> * is there a current mechanism in the ZK server code (for the leader in
> >> particular) to handle periodic housecleaning tasks? If so, where is that
> >> code?
> >>
> >> -Jordan
> >>
> >>
> >>
> >> On April 13, 2015 at 2:48:27 PM, Jordan Zimmerman (
> >> jordan@jordanzimmerman.com) wrote:
> >>
> >> As for noting that a ZNode is a container, would you suggest the hack of
> >> a special ephemeralOwner value or would you add a new field to Stat?
> >>
> >> -Jordan
> >>
> >>
> >>
> >> On April 10, 2015 at 6:40:23 PM, Patrick Hunt (phunt@apache.org) wrote:
> >>
> >> Adding is typically good from a b/w compact perspective. If you use the
> >> new
> >> feature (at runtime) it generally precludes rollback though.
> >>
> >> See CreateTxn and CreateTxnV0
> >>
> >> A bit of background on convenience vs availability: Originally in ZK's
> >> life
> >> we explicitly stayed away from such operations at the API level (another
> >> example being "rm -r"). We wanted to have high availability, in the
> sense
> >> that a single operation performed a single discreet operation on the
> >> service. We didn't want to allow "unbounded" sets of changes that might
> >> affect availability - say a single operation that triggered a thousand
> >> discreet operations on the service, blocking out clients from doing
> other
> >> work. This seems pretty bounded to me though - at worst deleting the
> >> entire
> >> parent chain, which in general should be relatively small.
> >>
> >> Patrick
> >>
> >> On Thu, Apr 9, 2015 at 12:41 PM, Jordan Zimmerman <
> >> jordan@jordanzimmerman.com> wrote:
> >>
> >> > You don’t even need to look at cversion. If the parent node is a
> >> container
> >> > and has no children (i.e. the node being deleted is the last child),
> it
> >> > gets deleted.
> >> >
> >> > The trouble I’m currently having, though, is that I don’t want to
> modify
> >> > the CreateTxn record. I can’t find a place to mark that the node
> should
> >> be
> >> > a container. I guess I’ll have to add a new record type. What are the
> >> > ramifications of that?
> >> >
> >> > -JZ
> >> >
> >> > On April 9, 2015 at 2:24:16 PM, Michi Mutsuzaki (
> michi@cs.stanford.edu)
> >> > wrote:
> >> >
> >> > I see, so the container znode is a znode that gets deleted if it's
> >> > empty and it ever had a child (cversion is greater than zero). It
> >> > sounds good to me. Let's see what other people say.
> >> >
> >> > Thanks Jordan!
> >> >
> >> > On Thu, Apr 9, 2015 at 10:20 AM, Jordan Zimmerman
> >> > <jo...@jordanzimmerman.com> wrote:
> >> > > This sounds great to me, but it sounds a lot like ZOOKEEPER-723.
> >> > >
> >> > > The problem with both ZOOKEEPER-723 and ZOOKEEPER-834 is that it
> >> > overloads
> >> > > the concept of EPHEMERAL. EPHEMERALs are tied to sessions. In the
> use
> >> > cases
> >> > > that I see, the parent node is always PERSISTENT - i.e. not tied to
> a
> >> > > session.
> >> > >
> >> > > I haven't looked at the patch yet, but how do you handle the "first
> >> > > child" problem?
> >> > >
> >> > > My solution applies only when a node is deleted. So, there is no
> need
> >> > for a
> >> > > first child check. When a node is deleted, iff it's parent has zero
> >> > children
> >> > > and is of type CONTAINER, then the parent is deleted and recursively
> >> up
> >> > the
> >> > > tree.
> >> > >
> >> > > -Jordan
> >> > >
> >> > > On April 9, 2015 at 12:15:33 PM, Michi Mutsuzaki (
> >> michi@cs.stanford.edu)
> >> > > wrote:
> >> > >
> >> > > Hi Jordan.
> >> > >
> >> > > This sounds great to me, but it sounds a lot like ZOOKEEPER-723.
> >> > > Different people had different ideas there, but the original
> >> > > description was:
> >> > >
> >> > > "rather than changing the semantics of ephemeral nodes, i propose
> >> > > ephemeral parents: znodes that disappear when they have no more
> >> > > children. this cleanup would happen automatically when the last
> child
> >> > > is removed. an ephemeral parent is not tied to any particular
> session,
> >> > > so even if the creator goes away, the ephemeral parent will remain
> as
> >> > > long as there are children."
> >> > >
> >> > > I haven't looked at the patch yet, but how do you handle the "first
> >> > > child" problem? Is the container znode created with a first child to
> >> > > prevent getting deleted, or does the client rely on multi to create
> a
> >> > > container and its children, or something else?
> >> > >
> >> > >
> >> > > On Thu, Apr 9, 2015 at 8:00 AM, Jordan Zimmerman
> >> > > <jo...@jordanzimmerman.com> wrote:
> >> > >> BACKGROUND
> >> > >> ============
> >> > >> A recurring problem for ZooKeeper users is garbage collection of
> >> parent
> >> > >> nodes. Many recipes (e.g. locks, leaders, etc.) call for the
> creation
> >> > of a
> >> > >> parent node under which participants create sequential nodes. When
> >> the
> >> > >> participant is done, it deletes its node. In practice, the
> ZooKeeper
> >> > tree
> >> > >> begins to fill up with orphaned parent nodes that are no longer
> >> needed.
> >> > The
> >> > >> ZooKeeper APIs don't provide a way to clean these. Over time,
> >> ZooKeeper
> >> > can
> >> > >> become unstable due to the number of these nodes.
> >> > >>
> >> > >> CURRENT SOLUTIONS
> >> > >> ===================
> >> > >> Apache Curator has a workaround solution for this by providing the
> >> > Reaper
> >> > >> class which runs in the background looking for orphaned parent
> nodes
> >> and
> >> > >> deleting them. This isn't ideal and it would be better if ZooKeeper
> >> > >> supported this directly.
> >> > >>
> >> > >> PROPOSAL
> >> > >> =========
> >> > >> ZOOKEEPER-723 and ZOOKEEPER-834 have been proposed to allow
> EPHEMERAL
> >> > >> nodes to contain child nodes. This is not optimum as EPHEMERALs are
> >> > tied to
> >> > >> a session and the general use case of parent nodes is for
> PERSISTENT
> >> > nodes.
> >> > >> This proposal adds a new node type, CONTAINER. A CONTAINER node is
> >> the
> >> > same
> >> > >> as a PERSISTENT node with the additional property that when its
> last
> >> > child
> >> > >> is deleted, it is deleted (and CONTAINER nodes recursively up the
> >> tree
> >> > are
> >> > >> deleted if empty).
> >> > >>
> >> > >> I have a first pass (untested) straw man proposal open for comment
> >> here:
> >> > >>
> >> > >> https://github.com/apache/zookeeper/pull/28
> >> > >>
> >> > >> In order to have minimum impact on existing implementations, a
> >> container
> >> > >> node is denoted by having an ephemeralOwner id of Long.MIN_VALUE.
> >> This
> >> > is
> >> > >> pretty hackish, but I think it's the most supportable without
> causing
> >> > >> disruption. Also, a container behaves a "little bit" like an
> >> EPHEMERAL
> >> > node
> >> > >> so it isn't totally illogical. Alternatively, a new field could be
> >> > added to
> >> > >> STAT.
> >> > >>
> >> > >> I look forward to feedback on this. If people think it's worthwhile
> >> I'll
> >> > >> open a Jira and work on a Production quality solution. If it's
> >> > rejected, I'd
> >> > >> appreciate discussion of an alternate as this is a real need in the
> >> ZK
> >> > user
> >> > >> community.
> >> > >>
> >> > >> -Jordan
> >> > >>
> >> > >>
> >> >
> >>
> >
> >
>

Re: [PROPOSAL] Container nodes

Posted by Jordan Zimmerman <jo...@jordanzimmerman.com>.
Thanks. How about the first question?

-Jordan



On April 14, 2015 at 12:49:52 PM, Camille Fournier (camille@apache.org) wrote:

Look at the session managers, they track what sessions are alive and clean  
up when they aren't.  

On Tue, Apr 14, 2015 at 1:49 PM, Camille Fournier <cf...@renttherunway.com>  
wrote:  

> Look at the session managers, they track what sessions are alive and clean  
> up when they aren't.  
>  
> C  
>  
> On Tue, Apr 14, 2015 at 1:36 PM, Jordan Zimmerman <  
> jordan@jordanzimmerman.com> wrote:  
>  
>> Another question…  
>>  
>> So, my two current questions are:  
>>  
>> * noting that a ZNode is a container, would you suggest the hack of a  
>> special ephemeralOwner value or would you add a new field to Stat?  
>>  
>> * is there a current mechanism in the ZK server code (for the leader in  
>> particular) to handle periodic housecleaning tasks? If so, where is that  
>> code?  
>>  
>> -Jordan  
>>  
>>  
>>  
>> On April 13, 2015 at 2:48:27 PM, Jordan Zimmerman (  
>> jordan@jordanzimmerman.com) wrote:  
>>  
>> As for noting that a ZNode is a container, would you suggest the hack of  
>> a special ephemeralOwner value or would you add a new field to Stat?  
>>  
>> -Jordan  
>>  
>>  
>>  
>> On April 10, 2015 at 6:40:23 PM, Patrick Hunt (phunt@apache.org) wrote:  
>>  
>> Adding is typically good from a b/w compact perspective. If you use the  
>> new  
>> feature (at runtime) it generally precludes rollback though.  
>>  
>> See CreateTxn and CreateTxnV0  
>>  
>> A bit of background on convenience vs availability: Originally in ZK's  
>> life  
>> we explicitly stayed away from such operations at the API level (another  
>> example being "rm -r"). We wanted to have high availability, in the sense  
>> that a single operation performed a single discreet operation on the  
>> service. We didn't want to allow "unbounded" sets of changes that might  
>> affect availability - say a single operation that triggered a thousand  
>> discreet operations on the service, blocking out clients from doing other  
>> work. This seems pretty bounded to me though - at worst deleting the  
>> entire  
>> parent chain, which in general should be relatively small.  
>>  
>> Patrick  
>>  
>> On Thu, Apr 9, 2015 at 12:41 PM, Jordan Zimmerman <  
>> jordan@jordanzimmerman.com> wrote:  
>>  
>> > You don’t even need to look at cversion. If the parent node is a  
>> container  
>> > and has no children (i.e. the node being deleted is the last child), it  
>> > gets deleted.  
>> >  
>> > The trouble I’m currently having, though, is that I don’t want to modify  
>> > the CreateTxn record. I can’t find a place to mark that the node should  
>> be  
>> > a container. I guess I’ll have to add a new record type. What are the  
>> > ramifications of that?  
>> >  
>> > -JZ  
>> >  
>> > On April 9, 2015 at 2:24:16 PM, Michi Mutsuzaki (michi@cs.stanford.edu)  
>> > wrote:  
>> >  
>> > I see, so the container znode is a znode that gets deleted if it's  
>> > empty and it ever had a child (cversion is greater than zero). It  
>> > sounds good to me. Let's see what other people say.  
>> >  
>> > Thanks Jordan!  
>> >  
>> > On Thu, Apr 9, 2015 at 10:20 AM, Jordan Zimmerman  
>> > <jo...@jordanzimmerman.com> wrote:  
>> > > This sounds great to me, but it sounds a lot like ZOOKEEPER-723.  
>> > >  
>> > > The problem with both ZOOKEEPER-723 and ZOOKEEPER-834 is that it  
>> > overloads  
>> > > the concept of EPHEMERAL. EPHEMERALs are tied to sessions. In the use  
>> > cases  
>> > > that I see, the parent node is always PERSISTENT - i.e. not tied to a  
>> > > session.  
>> > >  
>> > > I haven't looked at the patch yet, but how do you handle the "first  
>> > > child" problem?  
>> > >  
>> > > My solution applies only when a node is deleted. So, there is no need  
>> > for a  
>> > > first child check. When a node is deleted, iff it's parent has zero  
>> > children  
>> > > and is of type CONTAINER, then the parent is deleted and recursively  
>> up  
>> > the  
>> > > tree.  
>> > >  
>> > > -Jordan  
>> > >  
>> > > On April 9, 2015 at 12:15:33 PM, Michi Mutsuzaki (  
>> michi@cs.stanford.edu)  
>> > > wrote:  
>> > >  
>> > > Hi Jordan.  
>> > >  
>> > > This sounds great to me, but it sounds a lot like ZOOKEEPER-723.  
>> > > Different people had different ideas there, but the original  
>> > > description was:  
>> > >  
>> > > "rather than changing the semantics of ephemeral nodes, i propose  
>> > > ephemeral parents: znodes that disappear when they have no more  
>> > > children. this cleanup would happen automatically when the last child  
>> > > is removed. an ephemeral parent is not tied to any particular session,  
>> > > so even if the creator goes away, the ephemeral parent will remain as  
>> > > long as there are children."  
>> > >  
>> > > I haven't looked at the patch yet, but how do you handle the "first  
>> > > child" problem? Is the container znode created with a first child to  
>> > > prevent getting deleted, or does the client rely on multi to create a  
>> > > container and its children, or something else?  
>> > >  
>> > >  
>> > > On Thu, Apr 9, 2015 at 8:00 AM, Jordan Zimmerman  
>> > > <jo...@jordanzimmerman.com> wrote:  
>> > >> BACKGROUND  
>> > >> ============  
>> > >> A recurring problem for ZooKeeper users is garbage collection of  
>> parent  
>> > >> nodes. Many recipes (e.g. locks, leaders, etc.) call for the creation  
>> > of a  
>> > >> parent node under which participants create sequential nodes. When  
>> the  
>> > >> participant is done, it deletes its node. In practice, the ZooKeeper  
>> > tree  
>> > >> begins to fill up with orphaned parent nodes that are no longer  
>> needed.  
>> > The  
>> > >> ZooKeeper APIs don't provide a way to clean these. Over time,  
>> ZooKeeper  
>> > can  
>> > >> become unstable due to the number of these nodes.  
>> > >>  
>> > >> CURRENT SOLUTIONS  
>> > >> ===================  
>> > >> Apache Curator has a workaround solution for this by providing the  
>> > Reaper  
>> > >> class which runs in the background looking for orphaned parent nodes  
>> and  
>> > >> deleting them. This isn't ideal and it would be better if ZooKeeper  
>> > >> supported this directly.  
>> > >>  
>> > >> PROPOSAL  
>> > >> =========  
>> > >> ZOOKEEPER-723 and ZOOKEEPER-834 have been proposed to allow EPHEMERAL  
>> > >> nodes to contain child nodes. This is not optimum as EPHEMERALs are  
>> > tied to  
>> > >> a session and the general use case of parent nodes is for PERSISTENT  
>> > nodes.  
>> > >> This proposal adds a new node type, CONTAINER. A CONTAINER node is  
>> the  
>> > same  
>> > >> as a PERSISTENT node with the additional property that when its last  
>> > child  
>> > >> is deleted, it is deleted (and CONTAINER nodes recursively up the  
>> tree  
>> > are  
>> > >> deleted if empty).  
>> > >>  
>> > >> I have a first pass (untested) straw man proposal open for comment  
>> here:  
>> > >>  
>> > >> https://github.com/apache/zookeeper/pull/28  
>> > >>  
>> > >> In order to have minimum impact on existing implementations, a  
>> container  
>> > >> node is denoted by having an ephemeralOwner id of Long.MIN_VALUE.  
>> This  
>> > is  
>> > >> pretty hackish, but I think it's the most supportable without causing  
>> > >> disruption. Also, a container behaves a "little bit" like an  
>> EPHEMERAL  
>> > node  
>> > >> so it isn't totally illogical. Alternatively, a new field could be  
>> > added to  
>> > >> STAT.  
>> > >>  
>> > >> I look forward to feedback on this. If people think it's worthwhile  
>> I'll  
>> > >> open a Jira and work on a Production quality solution. If it's  
>> > rejected, I'd  
>> > >> appreciate discussion of an alternate as this is a real need in the  
>> ZK  
>> > user  
>> > >> community.  
>> > >>  
>> > >> -Jordan  
>> > >>  
>> > >>  
>> >  
>>  
>  
>  

Re: [PROPOSAL] Container nodes

Posted by Camille Fournier <ca...@apache.org>.
Look at the session managers, they track what sessions are alive and clean
up when they aren't.

On Tue, Apr 14, 2015 at 1:49 PM, Camille Fournier <cf...@renttherunway.com>
wrote:

> Look at the session managers, they track what sessions are alive and clean
> up when they aren't.
>
> C
>
> On Tue, Apr 14, 2015 at 1:36 PM, Jordan Zimmerman <
> jordan@jordanzimmerman.com> wrote:
>
>> Another question…
>>
>> So, my two current questions are:
>>
>> * noting that a ZNode is a container, would you suggest the hack of a
>> special ephemeralOwner value or would you add a new field to Stat?
>>
>> * is there a current mechanism in the ZK server code (for the leader in
>> particular) to handle periodic housecleaning tasks? If so, where is that
>> code?
>>
>> -Jordan
>>
>>
>>
>> On April 13, 2015 at 2:48:27 PM, Jordan Zimmerman (
>> jordan@jordanzimmerman.com) wrote:
>>
>> As for noting that a ZNode is a container, would you suggest the hack of
>> a special ephemeralOwner value or would you add a new field to Stat?
>>
>> -Jordan
>>
>>
>>
>> On April 10, 2015 at 6:40:23 PM, Patrick Hunt (phunt@apache.org) wrote:
>>
>> Adding is typically good from a b/w compact perspective. If you use the
>> new
>> feature (at runtime) it generally precludes rollback though.
>>
>> See CreateTxn and CreateTxnV0
>>
>> A bit of background on convenience vs availability: Originally in ZK's
>> life
>> we explicitly stayed away from such operations at the API level (another
>> example being "rm -r"). We wanted to have high availability, in the sense
>> that a single operation performed a single discreet operation on the
>> service. We didn't want to allow "unbounded" sets of changes that might
>> affect availability - say a single operation that triggered a thousand
>> discreet operations on the service, blocking out clients from doing other
>> work. This seems pretty bounded to me though - at worst deleting the
>> entire
>> parent chain, which in general should be relatively small.
>>
>> Patrick
>>
>> On Thu, Apr 9, 2015 at 12:41 PM, Jordan Zimmerman <
>> jordan@jordanzimmerman.com> wrote:
>>
>> > You don’t even need to look at cversion. If the parent node is a
>> container
>> > and has no children (i.e. the node being deleted is the last child), it
>> > gets deleted.
>> >
>> > The trouble I’m currently having, though, is that I don’t want to modify
>> > the CreateTxn record. I can’t find a place to mark that the node should
>> be
>> > a container. I guess I’ll have to add a new record type. What are the
>> > ramifications of that?
>> >
>> > -JZ
>> >
>> > On April 9, 2015 at 2:24:16 PM, Michi Mutsuzaki (michi@cs.stanford.edu)
>> > wrote:
>> >
>> > I see, so the container znode is a znode that gets deleted if it's
>> > empty and it ever had a child (cversion is greater than zero). It
>> > sounds good to me. Let's see what other people say.
>> >
>> > Thanks Jordan!
>> >
>> > On Thu, Apr 9, 2015 at 10:20 AM, Jordan Zimmerman
>> > <jo...@jordanzimmerman.com> wrote:
>> > > This sounds great to me, but it sounds a lot like ZOOKEEPER-723.
>> > >
>> > > The problem with both ZOOKEEPER-723 and ZOOKEEPER-834 is that it
>> > overloads
>> > > the concept of EPHEMERAL. EPHEMERALs are tied to sessions. In the use
>> > cases
>> > > that I see, the parent node is always PERSISTENT - i.e. not tied to a
>> > > session.
>> > >
>> > > I haven't looked at the patch yet, but how do you handle the "first
>> > > child" problem?
>> > >
>> > > My solution applies only when a node is deleted. So, there is no need
>> > for a
>> > > first child check. When a node is deleted, iff it's parent has zero
>> > children
>> > > and is of type CONTAINER, then the parent is deleted and recursively
>> up
>> > the
>> > > tree.
>> > >
>> > > -Jordan
>> > >
>> > > On April 9, 2015 at 12:15:33 PM, Michi Mutsuzaki (
>> michi@cs.stanford.edu)
>> > > wrote:
>> > >
>> > > Hi Jordan.
>> > >
>> > > This sounds great to me, but it sounds a lot like ZOOKEEPER-723.
>> > > Different people had different ideas there, but the original
>> > > description was:
>> > >
>> > > "rather than changing the semantics of ephemeral nodes, i propose
>> > > ephemeral parents: znodes that disappear when they have no more
>> > > children. this cleanup would happen automatically when the last child
>> > > is removed. an ephemeral parent is not tied to any particular session,
>> > > so even if the creator goes away, the ephemeral parent will remain as
>> > > long as there are children."
>> > >
>> > > I haven't looked at the patch yet, but how do you handle the "first
>> > > child" problem? Is the container znode created with a first child to
>> > > prevent getting deleted, or does the client rely on multi to create a
>> > > container and its children, or something else?
>> > >
>> > >
>> > > On Thu, Apr 9, 2015 at 8:00 AM, Jordan Zimmerman
>> > > <jo...@jordanzimmerman.com> wrote:
>> > >> BACKGROUND
>> > >> ============
>> > >> A recurring problem for ZooKeeper users is garbage collection of
>> parent
>> > >> nodes. Many recipes (e.g. locks, leaders, etc.) call for the creation
>> > of a
>> > >> parent node under which participants create sequential nodes. When
>> the
>> > >> participant is done, it deletes its node. In practice, the ZooKeeper
>> > tree
>> > >> begins to fill up with orphaned parent nodes that are no longer
>> needed.
>> > The
>> > >> ZooKeeper APIs don't provide a way to clean these. Over time,
>> ZooKeeper
>> > can
>> > >> become unstable due to the number of these nodes.
>> > >>
>> > >> CURRENT SOLUTIONS
>> > >> ===================
>> > >> Apache Curator has a workaround solution for this by providing the
>> > Reaper
>> > >> class which runs in the background looking for orphaned parent nodes
>> and
>> > >> deleting them. This isn't ideal and it would be better if ZooKeeper
>> > >> supported this directly.
>> > >>
>> > >> PROPOSAL
>> > >> =========
>> > >> ZOOKEEPER-723 and ZOOKEEPER-834 have been proposed to allow EPHEMERAL
>> > >> nodes to contain child nodes. This is not optimum as EPHEMERALs are
>> > tied to
>> > >> a session and the general use case of parent nodes is for PERSISTENT
>> > nodes.
>> > >> This proposal adds a new node type, CONTAINER. A CONTAINER node is
>> the
>> > same
>> > >> as a PERSISTENT node with the additional property that when its last
>> > child
>> > >> is deleted, it is deleted (and CONTAINER nodes recursively up the
>> tree
>> > are
>> > >> deleted if empty).
>> > >>
>> > >> I have a first pass (untested) straw man proposal open for comment
>> here:
>> > >>
>> > >> https://github.com/apache/zookeeper/pull/28
>> > >>
>> > >> In order to have minimum impact on existing implementations, a
>> container
>> > >> node is denoted by having an ephemeralOwner id of Long.MIN_VALUE.
>> This
>> > is
>> > >> pretty hackish, but I think it's the most supportable without causing
>> > >> disruption. Also, a container behaves a "little bit" like an
>> EPHEMERAL
>> > node
>> > >> so it isn't totally illogical. Alternatively, a new field could be
>> > added to
>> > >> STAT.
>> > >>
>> > >> I look forward to feedback on this. If people think it's worthwhile
>> I'll
>> > >> open a Jira and work on a Production quality solution. If it's
>> > rejected, I'd
>> > >> appreciate discussion of an alternate as this is a real need in the
>> ZK
>> > user
>> > >> community.
>> > >>
>> > >> -Jordan
>> > >>
>> > >>
>> >
>>
>
>

Re: [PROPOSAL] Container nodes

Posted by Camille Fournier <cf...@renttherunway.com>.
Look at the session managers, they track what sessions are alive and clean
up when they aren't.

C

On Tue, Apr 14, 2015 at 1:36 PM, Jordan Zimmerman <
jordan@jordanzimmerman.com> wrote:

> Another question…
>
> So, my two current questions are:
>
> * noting that a ZNode is a container, would you suggest the hack of a
> special ephemeralOwner value or would you add a new field to Stat?
>
> * is there a current mechanism in the ZK server code (for the leader in
> particular) to handle periodic housecleaning tasks? If so, where is that
> code?
>
> -Jordan
>
>
>
> On April 13, 2015 at 2:48:27 PM, Jordan Zimmerman (
> jordan@jordanzimmerman.com) wrote:
>
> As for noting that a ZNode is a container, would you suggest the hack of a
> special ephemeralOwner value or would you add a new field to Stat?
>
> -Jordan
>
>
>
> On April 10, 2015 at 6:40:23 PM, Patrick Hunt (phunt@apache.org) wrote:
>
> Adding is typically good from a b/w compact perspective. If you use the new
> feature (at runtime) it generally precludes rollback though.
>
> See CreateTxn and CreateTxnV0
>
> A bit of background on convenience vs availability: Originally in ZK's life
> we explicitly stayed away from such operations at the API level (another
> example being "rm -r"). We wanted to have high availability, in the sense
> that a single operation performed a single discreet operation on the
> service. We didn't want to allow "unbounded" sets of changes that might
> affect availability - say a single operation that triggered a thousand
> discreet operations on the service, blocking out clients from doing other
> work. This seems pretty bounded to me though - at worst deleting the entire
> parent chain, which in general should be relatively small.
>
> Patrick
>
> On Thu, Apr 9, 2015 at 12:41 PM, Jordan Zimmerman <
> jordan@jordanzimmerman.com> wrote:
>
> > You don’t even need to look at cversion. If the parent node is a
> container
> > and has no children (i.e. the node being deleted is the last child), it
> > gets deleted.
> >
> > The trouble I’m currently having, though, is that I don’t want to modify
> > the CreateTxn record. I can’t find a place to mark that the node should
> be
> > a container. I guess I’ll have to add a new record type. What are the
> > ramifications of that?
> >
> > -JZ
> >
> > On April 9, 2015 at 2:24:16 PM, Michi Mutsuzaki (michi@cs.stanford.edu)
> > wrote:
> >
> > I see, so the container znode is a znode that gets deleted if it's
> > empty and it ever had a child (cversion is greater than zero). It
> > sounds good to me. Let's see what other people say.
> >
> > Thanks Jordan!
> >
> > On Thu, Apr 9, 2015 at 10:20 AM, Jordan Zimmerman
> > <jo...@jordanzimmerman.com> wrote:
> > > This sounds great to me, but it sounds a lot like ZOOKEEPER-723.
> > >
> > > The problem with both ZOOKEEPER-723 and ZOOKEEPER-834 is that it
> > overloads
> > > the concept of EPHEMERAL. EPHEMERALs are tied to sessions. In the use
> > cases
> > > that I see, the parent node is always PERSISTENT - i.e. not tied to a
> > > session.
> > >
> > > I haven't looked at the patch yet, but how do you handle the "first
> > > child" problem?
> > >
> > > My solution applies only when a node is deleted. So, there is no need
> > for a
> > > first child check. When a node is deleted, iff it's parent has zero
> > children
> > > and is of type CONTAINER, then the parent is deleted and recursively up
> > the
> > > tree.
> > >
> > > -Jordan
> > >
> > > On April 9, 2015 at 12:15:33 PM, Michi Mutsuzaki (
> michi@cs.stanford.edu)
> > > wrote:
> > >
> > > Hi Jordan.
> > >
> > > This sounds great to me, but it sounds a lot like ZOOKEEPER-723.
> > > Different people had different ideas there, but the original
> > > description was:
> > >
> > > "rather than changing the semantics of ephemeral nodes, i propose
> > > ephemeral parents: znodes that disappear when they have no more
> > > children. this cleanup would happen automatically when the last child
> > > is removed. an ephemeral parent is not tied to any particular session,
> > > so even if the creator goes away, the ephemeral parent will remain as
> > > long as there are children."
> > >
> > > I haven't looked at the patch yet, but how do you handle the "first
> > > child" problem? Is the container znode created with a first child to
> > > prevent getting deleted, or does the client rely on multi to create a
> > > container and its children, or something else?
> > >
> > >
> > > On Thu, Apr 9, 2015 at 8:00 AM, Jordan Zimmerman
> > > <jo...@jordanzimmerman.com> wrote:
> > >> BACKGROUND
> > >> ============
> > >> A recurring problem for ZooKeeper users is garbage collection of
> parent
> > >> nodes. Many recipes (e.g. locks, leaders, etc.) call for the creation
> > of a
> > >> parent node under which participants create sequential nodes. When the
> > >> participant is done, it deletes its node. In practice, the ZooKeeper
> > tree
> > >> begins to fill up with orphaned parent nodes that are no longer
> needed.
> > The
> > >> ZooKeeper APIs don't provide a way to clean these. Over time,
> ZooKeeper
> > can
> > >> become unstable due to the number of these nodes.
> > >>
> > >> CURRENT SOLUTIONS
> > >> ===================
> > >> Apache Curator has a workaround solution for this by providing the
> > Reaper
> > >> class which runs in the background looking for orphaned parent nodes
> and
> > >> deleting them. This isn't ideal and it would be better if ZooKeeper
> > >> supported this directly.
> > >>
> > >> PROPOSAL
> > >> =========
> > >> ZOOKEEPER-723 and ZOOKEEPER-834 have been proposed to allow EPHEMERAL
> > >> nodes to contain child nodes. This is not optimum as EPHEMERALs are
> > tied to
> > >> a session and the general use case of parent nodes is for PERSISTENT
> > nodes.
> > >> This proposal adds a new node type, CONTAINER. A CONTAINER node is the
> > same
> > >> as a PERSISTENT node with the additional property that when its last
> > child
> > >> is deleted, it is deleted (and CONTAINER nodes recursively up the tree
> > are
> > >> deleted if empty).
> > >>
> > >> I have a first pass (untested) straw man proposal open for comment
> here:
> > >>
> > >> https://github.com/apache/zookeeper/pull/28
> > >>
> > >> In order to have minimum impact on existing implementations, a
> container
> > >> node is denoted by having an ephemeralOwner id of Long.MIN_VALUE. This
> > is
> > >> pretty hackish, but I think it's the most supportable without causing
> > >> disruption. Also, a container behaves a "little bit" like an EPHEMERAL
> > node
> > >> so it isn't totally illogical. Alternatively, a new field could be
> > added to
> > >> STAT.
> > >>
> > >> I look forward to feedback on this. If people think it's worthwhile
> I'll
> > >> open a Jira and work on a Production quality solution. If it's
> > rejected, I'd
> > >> appreciate discussion of an alternate as this is a real need in the ZK
> > user
> > >> community.
> > >>
> > >> -Jordan
> > >>
> > >>
> >
>

Re: [PROPOSAL] Container nodes

Posted by Jordan Zimmerman <jo...@jordanzimmerman.com>.
Another question… 

So, my two current questions are:

* noting that a ZNode is a container, would you suggest the hack of a special ephemeralOwner value or would you add a new field to Stat?

* is there a current mechanism in the ZK server code (for the leader in particular) to handle periodic housecleaning tasks? If so, where is that code?

-Jordan



On April 13, 2015 at 2:48:27 PM, Jordan Zimmerman (jordan@jordanzimmerman.com) wrote:

As for noting that a ZNode is a container, would you suggest the hack of a special ephemeralOwner value or would you add a new field to Stat?

-Jordan



On April 10, 2015 at 6:40:23 PM, Patrick Hunt (phunt@apache.org) wrote:

Adding is typically good from a b/w compact perspective. If you use the new
feature (at runtime) it generally precludes rollback though.

See CreateTxn and CreateTxnV0

A bit of background on convenience vs availability: Originally in ZK's life
we explicitly stayed away from such operations at the API level (another
example being "rm -r"). We wanted to have high availability, in the sense
that a single operation performed a single discreet operation on the
service. We didn't want to allow "unbounded" sets of changes that might
affect availability - say a single operation that triggered a thousand
discreet operations on the service, blocking out clients from doing other
work. This seems pretty bounded to me though - at worst deleting the entire
parent chain, which in general should be relatively small.

Patrick

On Thu, Apr 9, 2015 at 12:41 PM, Jordan Zimmerman <
jordan@jordanzimmerman.com> wrote:

> You don’t even need to look at cversion. If the parent node is a container
> and has no children (i.e. the node being deleted is the last child), it
> gets deleted.
>
> The trouble I’m currently having, though, is that I don’t want to modify
> the CreateTxn record. I can’t find a place to mark that the node should be
> a container. I guess I’ll have to add a new record type. What are the
> ramifications of that?
>
> -JZ
>
> On April 9, 2015 at 2:24:16 PM, Michi Mutsuzaki (michi@cs.stanford.edu)
> wrote:
>
> I see, so the container znode is a znode that gets deleted if it's
> empty and it ever had a child (cversion is greater than zero). It
> sounds good to me. Let's see what other people say.
>
> Thanks Jordan!
>
> On Thu, Apr 9, 2015 at 10:20 AM, Jordan Zimmerman
> <jo...@jordanzimmerman.com> wrote:
> > This sounds great to me, but it sounds a lot like ZOOKEEPER-723.
> >
> > The problem with both ZOOKEEPER-723 and ZOOKEEPER-834 is that it
> overloads
> > the concept of EPHEMERAL. EPHEMERALs are tied to sessions. In the use
> cases
> > that I see, the parent node is always PERSISTENT - i.e. not tied to a
> > session.
> >
> > I haven't looked at the patch yet, but how do you handle the "first
> > child" problem?
> >
> > My solution applies only when a node is deleted. So, there is no need
> for a
> > first child check. When a node is deleted, iff it's parent has zero
> children
> > and is of type CONTAINER, then the parent is deleted and recursively up
> the
> > tree.
> >
> > -Jordan
> >
> > On April 9, 2015 at 12:15:33 PM, Michi Mutsuzaki (michi@cs.stanford.edu)
> > wrote:
> >
> > Hi Jordan.
> >
> > This sounds great to me, but it sounds a lot like ZOOKEEPER-723.
> > Different people had different ideas there, but the original
> > description was:
> >
> > "rather than changing the semantics of ephemeral nodes, i propose
> > ephemeral parents: znodes that disappear when they have no more
> > children. this cleanup would happen automatically when the last child
> > is removed. an ephemeral parent is not tied to any particular session,
> > so even if the creator goes away, the ephemeral parent will remain as
> > long as there are children."
> >
> > I haven't looked at the patch yet, but how do you handle the "first
> > child" problem? Is the container znode created with a first child to
> > prevent getting deleted, or does the client rely on multi to create a
> > container and its children, or something else?
> >
> >
> > On Thu, Apr 9, 2015 at 8:00 AM, Jordan Zimmerman
> > <jo...@jordanzimmerman.com> wrote:
> >> BACKGROUND
> >> ============
> >> A recurring problem for ZooKeeper users is garbage collection of parent
> >> nodes. Many recipes (e.g. locks, leaders, etc.) call for the creation
> of a
> >> parent node under which participants create sequential nodes. When the
> >> participant is done, it deletes its node. In practice, the ZooKeeper
> tree
> >> begins to fill up with orphaned parent nodes that are no longer needed.
> The
> >> ZooKeeper APIs don't provide a way to clean these. Over time, ZooKeeper
> can
> >> become unstable due to the number of these nodes.
> >>
> >> CURRENT SOLUTIONS
> >> ===================
> >> Apache Curator has a workaround solution for this by providing the
> Reaper
> >> class which runs in the background looking for orphaned parent nodes and
> >> deleting them. This isn't ideal and it would be better if ZooKeeper
> >> supported this directly.
> >>
> >> PROPOSAL
> >> =========
> >> ZOOKEEPER-723 and ZOOKEEPER-834 have been proposed to allow EPHEMERAL
> >> nodes to contain child nodes. This is not optimum as EPHEMERALs are
> tied to
> >> a session and the general use case of parent nodes is for PERSISTENT
> nodes.
> >> This proposal adds a new node type, CONTAINER. A CONTAINER node is the
> same
> >> as a PERSISTENT node with the additional property that when its last
> child
> >> is deleted, it is deleted (and CONTAINER nodes recursively up the tree
> are
> >> deleted if empty).
> >>
> >> I have a first pass (untested) straw man proposal open for comment here:
> >>
> >> https://github.com/apache/zookeeper/pull/28
> >>
> >> In order to have minimum impact on existing implementations, a container
> >> node is denoted by having an ephemeralOwner id of Long.MIN_VALUE. This
> is
> >> pretty hackish, but I think it's the most supportable without causing
> >> disruption. Also, a container behaves a "little bit" like an EPHEMERAL
> node
> >> so it isn't totally illogical. Alternatively, a new field could be
> added to
> >> STAT.
> >>
> >> I look forward to feedback on this. If people think it's worthwhile I'll
> >> open a Jira and work on a Production quality solution. If it's
> rejected, I'd
> >> appreciate discussion of an alternate as this is a real need in the ZK
> user
> >> community.
> >>
> >> -Jordan
> >>
> >>
>

Re: [PROPOSAL] Container nodes

Posted by Jordan Zimmerman <jo...@jordanzimmerman.com>.
As for noting that a ZNode is a container, would you suggest the hack of a special ephemeralOwner value or would you add a new field to Stat?

-Jordan



On April 10, 2015 at 6:40:23 PM, Patrick Hunt (phunt@apache.org) wrote:

Adding is typically good from a b/w compact perspective. If you use the new  
feature (at runtime) it generally precludes rollback though.  

See CreateTxn and CreateTxnV0  

A bit of background on convenience vs availability: Originally in ZK's life  
we explicitly stayed away from such operations at the API level (another  
example being "rm -r"). We wanted to have high availability, in the sense  
that a single operation performed a single discreet operation on the  
service. We didn't want to allow "unbounded" sets of changes that might  
affect availability - say a single operation that triggered a thousand  
discreet operations on the service, blocking out clients from doing other  
work. This seems pretty bounded to me though - at worst deleting the entire  
parent chain, which in general should be relatively small.  

Patrick  

On Thu, Apr 9, 2015 at 12:41 PM, Jordan Zimmerman <  
jordan@jordanzimmerman.com> wrote:  

> You don’t even need to look at cversion. If the parent node is a container  
> and has no children (i.e. the node being deleted is the last child), it  
> gets deleted.  
>  
> The trouble I’m currently having, though, is that I don’t want to modify  
> the CreateTxn record. I can’t find a place to mark that the node should be  
> a container. I guess I’ll have to add a new record type. What are the  
> ramifications of that?  
>  
> -JZ  
>  
> On April 9, 2015 at 2:24:16 PM, Michi Mutsuzaki (michi@cs.stanford.edu)  
> wrote:  
>  
> I see, so the container znode is a znode that gets deleted if it's  
> empty and it ever had a child (cversion is greater than zero). It  
> sounds good to me. Let's see what other people say.  
>  
> Thanks Jordan!  
>  
> On Thu, Apr 9, 2015 at 10:20 AM, Jordan Zimmerman  
> <jo...@jordanzimmerman.com> wrote:  
> > This sounds great to me, but it sounds a lot like ZOOKEEPER-723.  
> >  
> > The problem with both ZOOKEEPER-723 and ZOOKEEPER-834 is that it  
> overloads  
> > the concept of EPHEMERAL. EPHEMERALs are tied to sessions. In the use  
> cases  
> > that I see, the parent node is always PERSISTENT - i.e. not tied to a  
> > session.  
> >  
> > I haven't looked at the patch yet, but how do you handle the "first  
> > child" problem?  
> >  
> > My solution applies only when a node is deleted. So, there is no need  
> for a  
> > first child check. When a node is deleted, iff it's parent has zero  
> children  
> > and is of type CONTAINER, then the parent is deleted and recursively up  
> the  
> > tree.  
> >  
> > -Jordan  
> >  
> > On April 9, 2015 at 12:15:33 PM, Michi Mutsuzaki (michi@cs.stanford.edu)  
> > wrote:  
> >  
> > Hi Jordan.  
> >  
> > This sounds great to me, but it sounds a lot like ZOOKEEPER-723.  
> > Different people had different ideas there, but the original  
> > description was:  
> >  
> > "rather than changing the semantics of ephemeral nodes, i propose  
> > ephemeral parents: znodes that disappear when they have no more  
> > children. this cleanup would happen automatically when the last child  
> > is removed. an ephemeral parent is not tied to any particular session,  
> > so even if the creator goes away, the ephemeral parent will remain as  
> > long as there are children."  
> >  
> > I haven't looked at the patch yet, but how do you handle the "first  
> > child" problem? Is the container znode created with a first child to  
> > prevent getting deleted, or does the client rely on multi to create a  
> > container and its children, or something else?  
> >  
> >  
> > On Thu, Apr 9, 2015 at 8:00 AM, Jordan Zimmerman  
> > <jo...@jordanzimmerman.com> wrote:  
> >> BACKGROUND  
> >> ============  
> >> A recurring problem for ZooKeeper users is garbage collection of parent  
> >> nodes. Many recipes (e.g. locks, leaders, etc.) call for the creation  
> of a  
> >> parent node under which participants create sequential nodes. When the  
> >> participant is done, it deletes its node. In practice, the ZooKeeper  
> tree  
> >> begins to fill up with orphaned parent nodes that are no longer needed.  
> The  
> >> ZooKeeper APIs don't provide a way to clean these. Over time, ZooKeeper  
> can  
> >> become unstable due to the number of these nodes.  
> >>  
> >> CURRENT SOLUTIONS  
> >> ===================  
> >> Apache Curator has a workaround solution for this by providing the  
> Reaper  
> >> class which runs in the background looking for orphaned parent nodes and  
> >> deleting them. This isn't ideal and it would be better if ZooKeeper  
> >> supported this directly.  
> >>  
> >> PROPOSAL  
> >> =========  
> >> ZOOKEEPER-723 and ZOOKEEPER-834 have been proposed to allow EPHEMERAL  
> >> nodes to contain child nodes. This is not optimum as EPHEMERALs are  
> tied to  
> >> a session and the general use case of parent nodes is for PERSISTENT  
> nodes.  
> >> This proposal adds a new node type, CONTAINER. A CONTAINER node is the  
> same  
> >> as a PERSISTENT node with the additional property that when its last  
> child  
> >> is deleted, it is deleted (and CONTAINER nodes recursively up the tree  
> are  
> >> deleted if empty).  
> >>  
> >> I have a first pass (untested) straw man proposal open for comment here:  
> >>  
> >> https://github.com/apache/zookeeper/pull/28  
> >>  
> >> In order to have minimum impact on existing implementations, a container  
> >> node is denoted by having an ephemeralOwner id of Long.MIN_VALUE. This  
> is  
> >> pretty hackish, but I think it's the most supportable without causing  
> >> disruption. Also, a container behaves a "little bit" like an EPHEMERAL  
> node  
> >> so it isn't totally illogical. Alternatively, a new field could be  
> added to  
> >> STAT.  
> >>  
> >> I look forward to feedback on this. If people think it's worthwhile I'll  
> >> open a Jira and work on a Production quality solution. If it's  
> rejected, I'd  
> >> appreciate discussion of an alternate as this is a real need in the ZK  
> user  
> >> community.  
> >>  
> >> -Jordan  
> >>  
> >>  
>  

Re: [PROPOSAL] Container nodes

Posted by Jordan Zimmerman <jo...@jordanzimmerman.com>.
FYI

https://issues.apache.org/jira/browse/ZOOKEEPER-2163

On Fri, Apr 10, 2015 at 6:39 PM, Patrick Hunt <ph...@apache.org> wrote:
> Adding is typically good from a b/w compact perspective. If you use the new
> feature (at runtime) it generally precludes rollback though.
>
> See CreateTxn and CreateTxnV0
>
> A bit of background on convenience vs availability: Originally in ZK's life
> we explicitly stayed away from such operations at the API level (another
> example being "rm -r"). We wanted to have high availability, in the sense
> that a single operation performed a single discreet operation on the
> service. We didn't want to allow "unbounded" sets of changes that might
> affect availability - say a single operation that triggered a thousand
> discreet operations on the service, blocking out clients from doing other
> work. This seems pretty bounded to me though - at worst deleting the entire
> parent chain, which in general should be relatively small.
>
> Patrick
>
> On Thu, Apr 9, 2015 at 12:41 PM, Jordan Zimmerman <
> jordan@jordanzimmerman.com> wrote:
>
>> You don’t even need to look at cversion. If the parent node is a container
>> and has no children (i.e. the node being deleted is the last child), it
>> gets deleted.
>>
>> The trouble I’m currently having, though, is that I don’t want to modify
>> the CreateTxn record. I can’t find a place to mark that the node should be
>> a container. I guess I’ll have to add a new record type. What are the
>> ramifications of that?
>>
>> -JZ
>>
>> On April 9, 2015 at 2:24:16 PM, Michi Mutsuzaki (michi@cs.stanford.edu)
>> wrote:
>>
>> I see, so the container znode is a znode that gets deleted if it's
>> empty and it ever had a child (cversion is greater than zero). It
>> sounds good to me. Let's see what other people say.
>>
>> Thanks Jordan!
>>
>> On Thu, Apr 9, 2015 at 10:20 AM, Jordan Zimmerman
>> <jo...@jordanzimmerman.com> wrote:
>> > This sounds great to me, but it sounds a lot like ZOOKEEPER-723.
>> >
>> > The problem with both ZOOKEEPER-723 and ZOOKEEPER-834 is that it
>> overloads
>> > the concept of EPHEMERAL. EPHEMERALs are tied to sessions. In the use
>> cases
>> > that I see, the parent node is always PERSISTENT - i.e. not tied to a
>> > session.
>> >
>> > I haven't looked at the patch yet, but how do you handle the "first
>> > child" problem?
>> >
>> > My solution applies only when a node is deleted. So, there is no need
>> for a
>> > first child check. When a node is deleted, iff it's parent has zero
>> children
>> > and is of type CONTAINER, then the parent is deleted and recursively up
>> the
>> > tree.
>> >
>> > -Jordan
>> >
>> > On April 9, 2015 at 12:15:33 PM, Michi Mutsuzaki (michi@cs.stanford.edu)
>> > wrote:
>> >
>> > Hi Jordan.
>> >
>> > This sounds great to me, but it sounds a lot like ZOOKEEPER-723.
>> > Different people had different ideas there, but the original
>> > description was:
>> >
>> > "rather than changing the semantics of ephemeral nodes, i propose
>> > ephemeral parents: znodes that disappear when they have no more
>> > children. this cleanup would happen automatically when the last child
>> > is removed. an ephemeral parent is not tied to any particular session,
>> > so even if the creator goes away, the ephemeral parent will remain as
>> > long as there are children."
>> >
>> > I haven't looked at the patch yet, but how do you handle the "first
>> > child" problem? Is the container znode created with a first child to
>> > prevent getting deleted, or does the client rely on multi to create a
>> > container and its children, or something else?
>> >
>> >
>> > On Thu, Apr 9, 2015 at 8:00 AM, Jordan Zimmerman
>> > <jo...@jordanzimmerman.com> wrote:
>> >> BACKGROUND
>> >> ============
>> >> A recurring problem for ZooKeeper users is garbage collection of parent
>> >> nodes. Many recipes (e.g. locks, leaders, etc.) call for the creation
>> of a
>> >> parent node under which participants create sequential nodes. When the
>> >> participant is done, it deletes its node. In practice, the ZooKeeper
>> tree
>> >> begins to fill up with orphaned parent nodes that are no longer needed.
>> The
>> >> ZooKeeper APIs don't provide a way to clean these. Over time, ZooKeeper
>> can
>> >> become unstable due to the number of these nodes.
>> >>
>> >> CURRENT SOLUTIONS
>> >> ===================
>> >> Apache Curator has a workaround solution for this by providing the
>> Reaper
>> >> class which runs in the background looking for orphaned parent nodes and
>> >> deleting them. This isn't ideal and it would be better if ZooKeeper
>> >> supported this directly.
>> >>
>> >> PROPOSAL
>> >> =========
>> >> ZOOKEEPER-723 and ZOOKEEPER-834 have been proposed to allow EPHEMERAL
>> >> nodes to contain child nodes. This is not optimum as EPHEMERALs are
>> tied to
>> >> a session and the general use case of parent nodes is for PERSISTENT
>> nodes.
>> >> This proposal adds a new node type, CONTAINER. A CONTAINER node is the
>> same
>> >> as a PERSISTENT node with the additional property that when its last
>> child
>> >> is deleted, it is deleted (and CONTAINER nodes recursively up the tree
>> are
>> >> deleted if empty).
>> >>
>> >> I have a first pass (untested) straw man proposal open for comment here:
>> >>
>> >> https://github.com/apache/zookeeper/pull/28
>> >>
>> >> In order to have minimum impact on existing implementations, a container
>> >> node is denoted by having an ephemeralOwner id of Long.MIN_VALUE. This
>> is
>> >> pretty hackish, but I think it's the most supportable without causing
>> >> disruption. Also, a container behaves a "little bit" like an EPHEMERAL
>> node
>> >> so it isn't totally illogical. Alternatively, a new field could be
>> added to
>> >> STAT.
>> >>
>> >> I look forward to feedback on this. If people think it's worthwhile I'll
>> >> open a Jira and work on a Production quality solution. If it's
>> rejected, I'd
>> >> appreciate discussion of an alternate as this is a real need in the ZK
>> user
>> >> community.
>> >>
>> >> -Jordan
>> >>
>> >>
>>

RE: [PROPOSAL] Container nodes

Posted by Jordan Zimmerman <jo...@jordanzimmerman.com>.
Never mind - found it (ant build-generated)



On April 13, 2015 at 2:14:24 PM, Jordan Zimmerman (jordan@jordanzimmerman.com) wrote:

OK - I found it. How do I build it so that the generated files get correctly placed in the source tree? I don’t see an ant task for it.

-Jordan



On April 13, 2015 at 1:53:08 PM, Hongchao Deng (fengjingchao@hotmail.com) wrote:

Those are generated by the jute compiler. Source file:
https://github.com/apache/zookeeper/blob/trunk/src/zookeeper.jute

- Hongchao Deng

> Date: Mon, 13 Apr 2015 13:47:09 -0500
> From: jordan@jordanzimmerman.com
> To: phunt@apache.org; dev@zookeeper.apache.org
> CC: michi@cs.stanford.edu
> Subject: Re: [PROPOSAL] Container nodes
>
> How are things such as Create2Request et al generated? I see the comment that it’s the Hadoop compiler but I don’t see the source files anywhere. Is it OK to manually create these (for new classes) or am I missing some source?
>
> -Jordan
>
>
>
> On April 10, 2015 at 6:40:23 PM, Patrick Hunt (phunt@apache.org) wrote:
>
> Adding is typically good from a b/w compact perspective. If you use the new
> feature (at runtime) it generally precludes rollback though.
>
> See CreateTxn and CreateTxnV0
>
> A bit of background on convenience vs availability: Originally in ZK's life
> we explicitly stayed away from such operations at the API level (another
> example being "rm -r"). We wanted to have high availability, in the sense
> that a single operation performed a single discreet operation on the
> service. We didn't want to allow "unbounded" sets of changes that might
> affect availability - say a single operation that triggered a thousand
> discreet operations on the service, blocking out clients from doing other
> work. This seems pretty bounded to me though - at worst deleting the entire
> parent chain, which in general should be relatively small.
>
> Patrick
>
> On Thu, Apr 9, 2015 at 12:41 PM, Jordan Zimmerman <
> jordan@jordanzimmerman.com> wrote:
>
> > You don’t even need to look at cversion. If the parent node is a container
> > and has no children (i.e. the node being deleted is the last child), it
> > gets deleted.
> >
> > The trouble I’m currently having, though, is that I don’t want to modify
> > the CreateTxn record. I can’t find a place to mark that the node should be
> > a container. I guess I’ll have to add a new record type. What are the
> > ramifications of that?
> >
> > -JZ
> >
> > On April 9, 2015 at 2:24:16 PM, Michi Mutsuzaki (michi@cs.stanford.edu)
> > wrote:
> >
> > I see, so the container znode is a znode that gets deleted if it's
> > empty and it ever had a child (cversion is greater than zero). It
> > sounds good to me. Let's see what other people say.
> >
> > Thanks Jordan!
> >
> > On Thu, Apr 9, 2015 at 10:20 AM, Jordan Zimmerman
> > <jo...@jordanzimmerman.com> wrote:
> > > This sounds great to me, but it sounds a lot like ZOOKEEPER-723.
> > >
> > > The problem with both ZOOKEEPER-723 and ZOOKEEPER-834 is that it
> > overloads
> > > the concept of EPHEMERAL. EPHEMERALs are tied to sessions. In the use
> > cases
> > > that I see, the parent node is always PERSISTENT - i.e. not tied to a
> > > session.
> > >
> > > I haven't looked at the patch yet, but how do you handle the "first
> > > child" problem?
> > >
> > > My solution applies only when a node is deleted. So, there is no need
> > for a
> > > first child check. When a node is deleted, iff it's parent has zero
> > children
> > > and is of type CONTAINER, then the parent is deleted and recursively up
> > the
> > > tree.
> > >
> > > -Jordan
> > >
> > > On April 9, 2015 at 12:15:33 PM, Michi Mutsuzaki (michi@cs.stanford.edu)
> > > wrote:
> > >
> > > Hi Jordan.
> > >
> > > This sounds great to me, but it sounds a lot like ZOOKEEPER-723.
> > > Different people had different ideas there, but the original
> > > description was:
> > >
> > > "rather than changing the semantics of ephemeral nodes, i propose
> > > ephemeral parents: znodes that disappear when they have no more
> > > children. this cleanup would happen automatically when the last child
> > > is removed. an ephemeral parent is not tied to any particular session,
> > > so even if the creator goes away, the ephemeral parent will remain as
> > > long as there are children."
> > >
> > > I haven't looked at the patch yet, but how do you handle the "first
> > > child" problem? Is the container znode created with a first child to
> > > prevent getting deleted, or does the client rely on multi to create a
> > > container and its children, or something else?
> > >
> > >
> > > On Thu, Apr 9, 2015 at 8:00 AM, Jordan Zimmerman
> > > <jo...@jordanzimmerman.com> wrote:
> > >> BACKGROUND
> > >> ============
> > >> A recurring problem for ZooKeeper users is garbage collection of parent
> > >> nodes. Many recipes (e.g. locks, leaders, etc.) call for the creation
> > of a
> > >> parent node under which participants create sequential nodes. When the
> > >> participant is done, it deletes its node. In practice, the ZooKeeper
> > tree
> > >> begins to fill up with orphaned parent nodes that are no longer needed.
> > The
> > >> ZooKeeper APIs don't provide a way to clean these. Over time, ZooKeeper
> > can
> > >> become unstable due to the number of these nodes.
> > >>
> > >> CURRENT SOLUTIONS
> > >> ===================
> > >> Apache Curator has a workaround solution for this by providing the
> > Reaper
> > >> class which runs in the background looking for orphaned parent nodes and
> > >> deleting them. This isn't ideal and it would be better if ZooKeeper
> > >> supported this directly.
> > >>
> > >> PROPOSAL
> > >> =========
> > >> ZOOKEEPER-723 and ZOOKEEPER-834 have been proposed to allow EPHEMERAL
> > >> nodes to contain child nodes. This is not optimum as EPHEMERALs are
> > tied to
> > >> a session and the general use case of parent nodes is for PERSISTENT
> > nodes.
> > >> This proposal adds a new node type, CONTAINER. A CONTAINER node is the
> > same
> > >> as a PERSISTENT node with the additional property that when its last
> > child
> > >> is deleted, it is deleted (and CONTAINER nodes recursively up the tree
> > are
> > >> deleted if empty).
> > >>
> > >> I have a first pass (untested) straw man proposal open for comment here:
> > >>
> > >> https://github.com/apache/zookeeper/pull/28
> > >>
> > >> In order to have minimum impact on existing implementations, a container
> > >> node is denoted by having an ephemeralOwner id of Long.MIN_VALUE. This
> > is
> > >> pretty hackish, but I think it's the most supportable without causing
> > >> disruption. Also, a container behaves a "little bit" like an EPHEMERAL
> > node
> > >> so it isn't totally illogical. Alternatively, a new field could be
> > added to
> > >> STAT.
> > >>
> > >> I look forward to feedback on this. If people think it's worthwhile I'll
> > >> open a Jira and work on a Production quality solution. If it's
> > rejected, I'd
> > >> appreciate discussion of an alternate as this is a real need in the ZK
> > user
> > >> community.
> > >>
> > >> -Jordan
> > >>
> > >>
> >

RE: [PROPOSAL] Container nodes

Posted by Jordan Zimmerman <jo...@jordanzimmerman.com>.
OK - I found it. How do I build it so that the generated files get correctly placed in the source tree? I don’t see an ant task for it.

-Jordan



On April 13, 2015 at 1:53:08 PM, Hongchao Deng (fengjingchao@hotmail.com) wrote:

Those are generated by the jute compiler. Source file:  
https://github.com/apache/zookeeper/blob/trunk/src/zookeeper.jute  

- Hongchao Deng  

> Date: Mon, 13 Apr 2015 13:47:09 -0500  
> From: jordan@jordanzimmerman.com  
> To: phunt@apache.org; dev@zookeeper.apache.org  
> CC: michi@cs.stanford.edu  
> Subject: Re: [PROPOSAL] Container nodes  
>  
> How are things such as Create2Request et al generated? I see the comment that it’s the Hadoop compiler but I don’t see the source files anywhere. Is it OK to manually create these (for new classes) or am I missing some source?  
>  
> -Jordan  
>  
>  
>  
> On April 10, 2015 at 6:40:23 PM, Patrick Hunt (phunt@apache.org) wrote:  
>  
> Adding is typically good from a b/w compact perspective. If you use the new  
> feature (at runtime) it generally precludes rollback though.  
>  
> See CreateTxn and CreateTxnV0  
>  
> A bit of background on convenience vs availability: Originally in ZK's life  
> we explicitly stayed away from such operations at the API level (another  
> example being "rm -r"). We wanted to have high availability, in the sense  
> that a single operation performed a single discreet operation on the  
> service. We didn't want to allow "unbounded" sets of changes that might  
> affect availability - say a single operation that triggered a thousand  
> discreet operations on the service, blocking out clients from doing other  
> work. This seems pretty bounded to me though - at worst deleting the entire  
> parent chain, which in general should be relatively small.  
>  
> Patrick  
>  
> On Thu, Apr 9, 2015 at 12:41 PM, Jordan Zimmerman <  
> jordan@jordanzimmerman.com> wrote:  
>  
> > You don’t even need to look at cversion. If the parent node is a container  
> > and has no children (i.e. the node being deleted is the last child), it  
> > gets deleted.  
> >  
> > The trouble I’m currently having, though, is that I don’t want to modify  
> > the CreateTxn record. I can’t find a place to mark that the node should be  
> > a container. I guess I’ll have to add a new record type. What are the  
> > ramifications of that?  
> >  
> > -JZ  
> >  
> > On April 9, 2015 at 2:24:16 PM, Michi Mutsuzaki (michi@cs.stanford.edu)  
> > wrote:  
> >  
> > I see, so the container znode is a znode that gets deleted if it's  
> > empty and it ever had a child (cversion is greater than zero). It  
> > sounds good to me. Let's see what other people say.  
> >  
> > Thanks Jordan!  
> >  
> > On Thu, Apr 9, 2015 at 10:20 AM, Jordan Zimmerman  
> > <jo...@jordanzimmerman.com> wrote:  
> > > This sounds great to me, but it sounds a lot like ZOOKEEPER-723.  
> > >  
> > > The problem with both ZOOKEEPER-723 and ZOOKEEPER-834 is that it  
> > overloads  
> > > the concept of EPHEMERAL. EPHEMERALs are tied to sessions. In the use  
> > cases  
> > > that I see, the parent node is always PERSISTENT - i.e. not tied to a  
> > > session.  
> > >  
> > > I haven't looked at the patch yet, but how do you handle the "first  
> > > child" problem?  
> > >  
> > > My solution applies only when a node is deleted. So, there is no need  
> > for a  
> > > first child check. When a node is deleted, iff it's parent has zero  
> > children  
> > > and is of type CONTAINER, then the parent is deleted and recursively up  
> > the  
> > > tree.  
> > >  
> > > -Jordan  
> > >  
> > > On April 9, 2015 at 12:15:33 PM, Michi Mutsuzaki (michi@cs.stanford.edu)  
> > > wrote:  
> > >  
> > > Hi Jordan.  
> > >  
> > > This sounds great to me, but it sounds a lot like ZOOKEEPER-723.  
> > > Different people had different ideas there, but the original  
> > > description was:  
> > >  
> > > "rather than changing the semantics of ephemeral nodes, i propose  
> > > ephemeral parents: znodes that disappear when they have no more  
> > > children. this cleanup would happen automatically when the last child  
> > > is removed. an ephemeral parent is not tied to any particular session,  
> > > so even if the creator goes away, the ephemeral parent will remain as  
> > > long as there are children."  
> > >  
> > > I haven't looked at the patch yet, but how do you handle the "first  
> > > child" problem? Is the container znode created with a first child to  
> > > prevent getting deleted, or does the client rely on multi to create a  
> > > container and its children, or something else?  
> > >  
> > >  
> > > On Thu, Apr 9, 2015 at 8:00 AM, Jordan Zimmerman  
> > > <jo...@jordanzimmerman.com> wrote:  
> > >> BACKGROUND  
> > >> ============  
> > >> A recurring problem for ZooKeeper users is garbage collection of parent  
> > >> nodes. Many recipes (e.g. locks, leaders, etc.) call for the creation  
> > of a  
> > >> parent node under which participants create sequential nodes. When the  
> > >> participant is done, it deletes its node. In practice, the ZooKeeper  
> > tree  
> > >> begins to fill up with orphaned parent nodes that are no longer needed.  
> > The  
> > >> ZooKeeper APIs don't provide a way to clean these. Over time, ZooKeeper  
> > can  
> > >> become unstable due to the number of these nodes.  
> > >>  
> > >> CURRENT SOLUTIONS  
> > >> ===================  
> > >> Apache Curator has a workaround solution for this by providing the  
> > Reaper  
> > >> class which runs in the background looking for orphaned parent nodes and  
> > >> deleting them. This isn't ideal and it would be better if ZooKeeper  
> > >> supported this directly.  
> > >>  
> > >> PROPOSAL  
> > >> =========  
> > >> ZOOKEEPER-723 and ZOOKEEPER-834 have been proposed to allow EPHEMERAL  
> > >> nodes to contain child nodes. This is not optimum as EPHEMERALs are  
> > tied to  
> > >> a session and the general use case of parent nodes is for PERSISTENT  
> > nodes.  
> > >> This proposal adds a new node type, CONTAINER. A CONTAINER node is the  
> > same  
> > >> as a PERSISTENT node with the additional property that when its last  
> > child  
> > >> is deleted, it is deleted (and CONTAINER nodes recursively up the tree  
> > are  
> > >> deleted if empty).  
> > >>  
> > >> I have a first pass (untested) straw man proposal open for comment here:  
> > >>  
> > >> https://github.com/apache/zookeeper/pull/28  
> > >>  
> > >> In order to have minimum impact on existing implementations, a container  
> > >> node is denoted by having an ephemeralOwner id of Long.MIN_VALUE. This  
> > is  
> > >> pretty hackish, but I think it's the most supportable without causing  
> > >> disruption. Also, a container behaves a "little bit" like an EPHEMERAL  
> > node  
> > >> so it isn't totally illogical. Alternatively, a new field could be  
> > added to  
> > >> STAT.  
> > >>  
> > >> I look forward to feedback on this. If people think it's worthwhile I'll  
> > >> open a Jira and work on a Production quality solution. If it's  
> > rejected, I'd  
> > >> appreciate discussion of an alternate as this is a real need in the ZK  
> > user  
> > >> community.  
> > >>  
> > >> -Jordan  
> > >>  
> > >>  
> >  

RE: [PROPOSAL] Container nodes

Posted by Hongchao Deng <fe...@hotmail.com>.
Those are generated by the jute compiler. Source file:
https://github.com/apache/zookeeper/blob/trunk/src/zookeeper.jute

- Hongchao Deng

> Date: Mon, 13 Apr 2015 13:47:09 -0500
> From: jordan@jordanzimmerman.com
> To: phunt@apache.org; dev@zookeeper.apache.org
> CC: michi@cs.stanford.edu
> Subject: Re: [PROPOSAL] Container nodes
> 
> How are things such as Create2Request et al generated? I see the comment that it’s the Hadoop compiler but I don’t see the source files anywhere. Is it OK to manually create these (for new classes) or am I missing some source?
> 
> -Jordan
> 
> 
> 
> On April 10, 2015 at 6:40:23 PM, Patrick Hunt (phunt@apache.org) wrote:
> 
> Adding is typically good from a b/w compact perspective. If you use the new  
> feature (at runtime) it generally precludes rollback though.  
> 
> See CreateTxn and CreateTxnV0  
> 
> A bit of background on convenience vs availability: Originally in ZK's life  
> we explicitly stayed away from such operations at the API level (another  
> example being "rm -r"). We wanted to have high availability, in the sense  
> that a single operation performed a single discreet operation on the  
> service. We didn't want to allow "unbounded" sets of changes that might  
> affect availability - say a single operation that triggered a thousand  
> discreet operations on the service, blocking out clients from doing other  
> work. This seems pretty bounded to me though - at worst deleting the entire  
> parent chain, which in general should be relatively small.  
> 
> Patrick  
> 
> On Thu, Apr 9, 2015 at 12:41 PM, Jordan Zimmerman <  
> jordan@jordanzimmerman.com> wrote:  
> 
> > You don’t even need to look at cversion. If the parent node is a container  
> > and has no children (i.e. the node being deleted is the last child), it  
> > gets deleted.  
> >  
> > The trouble I’m currently having, though, is that I don’t want to modify  
> > the CreateTxn record. I can’t find a place to mark that the node should be  
> > a container. I guess I’ll have to add a new record type. What are the  
> > ramifications of that?  
> >  
> > -JZ  
> >  
> > On April 9, 2015 at 2:24:16 PM, Michi Mutsuzaki (michi@cs.stanford.edu)  
> > wrote:  
> >  
> > I see, so the container znode is a znode that gets deleted if it's  
> > empty and it ever had a child (cversion is greater than zero). It  
> > sounds good to me. Let's see what other people say.  
> >  
> > Thanks Jordan!  
> >  
> > On Thu, Apr 9, 2015 at 10:20 AM, Jordan Zimmerman  
> > <jo...@jordanzimmerman.com> wrote:  
> > > This sounds great to me, but it sounds a lot like ZOOKEEPER-723.  
> > >  
> > > The problem with both ZOOKEEPER-723 and ZOOKEEPER-834 is that it  
> > overloads  
> > > the concept of EPHEMERAL. EPHEMERALs are tied to sessions. In the use  
> > cases  
> > > that I see, the parent node is always PERSISTENT - i.e. not tied to a  
> > > session.  
> > >  
> > > I haven't looked at the patch yet, but how do you handle the "first  
> > > child" problem?  
> > >  
> > > My solution applies only when a node is deleted. So, there is no need  
> > for a  
> > > first child check. When a node is deleted, iff it's parent has zero  
> > children  
> > > and is of type CONTAINER, then the parent is deleted and recursively up  
> > the  
> > > tree.  
> > >  
> > > -Jordan  
> > >  
> > > On April 9, 2015 at 12:15:33 PM, Michi Mutsuzaki (michi@cs.stanford.edu)  
> > > wrote:  
> > >  
> > > Hi Jordan.  
> > >  
> > > This sounds great to me, but it sounds a lot like ZOOKEEPER-723.  
> > > Different people had different ideas there, but the original  
> > > description was:  
> > >  
> > > "rather than changing the semantics of ephemeral nodes, i propose  
> > > ephemeral parents: znodes that disappear when they have no more  
> > > children. this cleanup would happen automatically when the last child  
> > > is removed. an ephemeral parent is not tied to any particular session,  
> > > so even if the creator goes away, the ephemeral parent will remain as  
> > > long as there are children."  
> > >  
> > > I haven't looked at the patch yet, but how do you handle the "first  
> > > child" problem? Is the container znode created with a first child to  
> > > prevent getting deleted, or does the client rely on multi to create a  
> > > container and its children, or something else?  
> > >  
> > >  
> > > On Thu, Apr 9, 2015 at 8:00 AM, Jordan Zimmerman  
> > > <jo...@jordanzimmerman.com> wrote:  
> > >> BACKGROUND  
> > >> ============  
> > >> A recurring problem for ZooKeeper users is garbage collection of parent  
> > >> nodes. Many recipes (e.g. locks, leaders, etc.) call for the creation  
> > of a  
> > >> parent node under which participants create sequential nodes. When the  
> > >> participant is done, it deletes its node. In practice, the ZooKeeper  
> > tree  
> > >> begins to fill up with orphaned parent nodes that are no longer needed.  
> > The  
> > >> ZooKeeper APIs don't provide a way to clean these. Over time, ZooKeeper  
> > can  
> > >> become unstable due to the number of these nodes.  
> > >>  
> > >> CURRENT SOLUTIONS  
> > >> ===================  
> > >> Apache Curator has a workaround solution for this by providing the  
> > Reaper  
> > >> class which runs in the background looking for orphaned parent nodes and  
> > >> deleting them. This isn't ideal and it would be better if ZooKeeper  
> > >> supported this directly.  
> > >>  
> > >> PROPOSAL  
> > >> =========  
> > >> ZOOKEEPER-723 and ZOOKEEPER-834 have been proposed to allow EPHEMERAL  
> > >> nodes to contain child nodes. This is not optimum as EPHEMERALs are  
> > tied to  
> > >> a session and the general use case of parent nodes is for PERSISTENT  
> > nodes.  
> > >> This proposal adds a new node type, CONTAINER. A CONTAINER node is the  
> > same  
> > >> as a PERSISTENT node with the additional property that when its last  
> > child  
> > >> is deleted, it is deleted (and CONTAINER nodes recursively up the tree  
> > are  
> > >> deleted if empty).  
> > >>  
> > >> I have a first pass (untested) straw man proposal open for comment here:  
> > >>  
> > >> https://github.com/apache/zookeeper/pull/28  
> > >>  
> > >> In order to have minimum impact on existing implementations, a container  
> > >> node is denoted by having an ephemeralOwner id of Long.MIN_VALUE. This  
> > is  
> > >> pretty hackish, but I think it's the most supportable without causing  
> > >> disruption. Also, a container behaves a "little bit" like an EPHEMERAL  
> > node  
> > >> so it isn't totally illogical. Alternatively, a new field could be  
> > added to  
> > >> STAT.  
> > >>  
> > >> I look forward to feedback on this. If people think it's worthwhile I'll  
> > >> open a Jira and work on a Production quality solution. If it's  
> > rejected, I'd  
> > >> appreciate discussion of an alternate as this is a real need in the ZK  
> > user  
> > >> community.  
> > >>  
> > >> -Jordan  
> > >>  
> > >>  
> >  
 		 	   		  

Re: [PROPOSAL] Container nodes

Posted by Jordan Zimmerman <jo...@jordanzimmerman.com>.
How are things such as Create2Request et al generated? I see the comment that it’s the Hadoop compiler but I don’t see the source files anywhere. Is it OK to manually create these (for new classes) or am I missing some source?

-Jordan



On April 10, 2015 at 6:40:23 PM, Patrick Hunt (phunt@apache.org) wrote:

Adding is typically good from a b/w compact perspective. If you use the new  
feature (at runtime) it generally precludes rollback though.  

See CreateTxn and CreateTxnV0  

A bit of background on convenience vs availability: Originally in ZK's life  
we explicitly stayed away from such operations at the API level (another  
example being "rm -r"). We wanted to have high availability, in the sense  
that a single operation performed a single discreet operation on the  
service. We didn't want to allow "unbounded" sets of changes that might  
affect availability - say a single operation that triggered a thousand  
discreet operations on the service, blocking out clients from doing other  
work. This seems pretty bounded to me though - at worst deleting the entire  
parent chain, which in general should be relatively small.  

Patrick  

On Thu, Apr 9, 2015 at 12:41 PM, Jordan Zimmerman <  
jordan@jordanzimmerman.com> wrote:  

> You don’t even need to look at cversion. If the parent node is a container  
> and has no children (i.e. the node being deleted is the last child), it  
> gets deleted.  
>  
> The trouble I’m currently having, though, is that I don’t want to modify  
> the CreateTxn record. I can’t find a place to mark that the node should be  
> a container. I guess I’ll have to add a new record type. What are the  
> ramifications of that?  
>  
> -JZ  
>  
> On April 9, 2015 at 2:24:16 PM, Michi Mutsuzaki (michi@cs.stanford.edu)  
> wrote:  
>  
> I see, so the container znode is a znode that gets deleted if it's  
> empty and it ever had a child (cversion is greater than zero). It  
> sounds good to me. Let's see what other people say.  
>  
> Thanks Jordan!  
>  
> On Thu, Apr 9, 2015 at 10:20 AM, Jordan Zimmerman  
> <jo...@jordanzimmerman.com> wrote:  
> > This sounds great to me, but it sounds a lot like ZOOKEEPER-723.  
> >  
> > The problem with both ZOOKEEPER-723 and ZOOKEEPER-834 is that it  
> overloads  
> > the concept of EPHEMERAL. EPHEMERALs are tied to sessions. In the use  
> cases  
> > that I see, the parent node is always PERSISTENT - i.e. not tied to a  
> > session.  
> >  
> > I haven't looked at the patch yet, but how do you handle the "first  
> > child" problem?  
> >  
> > My solution applies only when a node is deleted. So, there is no need  
> for a  
> > first child check. When a node is deleted, iff it's parent has zero  
> children  
> > and is of type CONTAINER, then the parent is deleted and recursively up  
> the  
> > tree.  
> >  
> > -Jordan  
> >  
> > On April 9, 2015 at 12:15:33 PM, Michi Mutsuzaki (michi@cs.stanford.edu)  
> > wrote:  
> >  
> > Hi Jordan.  
> >  
> > This sounds great to me, but it sounds a lot like ZOOKEEPER-723.  
> > Different people had different ideas there, but the original  
> > description was:  
> >  
> > "rather than changing the semantics of ephemeral nodes, i propose  
> > ephemeral parents: znodes that disappear when they have no more  
> > children. this cleanup would happen automatically when the last child  
> > is removed. an ephemeral parent is not tied to any particular session,  
> > so even if the creator goes away, the ephemeral parent will remain as  
> > long as there are children."  
> >  
> > I haven't looked at the patch yet, but how do you handle the "first  
> > child" problem? Is the container znode created with a first child to  
> > prevent getting deleted, or does the client rely on multi to create a  
> > container and its children, or something else?  
> >  
> >  
> > On Thu, Apr 9, 2015 at 8:00 AM, Jordan Zimmerman  
> > <jo...@jordanzimmerman.com> wrote:  
> >> BACKGROUND  
> >> ============  
> >> A recurring problem for ZooKeeper users is garbage collection of parent  
> >> nodes. Many recipes (e.g. locks, leaders, etc.) call for the creation  
> of a  
> >> parent node under which participants create sequential nodes. When the  
> >> participant is done, it deletes its node. In practice, the ZooKeeper  
> tree  
> >> begins to fill up with orphaned parent nodes that are no longer needed.  
> The  
> >> ZooKeeper APIs don't provide a way to clean these. Over time, ZooKeeper  
> can  
> >> become unstable due to the number of these nodes.  
> >>  
> >> CURRENT SOLUTIONS  
> >> ===================  
> >> Apache Curator has a workaround solution for this by providing the  
> Reaper  
> >> class which runs in the background looking for orphaned parent nodes and  
> >> deleting them. This isn't ideal and it would be better if ZooKeeper  
> >> supported this directly.  
> >>  
> >> PROPOSAL  
> >> =========  
> >> ZOOKEEPER-723 and ZOOKEEPER-834 have been proposed to allow EPHEMERAL  
> >> nodes to contain child nodes. This is not optimum as EPHEMERALs are  
> tied to  
> >> a session and the general use case of parent nodes is for PERSISTENT  
> nodes.  
> >> This proposal adds a new node type, CONTAINER. A CONTAINER node is the  
> same  
> >> as a PERSISTENT node with the additional property that when its last  
> child  
> >> is deleted, it is deleted (and CONTAINER nodes recursively up the tree  
> are  
> >> deleted if empty).  
> >>  
> >> I have a first pass (untested) straw man proposal open for comment here:  
> >>  
> >> https://github.com/apache/zookeeper/pull/28  
> >>  
> >> In order to have minimum impact on existing implementations, a container  
> >> node is denoted by having an ephemeralOwner id of Long.MIN_VALUE. This  
> is  
> >> pretty hackish, but I think it's the most supportable without causing  
> >> disruption. Also, a container behaves a "little bit" like an EPHEMERAL  
> node  
> >> so it isn't totally illogical. Alternatively, a new field could be  
> added to  
> >> STAT.  
> >>  
> >> I look forward to feedback on this. If people think it's worthwhile I'll  
> >> open a Jira and work on a Production quality solution. If it's  
> rejected, I'd  
> >> appreciate discussion of an alternate as this is a real need in the ZK  
> user  
> >> community.  
> >>  
> >> -Jordan  
> >>  
> >>  
>  

Re: [PROPOSAL] Container nodes

Posted by Patrick Hunt <ph...@apache.org>.
Adding is typically good from a b/w compact perspective. If you use the new
feature (at runtime) it generally precludes rollback though.

See CreateTxn and CreateTxnV0

A bit of background on convenience vs availability: Originally in ZK's life
we explicitly stayed away from such operations at the API level (another
example being "rm -r"). We wanted to have high availability, in the sense
that a single operation performed a single discreet operation on the
service. We didn't want to allow "unbounded" sets of changes that might
affect availability - say a single operation that triggered a thousand
discreet operations on the service, blocking out clients from doing other
work. This seems pretty bounded to me though - at worst deleting the entire
parent chain, which in general should be relatively small.

Patrick

On Thu, Apr 9, 2015 at 12:41 PM, Jordan Zimmerman <
jordan@jordanzimmerman.com> wrote:

> You don’t even need to look at cversion. If the parent node is a container
> and has no children (i.e. the node being deleted is the last child), it
> gets deleted.
>
> The trouble I’m currently having, though, is that I don’t want to modify
> the CreateTxn record. I can’t find a place to mark that the node should be
> a container. I guess I’ll have to add a new record type. What are the
> ramifications of that?
>
> -JZ
>
> On April 9, 2015 at 2:24:16 PM, Michi Mutsuzaki (michi@cs.stanford.edu)
> wrote:
>
> I see, so the container znode is a znode that gets deleted if it's
> empty and it ever had a child (cversion is greater than zero). It
> sounds good to me. Let's see what other people say.
>
> Thanks Jordan!
>
> On Thu, Apr 9, 2015 at 10:20 AM, Jordan Zimmerman
> <jo...@jordanzimmerman.com> wrote:
> > This sounds great to me, but it sounds a lot like ZOOKEEPER-723.
> >
> > The problem with both ZOOKEEPER-723 and ZOOKEEPER-834 is that it
> overloads
> > the concept of EPHEMERAL. EPHEMERALs are tied to sessions. In the use
> cases
> > that I see, the parent node is always PERSISTENT - i.e. not tied to a
> > session.
> >
> > I haven't looked at the patch yet, but how do you handle the "first
> > child" problem?
> >
> > My solution applies only when a node is deleted. So, there is no need
> for a
> > first child check. When a node is deleted, iff it's parent has zero
> children
> > and is of type CONTAINER, then the parent is deleted and recursively up
> the
> > tree.
> >
> > -Jordan
> >
> > On April 9, 2015 at 12:15:33 PM, Michi Mutsuzaki (michi@cs.stanford.edu)
> > wrote:
> >
> > Hi Jordan.
> >
> > This sounds great to me, but it sounds a lot like ZOOKEEPER-723.
> > Different people had different ideas there, but the original
> > description was:
> >
> > "rather than changing the semantics of ephemeral nodes, i propose
> > ephemeral parents: znodes that disappear when they have no more
> > children. this cleanup would happen automatically when the last child
> > is removed. an ephemeral parent is not tied to any particular session,
> > so even if the creator goes away, the ephemeral parent will remain as
> > long as there are children."
> >
> > I haven't looked at the patch yet, but how do you handle the "first
> > child" problem? Is the container znode created with a first child to
> > prevent getting deleted, or does the client rely on multi to create a
> > container and its children, or something else?
> >
> >
> > On Thu, Apr 9, 2015 at 8:00 AM, Jordan Zimmerman
> > <jo...@jordanzimmerman.com> wrote:
> >> BACKGROUND
> >> ============
> >> A recurring problem for ZooKeeper users is garbage collection of parent
> >> nodes. Many recipes (e.g. locks, leaders, etc.) call for the creation
> of a
> >> parent node under which participants create sequential nodes. When the
> >> participant is done, it deletes its node. In practice, the ZooKeeper
> tree
> >> begins to fill up with orphaned parent nodes that are no longer needed.
> The
> >> ZooKeeper APIs don't provide a way to clean these. Over time, ZooKeeper
> can
> >> become unstable due to the number of these nodes.
> >>
> >> CURRENT SOLUTIONS
> >> ===================
> >> Apache Curator has a workaround solution for this by providing the
> Reaper
> >> class which runs in the background looking for orphaned parent nodes and
> >> deleting them. This isn't ideal and it would be better if ZooKeeper
> >> supported this directly.
> >>
> >> PROPOSAL
> >> =========
> >> ZOOKEEPER-723 and ZOOKEEPER-834 have been proposed to allow EPHEMERAL
> >> nodes to contain child nodes. This is not optimum as EPHEMERALs are
> tied to
> >> a session and the general use case of parent nodes is for PERSISTENT
> nodes.
> >> This proposal adds a new node type, CONTAINER. A CONTAINER node is the
> same
> >> as a PERSISTENT node with the additional property that when its last
> child
> >> is deleted, it is deleted (and CONTAINER nodes recursively up the tree
> are
> >> deleted if empty).
> >>
> >> I have a first pass (untested) straw man proposal open for comment here:
> >>
> >> https://github.com/apache/zookeeper/pull/28
> >>
> >> In order to have minimum impact on existing implementations, a container
> >> node is denoted by having an ephemeralOwner id of Long.MIN_VALUE. This
> is
> >> pretty hackish, but I think it's the most supportable without causing
> >> disruption. Also, a container behaves a "little bit" like an EPHEMERAL
> node
> >> so it isn't totally illogical. Alternatively, a new field could be
> added to
> >> STAT.
> >>
> >> I look forward to feedback on this. If people think it's worthwhile I'll
> >> open a Jira and work on a Production quality solution. If it's
> rejected, I'd
> >> appreciate discussion of an alternate as this is a real need in the ZK
> user
> >> community.
> >>
> >> -Jordan
> >>
> >>
>

Re: [PROPOSAL] Container nodes

Posted by Jordan Zimmerman <jo...@jordanzimmerman.com>.
You don’t even need to look at cversion. If the parent node is a container and has no children (i.e. the node being deleted is the last child), it gets deleted.

The trouble I’m currently having, though, is that I don’t want to modify the CreateTxn record. I can’t find a place to mark that the node should be a container. I guess I’ll have to add a new record type. What are the ramifications of that?

-JZ

On April 9, 2015 at 2:24:16 PM, Michi Mutsuzaki (michi@cs.stanford.edu) wrote:

I see, so the container znode is a znode that gets deleted if it's  
empty and it ever had a child (cversion is greater than zero). It  
sounds good to me. Let's see what other people say.  

Thanks Jordan!  

On Thu, Apr 9, 2015 at 10:20 AM, Jordan Zimmerman  
<jo...@jordanzimmerman.com> wrote:  
> This sounds great to me, but it sounds a lot like ZOOKEEPER-723.  
>  
> The problem with both ZOOKEEPER-723 and ZOOKEEPER-834 is that it overloads  
> the concept of EPHEMERAL. EPHEMERALs are tied to sessions. In the use cases  
> that I see, the parent node is always PERSISTENT - i.e. not tied to a  
> session.  
>  
> I haven't looked at the patch yet, but how do you handle the "first  
> child" problem?  
>  
> My solution applies only when a node is deleted. So, there is no need for a  
> first child check. When a node is deleted, iff it's parent has zero children  
> and is of type CONTAINER, then the parent is deleted and recursively up the  
> tree.  
>  
> -Jordan  
>  
> On April 9, 2015 at 12:15:33 PM, Michi Mutsuzaki (michi@cs.stanford.edu)  
> wrote:  
>  
> Hi Jordan.  
>  
> This sounds great to me, but it sounds a lot like ZOOKEEPER-723.  
> Different people had different ideas there, but the original  
> description was:  
>  
> "rather than changing the semantics of ephemeral nodes, i propose  
> ephemeral parents: znodes that disappear when they have no more  
> children. this cleanup would happen automatically when the last child  
> is removed. an ephemeral parent is not tied to any particular session,  
> so even if the creator goes away, the ephemeral parent will remain as  
> long as there are children."  
>  
> I haven't looked at the patch yet, but how do you handle the "first  
> child" problem? Is the container znode created with a first child to  
> prevent getting deleted, or does the client rely on multi to create a  
> container and its children, or something else?  
>  
>  
> On Thu, Apr 9, 2015 at 8:00 AM, Jordan Zimmerman  
> <jo...@jordanzimmerman.com> wrote:  
>> BACKGROUND  
>> ============  
>> A recurring problem for ZooKeeper users is garbage collection of parent  
>> nodes. Many recipes (e.g. locks, leaders, etc.) call for the creation of a  
>> parent node under which participants create sequential nodes. When the  
>> participant is done, it deletes its node. In practice, the ZooKeeper tree  
>> begins to fill up with orphaned parent nodes that are no longer needed. The  
>> ZooKeeper APIs don't provide a way to clean these. Over time, ZooKeeper can  
>> become unstable due to the number of these nodes.  
>>  
>> CURRENT SOLUTIONS  
>> ===================  
>> Apache Curator has a workaround solution for this by providing the Reaper  
>> class which runs in the background looking for orphaned parent nodes and  
>> deleting them. This isn't ideal and it would be better if ZooKeeper  
>> supported this directly.  
>>  
>> PROPOSAL  
>> =========  
>> ZOOKEEPER-723 and ZOOKEEPER-834 have been proposed to allow EPHEMERAL  
>> nodes to contain child nodes. This is not optimum as EPHEMERALs are tied to  
>> a session and the general use case of parent nodes is for PERSISTENT nodes.  
>> This proposal adds a new node type, CONTAINER. A CONTAINER node is the same  
>> as a PERSISTENT node with the additional property that when its last child  
>> is deleted, it is deleted (and CONTAINER nodes recursively up the tree are  
>> deleted if empty).  
>>  
>> I have a first pass (untested) straw man proposal open for comment here:  
>>  
>> https://github.com/apache/zookeeper/pull/28  
>>  
>> In order to have minimum impact on existing implementations, a container  
>> node is denoted by having an ephemeralOwner id of Long.MIN_VALUE. This is  
>> pretty hackish, but I think it's the most supportable without causing  
>> disruption. Also, a container behaves a "little bit" like an EPHEMERAL node  
>> so it isn't totally illogical. Alternatively, a new field could be added to  
>> STAT.  
>>  
>> I look forward to feedback on this. If people think it's worthwhile I'll  
>> open a Jira and work on a Production quality solution. If it's rejected, I'd  
>> appreciate discussion of an alternate as this is a real need in the ZK user  
>> community.  
>>  
>> -Jordan  
>>  
>>  

Re: [PROPOSAL] Container nodes

Posted by Michi Mutsuzaki <mi...@cs.stanford.edu>.
I see, so the container znode is a znode that gets deleted if it's
empty and it ever had a child (cversion is greater than zero). It
sounds good to me. Let's see what other people say.

Thanks Jordan!

On Thu, Apr 9, 2015 at 10:20 AM, Jordan Zimmerman
<jo...@jordanzimmerman.com> wrote:
> This sounds great to me, but it sounds a lot like ZOOKEEPER-723.
>
> The problem with both ZOOKEEPER-723 and ZOOKEEPER-834 is that it overloads
> the concept of EPHEMERAL. EPHEMERALs are tied to sessions. In the use cases
> that I see, the parent node is always PERSISTENT - i.e. not tied to a
> session.
>
> I haven't looked at the patch yet, but how do you handle the "first
> child" problem?
>
> My solution applies only when a node is deleted. So, there is no need for a
> first child check. When a node is deleted, iff it's parent has zero children
> and is of type CONTAINER, then the parent is deleted and recursively up the
> tree.
>
> -Jordan
>
> On April 9, 2015 at 12:15:33 PM, Michi Mutsuzaki (michi@cs.stanford.edu)
> wrote:
>
> Hi Jordan.
>
> This sounds great to me, but it sounds a lot like ZOOKEEPER-723.
> Different people had different ideas there, but the original
> description was:
>
> "rather than changing the semantics of ephemeral nodes, i propose
> ephemeral parents: znodes that disappear when they have no more
> children. this cleanup would happen automatically when the last child
> is removed. an ephemeral parent is not tied to any particular session,
> so even if the creator goes away, the ephemeral parent will remain as
> long as there are children."
>
> I haven't looked at the patch yet, but how do you handle the "first
> child" problem? Is the container znode created with a first child to
> prevent getting deleted, or does the client rely on multi to create a
> container and its children, or something else?
>
>
> On Thu, Apr 9, 2015 at 8:00 AM, Jordan Zimmerman
> <jo...@jordanzimmerman.com> wrote:
>> BACKGROUND
>> ============
>> A recurring problem for ZooKeeper users is garbage collection of parent
>> nodes. Many recipes (e.g. locks, leaders, etc.) call for the creation of a
>> parent node under which participants create sequential nodes. When the
>> participant is done, it deletes its node. In practice, the ZooKeeper tree
>> begins to fill up with orphaned parent nodes that are no longer needed. The
>> ZooKeeper APIs don't provide a way to clean these. Over time, ZooKeeper can
>> become unstable due to the number of these nodes.
>>
>> CURRENT SOLUTIONS
>> ===================
>> Apache Curator has a workaround solution for this by providing the Reaper
>> class which runs in the background looking for orphaned parent nodes and
>> deleting them. This isn't ideal and it would be better if ZooKeeper
>> supported this directly.
>>
>> PROPOSAL
>> =========
>> ZOOKEEPER-723 and ZOOKEEPER-834 have been proposed to allow EPHEMERAL
>> nodes to contain child nodes. This is not optimum as EPHEMERALs are tied to
>> a session and the general use case of parent nodes is for PERSISTENT nodes.
>> This proposal adds a new node type, CONTAINER. A CONTAINER node is the same
>> as a PERSISTENT node with the additional property that when its last child
>> is deleted, it is deleted (and CONTAINER nodes recursively up the tree are
>> deleted if empty).
>>
>> I have a first pass (untested) straw man proposal open for comment here:
>>
>> https://github.com/apache/zookeeper/pull/28
>>
>> In order to have minimum impact on existing implementations, a container
>> node is denoted by having an ephemeralOwner id of Long.MIN_VALUE. This is
>> pretty hackish, but I think it's the most supportable without causing
>> disruption. Also, a container behaves a "little bit" like an EPHEMERAL node
>> so it isn't totally illogical. Alternatively, a new field could be added to
>> STAT.
>>
>> I look forward to feedback on this. If people think it's worthwhile I'll
>> open a Jira and work on a Production quality solution. If it's rejected, I'd
>> appreciate discussion of an alternate as this is a real need in the ZK user
>> community.
>>
>> -Jordan
>>
>>

Re: [PROPOSAL] Container nodes

Posted by Jordan Zimmerman <jo...@jordanzimmerman.com>.
This sounds great to me, but it sounds a lot like ZOOKEEPER-723. 
The problem with both ZOOKEEPER-723 and ZOOKEEPER-834 is that it overloads the concept of EPHEMERAL. EPHEMERALs are tied to sessions. In the use cases that I see, the parent node is always PERSISTENT - i.e. not tied to a session.

I haven't looked at the patch yet, but how do you handle the "first 
child" problem?
My solution applies only when a node is deleted. So, there is no need for a first child check. When a node is deleted, iff it’s parent has zero children and is of type CONTAINER, then the parent is deleted and recursively up the tree.

-Jordan

On April 9, 2015 at 12:15:33 PM, Michi Mutsuzaki (michi@cs.stanford.edu) wrote:

Hi Jordan.  

This sounds great to me, but it sounds a lot like ZOOKEEPER-723.  
Different people had different ideas there, but the original  
description was:  

"rather than changing the semantics of ephemeral nodes, i propose  
ephemeral parents: znodes that disappear when they have no more  
children. this cleanup would happen automatically when the last child  
is removed. an ephemeral parent is not tied to any particular session,  
so even if the creator goes away, the ephemeral parent will remain as  
long as there are children."  

I haven't looked at the patch yet, but how do you handle the "first  
child" problem? Is the container znode created with a first child to  
prevent getting deleted, or does the client rely on multi to create a  
container and its children, or something else?  


On Thu, Apr 9, 2015 at 8:00 AM, Jordan Zimmerman  
<jo...@jordanzimmerman.com> wrote:  
> BACKGROUND  
> ============  
> A recurring problem for ZooKeeper users is garbage collection of parent nodes. Many recipes (e.g. locks, leaders, etc.) call for the creation of a parent node under which participants create sequential nodes. When the participant is done, it deletes its node. In practice, the ZooKeeper tree begins to fill up with orphaned parent nodes that are no longer needed. The ZooKeeper APIs don't provide a way to clean these. Over time, ZooKeeper can become unstable due to the number of these nodes.  
>  
> CURRENT SOLUTIONS  
> ===================  
> Apache Curator has a workaround solution for this by providing the Reaper class which runs in the background looking for orphaned parent nodes and deleting them. This isn't ideal and it would be better if ZooKeeper supported this directly.  
>  
> PROPOSAL  
> =========  
> ZOOKEEPER-723 and ZOOKEEPER-834 have been proposed to allow EPHEMERAL nodes to contain child nodes. This is not optimum as EPHEMERALs are tied to a session and the general use case of parent nodes is for PERSISTENT nodes. This proposal adds a new node type, CONTAINER. A CONTAINER node is the same as a PERSISTENT node with the additional property that when its last child is deleted, it is deleted (and CONTAINER nodes recursively up the tree are deleted if empty).  
>  
> I have a first pass (untested) straw man proposal open for comment here:  
>  
> https://github.com/apache/zookeeper/pull/28  
>  
> In order to have minimum impact on existing implementations, a container node is denoted by having an ephemeralOwner id of Long.MIN_VALUE. This is pretty hackish, but I think it's the most supportable without causing disruption. Also, a container behaves a "little bit" like an EPHEMERAL node so it isn't totally illogical. Alternatively, a new field could be added to STAT.  
>  
> I look forward to feedback on this. If people think it's worthwhile I'll open a Jira and work on a Production quality solution. If it's rejected, I'd appreciate discussion of an alternate as this is a real need in the ZK user community.  
>  
> -Jordan  
>  
>  

Re: [PROPOSAL] Container nodes

Posted by Michi Mutsuzaki <mi...@cs.stanford.edu>.
Hi Jordan.

This sounds great to me, but it sounds a lot like ZOOKEEPER-723.
Different people had different ideas there, but the original
description was:

"rather than changing the semantics of ephemeral nodes, i propose
ephemeral parents: znodes that disappear when they have no more
children. this cleanup would happen automatically when the last child
is removed. an ephemeral parent is not tied to any particular session,
so even if the creator goes away, the ephemeral parent will remain as
long as there are children."

I haven't looked at the patch yet, but how do you handle the "first
child" problem? Is the container znode created with a first child to
prevent getting deleted, or does the client rely on multi to create a
container and its children, or something else?


On Thu, Apr 9, 2015 at 8:00 AM, Jordan Zimmerman
<jo...@jordanzimmerman.com> wrote:
> BACKGROUND
> ============
> A recurring problem for ZooKeeper users is garbage collection of parent nodes. Many recipes (e.g. locks, leaders, etc.) call for the creation of a parent node under which participants create sequential nodes. When the participant is done, it deletes its node. In practice, the ZooKeeper tree begins to fill up with orphaned parent nodes that are no longer needed. The ZooKeeper APIs don't provide a way to clean these. Over time, ZooKeeper can become unstable due to the number of these nodes.
>
> CURRENT SOLUTIONS
> ===================
> Apache Curator has a workaround solution for this by providing the Reaper class which runs in the background looking for orphaned parent nodes and deleting them. This isn't ideal and it would be better if ZooKeeper supported this directly.
>
> PROPOSAL
> =========
> ZOOKEEPER-723 and ZOOKEEPER-834 have been proposed to allow EPHEMERAL nodes to contain child nodes. This is not optimum as EPHEMERALs are tied to a session and the general use case of parent nodes is for PERSISTENT nodes. This proposal adds a new node type, CONTAINER. A CONTAINER node is the same as a PERSISTENT node with the additional property that when its last child is deleted, it is deleted (and CONTAINER nodes recursively up the tree are deleted if empty).
>
> I have a first pass (untested) straw man proposal open for comment here:
>
> https://github.com/apache/zookeeper/pull/28
>
> In order to have minimum impact on existing implementations, a container node is denoted by having an ephemeralOwner id of Long.MIN_VALUE. This is pretty hackish, but I think it's the most supportable without causing disruption. Also, a container behaves a "little bit" like an EPHEMERAL node so it isn't totally illogical. Alternatively, a new field could be added to STAT.
>
> I look forward to feedback on this. If people think it's worthwhile I'll open a Jira and work on a Production quality solution. If it's rejected, I'd appreciate discussion of an alternate as this is a real need in the ZK user community.
>
> -Jordan
>
>