You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by Dominic Williams <th...@googlemail.com> on 2010/03/22 13:05:47 UTC
A deficiency? You can only add children to persistent nodes
Hi all,
I'm developing a library of primitives backed by ZooKeeper. An issue I've
recently hit is that you can only add children to persistent nodes.
When you're creating primitives for things like locking - for use for exampe
synchronizing cluster-wide processing of a NOSQL database like Cassandra -
this is a real pain:
1/ If a node crashes or something else goes wrong, you leave behind
persistent nodes. Over time these will grow and grow, rather like the old
tmp folders used to fill with files under Windows
2/ Persistent nodes = nasty scalability *bottleneck* because you're actually
having to write to disk somewhere.
To avoid this I'm actually thinking of writing locking system where you work
out the existing chain not by enumerating sequential children, but by
looking at the contents of each temporary lock node to see what it is
waiting on. But... that's quite horrible. Was wondering whether there is
some technical reason why you ephemeral nodes can't have children??
Otherwise it is aceptable e.g.
ZkPath ensurePath = new ZkPath("/Starburst/cluster/nodes",
CreateMode.PERSISTENT);
ensurePath.waitSynchronized();
clusterMembers = new ZkDistributedSet("/Starburst/cluster/nodes", new
String[] { thisNodeEndPoint }, true);
clusterMembers.addChangeListener(clusterNodesChangedProcessor, true);
clusterMembers.waitSynchronized();
Best, Dominic
ria101.wordpress.com
Re: A deficiency? You can only add children to persistent nodes
Posted by Patrick Hunt <ph...@apache.org>.
Hi Lei, that's perfectly fine, however the issue would be that
operations such as "getchildren(/path)" would return all these, vs if
you had an explicit path structure such as
getchildren(/path/starburst/cluster/nodes) - ie flat vs hierarchical.
Patrick
Lei Zhang wrote:
> Hi Dominic,
>
> Is it acceptable to use ephemeral nodes with hierarchical names, such as
> "starburst.cluster.nodes"?
>
Re: A deficiency? You can only add children to persistent nodes
Posted by Lei Zhang <lz...@gmail.com>.
Hi Dominic,
Is it acceptable to use ephemeral nodes with hierarchical names, such as
"starburst.cluster.nodes"?
Re: A deficiency? You can only add children to persistent nodes
Posted by Gustavo Niemeyer <gu...@niemeyer.net>.
Hi Ben,
This would certainly be very interesting! FWIW, I missed something like
that before as well.
On 23 Mar 2010 02:44, "Benjamin Reed" <br...@yahoo-inc.com> wrote:
let me put out an idea that we have kicked around for a while: ephemeral
containers. the idea is that the znode disappears if it doesn't have
children. you would create the znode with create("/path", data, acl,
EPHEMERAL_CONTAINER) this would result in the creation two znodes: /path and
/path/child. (we have to create it with a child otherwise it immediately
disappears.)
i think this mechanism would address your need in a way that is easy to
implement and use. it would also allow you to do a cool barrier
implementation!
ben
On 03/22/2010 10:37 AM, Patrick Hunt wrote:
>
> Dominic Williams wrote:
>
>>
>> What I'd sugges...
Re: A deficiency? You can only add children to persistent nodes
Posted by Dominic Williams <th...@googlemail.com>.
Just for "correct" programming approach given the latest libraries. For
example with 1.6 you can create a weak set like this
set = Collections.newSetFromMap(new WeakHashMap<ZkSyncPrimitive,
Boolean>());
Otherwise you need to bundle a WeakSet implementation. I'm guessing the less
dependencies the better...
On 24 March 2010 16:12, Patrick Hunt <ph...@apache.org> wrote:
> A while back (3.1? basically long after the mac port of the 1.6 jvm was
> considered "stable") we stopped "officially" supporting 1.5
> http://bit.ly/9IlzUq
>
> However there have been occasional requests from ppl trying to use 1.5 and
> we do our best not to break things for them. Last I checked the code still
> compiled under 1.5 (but it's been a while). IMO it's best to use 1.6 though,
> even if it compiles/runs for 1.5 there have been a number of significant
> bugs fixed in the jvm since then.
>
> Dominic, is there some reason in particular in this case?
>
> Patrick
>
>
> Dominic Williams wrote:
>
>> Hi Patrick, re: Weak Watchers implementation. Is it ok to assume JDK 6?
>>
>> On 23 March 2010 17:32, Patrick Hunt <ph...@apache.org> wrote:
>>
>> Feel free to assign yourself to WW. I encourage you to document (comments
>>> in jira or wiki page) some rough approximation of the api/approach for
>>> WW,
>>> it's better to get the comments/concerns up front than after you took the
>>> time to work it all out in a patch. There are good tests for watchers
>>> already, you'll need to extend those for WW and add appropriate
>>> javadoc/forrest docs. Asking lots of questions is fine. :-)
>>>
>>> http://wiki.apache.org/hadoop/ZooKeeper/HowToContribute
>>>
>>> Patrick
>>>
>>>
>>> Dominic Williams wrote:
>>>
>>> Ok will do. Will create patch for weak watchers and corral existing
>>>> discussion into ephem in ephem.
>>>>
>>>> This will happen some time this week
>>>>
>>>> Thanks for all the feedback.
>>>>
>>>> On 23 March 2010 16:29, Patrick Hunt <ph...@apache.org> wrote:
>>>>
>>>> Excellent idea Jeff. Dominic feel free to open 2 JIRAs, one for ephem
>>>> in
>>>>
>>>>> ephem and a second for the weak watchers. Summarize your goals and the
>>>>> discussion so far as appropriate. BTW, either of these JIRAs would be
>>>>> good
>>>>> for a new contributor interested in gaining experience with ZooKeeper.
>>>>> ephem
>>>>> is a bit tougher, but not so extensive that it's insurmountable (of
>>>>> course
>>>>> our current dev base will help out).
>>>>>
>>>>> Regards,
>>>>>
>>>>> Patrick
>>>>>
>>>>>
>>>>> Jeff Hammerbacher wrote:
>>>>>
>>>>> Hey Ben,
>>>>>
>>>>>> Perhaps you should open a JIRA for further discussion?
>>>>>>
>>>>>> Thanks,
>>>>>> Jeff
>>>>>>
>>>>>> On Mon, Mar 22, 2010 at 7:42 PM, Benjamin Reed <br...@yahoo-inc.com>
>>>>>> wrote:
>>>>>>
>>>>>> let me put out an idea that we have kicked around for a while:
>>>>>> ephemeral
>>>>>>
>>>>>> containers. the idea is that the znode disappears if it doesn't have
>>>>>>> children. you would create the znode with create("/path", data, acl,
>>>>>>> EPHEMERAL_CONTAINER) this would result in the creation two znodes:
>>>>>>> /path
>>>>>>> and
>>>>>>> /path/child. (we have to create it with a child otherwise it
>>>>>>> immediately
>>>>>>> disappears.)
>>>>>>>
>>>>>>> i think this mechanism would address your need in a way that is easy
>>>>>>> to
>>>>>>> implement and use. it would also allow you to do a cool barrier
>>>>>>> implementation!
>>>>>>>
>>>>>>> ben
>>>>>>>
>>>>>>>
>>>>>>> On 03/22/2010 10:37 AM, Patrick Hunt wrote:
>>>>>>>
>>>>>>> Dominic Williams wrote:
>>>>>>>
>>>>>>> What I'd suggest might work:
>>>>>>>>
>>>>>>>> - when the session that created the parent ends, ownership of the
>>>>>>>>> parent
>>>>>>>>> could either be transferred to the owner/session that created the
>>>>>>>>> oldest
>>>>>>>>> child, or instead ownership could be transferred to some kind of
>>>>>>>>> nominal
>>>>>>>>> system session (which would delete the parent once the last
>>>>>>>>> ephemeral
>>>>>>>>> child
>>>>>>>>> disappeared)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> There may be some issues with idempotency here, also it could
>>>>>>>>> require
>>>>>>>>>
>>>>>>>>> extensive locking which drives up operation latencies (essentially
>>>>>>>> "recursive delete"). It sounds possible, but someone would have to
>>>>>>>> take
>>>>>>>> a closer look as to the technical challenges involved.
>>>>>>>>
>>>>>>>>
>>>>>>>> Our general philosophy is to keep things as simple as possible wrt
>>>>>>>> api,
>>>>>>>> semantics, implementation, etc... Distributed communication is hard
>>>>>>>> and
>>>>>>>> while we handle a lot of the issues for you it's still complex.
>>>>>>>> Following our philosophy generally makes the easy things simple and
>>>>>>>> the
>>>>>>>> hard things possible, additionally it reduces the number of bugs
>>>>>>>> that
>>>>>>>> we
>>>>>>>> have in the implementation itself (both user and service code).
>>>>>>>>
>>>>>>>> I don't wish to discourage you as much as provide insight/background
>>>>>>>> into some of our decisions.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>
>>>>>>>> Patrick
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 22 March 2010 16:44, Patrick Hunt<ph...@apache.org> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>> Dominic Williams wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> 1/ If a node crashes or something else goes wrong, you leave
>>>>>>>>>> behind
>>>>>>>>>>
>>>>>>>>>> persistent nodes. Over time these will grow and grow, rather like
>>>>>>>>>>> the
>>>>>>>>>>> old
>>>>>>>>>>> tmp folders used to fill with files under Windows
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> That's true. One either needs to use ephemerals or use
>>>>>>>>>>> persistent
>>>>>>>>>>>
>>>>>>>>>>> and
>>>>>>>>>> have
>>>>>>>>>> a "garbage collector" (implicit or explicit gc). In most cases
>>>>>>>>>> it's
>>>>>>>>>> preferable to use the ephemeral.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2/ Persistent nodes = nasty scalability *bottleneck* because
>>>>>>>>>> you're
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> actually
>>>>>>>>>>
>>>>>>>>>> having to write to disk somewhere.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> This is not actually how ZK works. All znodes regardless of
>>>>>>>>>>>
>>>>>>>>>>> persistent/ephemeral are written to disk persistently. Ephemeral
>>>>>>>>>> nodes
>>>>>>>>>> are
>>>>>>>>>> tied to the session that created them. As long as the session is
>>>>>>>>>> alive
>>>>>>>>>> the
>>>>>>>>>> ephemeral node is alive. Sessions themselves are
>>>>>>>>>> persistently/reliably
>>>>>>>>>> stored by the ZK cluster. This allows the shutdown of the entire
>>>>>>>>>> cluster
>>>>>>>>>> and
>>>>>>>>>> restart it, all sessions/ephemerals will be maintained. Sessions
>>>>>>>>>> can
>>>>>>>>>> move
>>>>>>>>>> from server to server (if say network connectivity to server A
>>>>>>>>>> fails,
>>>>>>>>>> or
>>>>>>>>>> server A itself fails then the client will move to server B). The
>>>>>>>>>> session
>>>>>>>>>> and all ephemerals are maintained (well, as long as the client
>>>>>>>>>> moves
>>>>>>>>>> withing
>>>>>>>>>> the expiration timeout value).
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> To avoid this I'm actually thinking of writing locking system
>>>>>>>>>> where
>>>>>>>>>> you
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> work
>>>>>>>>>>
>>>>>>>>>> out the existing chain not by enumerating sequential children,
>>>>>>>>>>> but
>>>>>>>>>>> by
>>>>>>>>>>> looking at the contents of each temporary lock node to see what
>>>>>>>>>>> it
>>>>>>>>>>> is
>>>>>>>>>>> waiting on. But... that's quite horrible. Was wondering whether
>>>>>>>>>>> there
>>>>>>>>>>> is
>>>>>>>>>>> some technical reason why you ephemeral nodes can't have
>>>>>>>>>>> children??
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> There are a few cases to think about.
>>>>>>>>>>>
>>>>>>>>>>> 1) obviously ephemeral nodes can't have persistent children,
>>>>>>>>>> this
>>>>>>>>>> just
>>>>>>>>>> doesn't make sense
>>>>>>>>>>
>>>>>>>>>> 2) ephemeral nodes have an owner - the session that created them.
>>>>>>>>>> so
>>>>>>>>>> it
>>>>>>>>>> would also not make sense (in my mind at least) to have an
>>>>>>>>>> ephemeral
>>>>>>>>>> /foo
>>>>>>>>>> with another ephemeral /foo/bar with a different owner.
>>>>>>>>>>
>>>>>>>>>> 3) so you are left with "ephemerals can be a child of an ephemeral
>>>>>>>>>> with
>>>>>>>>>> the
>>>>>>>>>> same owner".
>>>>>>>>>>
>>>>>>>>>> 4) there are also issues of order. in particular what is the
>>>>>>>>>> "deletion
>>>>>>>>>> order" depth first or breadth first, etc...
>>>>>>>>>>
>>>>>>>>>> I believe the answer so far has been "we don't do this because
>>>>>>>>>> it's
>>>>>>>>>> fairly
>>>>>>>>>> complicated and we haven't seen any use cases that require it." In
>>>>>>>>>> the
>>>>>>>>>> cases
>>>>>>>>>> I've seen so far there was either a misunderstanding of how zk
>>>>>>>>>> worked,
>>>>>>>>>> or a
>>>>>>>>>> simpler way available.
>>>>>>>>>>
>>>>>>>>>> Does that make sense? Thoughts?
>>>>>>>>>>
>>>>>>>>>> Patrick
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>
Re: A deficiency? You can only add children to persistent nodes
Posted by Patrick Hunt <ph...@apache.org>.
A while back (3.1? basically long after the mac port of the 1.6 jvm was
considered "stable") we stopped "officially" supporting 1.5
http://bit.ly/9IlzUq
However there have been occasional requests from ppl trying to use 1.5
and we do our best not to break things for them. Last I checked the code
still compiled under 1.5 (but it's been a while). IMO it's best to use
1.6 though, even if it compiles/runs for 1.5 there have been a number of
significant bugs fixed in the jvm since then.
Dominic, is there some reason in particular in this case?
Patrick
Dominic Williams wrote:
> Hi Patrick, re: Weak Watchers implementation. Is it ok to assume JDK 6?
>
> On 23 March 2010 17:32, Patrick Hunt <ph...@apache.org> wrote:
>
>> Feel free to assign yourself to WW. I encourage you to document (comments
>> in jira or wiki page) some rough approximation of the api/approach for WW,
>> it's better to get the comments/concerns up front than after you took the
>> time to work it all out in a patch. There are good tests for watchers
>> already, you'll need to extend those for WW and add appropriate
>> javadoc/forrest docs. Asking lots of questions is fine. :-)
>>
>> http://wiki.apache.org/hadoop/ZooKeeper/HowToContribute
>>
>> Patrick
>>
>>
>> Dominic Williams wrote:
>>
>>> Ok will do. Will create patch for weak watchers and corral existing
>>> discussion into ephem in ephem.
>>>
>>> This will happen some time this week
>>>
>>> Thanks for all the feedback.
>>>
>>> On 23 March 2010 16:29, Patrick Hunt <ph...@apache.org> wrote:
>>>
>>> Excellent idea Jeff. Dominic feel free to open 2 JIRAs, one for ephem in
>>>> ephem and a second for the weak watchers. Summarize your goals and the
>>>> discussion so far as appropriate. BTW, either of these JIRAs would be
>>>> good
>>>> for a new contributor interested in gaining experience with ZooKeeper.
>>>> ephem
>>>> is a bit tougher, but not so extensive that it's insurmountable (of
>>>> course
>>>> our current dev base will help out).
>>>>
>>>> Regards,
>>>>
>>>> Patrick
>>>>
>>>>
>>>> Jeff Hammerbacher wrote:
>>>>
>>>> Hey Ben,
>>>>> Perhaps you should open a JIRA for further discussion?
>>>>>
>>>>> Thanks,
>>>>> Jeff
>>>>>
>>>>> On Mon, Mar 22, 2010 at 7:42 PM, Benjamin Reed <br...@yahoo-inc.com>
>>>>> wrote:
>>>>>
>>>>> let me put out an idea that we have kicked around for a while:
>>>>> ephemeral
>>>>>
>>>>>> containers. the idea is that the znode disappears if it doesn't have
>>>>>> children. you would create the znode with create("/path", data, acl,
>>>>>> EPHEMERAL_CONTAINER) this would result in the creation two znodes:
>>>>>> /path
>>>>>> and
>>>>>> /path/child. (we have to create it with a child otherwise it
>>>>>> immediately
>>>>>> disappears.)
>>>>>>
>>>>>> i think this mechanism would address your need in a way that is easy to
>>>>>> implement and use. it would also allow you to do a cool barrier
>>>>>> implementation!
>>>>>>
>>>>>> ben
>>>>>>
>>>>>>
>>>>>> On 03/22/2010 10:37 AM, Patrick Hunt wrote:
>>>>>>
>>>>>> Dominic Williams wrote:
>>>>>>
>>>>>>> What I'd suggest might work:
>>>>>>>
>>>>>>>> - when the session that created the parent ends, ownership of the
>>>>>>>> parent
>>>>>>>> could either be transferred to the owner/session that created the
>>>>>>>> oldest
>>>>>>>> child, or instead ownership could be transferred to some kind of
>>>>>>>> nominal
>>>>>>>> system session (which would delete the parent once the last ephemeral
>>>>>>>> child
>>>>>>>> disappeared)
>>>>>>>>
>>>>>>>>
>>>>>>>> There may be some issues with idempotency here, also it could
>>>>>>>> require
>>>>>>>>
>>>>>>> extensive locking which drives up operation latencies (essentially
>>>>>>> "recursive delete"). It sounds possible, but someone would have to
>>>>>>> take
>>>>>>> a closer look as to the technical challenges involved.
>>>>>>>
>>>>>>>
>>>>>>> Our general philosophy is to keep things as simple as possible wrt
>>>>>>> api,
>>>>>>> semantics, implementation, etc... Distributed communication is hard
>>>>>>> and
>>>>>>> while we handle a lot of the issues for you it's still complex.
>>>>>>> Following our philosophy generally makes the easy things simple and
>>>>>>> the
>>>>>>> hard things possible, additionally it reduces the number of bugs that
>>>>>>> we
>>>>>>> have in the implementation itself (both user and service code).
>>>>>>>
>>>>>>> I don't wish to discourage you as much as provide insight/background
>>>>>>> into some of our decisions.
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Patrick
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 22 March 2010 16:44, Patrick Hunt<ph...@apache.org> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> Dominic Williams wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> 1/ If a node crashes or something else goes wrong, you leave behind
>>>>>>>>>
>>>>>>>>>> persistent nodes. Over time these will grow and grow, rather like
>>>>>>>>>> the
>>>>>>>>>> old
>>>>>>>>>> tmp folders used to fill with files under Windows
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> That's true. One either needs to use ephemerals or use persistent
>>>>>>>>>>
>>>>>>>>> and
>>>>>>>>> have
>>>>>>>>> a "garbage collector" (implicit or explicit gc). In most cases it's
>>>>>>>>> preferable to use the ephemeral.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2/ Persistent nodes = nasty scalability *bottleneck* because you're
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> actually
>>>>>>>>>
>>>>>>>>>> having to write to disk somewhere.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> This is not actually how ZK works. All znodes regardless of
>>>>>>>>>>
>>>>>>>>> persistent/ephemeral are written to disk persistently. Ephemeral
>>>>>>>>> nodes
>>>>>>>>> are
>>>>>>>>> tied to the session that created them. As long as the session is
>>>>>>>>> alive
>>>>>>>>> the
>>>>>>>>> ephemeral node is alive. Sessions themselves are
>>>>>>>>> persistently/reliably
>>>>>>>>> stored by the ZK cluster. This allows the shutdown of the entire
>>>>>>>>> cluster
>>>>>>>>> and
>>>>>>>>> restart it, all sessions/ephemerals will be maintained. Sessions can
>>>>>>>>> move
>>>>>>>>> from server to server (if say network connectivity to server A
>>>>>>>>> fails,
>>>>>>>>> or
>>>>>>>>> server A itself fails then the client will move to server B). The
>>>>>>>>> session
>>>>>>>>> and all ephemerals are maintained (well, as long as the client moves
>>>>>>>>> withing
>>>>>>>>> the expiration timeout value).
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> To avoid this I'm actually thinking of writing locking system where
>>>>>>>>> you
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> work
>>>>>>>>>
>>>>>>>>>> out the existing chain not by enumerating sequential children, but
>>>>>>>>>> by
>>>>>>>>>> looking at the contents of each temporary lock node to see what it
>>>>>>>>>> is
>>>>>>>>>> waiting on. But... that's quite horrible. Was wondering whether
>>>>>>>>>> there
>>>>>>>>>> is
>>>>>>>>>> some technical reason why you ephemeral nodes can't have children??
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> There are a few cases to think about.
>>>>>>>>>>
>>>>>>>>> 1) obviously ephemeral nodes can't have persistent children, this
>>>>>>>>> just
>>>>>>>>> doesn't make sense
>>>>>>>>>
>>>>>>>>> 2) ephemeral nodes have an owner - the session that created them. so
>>>>>>>>> it
>>>>>>>>> would also not make sense (in my mind at least) to have an ephemeral
>>>>>>>>> /foo
>>>>>>>>> with another ephemeral /foo/bar with a different owner.
>>>>>>>>>
>>>>>>>>> 3) so you are left with "ephemerals can be a child of an ephemeral
>>>>>>>>> with
>>>>>>>>> the
>>>>>>>>> same owner".
>>>>>>>>>
>>>>>>>>> 4) there are also issues of order. in particular what is the
>>>>>>>>> "deletion
>>>>>>>>> order" depth first or breadth first, etc...
>>>>>>>>>
>>>>>>>>> I believe the answer so far has been "we don't do this because it's
>>>>>>>>> fairly
>>>>>>>>> complicated and we haven't seen any use cases that require it." In
>>>>>>>>> the
>>>>>>>>> cases
>>>>>>>>> I've seen so far there was either a misunderstanding of how zk
>>>>>>>>> worked,
>>>>>>>>> or a
>>>>>>>>> simpler way available.
>>>>>>>>>
>>>>>>>>> Does that make sense? Thoughts?
>>>>>>>>>
>>>>>>>>> Patrick
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>
Re: A deficiency? You can only add children to persistent nodes
Posted by Dominic Williams <th...@googlemail.com>.
Hi Patrick, re: Weak Watchers implementation. Is it ok to assume JDK 6?
On 23 March 2010 17:32, Patrick Hunt <ph...@apache.org> wrote:
> Feel free to assign yourself to WW. I encourage you to document (comments
> in jira or wiki page) some rough approximation of the api/approach for WW,
> it's better to get the comments/concerns up front than after you took the
> time to work it all out in a patch. There are good tests for watchers
> already, you'll need to extend those for WW and add appropriate
> javadoc/forrest docs. Asking lots of questions is fine. :-)
>
> http://wiki.apache.org/hadoop/ZooKeeper/HowToContribute
>
> Patrick
>
>
> Dominic Williams wrote:
>
>> Ok will do. Will create patch for weak watchers and corral existing
>> discussion into ephem in ephem.
>>
>> This will happen some time this week
>>
>> Thanks for all the feedback.
>>
>> On 23 March 2010 16:29, Patrick Hunt <ph...@apache.org> wrote:
>>
>> Excellent idea Jeff. Dominic feel free to open 2 JIRAs, one for ephem in
>>> ephem and a second for the weak watchers. Summarize your goals and the
>>> discussion so far as appropriate. BTW, either of these JIRAs would be
>>> good
>>> for a new contributor interested in gaining experience with ZooKeeper.
>>> ephem
>>> is a bit tougher, but not so extensive that it's insurmountable (of
>>> course
>>> our current dev base will help out).
>>>
>>> Regards,
>>>
>>> Patrick
>>>
>>>
>>> Jeff Hammerbacher wrote:
>>>
>>> Hey Ben,
>>>>
>>>> Perhaps you should open a JIRA for further discussion?
>>>>
>>>> Thanks,
>>>> Jeff
>>>>
>>>> On Mon, Mar 22, 2010 at 7:42 PM, Benjamin Reed <br...@yahoo-inc.com>
>>>> wrote:
>>>>
>>>> let me put out an idea that we have kicked around for a while:
>>>> ephemeral
>>>>
>>>>> containers. the idea is that the znode disappears if it doesn't have
>>>>> children. you would create the znode with create("/path", data, acl,
>>>>> EPHEMERAL_CONTAINER) this would result in the creation two znodes:
>>>>> /path
>>>>> and
>>>>> /path/child. (we have to create it with a child otherwise it
>>>>> immediately
>>>>> disappears.)
>>>>>
>>>>> i think this mechanism would address your need in a way that is easy to
>>>>> implement and use. it would also allow you to do a cool barrier
>>>>> implementation!
>>>>>
>>>>> ben
>>>>>
>>>>>
>>>>> On 03/22/2010 10:37 AM, Patrick Hunt wrote:
>>>>>
>>>>> Dominic Williams wrote:
>>>>>
>>>>>>
>>>>>> What I'd suggest might work:
>>>>>>
>>>>>>> - when the session that created the parent ends, ownership of the
>>>>>>> parent
>>>>>>> could either be transferred to the owner/session that created the
>>>>>>> oldest
>>>>>>> child, or instead ownership could be transferred to some kind of
>>>>>>> nominal
>>>>>>> system session (which would delete the parent once the last ephemeral
>>>>>>> child
>>>>>>> disappeared)
>>>>>>>
>>>>>>>
>>>>>>> There may be some issues with idempotency here, also it could
>>>>>>> require
>>>>>>>
>>>>>> extensive locking which drives up operation latencies (essentially
>>>>>> "recursive delete"). It sounds possible, but someone would have to
>>>>>> take
>>>>>> a closer look as to the technical challenges involved.
>>>>>>
>>>>>>
>>>>>> Our general philosophy is to keep things as simple as possible wrt
>>>>>> api,
>>>>>> semantics, implementation, etc... Distributed communication is hard
>>>>>> and
>>>>>> while we handle a lot of the issues for you it's still complex.
>>>>>> Following our philosophy generally makes the easy things simple and
>>>>>> the
>>>>>> hard things possible, additionally it reduces the number of bugs that
>>>>>> we
>>>>>> have in the implementation itself (both user and service code).
>>>>>>
>>>>>> I don't wish to discourage you as much as provide insight/background
>>>>>> into some of our decisions.
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Patrick
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 22 March 2010 16:44, Patrick Hunt<ph...@apache.org> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Dominic Williams wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 1/ If a node crashes or something else goes wrong, you leave behind
>>>>>>>>
>>>>>>>>> persistent nodes. Over time these will grow and grow, rather like
>>>>>>>>> the
>>>>>>>>> old
>>>>>>>>> tmp folders used to fill with files under Windows
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> That's true. One either needs to use ephemerals or use persistent
>>>>>>>>>
>>>>>>>> and
>>>>>>>> have
>>>>>>>> a "garbage collector" (implicit or explicit gc). In most cases it's
>>>>>>>> preferable to use the ephemeral.
>>>>>>>>
>>>>>>>>
>>>>>>>> 2/ Persistent nodes = nasty scalability *bottleneck* because you're
>>>>>>>>
>>>>>>>>
>>>>>>>> actually
>>>>>>>>
>>>>>>>>> having to write to disk somewhere.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> This is not actually how ZK works. All znodes regardless of
>>>>>>>>>
>>>>>>>> persistent/ephemeral are written to disk persistently. Ephemeral
>>>>>>>> nodes
>>>>>>>> are
>>>>>>>> tied to the session that created them. As long as the session is
>>>>>>>> alive
>>>>>>>> the
>>>>>>>> ephemeral node is alive. Sessions themselves are
>>>>>>>> persistently/reliably
>>>>>>>> stored by the ZK cluster. This allows the shutdown of the entire
>>>>>>>> cluster
>>>>>>>> and
>>>>>>>> restart it, all sessions/ephemerals will be maintained. Sessions can
>>>>>>>> move
>>>>>>>> from server to server (if say network connectivity to server A
>>>>>>>> fails,
>>>>>>>> or
>>>>>>>> server A itself fails then the client will move to server B). The
>>>>>>>> session
>>>>>>>> and all ephemerals are maintained (well, as long as the client moves
>>>>>>>> withing
>>>>>>>> the expiration timeout value).
>>>>>>>>
>>>>>>>>
>>>>>>>> To avoid this I'm actually thinking of writing locking system where
>>>>>>>> you
>>>>>>>>
>>>>>>>>
>>>>>>>> work
>>>>>>>>
>>>>>>>>> out the existing chain not by enumerating sequential children, but
>>>>>>>>> by
>>>>>>>>> looking at the contents of each temporary lock node to see what it
>>>>>>>>> is
>>>>>>>>> waiting on. But... that's quite horrible. Was wondering whether
>>>>>>>>> there
>>>>>>>>> is
>>>>>>>>> some technical reason why you ephemeral nodes can't have children??
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> There are a few cases to think about.
>>>>>>>>>
>>>>>>>> 1) obviously ephemeral nodes can't have persistent children, this
>>>>>>>> just
>>>>>>>> doesn't make sense
>>>>>>>>
>>>>>>>> 2) ephemeral nodes have an owner - the session that created them. so
>>>>>>>> it
>>>>>>>> would also not make sense (in my mind at least) to have an ephemeral
>>>>>>>> /foo
>>>>>>>> with another ephemeral /foo/bar with a different owner.
>>>>>>>>
>>>>>>>> 3) so you are left with "ephemerals can be a child of an ephemeral
>>>>>>>> with
>>>>>>>> the
>>>>>>>> same owner".
>>>>>>>>
>>>>>>>> 4) there are also issues of order. in particular what is the
>>>>>>>> "deletion
>>>>>>>> order" depth first or breadth first, etc...
>>>>>>>>
>>>>>>>> I believe the answer so far has been "we don't do this because it's
>>>>>>>> fairly
>>>>>>>> complicated and we haven't seen any use cases that require it." In
>>>>>>>> the
>>>>>>>> cases
>>>>>>>> I've seen so far there was either a misunderstanding of how zk
>>>>>>>> worked,
>>>>>>>> or a
>>>>>>>> simpler way available.
>>>>>>>>
>>>>>>>> Does that make sense? Thoughts?
>>>>>>>>
>>>>>>>> Patrick
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>
Re: A deficiency? You can only add children to persistent nodes
Posted by Patrick Hunt <ph...@apache.org>.
Feel free to assign yourself to WW. I encourage you to document
(comments in jira or wiki page) some rough approximation of the
api/approach for WW, it's better to get the comments/concerns up front
than after you took the time to work it all out in a patch. There are
good tests for watchers already, you'll need to extend those for WW and
add appropriate javadoc/forrest docs. Asking lots of questions is fine. :-)
http://wiki.apache.org/hadoop/ZooKeeper/HowToContribute
Patrick
Dominic Williams wrote:
> Ok will do. Will create patch for weak watchers and corral existing
> discussion into ephem in ephem.
>
> This will happen some time this week
>
> Thanks for all the feedback.
>
> On 23 March 2010 16:29, Patrick Hunt <ph...@apache.org> wrote:
>
>> Excellent idea Jeff. Dominic feel free to open 2 JIRAs, one for ephem in
>> ephem and a second for the weak watchers. Summarize your goals and the
>> discussion so far as appropriate. BTW, either of these JIRAs would be good
>> for a new contributor interested in gaining experience with ZooKeeper. ephem
>> is a bit tougher, but not so extensive that it's insurmountable (of course
>> our current dev base will help out).
>>
>> Regards,
>>
>> Patrick
>>
>>
>> Jeff Hammerbacher wrote:
>>
>>> Hey Ben,
>>>
>>> Perhaps you should open a JIRA for further discussion?
>>>
>>> Thanks,
>>> Jeff
>>>
>>> On Mon, Mar 22, 2010 at 7:42 PM, Benjamin Reed <br...@yahoo-inc.com>
>>> wrote:
>>>
>>> let me put out an idea that we have kicked around for a while: ephemeral
>>>> containers. the idea is that the znode disappears if it doesn't have
>>>> children. you would create the znode with create("/path", data, acl,
>>>> EPHEMERAL_CONTAINER) this would result in the creation two znodes: /path
>>>> and
>>>> /path/child. (we have to create it with a child otherwise it immediately
>>>> disappears.)
>>>>
>>>> i think this mechanism would address your need in a way that is easy to
>>>> implement and use. it would also allow you to do a cool barrier
>>>> implementation!
>>>>
>>>> ben
>>>>
>>>>
>>>> On 03/22/2010 10:37 AM, Patrick Hunt wrote:
>>>>
>>>> Dominic Williams wrote:
>>>>>
>>>>> What I'd suggest might work:
>>>>>> - when the session that created the parent ends, ownership of the
>>>>>> parent
>>>>>> could either be transferred to the owner/session that created the
>>>>>> oldest
>>>>>> child, or instead ownership could be transferred to some kind of
>>>>>> nominal
>>>>>> system session (which would delete the parent once the last ephemeral
>>>>>> child
>>>>>> disappeared)
>>>>>>
>>>>>>
>>>>>> There may be some issues with idempotency here, also it could require
>>>>> extensive locking which drives up operation latencies (essentially
>>>>> "recursive delete"). It sounds possible, but someone would have to take
>>>>> a closer look as to the technical challenges involved.
>>>>>
>>>>>
>>>>> Our general philosophy is to keep things as simple as possible wrt api,
>>>>> semantics, implementation, etc... Distributed communication is hard and
>>>>> while we handle a lot of the issues for you it's still complex.
>>>>> Following our philosophy generally makes the easy things simple and the
>>>>> hard things possible, additionally it reduces the number of bugs that we
>>>>> have in the implementation itself (both user and service code).
>>>>>
>>>>> I don't wish to discourage you as much as provide insight/background
>>>>> into some of our decisions.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Patrick
>>>>>
>>>>>
>>>>>
>>>>> On 22 March 2010 16:44, Patrick Hunt<ph...@apache.org> wrote:
>>>>>>
>>>>>>
>>>>>> Dominic Williams wrote:
>>>>>>>
>>>>>>>
>>>>>>> 1/ If a node crashes or something else goes wrong, you leave behind
>>>>>>>> persistent nodes. Over time these will grow and grow, rather like the
>>>>>>>> old
>>>>>>>> tmp folders used to fill with files under Windows
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> That's true. One either needs to use ephemerals or use persistent
>>>>>>> and
>>>>>>> have
>>>>>>> a "garbage collector" (implicit or explicit gc). In most cases it's
>>>>>>> preferable to use the ephemeral.
>>>>>>>
>>>>>>>
>>>>>>> 2/ Persistent nodes = nasty scalability *bottleneck* because you're
>>>>>>>
>>>>>>>
>>>>>>> actually
>>>>>>>> having to write to disk somewhere.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> This is not actually how ZK works. All znodes regardless of
>>>>>>> persistent/ephemeral are written to disk persistently. Ephemeral nodes
>>>>>>> are
>>>>>>> tied to the session that created them. As long as the session is alive
>>>>>>> the
>>>>>>> ephemeral node is alive. Sessions themselves are persistently/reliably
>>>>>>> stored by the ZK cluster. This allows the shutdown of the entire
>>>>>>> cluster
>>>>>>> and
>>>>>>> restart it, all sessions/ephemerals will be maintained. Sessions can
>>>>>>> move
>>>>>>> from server to server (if say network connectivity to server A fails,
>>>>>>> or
>>>>>>> server A itself fails then the client will move to server B). The
>>>>>>> session
>>>>>>> and all ephemerals are maintained (well, as long as the client moves
>>>>>>> withing
>>>>>>> the expiration timeout value).
>>>>>>>
>>>>>>>
>>>>>>> To avoid this I'm actually thinking of writing locking system where
>>>>>>> you
>>>>>>>
>>>>>>>
>>>>>>> work
>>>>>>>> out the existing chain not by enumerating sequential children, but by
>>>>>>>> looking at the contents of each temporary lock node to see what it is
>>>>>>>> waiting on. But... that's quite horrible. Was wondering whether there
>>>>>>>> is
>>>>>>>> some technical reason why you ephemeral nodes can't have children??
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> There are a few cases to think about.
>>>>>>> 1) obviously ephemeral nodes can't have persistent children, this just
>>>>>>> doesn't make sense
>>>>>>>
>>>>>>> 2) ephemeral nodes have an owner - the session that created them. so
>>>>>>> it
>>>>>>> would also not make sense (in my mind at least) to have an ephemeral
>>>>>>> /foo
>>>>>>> with another ephemeral /foo/bar with a different owner.
>>>>>>>
>>>>>>> 3) so you are left with "ephemerals can be a child of an ephemeral
>>>>>>> with
>>>>>>> the
>>>>>>> same owner".
>>>>>>>
>>>>>>> 4) there are also issues of order. in particular what is the "deletion
>>>>>>> order" depth first or breadth first, etc...
>>>>>>>
>>>>>>> I believe the answer so far has been "we don't do this because it's
>>>>>>> fairly
>>>>>>> complicated and we haven't seen any use cases that require it." In the
>>>>>>> cases
>>>>>>> I've seen so far there was either a misunderstanding of how zk worked,
>>>>>>> or a
>>>>>>> simpler way available.
>>>>>>>
>>>>>>> Does that make sense? Thoughts?
>>>>>>>
>>>>>>> Patrick
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>
Re: A deficiency? You can only add children to persistent nodes
Posted by Dominic Williams <th...@googlemail.com>.
Ok will do. Will create patch for weak watchers and corral existing
discussion into ephem in ephem.
This will happen some time this week
Thanks for all the feedback.
On 23 March 2010 16:29, Patrick Hunt <ph...@apache.org> wrote:
> Excellent idea Jeff. Dominic feel free to open 2 JIRAs, one for ephem in
> ephem and a second for the weak watchers. Summarize your goals and the
> discussion so far as appropriate. BTW, either of these JIRAs would be good
> for a new contributor interested in gaining experience with ZooKeeper. ephem
> is a bit tougher, but not so extensive that it's insurmountable (of course
> our current dev base will help out).
>
> Regards,
>
> Patrick
>
>
> Jeff Hammerbacher wrote:
>
>> Hey Ben,
>>
>> Perhaps you should open a JIRA for further discussion?
>>
>> Thanks,
>> Jeff
>>
>> On Mon, Mar 22, 2010 at 7:42 PM, Benjamin Reed <br...@yahoo-inc.com>
>> wrote:
>>
>> let me put out an idea that we have kicked around for a while: ephemeral
>>> containers. the idea is that the znode disappears if it doesn't have
>>> children. you would create the znode with create("/path", data, acl,
>>> EPHEMERAL_CONTAINER) this would result in the creation two znodes: /path
>>> and
>>> /path/child. (we have to create it with a child otherwise it immediately
>>> disappears.)
>>>
>>> i think this mechanism would address your need in a way that is easy to
>>> implement and use. it would also allow you to do a cool barrier
>>> implementation!
>>>
>>> ben
>>>
>>>
>>> On 03/22/2010 10:37 AM, Patrick Hunt wrote:
>>>
>>> Dominic Williams wrote:
>>>>
>>>>
>>>> What I'd suggest might work:
>>>>> - when the session that created the parent ends, ownership of the
>>>>> parent
>>>>> could either be transferred to the owner/session that created the
>>>>> oldest
>>>>> child, or instead ownership could be transferred to some kind of
>>>>> nominal
>>>>> system session (which would delete the parent once the last ephemeral
>>>>> child
>>>>> disappeared)
>>>>>
>>>>>
>>>>> There may be some issues with idempotency here, also it could require
>>>> extensive locking which drives up operation latencies (essentially
>>>> "recursive delete"). It sounds possible, but someone would have to take
>>>> a closer look as to the technical challenges involved.
>>>>
>>>>
>>>> Our general philosophy is to keep things as simple as possible wrt api,
>>>> semantics, implementation, etc... Distributed communication is hard and
>>>> while we handle a lot of the issues for you it's still complex.
>>>> Following our philosophy generally makes the easy things simple and the
>>>> hard things possible, additionally it reduces the number of bugs that we
>>>> have in the implementation itself (both user and service code).
>>>>
>>>> I don't wish to discourage you as much as provide insight/background
>>>> into some of our decisions.
>>>>
>>>> Regards,
>>>>
>>>> Patrick
>>>>
>>>>
>>>>
>>>> On 22 March 2010 16:44, Patrick Hunt<ph...@apache.org> wrote:
>>>>>
>>>>>
>>>>>
>>>>> Dominic Williams wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> 1/ If a node crashes or something else goes wrong, you leave behind
>>>>>>> persistent nodes. Over time these will grow and grow, rather like the
>>>>>>> old
>>>>>>> tmp folders used to fill with files under Windows
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> That's true. One either needs to use ephemerals or use persistent
>>>>>> and
>>>>>> have
>>>>>> a "garbage collector" (implicit or explicit gc). In most cases it's
>>>>>> preferable to use the ephemeral.
>>>>>>
>>>>>>
>>>>>> 2/ Persistent nodes = nasty scalability *bottleneck* because you're
>>>>>>
>>>>>>
>>>>>> actually
>>>>>>> having to write to disk somewhere.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> This is not actually how ZK works. All znodes regardless of
>>>>>> persistent/ephemeral are written to disk persistently. Ephemeral nodes
>>>>>> are
>>>>>> tied to the session that created them. As long as the session is alive
>>>>>> the
>>>>>> ephemeral node is alive. Sessions themselves are persistently/reliably
>>>>>> stored by the ZK cluster. This allows the shutdown of the entire
>>>>>> cluster
>>>>>> and
>>>>>> restart it, all sessions/ephemerals will be maintained. Sessions can
>>>>>> move
>>>>>> from server to server (if say network connectivity to server A fails,
>>>>>> or
>>>>>> server A itself fails then the client will move to server B). The
>>>>>> session
>>>>>> and all ephemerals are maintained (well, as long as the client moves
>>>>>> withing
>>>>>> the expiration timeout value).
>>>>>>
>>>>>>
>>>>>> To avoid this I'm actually thinking of writing locking system where
>>>>>> you
>>>>>>
>>>>>>
>>>>>> work
>>>>>>> out the existing chain not by enumerating sequential children, but by
>>>>>>> looking at the contents of each temporary lock node to see what it is
>>>>>>> waiting on. But... that's quite horrible. Was wondering whether there
>>>>>>> is
>>>>>>> some technical reason why you ephemeral nodes can't have children??
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> There are a few cases to think about.
>>>>>>
>>>>>> 1) obviously ephemeral nodes can't have persistent children, this just
>>>>>> doesn't make sense
>>>>>>
>>>>>> 2) ephemeral nodes have an owner - the session that created them. so
>>>>>> it
>>>>>> would also not make sense (in my mind at least) to have an ephemeral
>>>>>> /foo
>>>>>> with another ephemeral /foo/bar with a different owner.
>>>>>>
>>>>>> 3) so you are left with "ephemerals can be a child of an ephemeral
>>>>>> with
>>>>>> the
>>>>>> same owner".
>>>>>>
>>>>>> 4) there are also issues of order. in particular what is the "deletion
>>>>>> order" depth first or breadth first, etc...
>>>>>>
>>>>>> I believe the answer so far has been "we don't do this because it's
>>>>>> fairly
>>>>>> complicated and we haven't seen any use cases that require it." In the
>>>>>> cases
>>>>>> I've seen so far there was either a misunderstanding of how zk worked,
>>>>>> or a
>>>>>> simpler way available.
>>>>>>
>>>>>> Does that make sense? Thoughts?
>>>>>>
>>>>>> Patrick
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>
Re: A deficiency? You can only add children to persistent nodes
Posted by Patrick Hunt <ph...@apache.org>.
Excellent idea Jeff. Dominic feel free to open 2 JIRAs, one for ephem in
ephem and a second for the weak watchers. Summarize your goals and the
discussion so far as appropriate. BTW, either of these JIRAs would be
good for a new contributor interested in gaining experience with
ZooKeeper. ephem is a bit tougher, but not so extensive that it's
insurmountable (of course our current dev base will help out).
Regards,
Patrick
Jeff Hammerbacher wrote:
> Hey Ben,
>
> Perhaps you should open a JIRA for further discussion?
>
> Thanks,
> Jeff
>
> On Mon, Mar 22, 2010 at 7:42 PM, Benjamin Reed <br...@yahoo-inc.com> wrote:
>
>> let me put out an idea that we have kicked around for a while: ephemeral
>> containers. the idea is that the znode disappears if it doesn't have
>> children. you would create the znode with create("/path", data, acl,
>> EPHEMERAL_CONTAINER) this would result in the creation two znodes: /path and
>> /path/child. (we have to create it with a child otherwise it immediately
>> disappears.)
>>
>> i think this mechanism would address your need in a way that is easy to
>> implement and use. it would also allow you to do a cool barrier
>> implementation!
>>
>> ben
>>
>>
>> On 03/22/2010 10:37 AM, Patrick Hunt wrote:
>>
>>> Dominic Williams wrote:
>>>
>>>
>>>> What I'd suggest might work:
>>>> - when the session that created the parent ends, ownership of the parent
>>>> could either be transferred to the owner/session that created the oldest
>>>> child, or instead ownership could be transferred to some kind of nominal
>>>> system session (which would delete the parent once the last ephemeral
>>>> child
>>>> disappeared)
>>>>
>>>>
>>> There may be some issues with idempotency here, also it could require
>>> extensive locking which drives up operation latencies (essentially
>>> "recursive delete"). It sounds possible, but someone would have to take
>>> a closer look as to the technical challenges involved.
>>>
>>>
>>> Our general philosophy is to keep things as simple as possible wrt api,
>>> semantics, implementation, etc... Distributed communication is hard and
>>> while we handle a lot of the issues for you it's still complex.
>>> Following our philosophy generally makes the easy things simple and the
>>> hard things possible, additionally it reduces the number of bugs that we
>>> have in the implementation itself (both user and service code).
>>>
>>> I don't wish to discourage you as much as provide insight/background
>>> into some of our decisions.
>>>
>>> Regards,
>>>
>>> Patrick
>>>
>>>
>>>
>>>> On 22 March 2010 16:44, Patrick Hunt<ph...@apache.org> wrote:
>>>>
>>>>
>>>>
>>>>> Dominic Williams wrote:
>>>>>
>>>>>
>>>>>
>>>>>> 1/ If a node crashes or something else goes wrong, you leave behind
>>>>>> persistent nodes. Over time these will grow and grow, rather like the
>>>>>> old
>>>>>> tmp folders used to fill with files under Windows
>>>>>>
>>>>>>
>>>>>>
>>>>> That's true. One either needs to use ephemerals or use persistent and
>>>>> have
>>>>> a "garbage collector" (implicit or explicit gc). In most cases it's
>>>>> preferable to use the ephemeral.
>>>>>
>>>>>
>>>>> 2/ Persistent nodes = nasty scalability *bottleneck* because you're
>>>>>
>>>>>
>>>>>> actually
>>>>>> having to write to disk somewhere.
>>>>>>
>>>>>>
>>>>>>
>>>>> This is not actually how ZK works. All znodes regardless of
>>>>> persistent/ephemeral are written to disk persistently. Ephemeral nodes
>>>>> are
>>>>> tied to the session that created them. As long as the session is alive
>>>>> the
>>>>> ephemeral node is alive. Sessions themselves are persistently/reliably
>>>>> stored by the ZK cluster. This allows the shutdown of the entire cluster
>>>>> and
>>>>> restart it, all sessions/ephemerals will be maintained. Sessions can
>>>>> move
>>>>> from server to server (if say network connectivity to server A fails, or
>>>>> server A itself fails then the client will move to server B). The
>>>>> session
>>>>> and all ephemerals are maintained (well, as long as the client moves
>>>>> withing
>>>>> the expiration timeout value).
>>>>>
>>>>>
>>>>> To avoid this I'm actually thinking of writing locking system where you
>>>>>
>>>>>
>>>>>> work
>>>>>> out the existing chain not by enumerating sequential children, but by
>>>>>> looking at the contents of each temporary lock node to see what it is
>>>>>> waiting on. But... that's quite horrible. Was wondering whether there
>>>>>> is
>>>>>> some technical reason why you ephemeral nodes can't have children??
>>>>>>
>>>>>>
>>>>>>
>>>>> There are a few cases to think about.
>>>>>
>>>>> 1) obviously ephemeral nodes can't have persistent children, this just
>>>>> doesn't make sense
>>>>>
>>>>> 2) ephemeral nodes have an owner - the session that created them. so it
>>>>> would also not make sense (in my mind at least) to have an ephemeral
>>>>> /foo
>>>>> with another ephemeral /foo/bar with a different owner.
>>>>>
>>>>> 3) so you are left with "ephemerals can be a child of an ephemeral with
>>>>> the
>>>>> same owner".
>>>>>
>>>>> 4) there are also issues of order. in particular what is the "deletion
>>>>> order" depth first or breadth first, etc...
>>>>>
>>>>> I believe the answer so far has been "we don't do this because it's
>>>>> fairly
>>>>> complicated and we haven't seen any use cases that require it." In the
>>>>> cases
>>>>> I've seen so far there was either a misunderstanding of how zk worked,
>>>>> or a
>>>>> simpler way available.
>>>>>
>>>>> Does that make sense? Thoughts?
>>>>>
>>>>> Patrick
>>>>>
>>>>>
>>>>>
>>>>
>
Re: A deficiency? You can only add children to persistent nodes
Posted by Jeff Hammerbacher <ha...@cloudera.com>.
Hey Ben,
Perhaps you should open a JIRA for further discussion?
Thanks,
Jeff
On Mon, Mar 22, 2010 at 7:42 PM, Benjamin Reed <br...@yahoo-inc.com> wrote:
> let me put out an idea that we have kicked around for a while: ephemeral
> containers. the idea is that the znode disappears if it doesn't have
> children. you would create the znode with create("/path", data, acl,
> EPHEMERAL_CONTAINER) this would result in the creation two znodes: /path and
> /path/child. (we have to create it with a child otherwise it immediately
> disappears.)
>
> i think this mechanism would address your need in a way that is easy to
> implement and use. it would also allow you to do a cool barrier
> implementation!
>
> ben
>
>
> On 03/22/2010 10:37 AM, Patrick Hunt wrote:
>
>> Dominic Williams wrote:
>>
>>
>>> What I'd suggest might work:
>>> - when the session that created the parent ends, ownership of the parent
>>> could either be transferred to the owner/session that created the oldest
>>> child, or instead ownership could be transferred to some kind of nominal
>>> system session (which would delete the parent once the last ephemeral
>>> child
>>> disappeared)
>>>
>>>
>> There may be some issues with idempotency here, also it could require
>> extensive locking which drives up operation latencies (essentially
>> "recursive delete"). It sounds possible, but someone would have to take
>> a closer look as to the technical challenges involved.
>>
>>
>> Our general philosophy is to keep things as simple as possible wrt api,
>> semantics, implementation, etc... Distributed communication is hard and
>> while we handle a lot of the issues for you it's still complex.
>> Following our philosophy generally makes the easy things simple and the
>> hard things possible, additionally it reduces the number of bugs that we
>> have in the implementation itself (both user and service code).
>>
>> I don't wish to discourage you as much as provide insight/background
>> into some of our decisions.
>>
>> Regards,
>>
>> Patrick
>>
>>
>>
>>> On 22 March 2010 16:44, Patrick Hunt<ph...@apache.org> wrote:
>>>
>>>
>>>
>>>> Dominic Williams wrote:
>>>>
>>>>
>>>>
>>>>> 1/ If a node crashes or something else goes wrong, you leave behind
>>>>> persistent nodes. Over time these will grow and grow, rather like the
>>>>> old
>>>>> tmp folders used to fill with files under Windows
>>>>>
>>>>>
>>>>>
>>>> That's true. One either needs to use ephemerals or use persistent and
>>>> have
>>>> a "garbage collector" (implicit or explicit gc). In most cases it's
>>>> preferable to use the ephemeral.
>>>>
>>>>
>>>> 2/ Persistent nodes = nasty scalability *bottleneck* because you're
>>>>
>>>>
>>>>> actually
>>>>> having to write to disk somewhere.
>>>>>
>>>>>
>>>>>
>>>> This is not actually how ZK works. All znodes regardless of
>>>> persistent/ephemeral are written to disk persistently. Ephemeral nodes
>>>> are
>>>> tied to the session that created them. As long as the session is alive
>>>> the
>>>> ephemeral node is alive. Sessions themselves are persistently/reliably
>>>> stored by the ZK cluster. This allows the shutdown of the entire cluster
>>>> and
>>>> restart it, all sessions/ephemerals will be maintained. Sessions can
>>>> move
>>>> from server to server (if say network connectivity to server A fails, or
>>>> server A itself fails then the client will move to server B). The
>>>> session
>>>> and all ephemerals are maintained (well, as long as the client moves
>>>> withing
>>>> the expiration timeout value).
>>>>
>>>>
>>>> To avoid this I'm actually thinking of writing locking system where you
>>>>
>>>>
>>>>> work
>>>>> out the existing chain not by enumerating sequential children, but by
>>>>> looking at the contents of each temporary lock node to see what it is
>>>>> waiting on. But... that's quite horrible. Was wondering whether there
>>>>> is
>>>>> some technical reason why you ephemeral nodes can't have children??
>>>>>
>>>>>
>>>>>
>>>> There are a few cases to think about.
>>>>
>>>> 1) obviously ephemeral nodes can't have persistent children, this just
>>>> doesn't make sense
>>>>
>>>> 2) ephemeral nodes have an owner - the session that created them. so it
>>>> would also not make sense (in my mind at least) to have an ephemeral
>>>> /foo
>>>> with another ephemeral /foo/bar with a different owner.
>>>>
>>>> 3) so you are left with "ephemerals can be a child of an ephemeral with
>>>> the
>>>> same owner".
>>>>
>>>> 4) there are also issues of order. in particular what is the "deletion
>>>> order" depth first or breadth first, etc...
>>>>
>>>> I believe the answer so far has been "we don't do this because it's
>>>> fairly
>>>> complicated and we haven't seen any use cases that require it." In the
>>>> cases
>>>> I've seen so far there was either a misunderstanding of how zk worked,
>>>> or a
>>>> simpler way available.
>>>>
>>>> Does that make sense? Thoughts?
>>>>
>>>> Patrick
>>>>
>>>>
>>>>
>>>
>>>
>>
>
Re: A deficiency? You can only add children to persistent nodes
Posted by Benjamin Reed <br...@yahoo-inc.com>.
in some sense the children will "own" the parent. the nice thing about
it is that it isn't tied to any particular session, so we don't have to
worry about weird cases like owners going away or switching ownership.
ben
On 03/23/2010 03:04 AM, Dominic Williams wrote:
> Hi, would work nicely.
>
> Who would own the parent node after the session that created the initial
> pair exits (assume additional children exist)?
>
> On 23 March 2010 02:42, Benjamin Reed<br...@yahoo-inc.com> wrote:
>
>
>> let me put out an idea that we have kicked around for a while: ephemeral
>> containers. the idea is that the znode disappears if it doesn't have
>> children. you would create the znode with create("/path", data, acl,
>> EPHEMERAL_CONTAINER) this would result in the creation two znodes: /path and
>> /path/child. (we have to create it with a child otherwise it immediately
>> disappears.)
>>
>> i think this mechanism would address your need in a way that is easy to
>> implement and use. it would also allow you to do a cool barrier
>> implementation!
>>
>> ben
>>
>>
>> On 03/22/2010 10:37 AM, Patrick Hunt wrote:
>>
>>
>>> Dominic Williams wrote:
>>>
>>>
>>>
>>>> What I'd suggest might work:
>>>> - when the session that created the parent ends, ownership of the parent
>>>> could either be transferred to the owner/session that created the oldest
>>>> child, or instead ownership could be transferred to some kind of nominal
>>>> system session (which would delete the parent once the last ephemeral
>>>> child
>>>> disappeared)
>>>>
>>>>
>>>>
>>> There may be some issues with idempotency here, also it could require
>>> extensive locking which drives up operation latencies (essentially
>>> "recursive delete"). It sounds possible, but someone would have to take
>>> a closer look as to the technical challenges involved.
>>>
>>>
>>> Our general philosophy is to keep things as simple as possible wrt api,
>>> semantics, implementation, etc... Distributed communication is hard and
>>> while we handle a lot of the issues for you it's still complex.
>>> Following our philosophy generally makes the easy things simple and the
>>> hard things possible, additionally it reduces the number of bugs that we
>>> have in the implementation itself (both user and service code).
>>>
>>> I don't wish to discourage you as much as provide insight/background
>>> into some of our decisions.
>>>
>>> Regards,
>>>
>>> Patrick
>>>
>>>
>>>
>>>
>>>> On 22 March 2010 16:44, Patrick Hunt<ph...@apache.org> wrote:
>>>>
>>>>
>>>>
>>>>
>>>>> Dominic Williams wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> 1/ If a node crashes or something else goes wrong, you leave behind
>>>>>> persistent nodes. Over time these will grow and grow, rather like the
>>>>>> old
>>>>>> tmp folders used to fill with files under Windows
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> That's true. One either needs to use ephemerals or use persistent and
>>>>> have
>>>>> a "garbage collector" (implicit or explicit gc). In most cases it's
>>>>> preferable to use the ephemeral.
>>>>>
>>>>>
>>>>> 2/ Persistent nodes = nasty scalability *bottleneck* because you're
>>>>>
>>>>>
>>>>>
>>>>>> actually
>>>>>> having to write to disk somewhere.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> This is not actually how ZK works. All znodes regardless of
>>>>> persistent/ephemeral are written to disk persistently. Ephemeral nodes
>>>>> are
>>>>> tied to the session that created them. As long as the session is alive
>>>>> the
>>>>> ephemeral node is alive. Sessions themselves are persistently/reliably
>>>>> stored by the ZK cluster. This allows the shutdown of the entire cluster
>>>>> and
>>>>> restart it, all sessions/ephemerals will be maintained. Sessions can
>>>>> move
>>>>> from server to server (if say network connectivity to server A fails, or
>>>>> server A itself fails then the client will move to server B). The
>>>>> session
>>>>> and all ephemerals are maintained (well, as long as the client moves
>>>>> withing
>>>>> the expiration timeout value).
>>>>>
>>>>>
>>>>> To avoid this I'm actually thinking of writing locking system where you
>>>>>
>>>>>
>>>>>
>>>>>> work
>>>>>> out the existing chain not by enumerating sequential children, but by
>>>>>> looking at the contents of each temporary lock node to see what it is
>>>>>> waiting on. But... that's quite horrible. Was wondering whether there
>>>>>> is
>>>>>> some technical reason why you ephemeral nodes can't have children??
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> There are a few cases to think about.
>>>>>
>>>>> 1) obviously ephemeral nodes can't have persistent children, this just
>>>>> doesn't make sense
>>>>>
>>>>> 2) ephemeral nodes have an owner - the session that created them. so it
>>>>> would also not make sense (in my mind at least) to have an ephemeral
>>>>> /foo
>>>>> with another ephemeral /foo/bar with a different owner.
>>>>>
>>>>> 3) so you are left with "ephemerals can be a child of an ephemeral with
>>>>> the
>>>>> same owner".
>>>>>
>>>>> 4) there are also issues of order. in particular what is the "deletion
>>>>> order" depth first or breadth first, etc...
>>>>>
>>>>> I believe the answer so far has been "we don't do this because it's
>>>>> fairly
>>>>> complicated and we haven't seen any use cases that require it." In the
>>>>> cases
>>>>> I've seen so far there was either a misunderstanding of how zk worked,
>>>>> or a
>>>>> simpler way available.
>>>>>
>>>>> Does that make sense? Thoughts?
>>>>>
>>>>> Patrick
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
Re: A deficiency? You can only add children to persistent nodes
Posted by Dominic Williams <th...@googlemail.com>.
Hi, would work nicely.
Who would own the parent node after the session that created the initial
pair exits (assume additional children exist)?
On 23 March 2010 02:42, Benjamin Reed <br...@yahoo-inc.com> wrote:
> let me put out an idea that we have kicked around for a while: ephemeral
> containers. the idea is that the znode disappears if it doesn't have
> children. you would create the znode with create("/path", data, acl,
> EPHEMERAL_CONTAINER) this would result in the creation two znodes: /path and
> /path/child. (we have to create it with a child otherwise it immediately
> disappears.)
>
> i think this mechanism would address your need in a way that is easy to
> implement and use. it would also allow you to do a cool barrier
> implementation!
>
> ben
>
>
> On 03/22/2010 10:37 AM, Patrick Hunt wrote:
>
>> Dominic Williams wrote:
>>
>>
>>> What I'd suggest might work:
>>> - when the session that created the parent ends, ownership of the parent
>>> could either be transferred to the owner/session that created the oldest
>>> child, or instead ownership could be transferred to some kind of nominal
>>> system session (which would delete the parent once the last ephemeral
>>> child
>>> disappeared)
>>>
>>>
>> There may be some issues with idempotency here, also it could require
>> extensive locking which drives up operation latencies (essentially
>> "recursive delete"). It sounds possible, but someone would have to take
>> a closer look as to the technical challenges involved.
>>
>>
>> Our general philosophy is to keep things as simple as possible wrt api,
>> semantics, implementation, etc... Distributed communication is hard and
>> while we handle a lot of the issues for you it's still complex.
>> Following our philosophy generally makes the easy things simple and the
>> hard things possible, additionally it reduces the number of bugs that we
>> have in the implementation itself (both user and service code).
>>
>> I don't wish to discourage you as much as provide insight/background
>> into some of our decisions.
>>
>> Regards,
>>
>> Patrick
>>
>>
>>
>>> On 22 March 2010 16:44, Patrick Hunt<ph...@apache.org> wrote:
>>>
>>>
>>>
>>>> Dominic Williams wrote:
>>>>
>>>>
>>>>
>>>>> 1/ If a node crashes or something else goes wrong, you leave behind
>>>>> persistent nodes. Over time these will grow and grow, rather like the
>>>>> old
>>>>> tmp folders used to fill with files under Windows
>>>>>
>>>>>
>>>>>
>>>> That's true. One either needs to use ephemerals or use persistent and
>>>> have
>>>> a "garbage collector" (implicit or explicit gc). In most cases it's
>>>> preferable to use the ephemeral.
>>>>
>>>>
>>>> 2/ Persistent nodes = nasty scalability *bottleneck* because you're
>>>>
>>>>
>>>>> actually
>>>>> having to write to disk somewhere.
>>>>>
>>>>>
>>>>>
>>>> This is not actually how ZK works. All znodes regardless of
>>>> persistent/ephemeral are written to disk persistently. Ephemeral nodes
>>>> are
>>>> tied to the session that created them. As long as the session is alive
>>>> the
>>>> ephemeral node is alive. Sessions themselves are persistently/reliably
>>>> stored by the ZK cluster. This allows the shutdown of the entire cluster
>>>> and
>>>> restart it, all sessions/ephemerals will be maintained. Sessions can
>>>> move
>>>> from server to server (if say network connectivity to server A fails, or
>>>> server A itself fails then the client will move to server B). The
>>>> session
>>>> and all ephemerals are maintained (well, as long as the client moves
>>>> withing
>>>> the expiration timeout value).
>>>>
>>>>
>>>> To avoid this I'm actually thinking of writing locking system where you
>>>>
>>>>
>>>>> work
>>>>> out the existing chain not by enumerating sequential children, but by
>>>>> looking at the contents of each temporary lock node to see what it is
>>>>> waiting on. But... that's quite horrible. Was wondering whether there
>>>>> is
>>>>> some technical reason why you ephemeral nodes can't have children??
>>>>>
>>>>>
>>>>>
>>>> There are a few cases to think about.
>>>>
>>>> 1) obviously ephemeral nodes can't have persistent children, this just
>>>> doesn't make sense
>>>>
>>>> 2) ephemeral nodes have an owner - the session that created them. so it
>>>> would also not make sense (in my mind at least) to have an ephemeral
>>>> /foo
>>>> with another ephemeral /foo/bar with a different owner.
>>>>
>>>> 3) so you are left with "ephemerals can be a child of an ephemeral with
>>>> the
>>>> same owner".
>>>>
>>>> 4) there are also issues of order. in particular what is the "deletion
>>>> order" depth first or breadth first, etc...
>>>>
>>>> I believe the answer so far has been "we don't do this because it's
>>>> fairly
>>>> complicated and we haven't seen any use cases that require it." In the
>>>> cases
>>>> I've seen so far there was either a misunderstanding of how zk worked,
>>>> or a
>>>> simpler way available.
>>>>
>>>> Does that make sense? Thoughts?
>>>>
>>>> Patrick
>>>>
>>>>
>>>>
>>>
>>>
>>
>
Re: A deficiency? You can only add children to persistent nodes
Posted by Benjamin Reed <br...@yahoo-inc.com>.
let me put out an idea that we have kicked around for a while: ephemeral
containers. the idea is that the znode disappears if it doesn't have
children. you would create the znode with create("/path", data, acl,
EPHEMERAL_CONTAINER) this would result in the creation two znodes: /path
and /path/child. (we have to create it with a child otherwise it
immediately disappears.)
i think this mechanism would address your need in a way that is easy to
implement and use. it would also allow you to do a cool barrier
implementation!
ben
On 03/22/2010 10:37 AM, Patrick Hunt wrote:
> Dominic Williams wrote:
>
>> What I'd suggest might work:
>> - when the session that created the parent ends, ownership of the parent
>> could either be transferred to the owner/session that created the oldest
>> child, or instead ownership could be transferred to some kind of nominal
>> system session (which would delete the parent once the last ephemeral child
>> disappeared)
>>
> There may be some issues with idempotency here, also it could require
> extensive locking which drives up operation latencies (essentially
> "recursive delete"). It sounds possible, but someone would have to take
> a closer look as to the technical challenges involved.
>
>
> Our general philosophy is to keep things as simple as possible wrt api,
> semantics, implementation, etc... Distributed communication is hard and
> while we handle a lot of the issues for you it's still complex.
> Following our philosophy generally makes the easy things simple and the
> hard things possible, additionally it reduces the number of bugs that we
> have in the implementation itself (both user and service code).
>
> I don't wish to discourage you as much as provide insight/background
> into some of our decisions.
>
> Regards,
>
> Patrick
>
>
>> On 22 March 2010 16:44, Patrick Hunt<ph...@apache.org> wrote:
>>
>>
>>> Dominic Williams wrote:
>>>
>>>
>>>> 1/ If a node crashes or something else goes wrong, you leave behind
>>>> persistent nodes. Over time these will grow and grow, rather like the old
>>>> tmp folders used to fill with files under Windows
>>>>
>>>>
>>> That's true. One either needs to use ephemerals or use persistent and have
>>> a "garbage collector" (implicit or explicit gc). In most cases it's
>>> preferable to use the ephemeral.
>>>
>>>
>>> 2/ Persistent nodes = nasty scalability *bottleneck* because you're
>>>
>>>> actually
>>>> having to write to disk somewhere.
>>>>
>>>>
>>> This is not actually how ZK works. All znodes regardless of
>>> persistent/ephemeral are written to disk persistently. Ephemeral nodes are
>>> tied to the session that created them. As long as the session is alive the
>>> ephemeral node is alive. Sessions themselves are persistently/reliably
>>> stored by the ZK cluster. This allows the shutdown of the entire cluster and
>>> restart it, all sessions/ephemerals will be maintained. Sessions can move
>>> from server to server (if say network connectivity to server A fails, or
>>> server A itself fails then the client will move to server B). The session
>>> and all ephemerals are maintained (well, as long as the client moves withing
>>> the expiration timeout value).
>>>
>>>
>>> To avoid this I'm actually thinking of writing locking system where you
>>>
>>>> work
>>>> out the existing chain not by enumerating sequential children, but by
>>>> looking at the contents of each temporary lock node to see what it is
>>>> waiting on. But... that's quite horrible. Was wondering whether there is
>>>> some technical reason why you ephemeral nodes can't have children??
>>>>
>>>>
>>> There are a few cases to think about.
>>>
>>> 1) obviously ephemeral nodes can't have persistent children, this just
>>> doesn't make sense
>>>
>>> 2) ephemeral nodes have an owner - the session that created them. so it
>>> would also not make sense (in my mind at least) to have an ephemeral /foo
>>> with another ephemeral /foo/bar with a different owner.
>>>
>>> 3) so you are left with "ephemerals can be a child of an ephemeral with the
>>> same owner".
>>>
>>> 4) there are also issues of order. in particular what is the "deletion
>>> order" depth first or breadth first, etc...
>>>
>>> I believe the answer so far has been "we don't do this because it's fairly
>>> complicated and we haven't seen any use cases that require it." In the cases
>>> I've seen so far there was either a misunderstanding of how zk worked, or a
>>> simpler way available.
>>>
>>> Does that make sense? Thoughts?
>>>
>>> Patrick
>>>
>>>
>>
Re: A deficiency? You can only add children to persistent nodes
Posted by Patrick Hunt <ph...@apache.org>.
Dominic Williams wrote:
> What I'd suggest might work:
> - when the session that created the parent ends, ownership of the parent
> could either be transferred to the owner/session that created the oldest
> child, or instead ownership could be transferred to some kind of nominal
> system session (which would delete the parent once the last ephemeral child
> disappeared)
There may be some issues with idempotency here, also it could require
extensive locking which drives up operation latencies (essentially
"recursive delete"). It sounds possible, but someone would have to take
a closer look as to the technical challenges involved.
Our general philosophy is to keep things as simple as possible wrt api,
semantics, implementation, etc... Distributed communication is hard and
while we handle a lot of the issues for you it's still complex.
Following our philosophy generally makes the easy things simple and the
hard things possible, additionally it reduces the number of bugs that we
have in the implementation itself (both user and service code).
I don't wish to discourage you as much as provide insight/background
into some of our decisions.
Regards,
Patrick
>
> On 22 March 2010 16:44, Patrick Hunt <ph...@apache.org> wrote:
>
>> Dominic Williams wrote:
>>
>>> 1/ If a node crashes or something else goes wrong, you leave behind
>>> persistent nodes. Over time these will grow and grow, rather like the old
>>> tmp folders used to fill with files under Windows
>>>
>> That's true. One either needs to use ephemerals or use persistent and have
>> a "garbage collector" (implicit or explicit gc). In most cases it's
>> preferable to use the ephemeral.
>>
>>
>> 2/ Persistent nodes = nasty scalability *bottleneck* because you're
>>> actually
>>> having to write to disk somewhere.
>>>
>> This is not actually how ZK works. All znodes regardless of
>> persistent/ephemeral are written to disk persistently. Ephemeral nodes are
>> tied to the session that created them. As long as the session is alive the
>> ephemeral node is alive. Sessions themselves are persistently/reliably
>> stored by the ZK cluster. This allows the shutdown of the entire cluster and
>> restart it, all sessions/ephemerals will be maintained. Sessions can move
>> from server to server (if say network connectivity to server A fails, or
>> server A itself fails then the client will move to server B). The session
>> and all ephemerals are maintained (well, as long as the client moves withing
>> the expiration timeout value).
>>
>>
>> To avoid this I'm actually thinking of writing locking system where you
>>> work
>>> out the existing chain not by enumerating sequential children, but by
>>> looking at the contents of each temporary lock node to see what it is
>>> waiting on. But... that's quite horrible. Was wondering whether there is
>>> some technical reason why you ephemeral nodes can't have children??
>>>
>> There are a few cases to think about.
>>
>> 1) obviously ephemeral nodes can't have persistent children, this just
>> doesn't make sense
>>
>> 2) ephemeral nodes have an owner - the session that created them. so it
>> would also not make sense (in my mind at least) to have an ephemeral /foo
>> with another ephemeral /foo/bar with a different owner.
>>
>> 3) so you are left with "ephemerals can be a child of an ephemeral with the
>> same owner".
>>
>> 4) there are also issues of order. in particular what is the "deletion
>> order" depth first or breadth first, etc...
>>
>> I believe the answer so far has been "we don't do this because it's fairly
>> complicated and we haven't seen any use cases that require it." In the cases
>> I've seen so far there was either a misunderstanding of how zk worked, or a
>> simpler way available.
>>
>> Does that make sense? Thoughts?
>>
>> Patrick
>>
>
Re: A deficiency? You can only add children to persistent nodes
Posted by Dominic Williams <th...@googlemail.com>.
Hi Patrick,
This is a little more complex than I'd initially given consideration to. The
biggest point being the question about what would happen to an ephemeral
parent node when the session that owns it exits.
What I'd suggest might work:
- when the session that created the parent ends, ownership of the parent
could either be transferred to the owner/session that created the oldest
child, or instead ownership could be transferred to some kind of nominal
system session (which would delete the parent once the last ephemeral child
disappeared)
- when someone tries to create a persistent child of an ephemeral node, they
simply get back an appropriate error code
Best, Dominic
On 22 March 2010 16:44, Patrick Hunt <ph...@apache.org> wrote:
> Dominic Williams wrote:
>
>> 1/ If a node crashes or something else goes wrong, you leave behind
>> persistent nodes. Over time these will grow and grow, rather like the old
>> tmp folders used to fill with files under Windows
>>
>
> That's true. One either needs to use ephemerals or use persistent and have
> a "garbage collector" (implicit or explicit gc). In most cases it's
> preferable to use the ephemeral.
>
>
> 2/ Persistent nodes = nasty scalability *bottleneck* because you're
>> actually
>> having to write to disk somewhere.
>>
>
> This is not actually how ZK works. All znodes regardless of
> persistent/ephemeral are written to disk persistently. Ephemeral nodes are
> tied to the session that created them. As long as the session is alive the
> ephemeral node is alive. Sessions themselves are persistently/reliably
> stored by the ZK cluster. This allows the shutdown of the entire cluster and
> restart it, all sessions/ephemerals will be maintained. Sessions can move
> from server to server (if say network connectivity to server A fails, or
> server A itself fails then the client will move to server B). The session
> and all ephemerals are maintained (well, as long as the client moves withing
> the expiration timeout value).
>
>
> To avoid this I'm actually thinking of writing locking system where you
>> work
>> out the existing chain not by enumerating sequential children, but by
>> looking at the contents of each temporary lock node to see what it is
>> waiting on. But... that's quite horrible. Was wondering whether there is
>> some technical reason why you ephemeral nodes can't have children??
>>
>
> There are a few cases to think about.
>
> 1) obviously ephemeral nodes can't have persistent children, this just
> doesn't make sense
>
> 2) ephemeral nodes have an owner - the session that created them. so it
> would also not make sense (in my mind at least) to have an ephemeral /foo
> with another ephemeral /foo/bar with a different owner.
>
> 3) so you are left with "ephemerals can be a child of an ephemeral with the
> same owner".
>
> 4) there are also issues of order. in particular what is the "deletion
> order" depth first or breadth first, etc...
>
> I believe the answer so far has been "we don't do this because it's fairly
> complicated and we haven't seen any use cases that require it." In the cases
> I've seen so far there was either a misunderstanding of how zk worked, or a
> simpler way available.
>
> Does that make sense? Thoughts?
>
> Patrick
>
Re: A deficiency? You can only add children to persistent nodes
Posted by Patrick Hunt <ph...@apache.org>.
Dominic Williams wrote:
> 1/ If a node crashes or something else goes wrong, you leave behind
> persistent nodes. Over time these will grow and grow, rather like the old
> tmp folders used to fill with files under Windows
That's true. One either needs to use ephemerals or use persistent and
have a "garbage collector" (implicit or explicit gc). In most cases it's
preferable to use the ephemeral.
> 2/ Persistent nodes = nasty scalability *bottleneck* because you're actually
> having to write to disk somewhere.
This is not actually how ZK works. All znodes regardless of
persistent/ephemeral are written to disk persistently. Ephemeral nodes
are tied to the session that created them. As long as the session is
alive the ephemeral node is alive. Sessions themselves are
persistently/reliably stored by the ZK cluster. This allows the shutdown
of the entire cluster and restart it, all sessions/ephemerals will be
maintained. Sessions can move from server to server (if say network
connectivity to server A fails, or server A itself fails then the client
will move to server B). The session and all ephemerals are maintained
(well, as long as the client moves withing the expiration timeout value).
> To avoid this I'm actually thinking of writing locking system where you work
> out the existing chain not by enumerating sequential children, but by
> looking at the contents of each temporary lock node to see what it is
> waiting on. But... that's quite horrible. Was wondering whether there is
> some technical reason why you ephemeral nodes can't have children??
There are a few cases to think about.
1) obviously ephemeral nodes can't have persistent children, this just
doesn't make sense
2) ephemeral nodes have an owner - the session that created them. so it
would also not make sense (in my mind at least) to have an ephemeral
/foo with another ephemeral /foo/bar with a different owner.
3) so you are left with "ephemerals can be a child of an ephemeral with
the same owner".
4) there are also issues of order. in particular what is the "deletion
order" depth first or breadth first, etc...
I believe the answer so far has been "we don't do this because it's
fairly complicated and we haven't seen any use cases that require it."
In the cases I've seen so far there was either a misunderstanding of how
zk worked, or a simpler way available.
Does that make sense? Thoughts?
Patrick