You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@zookeeper.apache.org by Ted Yu <yu...@gmail.com> on 2012/03/20 14:57:13 UTC

ZOOKEEPER-1059 Was: Does the rolling-restart.sh script work?

I looked at the patch for ZOOKEEPER-1059 which should have converted the
NPE to KeeperException.NoNodeException

Why would 'zkcli stat' command return 0 in case hbase master znode expires ?

Advice is appreciated.

FYI Jon filed a JIRA for the issue below which is a blocker for HBase trunk.

On Tue, Mar 20, 2012 at 12:36 AM, Jonathan Hsieh <jo...@cloudera.com> wrote:

> I'm trying to test HBASE-5589 -- to see if I can add an API call to
> HMasterInterface and do a rolling-restart / upgrade on a live cluster which
> lead me down another rabbit hole.
>
> I'm wondering how rolling-restart.sh script worked in the past (I can spend
> more time setting up an older version to test this, but figured I'd ask).
>
> I'm getting stuck when the bin/rolling-restart.sh tries to wait until the
> Master ZNode expires.  In this particular case, the script seems to hang
> there forever (even after the /hbase/master ephemeral node expires).
>
> Here's the code in the script:
> ----
> # make sure the master znode has been deleted before continuing
>    zparent=`$bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool
> zookeeper.znode.parent`
>    if [ "$zparent" == "null" ]; then zparent="/hbase"; fi
>    zmaster=`$bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool
> zookeeper.znode.master`
>    if [ "$zmaster" == "null" ]; then zmaster="master"; fi
>    zmaster=$zparent/$zmaster
>    echo -n "Waiting for Master ZNode ${zmaster} to expire"
>    while bin/hbase zkcli stat $zmaster >/dev/null 2>&1; do
>      echo -n "."
>      sleep 1
>    done
>    echo #force a newline
> ----
>
> The problem is that 'bin/hbase zkcli stat /hbase/master ...' seems to
> always returns with $? == 0 regardless if the znode is present or not
> present!  I've checked with Patrick Hunt (ZK committer) and this the
> expected behavior.  The only non-zero retcodes are for abnormal exits
> (exceptions thrown)
>
> Here's the ZK code I was looking through
>
> https://github.com/apache/zookeeper/blob/release-3.4.3/src/java/main/org/apache/zookeeper/ZooKeeperMain.java#L736
>
>
> https://github.com/apache/zookeeper/blob/release-3.4.3/src/java/main/org/apache/zookeeper/ZooKeeper.java#L980
>
>
> Thoughts?
>
> Jon.
>
> --
> // Jonathan Hsieh (shay)
> // Software Engineer, Cloudera
> // jon@cloudera.com
>

Re: ZOOKEEPER-1059 Was: Does the rolling-restart.sh script work?

Posted by Patrick Hunt <ph...@apache.org>.

That's great news, glad it worked out. Thanks for the update Ted.

Patrick

On Tue, Mar 20, 2012 at 3:33 PM, Ted Yu <yu...@gmail.com> wrote:
> We're using the trick Patrick proposed, see:
> https://issues.apache.org/jira/browse/HBASE-5603
>
> FYI
>
> On Tue, Mar 20, 2012 at 10:14 AM, Patrick Hunt <ph...@apache.org> wrote:
>
>> Great. Thanks Ted.
>>
>> On Tue, Mar 20, 2012 at 10:09 AM, Ted Yu <yu...@gmail.com> wrote:
>> > Patrick:
>> > I logged https://issues.apache.org/jira/browse/ZOOKEEPER-1428
>> >
>> > If you feel there is anything missing in the JIRA, feel free to add it.
>> >
>> > Thanks for your help on this issue.
>> >
>> > Cheers
>> >
>> > On Tue, Mar 20, 2012 at 9:42 AM, Patrick Hunt <ph...@apache.org> wrote:
>> >
>> >> On Tue, Mar 20, 2012 at 9:32 AM, Ted Yu <yu...@gmail.com> wrote:
>> >> > Near term, if we can find out a way for shell script to detect the
>> >> absence
>> >> > of particular zookeeper node, rolling-restart.sh can be restored.
>> >> > Otherwise we may need to remove it.
>> >>
>> >> I just tested this out with 3.4, and I see the following for statting
>> >> a non-existant znode:
>> >>
>> >> [zk: (CONNECTED) 1] stat /foobar
>> >> Node does not exist: /foobar
>> >>
>> >> vs statting one that does exist:
>> >>
>> >> [zk: (CONNECTED) 2] stat /
>> >> cZxid = 0x0
>> >> ctime = Wed Dec 31 16:00:00 PST 1969
>> >> mZxid = 0x0
>> >> mtime = Wed Dec 31 16:00:00 PST 1969
>> >> pZxid = 0x0
>> >> cversion = -1
>> >> dataVersion = 0
>> >> aclVersion = 0
>> >> ephemeralOwner = 0x0
>> >> dataLength = 0
>> >> numChildren = 1
>> >>
>> >> You can look for "^Node does not exist" in the stat output instead of
>> >> checking the exit code. This would get around the problem until a more
>> >> permanent solution could be found.
>> >>
>> >> I hear you re time bound (i'd love to work on this myself). In that
>> >> case, would you mind creating a jira based on my suggestion of having
>> >> a new command line tool, give your hbase case as an example and any
>> >> requirements you might think of. Perhaps Hartmut or one of the other
>> >> contributors might be interested to work on this.
>> >> https://issues.apache.org/jira/browse/ZOOKEEPER
>> >>
>> >> Patrick
>> >>
>> >> >
>> >> > On Tue, Mar 20, 2012 at 9:16 AM, Patrick Hunt <ph...@apache.org>
>> wrote:
>> >> >
>> >> >> On Tue, Mar 20, 2012 at 6:57 AM, Ted Yu <yu...@gmail.com> wrote:
>> >> >> > I looked at the patch for ZOOKEEPER-1059 which should have
>> converted
>> >> the
>> >> >> > NPE to KeeperException.NoNodeException
>> >> >> >
>> >> >> > Why would 'zkcli stat' command return 0 in case hbase master znode
>> >> >> expires ?
>> >> >> >
>> >> >> > Advice is appreciated.
>> >> >>
>> >> >> Hi Ted, sorry to see you're having troubles. I think I see the
>> >> >> disconnect. ZooKeeperMain is first and foremost a user shell. As such
>> >> >> it should not exit unless the quit command is run (or killed
>> >> >> explicitly, etc...). In this case ZOOKEEPER-1059 is fixing a bug in
>> >> >> the shell. It indeed is converting the NPE into a NoNodeException,
>> >> >> which the shell then converts into an error message to the user, and
>> >> >> continues. Prior to this patch the shell was failing on the NPE,
>> which
>> >> >> then generated the non-0 exit from the process.
>> >> >>
>> >> >> Note that trunk has some further improvements along these lines that
>> >> >> you might also run into at some point in the future (3.5+):
>> >> >>
>> >> >> https://issues.apache.org/jira/browse/ZOOKEEPER-271
>> >> >> https://issues.apache.org/jira/browse/ZOOKEEPER-1391
>> >> >> https://issues.apache.org/jira/browse/ZOOKEEPER-1307
>> >> >>
>> >> >> I think what we need is to have a tool that's intended for use both
>> >> >> programmatically and by humans, with more strict requirements about
>> >> >> input, output formatting and command handling, etc... Please see the
>> >> >> work Hartmut has been doing as part of 271 on trunk (3.5.0). Perhaps
>> >> >> we can augment these new classes to also support such a tool. However
>> >> >> it should instead be a true command line tool, rather than an shell.
>> >> >> Would you be available to work on this?
>> >> >>
>> >> >> Patrick
>> >> >>
>> >> >> ps. bigtop is now helping to verify cross project compatibility, it
>> >> >> would be great if you could introduce some hbase tests  that would
>> >> >> flag these breakages in future. When bigtop does it's integration (ie
>> >> >> runs the hbase tests using the corresponding version of zk) it would
>> >> >> find these problems. We'd catch it much earlier. Thanks!
>> >> >>
>> >> >>
>> >> >> > FYI Jon filed a JIRA for the issue below which is a blocker for
>> HBase
>> >> >> trunk.
>> >> >> >
>> >> >> > On Tue, Mar 20, 2012 at 12:36 AM, Jonathan Hsieh <jon@cloudera.com
>> >
>> >> >> wrote:
>> >> >> >
>> >> >> >> I'm trying to test HBASE-5589 -- to see if I can add an API call
>> to
>> >> >> >> HMasterInterface and do a rolling-restart / upgrade on a live
>> cluster
>> >> >> which
>> >> >> >> lead me down another rabbit hole.
>> >> >> >>
>> >> >> >> I'm wondering how rolling-restart.sh script worked in the past (I
>> can
>> >> >> spend
>> >> >> >> more time setting up an older version to test this, but figured
>> I'd
>> >> >> ask).
>> >> >> >>
>> >> >> >> I'm getting stuck when the bin/rolling-restart.sh tries to wait
>> until
>> >> >> the
>> >> >> >> Master ZNode expires.  In this particular case, the script seems
>> to
>> >> hang
>> >> >> >> there forever (even after the /hbase/master ephemeral node
>> expires).
>> >> >> >>
>> >> >> >> Here's the code in the script:
>> >> >> >> ----
>> >> >> >> # make sure the master znode has been deleted before continuing
>> >> >> >>    zparent=`$bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool
>> >> >> >> zookeeper.znode.parent`
>> >> >> >>    if [ "$zparent" == "null" ]; then zparent="/hbase"; fi
>> >> >> >>    zmaster=`$bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool
>> >> >> >> zookeeper.znode.master`
>> >> >> >>    if [ "$zmaster" == "null" ]; then zmaster="master"; fi
>> >> >> >>    zmaster=$zparent/$zmaster
>> >> >> >>    echo -n "Waiting for Master ZNode ${zmaster} to expire"
>> >> >> >>    while bin/hbase zkcli stat $zmaster >/dev/null 2>&1; do
>> >> >> >>      echo -n "."
>> >> >> >>      sleep 1
>> >> >> >>    done
>> >> >> >>    echo #force a newline
>> >> >> >> ----
>> >> >> >>
>> >> >> >> The problem is that 'bin/hbase zkcli stat /hbase/master ...'
>> seems to
>> >> >> >> always returns with $? == 0 regardless if the znode is present or
>> not
>> >> >> >> present!  I've checked with Patrick Hunt (ZK committer) and this
>> the
>> >> >> >> expected behavior.  The only non-zero retcodes are for abnormal
>> exits
>> >> >> >> (exceptions thrown)
>> >> >> >>
>> >> >> >> Here's the ZK code I was looking through
>> >> >> >>
>> >> >> >>
>> >> >>
>> >>
>> https://github.com/apache/zookeeper/blob/release-3.4.3/src/java/main/org/apache/zookeeper/ZooKeeperMain.java#L736
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >>
>> >>
>> https://github.com/apache/zookeeper/blob/release-3.4.3/src/java/main/org/apache/zookeeper/ZooKeeper.java#L980
>> >> >> >>
>> >> >> >>
>> >> >> >> Thoughts?
>> >> >> >>
>> >> >> >> Jon.
>> >> >> >>
>> >> >> >> --
>> >> >> >> // Jonathan Hsieh (shay)
>> >> >> >> // Software Engineer, Cloudera
>> >> >> >> // jon@cloudera.com
>> >> >> >>
>> >> >>
>> >>
>>

Re: ZOOKEEPER-1059 Was: Does the rolling-restart.sh script work?

Posted by Ted Yu <yu...@gmail.com>.

We're using the trick Patrick proposed, see:
https://issues.apache.org/jira/browse/HBASE-5603

FYI

On Tue, Mar 20, 2012 at 10:14 AM, Patrick Hunt <ph...@apache.org> wrote:

> Great. Thanks Ted.
>
> On Tue, Mar 20, 2012 at 10:09 AM, Ted Yu <yu...@gmail.com> wrote:
> > Patrick:
> > I logged https://issues.apache.org/jira/browse/ZOOKEEPER-1428
> >
> > If you feel there is anything missing in the JIRA, feel free to add it.
> >
> > Thanks for your help on this issue.
> >
> > Cheers
> >
> > On Tue, Mar 20, 2012 at 9:42 AM, Patrick Hunt <ph...@apache.org> wrote:
> >
> >> On Tue, Mar 20, 2012 at 9:32 AM, Ted Yu <yu...@gmail.com> wrote:
> >> > Near term, if we can find out a way for shell script to detect the
> >> absence
> >> > of particular zookeeper node, rolling-restart.sh can be restored.
> >> > Otherwise we may need to remove it.
> >>
> >> I just tested this out with 3.4, and I see the following for statting
> >> a non-existant znode:
> >>
> >> [zk: (CONNECTED) 1] stat /foobar
> >> Node does not exist: /foobar
> >>
> >> vs statting one that does exist:
> >>
> >> [zk: (CONNECTED) 2] stat /
> >> cZxid = 0x0
> >> ctime = Wed Dec 31 16:00:00 PST 1969
> >> mZxid = 0x0
> >> mtime = Wed Dec 31 16:00:00 PST 1969
> >> pZxid = 0x0
> >> cversion = -1
> >> dataVersion = 0
> >> aclVersion = 0
> >> ephemeralOwner = 0x0
> >> dataLength = 0
> >> numChildren = 1
> >>
> >> You can look for "^Node does not exist" in the stat output instead of
> >> checking the exit code. This would get around the problem until a more
> >> permanent solution could be found.
> >>
> >> I hear you re time bound (i'd love to work on this myself). In that
> >> case, would you mind creating a jira based on my suggestion of having
> >> a new command line tool, give your hbase case as an example and any
> >> requirements you might think of. Perhaps Hartmut or one of the other
> >> contributors might be interested to work on this.
> >> https://issues.apache.org/jira/browse/ZOOKEEPER
> >>
> >> Patrick
> >>
> >> >
> >> > On Tue, Mar 20, 2012 at 9:16 AM, Patrick Hunt <ph...@apache.org>
> wrote:
> >> >
> >> >> On Tue, Mar 20, 2012 at 6:57 AM, Ted Yu <yu...@gmail.com> wrote:
> >> >> > I looked at the patch for ZOOKEEPER-1059 which should have
> converted
> >> the
> >> >> > NPE to KeeperException.NoNodeException
> >> >> >
> >> >> > Why would 'zkcli stat' command return 0 in case hbase master znode
> >> >> expires ?
> >> >> >
> >> >> > Advice is appreciated.
> >> >>
> >> >> Hi Ted, sorry to see you're having troubles. I think I see the
> >> >> disconnect. ZooKeeperMain is first and foremost a user shell. As such
> >> >> it should not exit unless the quit command is run (or killed
> >> >> explicitly, etc...). In this case ZOOKEEPER-1059 is fixing a bug in
> >> >> the shell. It indeed is converting the NPE into a NoNodeException,
> >> >> which the shell then converts into an error message to the user, and
> >> >> continues. Prior to this patch the shell was failing on the NPE,
> which
> >> >> then generated the non-0 exit from the process.
> >> >>
> >> >> Note that trunk has some further improvements along these lines that
> >> >> you might also run into at some point in the future (3.5+):
> >> >>
> >> >> https://issues.apache.org/jira/browse/ZOOKEEPER-271
> >> >> https://issues.apache.org/jira/browse/ZOOKEEPER-1391
> >> >> https://issues.apache.org/jira/browse/ZOOKEEPER-1307
> >> >>
> >> >> I think what we need is to have a tool that's intended for use both
> >> >> programmatically and by humans, with more strict requirements about
> >> >> input, output formatting and command handling, etc... Please see the
> >> >> work Hartmut has been doing as part of 271 on trunk (3.5.0). Perhaps
> >> >> we can augment these new classes to also support such a tool. However
> >> >> it should instead be a true command line tool, rather than an shell.
> >> >> Would you be available to work on this?
> >> >>
> >> >> Patrick
> >> >>
> >> >> ps. bigtop is now helping to verify cross project compatibility, it
> >> >> would be great if you could introduce some hbase tests  that would
> >> >> flag these breakages in future. When bigtop does it's integration (ie
> >> >> runs the hbase tests using the corresponding version of zk) it would
> >> >> find these problems. We'd catch it much earlier. Thanks!
> >> >>
> >> >>
> >> >> > FYI Jon filed a JIRA for the issue below which is a blocker for
> HBase
> >> >> trunk.
> >> >> >
> >> >> > On Tue, Mar 20, 2012 at 12:36 AM, Jonathan Hsieh <jon@cloudera.com
> >
> >> >> wrote:
> >> >> >
> >> >> >> I'm trying to test HBASE-5589 -- to see if I can add an API call
> to
> >> >> >> HMasterInterface and do a rolling-restart / upgrade on a live
> cluster
> >> >> which
> >> >> >> lead me down another rabbit hole.
> >> >> >>
> >> >> >> I'm wondering how rolling-restart.sh script worked in the past (I
> can
> >> >> spend
> >> >> >> more time setting up an older version to test this, but figured
> I'd
> >> >> ask).
> >> >> >>
> >> >> >> I'm getting stuck when the bin/rolling-restart.sh tries to wait
> until
> >> >> the
> >> >> >> Master ZNode expires.  In this particular case, the script seems
> to
> >> hang
> >> >> >> there forever (even after the /hbase/master ephemeral node
> expires).
> >> >> >>
> >> >> >> Here's the code in the script:
> >> >> >> ----
> >> >> >> # make sure the master znode has been deleted before continuing
> >> >> >>    zparent=`$bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool
> >> >> >> zookeeper.znode.parent`
> >> >> >>    if [ "$zparent" == "null" ]; then zparent="/hbase"; fi
> >> >> >>    zmaster=`$bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool
> >> >> >> zookeeper.znode.master`
> >> >> >>    if [ "$zmaster" == "null" ]; then zmaster="master"; fi
> >> >> >>    zmaster=$zparent/$zmaster
> >> >> >>    echo -n "Waiting for Master ZNode ${zmaster} to expire"
> >> >> >>    while bin/hbase zkcli stat $zmaster >/dev/null 2>&1; do
> >> >> >>      echo -n "."
> >> >> >>      sleep 1
> >> >> >>    done
> >> >> >>    echo #force a newline
> >> >> >> ----
> >> >> >>
> >> >> >> The problem is that 'bin/hbase zkcli stat /hbase/master ...'
> seems to
> >> >> >> always returns with $? == 0 regardless if the znode is present or
> not
> >> >> >> present!  I've checked with Patrick Hunt (ZK committer) and this
> the
> >> >> >> expected behavior.  The only non-zero retcodes are for abnormal
> exits
> >> >> >> (exceptions thrown)
> >> >> >>
> >> >> >> Here's the ZK code I was looking through
> >> >> >>
> >> >> >>
> >> >>
> >>
> https://github.com/apache/zookeeper/blob/release-3.4.3/src/java/main/org/apache/zookeeper/ZooKeeperMain.java#L736
> >> >> >>
> >> >> >>
> >> >> >>
> >> >>
> >>
> https://github.com/apache/zookeeper/blob/release-3.4.3/src/java/main/org/apache/zookeeper/ZooKeeper.java#L980
> >> >> >>
> >> >> >>
> >> >> >> Thoughts?
> >> >> >>
> >> >> >> Jon.
> >> >> >>
> >> >> >> --
> >> >> >> // Jonathan Hsieh (shay)
> >> >> >> // Software Engineer, Cloudera
> >> >> >> // jon@cloudera.com
> >> >> >>
> >> >>
> >>
>

Re: ZOOKEEPER-1059 Was: Does the rolling-restart.sh script work?

Posted by Patrick Hunt <ph...@apache.org>.

Great. Thanks Ted.

On Tue, Mar 20, 2012 at 10:09 AM, Ted Yu <yu...@gmail.com> wrote:
> Patrick:
> I logged https://issues.apache.org/jira/browse/ZOOKEEPER-1428
>
> If you feel there is anything missing in the JIRA, feel free to add it.
>
> Thanks for your help on this issue.
>
> Cheers
>
> On Tue, Mar 20, 2012 at 9:42 AM, Patrick Hunt <ph...@apache.org> wrote:
>
>> On Tue, Mar 20, 2012 at 9:32 AM, Ted Yu <yu...@gmail.com> wrote:
>> > Near term, if we can find out a way for shell script to detect the
>> absence
>> > of particular zookeeper node, rolling-restart.sh can be restored.
>> > Otherwise we may need to remove it.
>>
>> I just tested this out with 3.4, and I see the following for statting
>> a non-existant znode:
>>
>> [zk: (CONNECTED) 1] stat /foobar
>> Node does not exist: /foobar
>>
>> vs statting one that does exist:
>>
>> [zk: (CONNECTED) 2] stat /
>> cZxid = 0x0
>> ctime = Wed Dec 31 16:00:00 PST 1969
>> mZxid = 0x0
>> mtime = Wed Dec 31 16:00:00 PST 1969
>> pZxid = 0x0
>> cversion = -1
>> dataVersion = 0
>> aclVersion = 0
>> ephemeralOwner = 0x0
>> dataLength = 0
>> numChildren = 1
>>
>> You can look for "^Node does not exist" in the stat output instead of
>> checking the exit code. This would get around the problem until a more
>> permanent solution could be found.
>>
>> I hear you re time bound (i'd love to work on this myself). In that
>> case, would you mind creating a jira based on my suggestion of having
>> a new command line tool, give your hbase case as an example and any
>> requirements you might think of. Perhaps Hartmut or one of the other
>> contributors might be interested to work on this.
>> https://issues.apache.org/jira/browse/ZOOKEEPER
>>
>> Patrick
>>
>> >
>> > On Tue, Mar 20, 2012 at 9:16 AM, Patrick Hunt <ph...@apache.org> wrote:
>> >
>> >> On Tue, Mar 20, 2012 at 6:57 AM, Ted Yu <yu...@gmail.com> wrote:
>> >> > I looked at the patch for ZOOKEEPER-1059 which should have converted
>> the
>> >> > NPE to KeeperException.NoNodeException
>> >> >
>> >> > Why would 'zkcli stat' command return 0 in case hbase master znode
>> >> expires ?
>> >> >
>> >> > Advice is appreciated.
>> >>
>> >> Hi Ted, sorry to see you're having troubles. I think I see the
>> >> disconnect. ZooKeeperMain is first and foremost a user shell. As such
>> >> it should not exit unless the quit command is run (or killed
>> >> explicitly, etc...). In this case ZOOKEEPER-1059 is fixing a bug in
>> >> the shell. It indeed is converting the NPE into a NoNodeException,
>> >> which the shell then converts into an error message to the user, and
>> >> continues. Prior to this patch the shell was failing on the NPE, which
>> >> then generated the non-0 exit from the process.
>> >>
>> >> Note that trunk has some further improvements along these lines that
>> >> you might also run into at some point in the future (3.5+):
>> >>
>> >> https://issues.apache.org/jira/browse/ZOOKEEPER-271
>> >> https://issues.apache.org/jira/browse/ZOOKEEPER-1391
>> >> https://issues.apache.org/jira/browse/ZOOKEEPER-1307
>> >>
>> >> I think what we need is to have a tool that's intended for use both
>> >> programmatically and by humans, with more strict requirements about
>> >> input, output formatting and command handling, etc... Please see the
>> >> work Hartmut has been doing as part of 271 on trunk (3.5.0). Perhaps
>> >> we can augment these new classes to also support such a tool. However
>> >> it should instead be a true command line tool, rather than an shell.
>> >> Would you be available to work on this?
>> >>
>> >> Patrick
>> >>
>> >> ps. bigtop is now helping to verify cross project compatibility, it
>> >> would be great if you could introduce some hbase tests  that would
>> >> flag these breakages in future. When bigtop does it's integration (ie
>> >> runs the hbase tests using the corresponding version of zk) it would
>> >> find these problems. We'd catch it much earlier. Thanks!
>> >>
>> >>
>> >> > FYI Jon filed a JIRA for the issue below which is a blocker for HBase
>> >> trunk.
>> >> >
>> >> > On Tue, Mar 20, 2012 at 12:36 AM, Jonathan Hsieh <jo...@cloudera.com>
>> >> wrote:
>> >> >
>> >> >> I'm trying to test HBASE-5589 -- to see if I can add an API call to
>> >> >> HMasterInterface and do a rolling-restart / upgrade on a live cluster
>> >> which
>> >> >> lead me down another rabbit hole.
>> >> >>
>> >> >> I'm wondering how rolling-restart.sh script worked in the past (I can
>> >> spend
>> >> >> more time setting up an older version to test this, but figured I'd
>> >> ask).
>> >> >>
>> >> >> I'm getting stuck when the bin/rolling-restart.sh tries to wait until
>> >> the
>> >> >> Master ZNode expires.  In this particular case, the script seems to
>> hang
>> >> >> there forever (even after the /hbase/master ephemeral node expires).
>> >> >>
>> >> >> Here's the code in the script:
>> >> >> ----
>> >> >> # make sure the master znode has been deleted before continuing
>> >> >>    zparent=`$bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool
>> >> >> zookeeper.znode.parent`
>> >> >>    if [ "$zparent" == "null" ]; then zparent="/hbase"; fi
>> >> >>    zmaster=`$bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool
>> >> >> zookeeper.znode.master`
>> >> >>    if [ "$zmaster" == "null" ]; then zmaster="master"; fi
>> >> >>    zmaster=$zparent/$zmaster
>> >> >>    echo -n "Waiting for Master ZNode ${zmaster} to expire"
>> >> >>    while bin/hbase zkcli stat $zmaster >/dev/null 2>&1; do
>> >> >>      echo -n "."
>> >> >>      sleep 1
>> >> >>    done
>> >> >>    echo #force a newline
>> >> >> ----
>> >> >>
>> >> >> The problem is that 'bin/hbase zkcli stat /hbase/master ...' seems to
>> >> >> always returns with $? == 0 regardless if the znode is present or not
>> >> >> present!  I've checked with Patrick Hunt (ZK committer) and this the
>> >> >> expected behavior.  The only non-zero retcodes are for abnormal exits
>> >> >> (exceptions thrown)
>> >> >>
>> >> >> Here's the ZK code I was looking through
>> >> >>
>> >> >>
>> >>
>> https://github.com/apache/zookeeper/blob/release-3.4.3/src/java/main/org/apache/zookeeper/ZooKeeperMain.java#L736
>> >> >>
>> >> >>
>> >> >>
>> >>
>> https://github.com/apache/zookeeper/blob/release-3.4.3/src/java/main/org/apache/zookeeper/ZooKeeper.java#L980
>> >> >>
>> >> >>
>> >> >> Thoughts?
>> >> >>
>> >> >> Jon.
>> >> >>
>> >> >> --
>> >> >> // Jonathan Hsieh (shay)
>> >> >> // Software Engineer, Cloudera
>> >> >> // jon@cloudera.com
>> >> >>
>> >>
>>

Re: ZOOKEEPER-1059 Was: Does the rolling-restart.sh script work?

Posted by Ted Yu <yu...@gmail.com>.

Patrick:
I logged https://issues.apache.org/jira/browse/ZOOKEEPER-1428

If you feel there is anything missing in the JIRA, feel free to add it.

Thanks for your help on this issue.

Cheers

On Tue, Mar 20, 2012 at 9:42 AM, Patrick Hunt <ph...@apache.org> wrote:

> On Tue, Mar 20, 2012 at 9:32 AM, Ted Yu <yu...@gmail.com> wrote:
> > Near term, if we can find out a way for shell script to detect the
> absence
> > of particular zookeeper node, rolling-restart.sh can be restored.
> > Otherwise we may need to remove it.
>
> I just tested this out with 3.4, and I see the following for statting
> a non-existant znode:
>
> [zk: (CONNECTED) 1] stat /foobar
> Node does not exist: /foobar
>
> vs statting one that does exist:
>
> [zk: (CONNECTED) 2] stat /
> cZxid = 0x0
> ctime = Wed Dec 31 16:00:00 PST 1969
> mZxid = 0x0
> mtime = Wed Dec 31 16:00:00 PST 1969
> pZxid = 0x0
> cversion = -1
> dataVersion = 0
> aclVersion = 0
> ephemeralOwner = 0x0
> dataLength = 0
> numChildren = 1
>
> You can look for "^Node does not exist" in the stat output instead of
> checking the exit code. This would get around the problem until a more
> permanent solution could be found.
>
> I hear you re time bound (i'd love to work on this myself). In that
> case, would you mind creating a jira based on my suggestion of having
> a new command line tool, give your hbase case as an example and any
> requirements you might think of. Perhaps Hartmut or one of the other
> contributors might be interested to work on this.
> https://issues.apache.org/jira/browse/ZOOKEEPER
>
> Patrick
>
> >
> > On Tue, Mar 20, 2012 at 9:16 AM, Patrick Hunt <ph...@apache.org> wrote:
> >
> >> On Tue, Mar 20, 2012 at 6:57 AM, Ted Yu <yu...@gmail.com> wrote:
> >> > I looked at the patch for ZOOKEEPER-1059 which should have converted
> the
> >> > NPE to KeeperException.NoNodeException
> >> >
> >> > Why would 'zkcli stat' command return 0 in case hbase master znode
> >> expires ?
> >> >
> >> > Advice is appreciated.
> >>
> >> Hi Ted, sorry to see you're having troubles. I think I see the
> >> disconnect. ZooKeeperMain is first and foremost a user shell. As such
> >> it should not exit unless the quit command is run (or killed
> >> explicitly, etc...). In this case ZOOKEEPER-1059 is fixing a bug in
> >> the shell. It indeed is converting the NPE into a NoNodeException,
> >> which the shell then converts into an error message to the user, and
> >> continues. Prior to this patch the shell was failing on the NPE, which
> >> then generated the non-0 exit from the process.
> >>
> >> Note that trunk has some further improvements along these lines that
> >> you might also run into at some point in the future (3.5+):
> >>
> >> https://issues.apache.org/jira/browse/ZOOKEEPER-271
> >> https://issues.apache.org/jira/browse/ZOOKEEPER-1391
> >> https://issues.apache.org/jira/browse/ZOOKEEPER-1307
> >>
> >> I think what we need is to have a tool that's intended for use both
> >> programmatically and by humans, with more strict requirements about
> >> input, output formatting and command handling, etc... Please see the
> >> work Hartmut has been doing as part of 271 on trunk (3.5.0). Perhaps
> >> we can augment these new classes to also support such a tool. However
> >> it should instead be a true command line tool, rather than an shell.
> >> Would you be available to work on this?
> >>
> >> Patrick
> >>
> >> ps. bigtop is now helping to verify cross project compatibility, it
> >> would be great if you could introduce some hbase tests  that would
> >> flag these breakages in future. When bigtop does it's integration (ie
> >> runs the hbase tests using the corresponding version of zk) it would
> >> find these problems. We'd catch it much earlier. Thanks!
> >>
> >>
> >> > FYI Jon filed a JIRA for the issue below which is a blocker for HBase
> >> trunk.
> >> >
> >> > On Tue, Mar 20, 2012 at 12:36 AM, Jonathan Hsieh <jo...@cloudera.com>
> >> wrote:
> >> >
> >> >> I'm trying to test HBASE-5589 -- to see if I can add an API call to
> >> >> HMasterInterface and do a rolling-restart / upgrade on a live cluster
> >> which
> >> >> lead me down another rabbit hole.
> >> >>
> >> >> I'm wondering how rolling-restart.sh script worked in the past (I can
> >> spend
> >> >> more time setting up an older version to test this, but figured I'd
> >> ask).
> >> >>
> >> >> I'm getting stuck when the bin/rolling-restart.sh tries to wait until
> >> the
> >> >> Master ZNode expires.  In this particular case, the script seems to
> hang
> >> >> there forever (even after the /hbase/master ephemeral node expires).
> >> >>
> >> >> Here's the code in the script:
> >> >> ----
> >> >> # make sure the master znode has been deleted before continuing
> >> >>    zparent=`$bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool
> >> >> zookeeper.znode.parent`
> >> >>    if [ "$zparent" == "null" ]; then zparent="/hbase"; fi
> >> >>    zmaster=`$bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool
> >> >> zookeeper.znode.master`
> >> >>    if [ "$zmaster" == "null" ]; then zmaster="master"; fi
> >> >>    zmaster=$zparent/$zmaster
> >> >>    echo -n "Waiting for Master ZNode ${zmaster} to expire"
> >> >>    while bin/hbase zkcli stat $zmaster >/dev/null 2>&1; do
> >> >>      echo -n "."
> >> >>      sleep 1
> >> >>    done
> >> >>    echo #force a newline
> >> >> ----
> >> >>
> >> >> The problem is that 'bin/hbase zkcli stat /hbase/master ...' seems to
> >> >> always returns with $? == 0 regardless if the znode is present or not
> >> >> present!  I've checked with Patrick Hunt (ZK committer) and this the
> >> >> expected behavior.  The only non-zero retcodes are for abnormal exits
> >> >> (exceptions thrown)
> >> >>
> >> >> Here's the ZK code I was looking through
> >> >>
> >> >>
> >>
> https://github.com/apache/zookeeper/blob/release-3.4.3/src/java/main/org/apache/zookeeper/ZooKeeperMain.java#L736
> >> >>
> >> >>
> >> >>
> >>
> https://github.com/apache/zookeeper/blob/release-3.4.3/src/java/main/org/apache/zookeeper/ZooKeeper.java#L980
> >> >>
> >> >>
> >> >> Thoughts?
> >> >>
> >> >> Jon.
> >> >>
> >> >> --
> >> >> // Jonathan Hsieh (shay)
> >> >> // Software Engineer, Cloudera
> >> >> // jon@cloudera.com
> >> >>
> >>
>

Re: ZOOKEEPER-1059 Was: Does the rolling-restart.sh script work?

Posted by Patrick Hunt <ph...@apache.org>.

On Tue, Mar 20, 2012 at 9:32 AM, Ted Yu <yu...@gmail.com> wrote:
> Near term, if we can find out a way for shell script to detect the absence
> of particular zookeeper node, rolling-restart.sh can be restored.
> Otherwise we may need to remove it.

I just tested this out with 3.4, and I see the following for statting
a non-existant znode:

[zk: (CONNECTED) 1] stat /foobar
Node does not exist: /foobar

vs statting one that does exist:

[zk: (CONNECTED) 2] stat /
cZxid = 0x0
ctime = Wed Dec 31 16:00:00 PST 1969
mZxid = 0x0
mtime = Wed Dec 31 16:00:00 PST 1969
pZxid = 0x0
cversion = -1
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 0
numChildren = 1

You can look for "^Node does not exist" in the stat output instead of
checking the exit code. This would get around the problem until a more
permanent solution could be found.

I hear you re time bound (i'd love to work on this myself). In that
case, would you mind creating a jira based on my suggestion of having
a new command line tool, give your hbase case as an example and any
requirements you might think of. Perhaps Hartmut or one of the other
contributors might be interested to work on this.
https://issues.apache.org/jira/browse/ZOOKEEPER

Patrick

>
> On Tue, Mar 20, 2012 at 9:16 AM, Patrick Hunt <ph...@apache.org> wrote:
>
>> On Tue, Mar 20, 2012 at 6:57 AM, Ted Yu <yu...@gmail.com> wrote:
>> > I looked at the patch for ZOOKEEPER-1059 which should have converted the
>> > NPE to KeeperException.NoNodeException
>> >
>> > Why would 'zkcli stat' command return 0 in case hbase master znode
>> expires ?
>> >
>> > Advice is appreciated.
>>
>> Hi Ted, sorry to see you're having troubles. I think I see the
>> disconnect. ZooKeeperMain is first and foremost a user shell. As such
>> it should not exit unless the quit command is run (or killed
>> explicitly, etc...). In this case ZOOKEEPER-1059 is fixing a bug in
>> the shell. It indeed is converting the NPE into a NoNodeException,
>> which the shell then converts into an error message to the user, and
>> continues. Prior to this patch the shell was failing on the NPE, which
>> then generated the non-0 exit from the process.
>>
>> Note that trunk has some further improvements along these lines that
>> you might also run into at some point in the future (3.5+):
>>
>> https://issues.apache.org/jira/browse/ZOOKEEPER-271
>> https://issues.apache.org/jira/browse/ZOOKEEPER-1391
>> https://issues.apache.org/jira/browse/ZOOKEEPER-1307
>>
>> I think what we need is to have a tool that's intended for use both
>> programmatically and by humans, with more strict requirements about
>> input, output formatting and command handling, etc... Please see the
>> work Hartmut has been doing as part of 271 on trunk (3.5.0). Perhaps
>> we can augment these new classes to also support such a tool. However
>> it should instead be a true command line tool, rather than an shell.
>> Would you be available to work on this?
>>
>> Patrick
>>
>> ps. bigtop is now helping to verify cross project compatibility, it
>> would be great if you could introduce some hbase tests  that would
>> flag these breakages in future. When bigtop does it's integration (ie
>> runs the hbase tests using the corresponding version of zk) it would
>> find these problems. We'd catch it much earlier. Thanks!
>>
>>
>> > FYI Jon filed a JIRA for the issue below which is a blocker for HBase
>> trunk.
>> >
>> > On Tue, Mar 20, 2012 at 12:36 AM, Jonathan Hsieh <jo...@cloudera.com>
>> wrote:
>> >
>> >> I'm trying to test HBASE-5589 -- to see if I can add an API call to
>> >> HMasterInterface and do a rolling-restart / upgrade on a live cluster
>> which
>> >> lead me down another rabbit hole.
>> >>
>> >> I'm wondering how rolling-restart.sh script worked in the past (I can
>> spend
>> >> more time setting up an older version to test this, but figured I'd
>> ask).
>> >>
>> >> I'm getting stuck when the bin/rolling-restart.sh tries to wait until
>> the
>> >> Master ZNode expires.  In this particular case, the script seems to hang
>> >> there forever (even after the /hbase/master ephemeral node expires).
>> >>
>> >> Here's the code in the script:
>> >> ----
>> >> # make sure the master znode has been deleted before continuing
>> >>    zparent=`$bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool
>> >> zookeeper.znode.parent`
>> >>    if [ "$zparent" == "null" ]; then zparent="/hbase"; fi
>> >>    zmaster=`$bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool
>> >> zookeeper.znode.master`
>> >>    if [ "$zmaster" == "null" ]; then zmaster="master"; fi
>> >>    zmaster=$zparent/$zmaster
>> >>    echo -n "Waiting for Master ZNode ${zmaster} to expire"
>> >>    while bin/hbase zkcli stat $zmaster >/dev/null 2>&1; do
>> >>      echo -n "."
>> >>      sleep 1
>> >>    done
>> >>    echo #force a newline
>> >> ----
>> >>
>> >> The problem is that 'bin/hbase zkcli stat /hbase/master ...' seems to
>> >> always returns with $? == 0 regardless if the znode is present or not
>> >> present!  I've checked with Patrick Hunt (ZK committer) and this the
>> >> expected behavior.  The only non-zero retcodes are for abnormal exits
>> >> (exceptions thrown)
>> >>
>> >> Here's the ZK code I was looking through
>> >>
>> >>
>> https://github.com/apache/zookeeper/blob/release-3.4.3/src/java/main/org/apache/zookeeper/ZooKeeperMain.java#L736
>> >>
>> >>
>> >>
>> https://github.com/apache/zookeeper/blob/release-3.4.3/src/java/main/org/apache/zookeeper/ZooKeeper.java#L980
>> >>
>> >>
>> >> Thoughts?
>> >>
>> >> Jon.
>> >>
>> >> --
>> >> // Jonathan Hsieh (shay)
>> >> // Software Engineer, Cloudera
>> >> // jon@cloudera.com
>> >>
>>

Re: ZOOKEEPER-1059 Was: Does the rolling-restart.sh script work?

Posted by Ted Yu <yu...@gmail.com>.

Patrick:
Appreciate your detailed response.

I haven't finished work in ZOOKEEPER-1407 :-(
So I don't think I have bandwidth to start working on another zookeeper
issue.

Near term, if we can find out a way for shell script to detect the absence
of particular zookeeper node, rolling-restart.sh can be restored.
Otherwise we may need to remove it.

FYI As hbase committer, I often need to finish incomplete features such as
HBASE-3996.
This takes away significant amount of time.

Cheers

On Tue, Mar 20, 2012 at 9:16 AM, Patrick Hunt <ph...@apache.org> wrote:

> On Tue, Mar 20, 2012 at 6:57 AM, Ted Yu <yu...@gmail.com> wrote:
> > I looked at the patch for ZOOKEEPER-1059 which should have converted the
> > NPE to KeeperException.NoNodeException
> >
> > Why would 'zkcli stat' command return 0 in case hbase master znode
> expires ?
> >
> > Advice is appreciated.
>
> Hi Ted, sorry to see you're having troubles. I think I see the
> disconnect. ZooKeeperMain is first and foremost a user shell. As such
> it should not exit unless the quit command is run (or killed
> explicitly, etc...). In this case ZOOKEEPER-1059 is fixing a bug in
> the shell. It indeed is converting the NPE into a NoNodeException,
> which the shell then converts into an error message to the user, and
> continues. Prior to this patch the shell was failing on the NPE, which
> then generated the non-0 exit from the process.
>
> Note that trunk has some further improvements along these lines that
> you might also run into at some point in the future (3.5+):
>
> https://issues.apache.org/jira/browse/ZOOKEEPER-271
> https://issues.apache.org/jira/browse/ZOOKEEPER-1391
> https://issues.apache.org/jira/browse/ZOOKEEPER-1307
>
> I think what we need is to have a tool that's intended for use both
> programmatically and by humans, with more strict requirements about
> input, output formatting and command handling, etc... Please see the
> work Hartmut has been doing as part of 271 on trunk (3.5.0). Perhaps
> we can augment these new classes to also support such a tool. However
> it should instead be a true command line tool, rather than an shell.
> Would you be available to work on this?
>
> Patrick
>
> ps. bigtop is now helping to verify cross project compatibility, it
> would be great if you could introduce some hbase tests  that would
> flag these breakages in future. When bigtop does it's integration (ie
> runs the hbase tests using the corresponding version of zk) it would
> find these problems. We'd catch it much earlier. Thanks!
>
>
> > FYI Jon filed a JIRA for the issue below which is a blocker for HBase
> trunk.
> >
> > On Tue, Mar 20, 2012 at 12:36 AM, Jonathan Hsieh <jo...@cloudera.com>
> wrote:
> >
> >> I'm trying to test HBASE-5589 -- to see if I can add an API call to
> >> HMasterInterface and do a rolling-restart / upgrade on a live cluster
> which
> >> lead me down another rabbit hole.
> >>
> >> I'm wondering how rolling-restart.sh script worked in the past (I can
> spend
> >> more time setting up an older version to test this, but figured I'd
> ask).
> >>
> >> I'm getting stuck when the bin/rolling-restart.sh tries to wait until
> the
> >> Master ZNode expires.  In this particular case, the script seems to hang
> >> there forever (even after the /hbase/master ephemeral node expires).
> >>
> >> Here's the code in the script:
> >> ----
> >> # make sure the master znode has been deleted before continuing
> >>    zparent=`$bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool
> >> zookeeper.znode.parent`
> >>    if [ "$zparent" == "null" ]; then zparent="/hbase"; fi
> >>    zmaster=`$bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool
> >> zookeeper.znode.master`
> >>    if [ "$zmaster" == "null" ]; then zmaster="master"; fi
> >>    zmaster=$zparent/$zmaster
> >>    echo -n "Waiting for Master ZNode ${zmaster} to expire"
> >>    while bin/hbase zkcli stat $zmaster >/dev/null 2>&1; do
> >>      echo -n "."
> >>      sleep 1
> >>    done
> >>    echo #force a newline
> >> ----
> >>
> >> The problem is that 'bin/hbase zkcli stat /hbase/master ...' seems to
> >> always returns with $? == 0 regardless if the znode is present or not
> >> present!  I've checked with Patrick Hunt (ZK committer) and this the
> >> expected behavior.  The only non-zero retcodes are for abnormal exits
> >> (exceptions thrown)
> >>
> >> Here's the ZK code I was looking through
> >>
> >>
> https://github.com/apache/zookeeper/blob/release-3.4.3/src/java/main/org/apache/zookeeper/ZooKeeperMain.java#L736
> >>
> >>
> >>
> https://github.com/apache/zookeeper/blob/release-3.4.3/src/java/main/org/apache/zookeeper/ZooKeeper.java#L980
> >>
> >>
> >> Thoughts?
> >>
> >> Jon.
> >>
> >> --
> >> // Jonathan Hsieh (shay)
> >> // Software Engineer, Cloudera
> >> // jon@cloudera.com
> >>
>

Re: ZOOKEEPER-1059 Was: Does the rolling-restart.sh script work?

Posted by Patrick Hunt <ph...@apache.org>.

On Tue, Mar 20, 2012 at 6:57 AM, Ted Yu <yu...@gmail.com> wrote:
> I looked at the patch for ZOOKEEPER-1059 which should have converted the
> NPE to KeeperException.NoNodeException
>
> Why would 'zkcli stat' command return 0 in case hbase master znode expires ?
>
> Advice is appreciated.

Hi Ted, sorry to see you're having troubles. I think I see the
disconnect. ZooKeeperMain is first and foremost a user shell. As such
it should not exit unless the quit command is run (or killed
explicitly, etc...). In this case ZOOKEEPER-1059 is fixing a bug in
the shell. It indeed is converting the NPE into a NoNodeException,
which the shell then converts into an error message to the user, and
continues. Prior to this patch the shell was failing on the NPE, which
then generated the non-0 exit from the process.

Note that trunk has some further improvements along these lines that
you might also run into at some point in the future (3.5+):

https://issues.apache.org/jira/browse/ZOOKEEPER-271
https://issues.apache.org/jira/browse/ZOOKEEPER-1391
https://issues.apache.org/jira/browse/ZOOKEEPER-1307

I think what we need is to have a tool that's intended for use both
programmatically and by humans, with more strict requirements about
input, output formatting and command handling, etc... Please see the
work Hartmut has been doing as part of 271 on trunk (3.5.0). Perhaps
we can augment these new classes to also support such a tool. However
it should instead be a true command line tool, rather than an shell.
Would you be available to work on this?

Patrick

ps. bigtop is now helping to verify cross project compatibility, it
would be great if you could introduce some hbase tests  that would
flag these breakages in future. When bigtop does it's integration (ie
runs the hbase tests using the corresponding version of zk) it would
find these problems. We'd catch it much earlier. Thanks!

> FYI Jon filed a JIRA for the issue below which is a blocker for HBase trunk.
>
> On Tue, Mar 20, 2012 at 12:36 AM, Jonathan Hsieh <jo...@cloudera.com> wrote:
>
>> I'm trying to test HBASE-5589 -- to see if I can add an API call to
>> HMasterInterface and do a rolling-restart / upgrade on a live cluster which
>> lead me down another rabbit hole.
>>
>> I'm wondering how rolling-restart.sh script worked in the past (I can spend
>> more time setting up an older version to test this, but figured I'd ask).
>>
>> I'm getting stuck when the bin/rolling-restart.sh tries to wait until the
>> Master ZNode expires.  In this particular case, the script seems to hang
>> there forever (even after the /hbase/master ephemeral node expires).
>>
>> Here's the code in the script:
>> ----
>> # make sure the master znode has been deleted before continuing
>>    zparent=`$bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool
>> zookeeper.znode.parent`
>>    if [ "$zparent" == "null" ]; then zparent="/hbase"; fi
>>    zmaster=`$bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool
>> zookeeper.znode.master`
>>    if [ "$zmaster" == "null" ]; then zmaster="master"; fi
>>    zmaster=$zparent/$zmaster
>>    echo -n "Waiting for Master ZNode ${zmaster} to expire"
>>    while bin/hbase zkcli stat $zmaster >/dev/null 2>&1; do
>>      echo -n "."
>>      sleep 1
>>    done
>>    echo #force a newline
>> ----
>>
>> The problem is that 'bin/hbase zkcli stat /hbase/master ...' seems to
>> always returns with $? == 0 regardless if the znode is present or not
>> present!  I've checked with Patrick Hunt (ZK committer) and this the
>> expected behavior.  The only non-zero retcodes are for abnormal exits
>> (exceptions thrown)
>>
>> Here's the ZK code I was looking through
>>
>> https://github.com/apache/zookeeper/blob/release-3.4.3/src/java/main/org/apache/zookeeper/ZooKeeperMain.java#L736
>>
>>
>> https://github.com/apache/zookeeper/blob/release-3.4.3/src/java/main/org/apache/zookeeper/ZooKeeper.java#L980
>>
>>
>> Thoughts?
>>
>> Jon.
>>
>> --
>> // Jonathan Hsieh (shay)
>> // Software Engineer, Cloudera
>> // jon@cloudera.com
>>