You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Kathryn Hogg (JIRA)" <ji...@apache.org> on 2018/02/13 22:47:00 UTC

[jira] [Comment Edited] (ZOOKEEPER-645) Bug in WriteLock recipe implementation?

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16363174#comment-16363174 ] 

Kathryn Hogg edited comment on ZOOKEEPER-645 at 2/13/18 10:46 PM:
------------------------------------------------------------------

I'm getting WriteLocks never being granted on 3.4.11 and initially brought it up on the user email list.  I'm working with ZookeeperNetEx on C# but have verified the code behaves the same on Java.

I've encountered two issues:
 # When setting the watch on the predecessor, its possible that the predecessor has been deleted between the time we acquired the children and set the watch.  If this happens, their is no watch and we exit out of the loop.  We should set id to null in this case to ensure the loop doesn't terminate.
 # Still need the change to ensure the dir name is prepended to the name returned from getChildren call in findPrefixInChildren.

#1 change
{code:java}
Stat stat = zookeeper.exists(lastChildId, new LockWatcher());
if (stat != null) {
   return Boolean.FALSE;
} else {
    LOG.warn("Could not find the" +
" stats for less than me: " + lastChildName.getName());
}
{code}
to
{code:java}
Stat stat = zookeeper.exists(lastChildId, new LockWatcher());
if (stat != null) {
    return Boolean.FALSE;
} else {
    LOG.warn("Could not find the" +
" stats for less than me: " + lastChildName.getName());
    id = null;
}
{code}
 

I've been running with these changes on 3.4.11 with two processes contending for 3 different locks and so far no hangs like I was seeing consistently without them.


was (Author: khogg):
I'm getting WriteLocks never being granted on 3.4.11 and initially brought it up on the user email list.  I'm working with ZookeeperNetEx on C# but have verified the code behaves the same on Java.

I've encountered two issues:
 # When setting the watch on the predecessor, its possible that the predecessor has been deleted between the time we acquired the children and set the watch.  If this happens, their is no watch and we exit out of the loop.  We should set id to null in this case to ensure the loop doesn't terminate.
 # Still need the change to ensure the dir name is prepended to the name returned from getChildren call in findPrefixInChildren.

#1 change



{{ Stat stat = zookeeper.exists(lastChildId, new LockWatcher());}}
{{ if (stat != null) {}}
{{    return Boolean.FALSE;}}
{{ } else {}}
{{   LOG.warn("Could not find the" +}}
{{ " stats for less than me: " + lastChildName.getName());}}
{{ }}}

to

{{ Stat stat = zookeeper.exists(lastChildId, new LockWatcher());}}
{{ if (stat != null) {}}
{{   return Boolean.FALSE;}}
{{ } else {}}
{{   LOG.warn("Could not find the" +}}
{{ " stats for less than me: " + lastChildName.getName());}}

    *id = null*;
{{ }}}

I've been running with these changes on 3.4.11 with two processes contending for 3 different locks and so far no hangs like I was seeing consistently without them.

> Bug in WriteLock recipe implementation?
> ---------------------------------------
>
>                 Key: ZOOKEEPER-645
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-645
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: recipes
>    Affects Versions: 3.2.2
>         Environment: 3.2.2 java 1.6.0_12
>            Reporter: Jaakko Laine
>            Assignee: Mahadev konar
>            Priority: Minor
>             Fix For: 3.6.0
>
>         Attachments: 645-fix-findPrefixInChildren.patch, ZOOKEEPER-645-compareTo.patch, ZOOKEEPER-645.3.patch.txt
>
>
> Not sure, but there seem to be two issues in the example WriteLock:
> (1) ZNodeName is sorted according to session ID first, and then according to znode sequence number. This might cause starvation as lower session IDs always get priority. WriteLock is not thread-safe in the first place, so having session ID involved in compare operation does not seem to make sense.
> (2) if findPrefixInChildren finds previous ID, it should add dir in front of the ID



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)