You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Thawan Kooburat (JIRA)" <ji...@apache.org> on 2013/02/01 22:22:12 UTC

[jira] [Updated] (ZOOKEEPER-1624) PrepRequestProcessor abort multi-operation incorrectly

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thawan Kooburat updated ZOOKEEPER-1624:
---------------------------------------

    Attachment: ZOOKEEPER-1624.patch

Clean up wait counter so that future test can run correctly
                
> PrepRequestProcessor abort multi-operation incorrectly
> ------------------------------------------------------
>
>                 Key: ZOOKEEPER-1624
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1624
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>            Reporter: Thawan Kooburat
>            Assignee: Thawan Kooburat
>            Priority: Critical
>              Labels: zk-review
>             Fix For: 3.5.0
>
>         Attachments: ZOOKEEPER-1624.patch, ZOOKEEPER-1624.patch
>
>
> We found this issue when trying to issue multiple instances of the following multi-op concurrently
> multi {
> 1. create sequential node /a- 
> 2. create node /b
> }
> The expected result is that only the first multi-op request should success and the rest of request should fail because /b is already exist
> However, the reported result is that the subsequence multi-op failed because of sequential node creation failed which is not possible.
> Below is the return code for each sub-op when issuing 3 instances of the above multi-op asynchronously
> 1. ZOK, ZOK
> 2. ZOK, ZNODEEXISTS,
> 3. ZNODEEXISTS, ZRUNTIMEINCONSISTENCY,
> When I added more debug log. The cause is that PrepRequestProcessor rollback outstandingChanges of the second multi-op incorrectly causing sequential node name generation to be incorrect. Below is the sequential node name generated by PrepRequestProcessor
> 1. create /a-0001
> 2. create /a-0003
> 3. create /a-0001
> The bug is getPendingChanges() method. In failed to copied ChangeRecord for the parent node ("/").  So rollbackPendingChanges() cannot restore the right previous change record of the parent node when aborting the second multi-op
> The impact of this bug is that sequential node creation on the same parent node may fail until the previous one is committed. I am not sure if there is other implication or not.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira