You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Ted Dunning (JIRA)" <ji...@apache.org> on 2011/07/13 18:38:59 UTC

[jira] [Commented] (ZOOKEEPER-1124) Multiop submitted to non-leader always fails due to timeout

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064683#comment-13064683 ] 

Ted Dunning commented on ZOOKEEPER-1124:
----------------------------------------

Marshall,

This fix is clearly important. Do you have any tests?

The role of these tests is not just to verify this bug, but also to provide a prototype for any later implementors of new operations.

> Multiop submitted to non-leader always fails due to timeout
> -----------------------------------------------------------
>
>                 Key: ZOOKEEPER-1124
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1124
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.0
>         Environment: all
>            Reporter: Marshall McMullen
>            Priority: Critical
>             Fix For: 3.4.0
>
>         Attachments: multi-non-observer.patch
>
>
> The new Multiop support added under zookeeper-965 fails every single time if the multiop is submitted to a non-leader in quorum mode. In standalone mode it always works properly and this bug only presents itself in quorum mode (with 2 or more nodes). After 12 hours of debugging (*sigh*) it turns out to be a really simple fix. There are a couple of missing case statements inside FollowerRequestProcessor.java and ObserverRequestProcessor.java to ensure that multiop is forwarded to the leader for commit. I've attached a patch that fixes this problem.
> It's probably worth nothing that zookeeper-965 has already been committed to trunk. But this is a fatal flaw that will prevent multiop support from working properly and as such needs to get committed to 3.4.0 as well. Is there a way to tie these two cases together in some way?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira