You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bookkeeper.apache.org by "Ivan Kelly (Commented) (JIRA)" <ji...@apache.org> on 2011/12/16 12:08:30 UTC

[jira] [Commented] (BOOKKEEPER-140) Hub server doesn't subscribe remote region correctly when a region is down.

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13170899#comment-13170899 ] 

Ivan Kelly commented on BOOKKEEPER-140:
---------------------------------------

I don't understand the second scenario in the description, could you double check you've labelled regions and subscribers correctly. 

Otherwise, I need to clarify some things about how cross region should work before properly reviewing the patch, as I don't quite understand how message ordering should work between regions. I've send a mail to bookkeeper-dev about this.
                
> Hub server doesn't subscribe remote region correctly when a region is down.
> ---------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-140
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-140
>             Project: Bookkeeper
>          Issue Type: Bug
>          Components: hedwig-server
>    Affects Versions: 4.0.0
>            Reporter: Sijie Guo
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BOOKKEEPER-140.patch, BOOKKEEPER-140.patch
>
>
> Hub server doesn't subscribe remote region correctly in following cases: (assume there is 3 regions, A, B, C)
> 1. region shuts down before first subscribe.
> 1) region C is down.
> 2) subscribe-a subscribe a topic in region A. a subscription state is created in region A's zookeeper. but remote subscribe to region C would fail since region C is down. hub server will respond client that subscribe failed without deleting subscription state. The following subscriptions using same subscribe id and same topic would failed due to NodeExists.
> 2. region shuts down when attaches existing subscriptions.
> 1) In region A, there is a local subscriber a for topic T. in region B, subscriber b for topic T. in region B, subscribe c for topic T.
> 2) servers are all restarted in all three regions. But region C is network-partitioned (or shuts down) from region A and region B.
> 3) subscriber b and subscribe c try to subscribe T again. hub servers in region B, C will try to remote subscribe region A, but should failed. There is no mechanism to retry remote subscribe. so if messages are published to topic T in region A, subscribe b and subscribe c would receive any message.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira