You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org> on 2011/03/15 19:42:31 UTC

[jira] Updated: (HBASE-3596) [replication] Wait a few seconds before transferring queues

     [ https://issues.apache.org/jira/browse/HBASE-3596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jean-Daniel Cryans updated HBASE-3596:
--------------------------------------

    Attachment: HBASE-3596.patch

Simple patch that adds a configurable time to sleep before trying to lock a region server.

> [replication] Wait a few seconds before transferring queues 
> ------------------------------------------------------------
>
>                 Key: HBASE-3596
>                 URL: https://issues.apache.org/jira/browse/HBASE-3596
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.90.2
>
>         Attachments: HBASE-3596.patch
>
>
> ReplicationSourceManager.transferQueues is running a little too fast at the moment and this has the bad side effect of making us run into HBASE-2611 at almost every cluster restart. The reason is that some servers might shut down faster than others so that the last RS that are notified will at the same time see their friends dying, and will try to pick their queues. What happens then is that they also get told to shutdown and might be able to close their ZK session before the queue transfer process is completed, which is what 2611 is about.
> Currently the only to fix to that is to delete the lock znode by hand and bounce a region server so that it picks up the queue on startup.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira