You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "Phil Yang (JIRA)" <ji...@apache.org> on 2016/06/29 07:03:45 UTC

[jira] [Commented] (HBASE-16144) Replication queue's lock will live forever if RS acquiring the lock has dead

    [ https://issues.apache.org/jira/browse/HBASE-16144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15354697#comment-15354697 ] 

Phil Yang commented on HBASE-16144:
-----------------------------------

We will set hbase.zookeeper.useMulti=false if we have many peers in a cluster, because a large zk batch operation may fail because the request size is too large.
I think we can add a periodical lock cleaner on master to delete locks if it is acquired by a dead region server.
I'll submit a patch later.

> Replication queue's lock will live forever if RS acquiring the lock has dead
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-16144
>                 URL: https://issues.apache.org/jira/browse/HBASE-16144
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.2.1, 1.1.5, 0.98.20
>            Reporter: Phil Yang
>            Assignee: Phil Yang
>
> In default, we will use multi operation when we claimQueues from ZK. But if we set hbase.zookeeper.useMulti=false, we will add a lock first, then copy nodes, finally clean old queue and the lock. 
> However, if the RS acquiring the lock crash before claimQueues done, the lock will always be there and other RS can never claim the queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)