You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Ted Yu (JIRA)" <ji...@apache.org> on 2011/08/28 20:30:37 UTC

[jira] [Commented] (HBASE-4265) zookeeper.KeeperException$NodeExistsException if HMaster restarts while table is being disabled

    [ https://issues.apache.org/jira/browse/HBASE-4265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13092529#comment-13092529 ] 

Ted Yu commented on HBASE-4265:
-------------------------------

We should handle KeeperException$NodeExistsException in the catch block of unassign()
We also need to be flexible with the statement after the catch block:
{code}
        state = new RegionState(region, RegionState.State.PENDING_CLOSE);
{code}

> zookeeper.KeeperException$NodeExistsException if HMaster restarts while table is being disabled
> -----------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4265
>                 URL: https://issues.apache.org/jira/browse/HBASE-4265
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Ming Ma
>            Assignee: Ming Ma
>             Fix For: 0.92.0
>
>
> There seems to be more than just one issue regarding the following scenario. I would provide a fix later just for this exception.
> 1. A table is being disabled.
> 2. HMaster restarted.
> 3. At HMaster startup, it tries to transition from disabling to disabled state. It got the following exception.
> org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists for /hbase/unassigned/419b902243c836c285108ba555b712fa
> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:110)
> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
> 	at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
> 	at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:475)
> 	at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:457)
> 	at org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:742)
> 	at org.apache.hadoop.hbase.zookeeper.ZKAssign.createNodeClosing(ZKAssign.java:461)
> 	at org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1440)
> 	at org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1406)
> 	at org.apache.hadoop.hbase.master.handler.DisableTableHandler$BulkDisabler$1.run(DisableTableHandler.java:141)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> This issue is this specific region is in a special state before HMaster restarts; it has been closed by RS properly thus the zk state is RS_ZK_REGION_CLOSED. However, HMaster hasn't got a chance to process ClosedRegionHandler yet and thus the node remains at zk. After RS restarts, this node is added to online region list first in AssignmentManager.rebuildUserRegions and tries to unassign it later.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira