You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Ted Yu (JIRA)" <ji...@apache.org> on 2016/01/05 19:45:39 UTC
[jira] [Updated] (HBASE-15056) Split fails with
KeeperException$NoNodeException when namespace quota is enabled
[ https://issues.apache.org/jira/browse/HBASE-15056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ted Yu updated HBASE-15056:
---------------------------
Attachment: 15056-branch-1-v1.txt
In patch v1, services.reportRegionStateTransition() is called at the beginning of stepsBeforePONR() so that regionStateListener has a chance to check quota.
> Split fails with KeeperException$NoNodeException when namespace quota is enabled
> --------------------------------------------------------------------------------
>
> Key: HBASE-15056
> URL: https://issues.apache.org/jira/browse/HBASE-15056
> Project: HBase
> Issue Type: Bug
> Affects Versions: 1.2.0
> Reporter: Ted Yu
> Attachments: 15056-branch-1-v1.txt, split-fails-when-exceeding-quota-with-znode-loss.test
>
>
> When trying to port HBASE-15044 to branch-1, I found that region split fails with KeeperException$NoNodeException when namespace quota is enabled and the split would exceed allocated quota.
> Here is related test output:
> {code}
> 2015-12-30 09:50:16,764 WARN [RS:0;10.22.24.71:65256-splits-1451497816754] zookeeper.ZKAssign(885): regionserver:65256-0x151f402c21c0001, quorum=localhost:57662, baseZNode=/ hbase Attempt to transition the unassigned node for 17fc99c04a8027b653e9d5ef5d578461 from RS_ZK_REQUEST_REGION_SPLIT to RS_ZK_REQUEST_REGION_SPLIT failed, the node existed and was in the expected state but then when setting data it no longer existed
> 2015-12-30 09:50:16,866 DEBUG [RS:0;10.22.24.71:65256-splits-1451497816754] zookeeper.ZKUtil(718): regionserver:65256-0x151f402c21c0001, quorum=localhost:57662, baseZNode=/hbase Unable to get data of znode /hbase/region-in-transition/17fc99c04a8027b653e9d5ef5d578461 because node does not exist (not necessarily an error)
> 2015-12-30 09:50:16,866 INFO [RS:0;10.22.24.71:65256-splits-1451497816754] regionserver.SplitRequest(97): Running rollback/cleanup of failed split of np2: testRegionNormalizationSplitOnCluster,zzzzz,1451497806295.17fc99c04a8027b653e9d5ef5d578461.; Failed getting SPLITTING znode on np2:testRegionNormalizationSplitOnCluster,zzzzz, 1451497806295.17fc99c04a8027b653e9d5ef5d578461.
> java.io.IOException: Failed getting SPLITTING znode on np2:testRegionNormalizationSplitOnCluster,zzzzz,1451497806295.17fc99c04a8027b653e9d5ef5d578461.
> at org.apache.hadoop.hbase.coordination.ZKSplitTransactionCoordination.waitForSplitTransaction(ZKSplitTransactionCoordination.java:200)
> at org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.stepsBeforePONR(SplitTransactionImpl.java:381)
> at org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.createDaughters(SplitTransactionImpl.java:277)
> at org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.execute(SplitTransactionImpl.java:560)
> at org.apache.hadoop.hbase.regionserver.SplitRequest.doSplitting(SplitRequest.java:82)
> at org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:154)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: Data is null, splitting node 17fc99c04a8027b653e9d5ef5d578461 no longer exists
> at org.apache.hadoop.hbase.coordination.ZKSplitTransactionCoordination.waitForSplitTransaction(ZKSplitTransactionCoordination.java:166)
> ... 8 more
> 2015-12-30 09:50:16,869 DEBUG [RS:0;10.22.24.71:65256-splits-1451497816754] zookeeper.ZKUtil(718): regionserver:65256-0x151f402c21c0001, quorum=localhost:57662, baseZNode=/hbase Unable to get data of znode /hbase/region-in-transition/17fc99c04a8027b653e9d5ef5d578461 because node does not exist (not necessarily an error)
> 2015-12-30 09:50:16,869 INFO [RS:0;10.22.24.71:65256-splits-1451497816754] coordination.ZKSplitTransactionCoordination(268): Failed cleanup zk node of np2: testRegionNormalizationSplitOnCluster,zzzzz,1451497806295.17fc99c04a8027b653e9d5ef5d578461.
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
> at org.apache.hadoop.hbase.zookeeper.ZKAssign.deleteNode(ZKAssign.java:452)
> at org.apache.hadoop.hbase.zookeeper.ZKAssign.deleteNode(ZKAssign.java:381)
> at org.apache.hadoop.hbase.coordination.ZKSplitTransactionCoordination.clean(ZKSplitTransactionCoordination.java:261)
> at org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.rollback(SplitTransactionImpl.java:948)
> at org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.rollback(SplitTransactionImpl.java:900)
> at org.apache.hadoop.hbase.regionserver.SplitRequest.doSplitting(SplitRequest.java:99)
> at org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:154)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> {code}
> Strangely there is no QuotaExceededException thrown.
> In master branch, quota check is done in response to TransitionCode.READY_TO_SPLIT
> In branch-1, that code path wouldn't be executed when useZKForAssignment is true (the default case):
> {code}
> } else if (services != null && !useZKForAssignment) {
> if (!services.reportRegionStateTransition(TransitionCode.READY_TO_SPLIT,
> parent.getRegionInfo(), hri_a, hri_b)) {
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)