You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Ted Yu (JIRA)" <ji...@apache.org> on 2014/06/04 02:41:01 UTC
[jira] [Resolved] (HBASE-11282) Load balancer may move a region
which is participating in snapshot
[ https://issues.apache.org/jira/browse/HBASE-11282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ted Yu resolved HBASE-11282.
----------------------------
Resolution: Later
> Load balancer may move a region which is participating in snapshot
> ------------------------------------------------------------------
>
> Key: HBASE-11282
> URL: https://issues.apache.org/jira/browse/HBASE-11282
> Project: HBase
> Issue Type: Bug
> Reporter: Ted Yu
>
> The region was tableone,,1394495094967.289ebdee6adf0a3b9c2bbcbe2ff522e7.
> From master log:
> {code}
> 2014-03-10 23:48:09,035 DEBUG [AM.ZK.Worker-pool2-t42] master.AssignmentManager: Found an existing plan for tableone,,1394495094967.289ebdee6adf0a3b9c2bbcbe2ff522e7. destination server is h2-ubuntu12-sec-1394425849-hbase-4.cs1cloud.internal,60020,1394494963812 accepted as a dest server = true
> 2014-03-10 23:48:09,035 DEBUG [AM.ZK.Worker-pool2-t42] master.AssignmentManager: Using pre-existing plan for tableone,,1394495094967.289ebdee6adf0a3b9c2bbcbe2ff522e7.; plan=hri=tableone,,1394495094967.289ebdee6adf0a3b9c2bbcbe2ff522e7., src=h2-ubuntu12-sec-1394425849-hbase-9.cs1cloud.internal,60020,1394494962165, dest=h2-ubuntu12-sec- 1394425849-hbase-4.cs1cloud.internal,60020,1394494963812
> 2014-03-10 23:48:09,035 INFO [AM.ZK.Worker-pool2-t42] master.RegionStates: Transitioned {289ebdee6adf0a3b9c2bbcbe2ff522e7 state=CLOSED, ts=1394495289035, server=h2- ubuntu12-sec-1394425849-hbase-9.cs1cloud.internal,60020,1394494962165} to {289ebdee6adf0a3b9c2bbcbe2ff522e7 state=OFFLINE, ts=1394495289035, server=h2-ubuntu12-sec- 1394425849-hbase-9.cs1cloud.internal,60020,1394494962165}
> 2014-03-10 23:48:09,035 DEBUG [AM.ZK.Worker-pool2-t42] zookeeper.ZKAssign: master:60000-0x244aa9920190b04, quorum=h2-ubuntu12-sec-1394425849-hbase-8.cs1cloud.internal:2181,h2-ubuntu12-sec-1394425849-hbase-1.cs1cloud.internal:2181,h2-ubuntu12-sec-1394425849-hbase-4.cs1cloud.internal:2181, baseZNode=/hbase Creating (or updating) unassigned node 289ebdee6adf0a3b9c2bbcbe2ff522e7 with OFFLINE state
> 2014-03-10 23:48:09,044 INFO [AM.ZK.Worker-pool2-t42] master.AssignmentManager: Assigning tableone,,1394495094967.289ebdee6adf0a3b9c2bbcbe2ff522e7. to h2-ubuntu12-sec- 1394425849-hbase-4.cs1cloud.internal,60020,1394494963812
> {code}
> From hbase-hbase-regionserver-h2-ubuntu12-sec-1394425849-hbase-9.log :
> {code}
> 2014-03-10 23:48:08,487 WARN [member: 'h2-ubuntu12-sec-1394425849-hbase-9.cs1cloud.internal,60020,1394494962165' subprocedure-pool1-thread-1] snapshot. RegionServerSnapshotManager: Got Exception in SnapshotSubprocedurePool
> java.util.concurrent.ExecutionException: org.apache.hadoop.hbase.NotServingRegionException: tableone,,1394495094967.289ebdee6adf0a3b9c2bbcbe2ff522e7. is closing
> at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
> at java.util.concurrent.FutureTask.get(FutureTask.java:83)
> at org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager$SnapshotSubprocedurePool.waitForOutstandingTasks(RegionServerSnapshotManager.java:325)
> at org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.flushSnapshot(FlushSnapshotSubprocedure.java:118)
> at org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.insideBarrier(FlushSnapshotSubprocedure.java:137)
> at org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:181)
> at org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:52)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: org.apache.hadoop.hbase.NotServingRegionException: tableone,,1394495094967.289ebdee6adf0a3b9c2bbcbe2ff522e7. is closing
> at org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:5699)
> at org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:5663)
> at org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure$RegionSnapshotTask.call(FlushSnapshotSubprocedure.java:79)
> at org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure$RegionSnapshotTask.call(FlushSnapshotSubprocedure.java:65)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> {code}
> Load balancer's move of the underlying region caused FlushSnapshotSubprocedure to fail.
> Mechanism of making load balancer be aware of region operation is desirable such that snapshot doesn't fail due to the above scenario.
--
This message was sent by Atlassian JIRA
(v6.2#6252)