You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Ted Yu (JIRA)" <ji...@apache.org> on 2014/06/04 02:41:01 UTC

[jira] [Resolved] (HBASE-11282) Load balancer may move a region which is participating in snapshot

     [ https://issues.apache.org/jira/browse/HBASE-11282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu resolved HBASE-11282.
----------------------------

    Resolution: Later

> Load balancer may move a region which is participating in snapshot
> ------------------------------------------------------------------
>
>                 Key: HBASE-11282
>                 URL: https://issues.apache.org/jira/browse/HBASE-11282
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Ted Yu
>
> The region was tableone,,1394495094967.289ebdee6adf0a3b9c2bbcbe2ff522e7.
> From master log:
> {code}
> 2014-03-10 23:48:09,035 DEBUG [AM.ZK.Worker-pool2-t42] master.AssignmentManager: Found an existing plan for tableone,,1394495094967.289ebdee6adf0a3b9c2bbcbe2ff522e7.       destination server is h2-ubuntu12-sec-1394425849-hbase-4.cs1cloud.internal,60020,1394494963812 accepted as a dest server = true
> 2014-03-10 23:48:09,035 DEBUG [AM.ZK.Worker-pool2-t42] master.AssignmentManager: Using pre-existing plan for tableone,,1394495094967.289ebdee6adf0a3b9c2bbcbe2ff522e7.;     plan=hri=tableone,,1394495094967.289ebdee6adf0a3b9c2bbcbe2ff522e7., src=h2-ubuntu12-sec-1394425849-hbase-9.cs1cloud.internal,60020,1394494962165, dest=h2-ubuntu12-sec-     1394425849-hbase-4.cs1cloud.internal,60020,1394494963812
> 2014-03-10 23:48:09,035 INFO  [AM.ZK.Worker-pool2-t42] master.RegionStates: Transitioned {289ebdee6adf0a3b9c2bbcbe2ff522e7 state=CLOSED, ts=1394495289035, server=h2-       ubuntu12-sec-1394425849-hbase-9.cs1cloud.internal,60020,1394494962165} to {289ebdee6adf0a3b9c2bbcbe2ff522e7 state=OFFLINE, ts=1394495289035, server=h2-ubuntu12-sec-        1394425849-hbase-9.cs1cloud.internal,60020,1394494962165}
> 2014-03-10 23:48:09,035 DEBUG [AM.ZK.Worker-pool2-t42] zookeeper.ZKAssign: master:60000-0x244aa9920190b04, quorum=h2-ubuntu12-sec-1394425849-hbase-8.cs1cloud.internal:2181,h2-ubuntu12-sec-1394425849-hbase-1.cs1cloud.internal:2181,h2-ubuntu12-sec-1394425849-hbase-4.cs1cloud.internal:2181, baseZNode=/hbase Creating (or updating) unassigned     node 289ebdee6adf0a3b9c2bbcbe2ff522e7 with OFFLINE state
> 2014-03-10 23:48:09,044 INFO  [AM.ZK.Worker-pool2-t42] master.AssignmentManager: Assigning tableone,,1394495094967.289ebdee6adf0a3b9c2bbcbe2ff522e7. to h2-ubuntu12-sec-    1394425849-hbase-4.cs1cloud.internal,60020,1394494963812
> {code}
> From hbase-hbase-regionserver-h2-ubuntu12-sec-1394425849-hbase-9.log :
> {code}
> 2014-03-10 23:48:08,487 WARN  [member: 'h2-ubuntu12-sec-1394425849-hbase-9.cs1cloud.internal,60020,1394494962165' subprocedure-pool1-thread-1] snapshot.                    RegionServerSnapshotManager: Got Exception in SnapshotSubprocedurePool
> java.util.concurrent.ExecutionException: org.apache.hadoop.hbase.NotServingRegionException: tableone,,1394495094967.289ebdee6adf0a3b9c2bbcbe2ff522e7. is closing
>   at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:83)
>   at org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager$SnapshotSubprocedurePool.waitForOutstandingTasks(RegionServerSnapshotManager.java:325)
>   at org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.flushSnapshot(FlushSnapshotSubprocedure.java:118)
>   at org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.insideBarrier(FlushSnapshotSubprocedure.java:137)
>   at org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:181)
>   at org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:52)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: org.apache.hadoop.hbase.NotServingRegionException: tableone,,1394495094967.289ebdee6adf0a3b9c2bbcbe2ff522e7. is closing
>   at org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:5699)
>   at org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:5663)
>   at org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure$RegionSnapshotTask.call(FlushSnapshotSubprocedure.java:79)
>   at org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure$RegionSnapshotTask.call(FlushSnapshotSubprocedure.java:65)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> {code}
> Load balancer's move of the underlying region caused FlushSnapshotSubprocedure to fail.
> Mechanism of making load balancer be aware of region operation is desirable such that snapshot doesn't fail due to the above scenario.



--
This message was sent by Atlassian JIRA
(v6.2#6252)