You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Prakash Khemani (JIRA)" <ji...@apache.org> on 2011/04/29 01:32:03 UTC
[jira] [Resolved] (HBASE-3822) region server stuck in
waitOnAllRegionsToClose
[ https://issues.apache.org/jira/browse/HBASE-3822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Prakash Khemani resolved HBASE-3822.
------------------------------------
Resolution: Invalid
Release Note: The description is invalid. Will open a new one.
> region server stuck in waitOnAllRegionsToClose
> ----------------------------------------------
>
> Key: HBASE-3822
> URL: https://issues.apache.org/jira/browse/HBASE-3822
> Project: HBase
> Issue Type: Bug
> Reporter: Prakash Khemani
>
> The regionserver is not able to exit because the rs thread is stuck here
> "regionserver60020" prio=10 tid=0x00002ab2b039e000 nid=0x760a waiting on condition [0x000000004365e000]
> java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at org.apache.hadoop.hbase.util.Threads.sleep(Threads.java:126)
> at org.apache.hadoop.hbase.regionserver.HRegionServer.waitOnAllRegionsToClose(HRegionServer.java:736)
> at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:689)
> at java.lang.Thread.run(Thread.java:619)
> ===
> In CloseRegionHandler.process() we do not call removeFromOnlineRegions() if there is an exception. (In this case I suspect there was a log-rolling exception because of another issue)
> // Close the region
> try {
> // TODO: If we need to keep updating CLOSING stamp to prevent against
> // a timeout if this is long-running, need to spin up a thread?
> if (region.close(abort) == null) {
> // This region got closed. Most likely due to a split. So instead
> // of doing the setClosedState() below, let's just ignore and continue.
> // The split message will clean up the master state.
> LOG.warn("Can't close region: was already closed during close(): " +
> regionInfo.getRegionNameAsString());
> return;
> }
> } catch (IOException e) {
> LOG.error("Unrecoverable exception while closing region " +
> regionInfo.getRegionNameAsString() + ", still finishing close", e);
> }
> this.rsServices.removeFromOnlineRegions(regionInfo.getEncodedName());
> ===
> I think we set the closing flag on the region, it won't be taking any more requests, it is as good as offline.
> Either we should refine the check in waitOnAllRegionsToClose() or CloseRegionHandler.process() should remove the region from online-regions set.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira