You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2010/11/23 21:12:15 UTC

[jira] Updated: (HBASE-3265) Regionservers waiting for ROOT while Master waiting for RegionServers

     [ https://issues.apache.org/jira/browse/HBASE-3265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-3265:
-------------------------

    Fix Version/s: 0.90.0

Bringing in for triage

> Regionservers waiting for ROOT while Master waiting for RegionServers
> ---------------------------------------------------------------------
>
>                 Key: HBASE-3265
>                 URL: https://issues.apache.org/jira/browse/HBASE-3265
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.0
>            Reporter: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.90.0
>
>
> After a cluster disastrophe due to a disconnected switch, I ended up in a state where the master was up with no region servers (see HBASE-3263). When I brought the RS back up, because of the aforementioned bug, the master didn't get itself into a happy state (internal datastructure had some null in it). So I killed the master and started it again. Now, the master is in "Waiting for region servers to check in" mode, and the region servers are in the following stack:
>         - locked <0x00002aaab1bda5d0> (a org.apache.hadoop.hbase.zookeeper.RootRegionTracker)
>         at org.apache.hadoop.hbase.catalog.CatalogTracker.waitForRoot(CatalogTracker.java:177)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:537)
>         at java.lang.Thread.run(Thread.java:619)
> I imagine what happened is that the RS got through "tryReportForDuty" with the old master, but the old master was unable to assign anything due to bad state. So, when it crashed, all the RS were stuck in waitForRoot(), and when I brought the new one up, no one was reporting for duty.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.