You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Todd Lipcon (JIRA)" <ji...@apache.org> on 2010/06/21 06:17:23 UTC

[jira] Created: (HBASE-2758) META region stuck in RS2ZK_REGION_OPENED state

META region stuck in RS2ZK_REGION_OPENED state
----------------------------------------------

                 Key: HBASE-2758
                 URL: https://issues.apache.org/jira/browse/HBASE-2758
             Project: HBase
          Issue Type: Bug
          Components: master, regionserver
    Affects Versions: 0.21.0
            Reporter: Todd Lipcon
            Priority: Blocker


In cluster testing trunk, I ended up with a situation where META was unassigned and no amount of restarting various pieces would fix it. On master startup, I see:

2010-06-20 21:08:05,431 DEBUG org.apache.hadoop.hbase.master.BaseScanner: Current assignment of .META.,,1.1028785192 is not valid;  serverAddress=, startCode=0 unknown.
2010-06-20 21:08:05,436 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: While creating UNASSIGNED region 1028785192 exists, state = RS2ZK_REGION_OPENED
2010-06-20 21:08:05,438 WARN org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: <monster01.sf.cloudera.com:/hbase,org.apache.hadoop.hbase.master.HMaster>Failed to create ZNode /hbase/UNASSIGNED/1028785192 in ZooKeeper
org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists for /hbase/UNASSIGNED/1028785192
2010-06-20 21:08:05,438 DEBUG org.apache.hadoop.hbase.master.RegionManager: Created UNASSIGNED zNode .META.,,1.1028785192 in state M2ZK_REGION_OFFLINE

then on the RS:
2010-06-20 21:08:05,899 ERROR org.apache.hadoop.hbase.regionserver.RSZookeeperUpdater: ZNode /hbase/UNASSIGNED/1028785192 is not in CLOSED/OFFLINE state (state = RS2ZK_REGION_OPENED), will NOT open region.
2010-06-20 21:08:05,899 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Error opening .META.,,1.1028785192
java.io.IOException: ZNode /hbase/UNASSIGNED/1028785192 is not in CLOSED/OFFLINE state (state = RS2ZK_REGION_OPENED), will NOT open region.

and the region never opens


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-2758) META region stuck in RS2ZK_REGION_OPENED state

Posted by "Karthik Ranganathan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-2758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Karthik Ranganathan updated HBASE-2758:
---------------------------------------

    Attachment: HBASE-2758-0.21.patch

If the cluster was shutdown before the regions in transition in ZK was cleared, then it does not get assigned out on startup. Fix is to delete the UNASSIGNED znode in ZK on a new cluster start.

> META region stuck in RS2ZK_REGION_OPENED state
> ----------------------------------------------
>
>                 Key: HBASE-2758
>                 URL: https://issues.apache.org/jira/browse/HBASE-2758
>             Project: HBase
>          Issue Type: Bug
>          Components: master, regionserver
>    Affects Versions: 0.21.0
>            Reporter: Todd Lipcon
>            Assignee: Karthik Ranganathan
>            Priority: Blocker
>         Attachments: HBASE-2758-0.21.patch
>
>
> In cluster testing trunk, I ended up with a situation where META was unassigned and no amount of restarting various pieces would fix it. On master startup, I see:
> 2010-06-20 21:08:05,431 DEBUG org.apache.hadoop.hbase.master.BaseScanner: Current assignment of .META.,,1.1028785192 is not valid;  serverAddress=, startCode=0 unknown.
> 2010-06-20 21:08:05,436 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: While creating UNASSIGNED region 1028785192 exists, state = RS2ZK_REGION_OPENED
> 2010-06-20 21:08:05,438 WARN org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: <monster01.sf.cloudera.com:/hbase,org.apache.hadoop.hbase.master.HMaster>Failed to create ZNode /hbase/UNASSIGNED/1028785192 in ZooKeeper
> org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists for /hbase/UNASSIGNED/1028785192
> 2010-06-20 21:08:05,438 DEBUG org.apache.hadoop.hbase.master.RegionManager: Created UNASSIGNED zNode .META.,,1.1028785192 in state M2ZK_REGION_OFFLINE
> then on the RS:
> 2010-06-20 21:08:05,899 ERROR org.apache.hadoop.hbase.regionserver.RSZookeeperUpdater: ZNode /hbase/UNASSIGNED/1028785192 is not in CLOSED/OFFLINE state (state = RS2ZK_REGION_OPENED), will NOT open region.
> 2010-06-20 21:08:05,899 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Error opening .META.,,1.1028785192
> java.io.IOException: ZNode /hbase/UNASSIGNED/1028785192 is not in CLOSED/OFFLINE state (state = RS2ZK_REGION_OPENED), will NOT open region.
> and the region never opens

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-2758) META region stuck in RS2ZK_REGION_OPENED state

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-2758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881528#action_12881528 ] 

Jonathan Gray commented on HBASE-2758:
--------------------------------------

Patch looks good.  This also fixes an existing race condition where the master node in ZK was put up before the master got the listing of regionservers.  Nothing would be blocking the RS from putting up their ephemeral nodes so it was possible that an HMaster thought it was a failover but it was a clean startup.  Test added in patch verifies that a cluster will startup even if there are unassigned znodes in zookeeper.

Running full test suite and then will commit.

> META region stuck in RS2ZK_REGION_OPENED state
> ----------------------------------------------
>
>                 Key: HBASE-2758
>                 URL: https://issues.apache.org/jira/browse/HBASE-2758
>             Project: HBase
>          Issue Type: Bug
>          Components: master, regionserver
>    Affects Versions: 0.21.0
>            Reporter: Todd Lipcon
>            Assignee: Karthik Ranganathan
>            Priority: Blocker
>         Attachments: HBASE-2758-0.21.patch
>
>
> In cluster testing trunk, I ended up with a situation where META was unassigned and no amount of restarting various pieces would fix it. On master startup, I see:
> 2010-06-20 21:08:05,431 DEBUG org.apache.hadoop.hbase.master.BaseScanner: Current assignment of .META.,,1.1028785192 is not valid;  serverAddress=, startCode=0 unknown.
> 2010-06-20 21:08:05,436 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: While creating UNASSIGNED region 1028785192 exists, state = RS2ZK_REGION_OPENED
> 2010-06-20 21:08:05,438 WARN org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: <monster01.sf.cloudera.com:/hbase,org.apache.hadoop.hbase.master.HMaster>Failed to create ZNode /hbase/UNASSIGNED/1028785192 in ZooKeeper
> org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists for /hbase/UNASSIGNED/1028785192
> 2010-06-20 21:08:05,438 DEBUG org.apache.hadoop.hbase.master.RegionManager: Created UNASSIGNED zNode .META.,,1.1028785192 in state M2ZK_REGION_OFFLINE
> then on the RS:
> 2010-06-20 21:08:05,899 ERROR org.apache.hadoop.hbase.regionserver.RSZookeeperUpdater: ZNode /hbase/UNASSIGNED/1028785192 is not in CLOSED/OFFLINE state (state = RS2ZK_REGION_OPENED), will NOT open region.
> 2010-06-20 21:08:05,899 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Error opening .META.,,1.1028785192
> java.io.IOException: ZNode /hbase/UNASSIGNED/1028785192 is not in CLOSED/OFFLINE state (state = RS2ZK_REGION_OPENED), will NOT open region.
> and the region never opens

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-2758) META region stuck in RS2ZK_REGION_OPENED state

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-2758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12880882#action_12880882 ] 

Jonathan Gray commented on HBASE-2758:
--------------------------------------

I think Karthik has a fix for this in another patch.  Basically, when a master starts up (before we fully handle master failover) he just needs to clean out all the znodes from /UNASSIGNED.  Should be a simple fix.

> META region stuck in RS2ZK_REGION_OPENED state
> ----------------------------------------------
>
>                 Key: HBASE-2758
>                 URL: https://issues.apache.org/jira/browse/HBASE-2758
>             Project: HBase
>          Issue Type: Bug
>          Components: master, regionserver
>    Affects Versions: 0.21.0
>            Reporter: Todd Lipcon
>            Priority: Blocker
>
> In cluster testing trunk, I ended up with a situation where META was unassigned and no amount of restarting various pieces would fix it. On master startup, I see:
> 2010-06-20 21:08:05,431 DEBUG org.apache.hadoop.hbase.master.BaseScanner: Current assignment of .META.,,1.1028785192 is not valid;  serverAddress=, startCode=0 unknown.
> 2010-06-20 21:08:05,436 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: While creating UNASSIGNED region 1028785192 exists, state = RS2ZK_REGION_OPENED
> 2010-06-20 21:08:05,438 WARN org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: <monster01.sf.cloudera.com:/hbase,org.apache.hadoop.hbase.master.HMaster>Failed to create ZNode /hbase/UNASSIGNED/1028785192 in ZooKeeper
> org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists for /hbase/UNASSIGNED/1028785192
> 2010-06-20 21:08:05,438 DEBUG org.apache.hadoop.hbase.master.RegionManager: Created UNASSIGNED zNode .META.,,1.1028785192 in state M2ZK_REGION_OFFLINE
> then on the RS:
> 2010-06-20 21:08:05,899 ERROR org.apache.hadoop.hbase.regionserver.RSZookeeperUpdater: ZNode /hbase/UNASSIGNED/1028785192 is not in CLOSED/OFFLINE state (state = RS2ZK_REGION_OPENED), will NOT open region.
> 2010-06-20 21:08:05,899 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Error opening .META.,,1.1028785192
> java.io.IOException: ZNode /hbase/UNASSIGNED/1028785192 is not in CLOSED/OFFLINE state (state = RS2ZK_REGION_OPENED), will NOT open region.
> and the region never opens

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HBASE-2758) META region stuck in RS2ZK_REGION_OPENED state

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-2758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Gray resolved HBASE-2758.
----------------------------------

    Fix Version/s: 0.21.0
       Resolution: Fixed

Confirmed with Karthik that he ran full test suite.  Committed to trunk.

> META region stuck in RS2ZK_REGION_OPENED state
> ----------------------------------------------
>
>                 Key: HBASE-2758
>                 URL: https://issues.apache.org/jira/browse/HBASE-2758
>             Project: HBase
>          Issue Type: Bug
>          Components: master, regionserver
>    Affects Versions: 0.21.0
>            Reporter: Todd Lipcon
>            Assignee: Karthik Ranganathan
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: HBASE-2758-0.21.patch
>
>
> In cluster testing trunk, I ended up with a situation where META was unassigned and no amount of restarting various pieces would fix it. On master startup, I see:
> 2010-06-20 21:08:05,431 DEBUG org.apache.hadoop.hbase.master.BaseScanner: Current assignment of .META.,,1.1028785192 is not valid;  serverAddress=, startCode=0 unknown.
> 2010-06-20 21:08:05,436 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: While creating UNASSIGNED region 1028785192 exists, state = RS2ZK_REGION_OPENED
> 2010-06-20 21:08:05,438 WARN org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: <monster01.sf.cloudera.com:/hbase,org.apache.hadoop.hbase.master.HMaster>Failed to create ZNode /hbase/UNASSIGNED/1028785192 in ZooKeeper
> org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists for /hbase/UNASSIGNED/1028785192
> 2010-06-20 21:08:05,438 DEBUG org.apache.hadoop.hbase.master.RegionManager: Created UNASSIGNED zNode .META.,,1.1028785192 in state M2ZK_REGION_OFFLINE
> then on the RS:
> 2010-06-20 21:08:05,899 ERROR org.apache.hadoop.hbase.regionserver.RSZookeeperUpdater: ZNode /hbase/UNASSIGNED/1028785192 is not in CLOSED/OFFLINE state (state = RS2ZK_REGION_OPENED), will NOT open region.
> 2010-06-20 21:08:05,899 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Error opening .META.,,1.1028785192
> java.io.IOException: ZNode /hbase/UNASSIGNED/1028785192 is not in CLOSED/OFFLINE state (state = RS2ZK_REGION_OPENED), will NOT open region.
> and the region never opens

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HBASE-2758) META region stuck in RS2ZK_REGION_OPENED state

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-2758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Gray reassigned HBASE-2758:
------------------------------------

    Assignee: Karthik Ranganathan

Assigning to Karthik so he sees

> META region stuck in RS2ZK_REGION_OPENED state
> ----------------------------------------------
>
>                 Key: HBASE-2758
>                 URL: https://issues.apache.org/jira/browse/HBASE-2758
>             Project: HBase
>          Issue Type: Bug
>          Components: master, regionserver
>    Affects Versions: 0.21.0
>            Reporter: Todd Lipcon
>            Assignee: Karthik Ranganathan
>            Priority: Blocker
>
> In cluster testing trunk, I ended up with a situation where META was unassigned and no amount of restarting various pieces would fix it. On master startup, I see:
> 2010-06-20 21:08:05,431 DEBUG org.apache.hadoop.hbase.master.BaseScanner: Current assignment of .META.,,1.1028785192 is not valid;  serverAddress=, startCode=0 unknown.
> 2010-06-20 21:08:05,436 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: While creating UNASSIGNED region 1028785192 exists, state = RS2ZK_REGION_OPENED
> 2010-06-20 21:08:05,438 WARN org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: <monster01.sf.cloudera.com:/hbase,org.apache.hadoop.hbase.master.HMaster>Failed to create ZNode /hbase/UNASSIGNED/1028785192 in ZooKeeper
> org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists for /hbase/UNASSIGNED/1028785192
> 2010-06-20 21:08:05,438 DEBUG org.apache.hadoop.hbase.master.RegionManager: Created UNASSIGNED zNode .META.,,1.1028785192 in state M2ZK_REGION_OFFLINE
> then on the RS:
> 2010-06-20 21:08:05,899 ERROR org.apache.hadoop.hbase.regionserver.RSZookeeperUpdater: ZNode /hbase/UNASSIGNED/1028785192 is not in CLOSED/OFFLINE state (state = RS2ZK_REGION_OPENED), will NOT open region.
> 2010-06-20 21:08:05,899 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Error opening .META.,,1.1028785192
> java.io.IOException: ZNode /hbase/UNASSIGNED/1028785192 is not in CLOSED/OFFLINE state (state = RS2ZK_REGION_OPENED), will NOT open region.
> and the region never opens

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.