You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Todd Lipcon (JIRA)" <ji...@apache.org> on 2011/01/17 22:35:44 UTC

[jira] Created: (HBASE-3449) Server shutdown handlers deadlocked waiting for META

Server shutdown handlers deadlocked waiting for META
----------------------------------------------------

                 Key: HBASE-3449
                 URL: https://issues.apache.org/jira/browse/HBASE-3449
             Project: HBase
          Issue Type: Bug
    Affects Versions: 0.90.0
            Reporter: Todd Lipcon
            Priority: Blocker


I have a situation where both of my MASTER_META_SERVER_OPERATIONS handlers are handling server shutdowns, and both of them are waiting on ROOT, which isn't coming up. Unclear exactly how this happened, but I triggered it by doing a rolling restart.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HBASE-3449) Server shutdown handlers deadlocked waiting for META

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack reassigned HBASE-3449:
----------------------------

    Assignee: stack

> Server shutdown handlers deadlocked waiting for META
> ----------------------------------------------------
>
>                 Key: HBASE-3449
>                 URL: https://issues.apache.org/jira/browse/HBASE-3449
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.0
>            Reporter: Todd Lipcon
>            Assignee: stack
>            Priority: Blocker
>
> I have a situation where both of my MASTER_META_SERVER_OPERATIONS handlers are handling server shutdowns, and both of them are waiting on ROOT, which isn't coming up. Unclear exactly how this happened, but I triggered it by doing a rolling restart.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-3449) Server shutdown handlers deadlocked waiting for META

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12983019#action_12983019 ] 

stack commented on HBASE-3449:
------------------------------

A workaround would be to up the ExecutorType.MASTER_META_SERVER_OPERATIONS from 2 to 5 to lower incidence of this happening.  Longer term would require architectural change.  The executorservice is autonomous.  In old days, we'd put ourselves back on a queue if we couldn't proceed.  We don't have that facility any more.

> Server shutdown handlers deadlocked waiting for META
> ----------------------------------------------------
>
>                 Key: HBASE-3449
>                 URL: https://issues.apache.org/jira/browse/HBASE-3449
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.0
>            Reporter: Todd Lipcon
>            Priority: Blocker
>
> I have a situation where both of my MASTER_META_SERVER_OPERATIONS handlers are handling server shutdowns, and both of them are waiting on ROOT, which isn't coming up. Unclear exactly how this happened, but I triggered it by doing a rolling restart.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-3449) Server shutdown handlers deadlocked waiting for META

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12985462#action_12985462 ] 

Hudson commented on HBASE-3449:
-------------------------------

Integrated in HBase-TRUNK #1719 (See [https://hudson.apache.org/hudson/job/HBase-TRUNK/1719/])
    

> Server shutdown handlers deadlocked waiting for META
> ----------------------------------------------------
>
>                 Key: HBASE-3449
>                 URL: https://issues.apache.org/jira/browse/HBASE-3449
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.0
>            Reporter: Todd Lipcon
>            Assignee: stack
>            Priority: Blocker
>             Fix For: 0.90.1
>
>
> I have a situation where both of my MASTER_META_SERVER_OPERATIONS handlers are handling server shutdowns, and both of them are waiting on ROOT, which isn't coming up. Unclear exactly how this happened, but I triggered it by doing a rolling restart.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-3449) Server shutdown handlers deadlocked waiting for META

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-3449:
-------------------------

    Fix Version/s: 0.90.1

Bringing into 0.90.1  Let me make the above suggested workaround configuration change for 0.90.1.

> Server shutdown handlers deadlocked waiting for META
> ----------------------------------------------------
>
>                 Key: HBASE-3449
>                 URL: https://issues.apache.org/jira/browse/HBASE-3449
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.0
>            Reporter: Todd Lipcon
>            Assignee: stack
>            Priority: Blocker
>             Fix For: 0.90.1
>
>
> I have a situation where both of my MASTER_META_SERVER_OPERATIONS handlers are handling server shutdowns, and both of them are waiting on ROOT, which isn't coming up. Unclear exactly how this happened, but I triggered it by doing a rolling restart.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-3449) Server shutdown handlers deadlocked waiting for META

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12982887#action_12982887 ] 

Todd Lipcon commented on HBASE-3449:
------------------------------------

I'm guessing what happened is this:

- server 1 went down, was hosting META
- shutdown handler started running, said "reassign ROOT to server 2, reassign meta to server 3"
- server 2 went down (this is rolling restart after all)
- shutdown handler for server 2 started running, reassigned ROOT to server 3
- server 3 went down

- so ROOT is still unassigned, but both meta shutdown handlers are blocked waiting on it. It can't get reassigned since server 3's shutdown can't get processed

> Server shutdown handlers deadlocked waiting for META
> ----------------------------------------------------
>
>                 Key: HBASE-3449
>                 URL: https://issues.apache.org/jira/browse/HBASE-3449
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.0
>            Reporter: Todd Lipcon
>            Priority: Blocker
>
> I have a situation where both of my MASTER_META_SERVER_OPERATIONS handlers are handling server shutdowns, and both of them are waiting on ROOT, which isn't coming up. Unclear exactly how this happened, but I triggered it by doing a rolling restart.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HBASE-3449) Server shutdown handlers deadlocked waiting for META

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack resolved HBASE-3449.
--------------------------

    Resolution: Fixed

I committed the below workaround to branch and trunk and opened HBASE-3458 to fix it so issue cannot ever happen.

{code}
Index: src/main/java/org/apache/hadoop/hbase/master/HMaster.java
===================================================================
--- src/main/java/org/apache/hadoop/hbase/master/HMaster.java   (revision 1061564)
+++ src/main/java/org/apache/hadoop/hbase/master/HMaster.java   (working copy)
@@ -518,7 +518,7 @@
       this.executorService.startExecutorService(ExecutorType.MASTER_SERVER_OPERATIONS,
         conf.getInt("hbase.master.executor.serverops.threads", 3));
       this.executorService.startExecutorService(ExecutorType.MASTER_META_SERVER_OPERATIONS,
-        conf.getInt("hbase.master.executor.serverops.threads", 2));
+        conf.getInt("hbase.master.executor.serverops.threads", 5));
       // We depend on there being only one instance of this executor running
       // at a time.  To do concurrency, would need fencing of enable/disable of
       // tables.
{code}

> Server shutdown handlers deadlocked waiting for META
> ----------------------------------------------------
>
>                 Key: HBASE-3449
>                 URL: https://issues.apache.org/jira/browse/HBASE-3449
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.0
>            Reporter: Todd Lipcon
>            Assignee: stack
>            Priority: Blocker
>             Fix For: 0.90.1
>
>
> I have a situation where both of my MASTER_META_SERVER_OPERATIONS handlers are handling server shutdowns, and both of them are waiting on ROOT, which isn't coming up. Unclear exactly how this happened, but I triggered it by doing a rolling restart.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.