You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Todd Lipcon (JIRA)" <ji...@apache.org> on 2011/01/17 22:35:44 UTC
[jira] Created: (HBASE-3449) Server shutdown handlers deadlocked
waiting for META
Server shutdown handlers deadlocked waiting for META
----------------------------------------------------
Key: HBASE-3449
URL: https://issues.apache.org/jira/browse/HBASE-3449
Project: HBase
Issue Type: Bug
Affects Versions: 0.90.0
Reporter: Todd Lipcon
Priority: Blocker
I have a situation where both of my MASTER_META_SERVER_OPERATIONS handlers are handling server shutdowns, and both of them are waiting on ROOT, which isn't coming up. Unclear exactly how this happened, but I triggered it by doing a rolling restart.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HBASE-3449) Server shutdown handlers deadlocked
waiting for META
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack reassigned HBASE-3449:
----------------------------
Assignee: stack
> Server shutdown handlers deadlocked waiting for META
> ----------------------------------------------------
>
> Key: HBASE-3449
> URL: https://issues.apache.org/jira/browse/HBASE-3449
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.0
> Reporter: Todd Lipcon
> Assignee: stack
> Priority: Blocker
>
> I have a situation where both of my MASTER_META_SERVER_OPERATIONS handlers are handling server shutdowns, and both of them are waiting on ROOT, which isn't coming up. Unclear exactly how this happened, but I triggered it by doing a rolling restart.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3449) Server shutdown handlers deadlocked
waiting for META
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12983019#action_12983019 ]
stack commented on HBASE-3449:
------------------------------
A workaround would be to up the ExecutorType.MASTER_META_SERVER_OPERATIONS from 2 to 5 to lower incidence of this happening. Longer term would require architectural change. The executorservice is autonomous. In old days, we'd put ourselves back on a queue if we couldn't proceed. We don't have that facility any more.
> Server shutdown handlers deadlocked waiting for META
> ----------------------------------------------------
>
> Key: HBASE-3449
> URL: https://issues.apache.org/jira/browse/HBASE-3449
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.0
> Reporter: Todd Lipcon
> Priority: Blocker
>
> I have a situation where both of my MASTER_META_SERVER_OPERATIONS handlers are handling server shutdowns, and both of them are waiting on ROOT, which isn't coming up. Unclear exactly how this happened, but I triggered it by doing a rolling restart.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3449) Server shutdown handlers deadlocked
waiting for META
Posted by "Hudson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12985462#action_12985462 ]
Hudson commented on HBASE-3449:
-------------------------------
Integrated in HBase-TRUNK #1719 (See [https://hudson.apache.org/hudson/job/HBase-TRUNK/1719/])
> Server shutdown handlers deadlocked waiting for META
> ----------------------------------------------------
>
> Key: HBASE-3449
> URL: https://issues.apache.org/jira/browse/HBASE-3449
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.0
> Reporter: Todd Lipcon
> Assignee: stack
> Priority: Blocker
> Fix For: 0.90.1
>
>
> I have a situation where both of my MASTER_META_SERVER_OPERATIONS handlers are handling server shutdowns, and both of them are waiting on ROOT, which isn't coming up. Unclear exactly how this happened, but I triggered it by doing a rolling restart.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-3449) Server shutdown handlers deadlocked
waiting for META
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HBASE-3449:
-------------------------
Fix Version/s: 0.90.1
Bringing into 0.90.1 Let me make the above suggested workaround configuration change for 0.90.1.
> Server shutdown handlers deadlocked waiting for META
> ----------------------------------------------------
>
> Key: HBASE-3449
> URL: https://issues.apache.org/jira/browse/HBASE-3449
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.0
> Reporter: Todd Lipcon
> Assignee: stack
> Priority: Blocker
> Fix For: 0.90.1
>
>
> I have a situation where both of my MASTER_META_SERVER_OPERATIONS handlers are handling server shutdowns, and both of them are waiting on ROOT, which isn't coming up. Unclear exactly how this happened, but I triggered it by doing a rolling restart.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3449) Server shutdown handlers deadlocked
waiting for META
Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12982887#action_12982887 ]
Todd Lipcon commented on HBASE-3449:
------------------------------------
I'm guessing what happened is this:
- server 1 went down, was hosting META
- shutdown handler started running, said "reassign ROOT to server 2, reassign meta to server 3"
- server 2 went down (this is rolling restart after all)
- shutdown handler for server 2 started running, reassigned ROOT to server 3
- server 3 went down
- so ROOT is still unassigned, but both meta shutdown handlers are blocked waiting on it. It can't get reassigned since server 3's shutdown can't get processed
> Server shutdown handlers deadlocked waiting for META
> ----------------------------------------------------
>
> Key: HBASE-3449
> URL: https://issues.apache.org/jira/browse/HBASE-3449
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.0
> Reporter: Todd Lipcon
> Priority: Blocker
>
> I have a situation where both of my MASTER_META_SERVER_OPERATIONS handlers are handling server shutdowns, and both of them are waiting on ROOT, which isn't coming up. Unclear exactly how this happened, but I triggered it by doing a rolling restart.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HBASE-3449) Server shutdown handlers deadlocked
waiting for META
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-3449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack resolved HBASE-3449.
--------------------------
Resolution: Fixed
I committed the below workaround to branch and trunk and opened HBASE-3458 to fix it so issue cannot ever happen.
{code}
Index: src/main/java/org/apache/hadoop/hbase/master/HMaster.java
===================================================================
--- src/main/java/org/apache/hadoop/hbase/master/HMaster.java (revision 1061564)
+++ src/main/java/org/apache/hadoop/hbase/master/HMaster.java (working copy)
@@ -518,7 +518,7 @@
this.executorService.startExecutorService(ExecutorType.MASTER_SERVER_OPERATIONS,
conf.getInt("hbase.master.executor.serverops.threads", 3));
this.executorService.startExecutorService(ExecutorType.MASTER_META_SERVER_OPERATIONS,
- conf.getInt("hbase.master.executor.serverops.threads", 2));
+ conf.getInt("hbase.master.executor.serverops.threads", 5));
// We depend on there being only one instance of this executor running
// at a time. To do concurrency, would need fencing of enable/disable of
// tables.
{code}
> Server shutdown handlers deadlocked waiting for META
> ----------------------------------------------------
>
> Key: HBASE-3449
> URL: https://issues.apache.org/jira/browse/HBASE-3449
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.0
> Reporter: Todd Lipcon
> Assignee: stack
> Priority: Blocker
> Fix For: 0.90.1
>
>
> I have a situation where both of my MASTER_META_SERVER_OPERATIONS handlers are handling server shutdowns, and both of them are waiting on ROOT, which isn't coming up. Unclear exactly how this happened, but I triggered it by doing a rolling restart.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.