You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Hiroshi Ikeda (JIRA)" <ji...@apache.org> on 2015/08/13 15:09:45 UTC

[jira] [Commented] (HBASE-11902) RegionServer was blocked while aborting

    [ https://issues.apache.org/jira/browse/HBASE-11902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14695194#comment-14695194 ] 

Hiroshi Ikeda commented on HBASE-11902:
---------------------------------------

{code}
    if (getValue(oldValAndFlags) == 1) return; // There were no operations outstanding.
    synchronized (this) { this.wait(); }
{code}

If DrainBarrier#endOp calls notifyAll just before the synchronized block, this may wait forever.
(BTW, some of tests for DrainBarrier are also required to fix because they catch AssertionError thrown by JUnit.)

> RegionServer was blocked while aborting
> ---------------------------------------
>
>                 Key: HBASE-11902
>                 URL: https://issues.apache.org/jira/browse/HBASE-11902
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver, wal
>    Affects Versions: 0.98.4
>         Environment: hbase-0.98.4, hadoop-2.3.0-cdh5.1, jdk1.7
>            Reporter: Victor Xu
>            Assignee: Qiang Tian
>         Attachments: hbase-hadoop-regionserver-hadoop461.cm6.log, hbase11902-master.patch, hbase11902-master_v2.patch, hbase11902-master_v3.patch, jstack_hadoop461.cm6.log
>
>
> Generally, regionserver automatically aborts when isHealth() returns false. But it sometimes got blocked while aborting. I saved the jstack and logs, and found out that it was caused by datanodes failures. The "regionserver60020" thread was blocked while closing WAL. 
> This issue doesn't happen so frequently, but if it happens, it always leads to huge amount of requests failure. The only way to do is KILL -9.
> I think it's a bug, but I haven't found a decent solution. Does anyone have the same problem?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)