You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Hudson (JIRA)" <ji...@apache.org> on 2018/03/01 11:35:00 UTC

[jira] [Commented] (HBASE-14317) Stuck FSHLog: bad disk (HDFS-8960) and can't roll WAL

    [ https://issues.apache.org/jira/browse/HBASE-14317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16381862#comment-16381862 ] 

Hudson commented on HBASE-14317:
--------------------------------

FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4669 (See [https://builds.apache.org/job/HBase-Trunk_matrix/4669/])
HBASE-20107 Add a test case for HBASE-14317 (Zephyr Guo) (tedyu: rev d7adc58e5203567b8083160d45f85f9986e272cd)
* (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java
* (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestWALLockup.java


> Stuck FSHLog: bad disk (HDFS-8960) and can't roll WAL
> -----------------------------------------------------
>
>                 Key: HBASE-14317
>                 URL: https://issues.apache.org/jira/browse/HBASE-14317
>             Project: HBase
>          Issue Type: Bug
>          Components: wal
>    Affects Versions: 1.2.0, 1.1.1
>            Reporter: stack
>            Assignee: stack
>            Priority: Blocker
>             Fix For: 2.0.0, 1.2.0, 1.3.0
>
>         Attachments: 14317.branch-1.txt, 14317.branch-1.txt, 14317.branch-1.v2.txt, 14317.branch-1.v2.txt, 14317.branch-1.v2.txt, 14317.branch-1.v2.txt, 14317.branch-1.v2.txt, 14317.branch-1.v2.txt, 14317.branch-1.v2.txt, 14317.branch-1.v2.txt, 14317.test.txt, 14317v10.txt, 14317v11.txt, 14317v12.txt, 14317v13.txt, 14317v14.txt, 14317v15.txt, 14317v5.branch-1.2.txt, 14317v5.txt, 14317v9.txt, HBASE-14317-v1.patch, HBASE-14317-v2.patch, HBASE-14317-v3.patch, HBASE-14317-v4.patch, HBASE-14317.patch, [Java] RS stuck on WAL sync to a dead DN - Pastebin.com.html, append-only-test.patch, raw.php, repro.txt, san_dump.txt, subset.of.rs.log, timeouts.branch-1.txt
>
>
> hbase-1.1.1 and hadoop-2.7.1
> We try to roll logs because can't append (See HDFS-8960) but we get stuck. See attached thread dump and associated log. What is interesting is that syncers are waiting to take syncs to run and at same time we want to flush so we are waiting on a safe point but there seems to be nothing in our ring buffer; did we go to roll log and not add safe point sync to clear out ringbuffer?
> Needs a bit of study. Try to reproduce.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)