You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Duo Zhang (Jira)" <ji...@apache.org> on 2021/11/01 09:54:00 UTC

[jira] [Commented] (HBASE-26411) Wal do not roll and write a big wal

    [ https://issues.apache.org/jira/browse/HBASE-26411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17436722#comment-17436722 ] 

Duo Zhang commented on HBASE-26411:
-----------------------------------

OK, so first, in AsyncFSWAL.syncCompleted, besides requestLogRoll, we should also have a check to stop writing to the current writer if the wal file size is already too big, maybe just abort the region server.

And second, why we have an infinite wait for writing here...

> Wal do not roll and write a big wal 
> ------------------------------------
>
>                 Key: HBASE-26411
>                 URL: https://issues.apache.org/jira/browse/HBASE-26411
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.4.8
>            Reporter: Lijin Bin
>            Priority: Major
>
> We see wal have long time to roll and write a big wal which has 3TB.
> And according to the jstack we can see the wal create hang.
> {code}
> "regionserver/11.149.48.227:60020.logRoller" #667 daemon prio=5 os_prio=0 cpu=116916.81ms elapsed=447455.26s tid=0x00007fa35d231000 nid=0xbdd2 waiting on condition [0x00007f79c7407000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x00007f9f10df5158> (a java.util.concurrent.CompletableFuture$Signaller)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>         at java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1707)
>         at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
>         at java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1742)
>         at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
>         at org.apache.hadoop.hbase.regionserver.wal.AsyncProtobufLogWriter.write(AsyncProtobufLogWriter.java:178)
>         at org.apache.hadoop.hbase.regionserver.wal.AsyncProtobufLogWriter.writeMagicAndWALHeader(AsyncProtobufLogWriter.java:191)
>         at org.apache.hadoop.hbase.regionserver.wal.AbstractProtobufLogWriter.init(AbstractProtobufLogWriter.java:170)
>         at org.apache.hadoop.hbase.wal.AsyncFSWALProvider.createAsyncWriter(AsyncFSWALProvider.java:113)
>         at org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.createWriterInstance(AsyncFSWAL.java:615)
>         at org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.createWriterInstance(AsyncFSWAL.java:126)
>         at org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.rollWriter(AbstractFSWAL.java:763)
>         at org.apache.hadoop.hbase.regionserver.LogRoller.run(LogRoller.java:184)
>         at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)