You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Jingyun Tian (JIRA)" <ji...@apache.org> on 2017/11/28 06:09:00 UTC
[jira] [Created] (HBASE-19358) Improve the stability of splitting
log when do fail over
Jingyun Tian created HBASE-19358:
------------------------------------
Summary: Improve the stability of splitting log when do fail over
Key: HBASE-19358
URL: https://issues.apache.org/jira/browse/HBASE-19358
Project: HBase
Issue Type: Improvement
Components: MTTR
Affects Versions: 0.98.24
Reporter: Jingyun Tian
Now the way we split log is like the following figure:
!previous-logic.png|thumbnail!
The problem is the OutputSink will write the recovered edits during splitting log, which means it will create one WriterAndPath for each region. If the cluster is small and the number of regions per rs is large, it will create too many HDFS streams at the same time. Then it is prone to failure since each datanode need to handle too many streams.
Thus I come up with a new way to split log.
!attachment-name.jpg|thumbnail!
We cached the recovered edits unless exceeds the memory limits we set or reach the end, then we have a thread pool to do the rest things: write them to files and move to the destination.
The biggest benefit is we can control the number of streams we create during splitting log,
it will not exceeds hbase.regionserver.wal.max.splitters * hbase.regionserver.hlog.splitlog.writer.threads, but before it is hbase.regionserver.wal.max.splitters * the number of region the hlog contains.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)