You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Yu Li (JIRA)" <ji...@apache.org> on 2016/01/12 11:46:39 UTC

[jira] [Updated] (HBASE-14457) Umbrella: Improve Multiple WAL for production usage

     [ https://issues.apache.org/jira/browse/HBASE-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yu Li updated HBASE-14457:
--------------------------
    Attachment: Action in Multiple WAL.pdf

Here comes the doc, sorry for the lag but I hope it's worth the waiting. :-)

I'd like to highlight the testing result:
* PerformanceEvaluation testing with pure SATA disks shows a ~20% performance improvement on writes, w/ 4 WALs per regionserver
* Monitoring data of our online production cluster (800+ nodes) shows a ~40% performance improvements in mutate latency with mixed workloads
* hsync writes with 4 WALs and PCIE-SSD shows promising throughput (20k per server) and latency (5.5ms on average)

Refer to the doc for more details, it also talks about the design and usage of multiple WAL.

Feel free to let me know if you have any comments/questions. Thanks.

> Umbrella: Improve Multiple WAL for production usage
> ---------------------------------------------------
>
>                 Key: HBASE-14457
>                 URL: https://issues.apache.org/jira/browse/HBASE-14457
>             Project: HBase
>          Issue Type: Umbrella
>            Reporter: Yu Li
>            Assignee: Yu Li
>             Fix For: 2.0.0, 1.3.0
>
>         Attachments: Action in Multiple WAL.pdf
>
>
> HBASE-5699 proposed the idea to run with multiple WAL in regionserver and did a great initial work there, but when trying to use it in our production cluster, we still found several issues to resolve, like tracking multiple WAL paths in replication (HBASE-6617), fixing UT with multiwal provider (HBASE-14411), introducing a namespace-based strategy for RegionGroupingProvider (HBASE-14456), etc. This is an umbrella including(but not limited of) all these works and efforts to make multiple wal ready for production usage and give user a clear picture about it.
> Besides the developing works done, I'd also like to share some scenarios and testing/online data in this JIRA about our usage/performance of multiple wal, to(hopefully) help people better judge whether to enable multiple wal or not in their own cluster and what they might gain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)