You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Sean Busbey (Jira)" <ji...@apache.org> on 2020/07/21 13:08:00 UTC
[jira] [Commented] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

    [ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17162022#comment-17162022 ] 

Sean Busbey commented on HBASE-24749:
-------------------------------------

excellent! it's worth a heads up to dev@hbase pointing folks here, I think.

{quote}
For recovery from region server crashes or region reload, we persist the in-memory HFiles tracking in store file manager to a new HBase admin table, ‘hbase:storefile’. To prevent loading HFiles from incomplete flushes and compactions, and reduce the number of expensive LIST files calls against the file system, we will read directly from the hbase:storefile table. A write to the storefile table is used as the commit mechanism for a HFile, removing the rename from .tmp to the data directory.
{quote}

Could this be a part of meta instead? we just recently got through having {{hbase:namespace}} move into meta to improve operational robustness, and this proposed storefile lookup seems very likely to be an even greater tripping point since all the RS need access.

{quote}
To avoid a circular dependency on the storefile table, the store file manager for the meta and storefile tables will be persisted in ZooKeeper.
{quote}

no persistent state in zookeeper please. we could do this via a local region controlled by whomever is handling meta. or at least I think that the feature would work for this, what do you think [~zhangduo]?

> Direct insert HFiles and Persist in-memory HFile tracking
> ---------------------------------------------------------
>
>                 Key: HBASE-24749
>                 URL: https://issues.apache.org/jira/browse/HBASE-24749
>             Project: HBase
>          Issue Type: Umbrella
>          Components: Compaction, HFile
>    Affects Versions: 3.0.0-alpha-1
>            Reporter: Tak-Lon (Stephen) Wu
>            Priority: Major
>              Labels: design, discussion, objectstore, storeFile, storeengine
>         Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} directory used in the commit stage for common HFile operations such as flush and compaction to improve the write throughput and latency on object stores. Specifically for S3 filesystems, this will also mitigate read-after-write inconsistencies caused by immediate HFiles validation after moving the HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed improvement on the object stores use case makes senses and if we miss anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file system imple



--
This message was sent by Atlassian Jira
(v8.3.4#803005)