You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "Michael Stack (Jira)" <ji...@apache.org> on 2020/01/24 05:27:00 UTC

[jira] [Commented] (HBASE-23730) Optimize Memstore Flush for Hbase on S3(Object Store)

    [ https://issues.apache.org/jira/browse/HBASE-23730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17022700#comment-17022700 ] 

Michael Stack commented on HBASE-23730:
---------------------------------------

What if crash during flush? On reopen of the region, how we know file is only half-done rather than just corrupt? Will WAL replay clean up half-done files?

Otherwise, sounds good. Make an implementation that makes it so we can use it against hdfs too -- skipping the .tmp file and rename. Thanks.

> Optimize Memstore Flush for Hbase on S3(Object Store)
> -----------------------------------------------------
>
>                 Key: HBASE-23730
>                 URL: https://issues.apache.org/jira/browse/HBASE-23730
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Jarred Li
>            Priority: Major
>
> The current Memstore Flush Process is divided into 2 stages:
>  # Flushcache: In this stage, a “.tmp” region file is written in S3/HDFS for the memstore;
>  # Commit: In this stage, the “.tmp” file created in the stage 1 is renamed to final destination of HBase region file.
> The above design(flush and commit) is OK for HDFS because “rename” is light opertion(only metadata operation). However, for storage like S3 or other object store, rename is “copy” and “delete” operation.
> We can follow the same pattern from V2 of  “FileOutputCommitter” in MapReduce. That means, we can write hfile directly to the S3 destination directory without “copy” and “paste”. So that we can have less S3 operations and the HBase memstore flush is more efficient.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)