You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Tak-Lon (Stephen) Wu (Jira)" <ji...@apache.org> on 2020/08/04 21:00:05 UTC
[jira] [Comment Edited] (HBASE-24749) Direct insert HFiles and Persist in-memory HFile tracking

    [ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17171116#comment-17171116 ] 

Tak-Lon (Stephen) Wu edited comment on HBASE-24749 at 8/4/20, 8:59 PM:
-----------------------------------------------------------------------

sorry for the delay, I was out few days last week.
{quote}every flush and compaction will result in an update inline w/ the flush/compaction completion – if it fails, the flush/compaction fail?
{quote}
if updating hfile set in {{hbase:meta}} fails, it should be considered as failure if this feature is enabled. do you have concern on blocking the actual flush to be completed ? (it should be similar to other feature like {{hbase:quota}})
{quote}Master would update, or RS writes meta, a violation of a simplification we made trying to ensure one-writer
{quote}
for ensuring one-writer to {{hbase:meta}}, this is a good note and we haven't considered one writer scenario yet. I'm not sure the right way but for flush should be happened on the RS side, then either the RS create directly connection to {{hbase:meta}} with a limit on writing to only this column family outside of the master (suggested by [~zyork], pending investigation) or as you suggested that we package the hfile set information with a RPC call to Master and Master updates the hfile set. the amount of traffic (direct table connection or RPC call) should be the same, I still need to compare if the overhead (throughput) have any difference.

In addition, I will try to come up a set of sub-tasks and update the proposal doc the coming week. please bear with me, the plan may have some transition tasks (the goal is to have delivery with stages), e.g. 1. having the separate system table first, then have followup tasks to 2). compare the migration into the {{hbase:meta}} and actually 3). merge into {{hbase:meta}} (as a throughput sanity check)


was (Author: taklwu):
sorry for the delay, I was out few days last week.
{quote}every flush and compaction will result in an update inline w/ the flush/compaction completion – if it fails, the flush/compaction fail?
{quote}
if updating hfile set in {{hbase:meta}} fails, it should be considered as failure if this feature is enabled. do you have concern on blocking the actual flush to be completed ? (it should be similar to other feature like {{hbase:quota}})
{quote}Master would update, or RS writes meta, a violation of a simplification we made trying to ensure one-writer
{quote}
for ensuring one-writer to {{hbase:meta}}, this is a good note and we haven't considered one writer scenario yet. I'm not sure the right way but for flush should be happened on the RS side, then either the RS create directly connection to {{hbase:meta}} with a limit on writing to only this column family outside of the master (suggested by [~zyork], pending investigation) or as you suggested that we package the hfile set information with a RPC call to Master and Master updates the hfile set. the amount of traffic (direct table connection or RPC call) should be the same, I still need to compare if the overhead (throughput) have any difference.

In addition, I will try to come up a set of sub-tasks and update the proposal doc the coming week. please bear with me, the plan may have some transition tasks, e.g. 1. having the separate system table first, then have followup tasks to 2). compare the migration into the {{hbase:meta}} and actually 3). merge into {{hbase:meta}} (as a throughput sanity check)

> Direct insert HFiles and Persist in-memory HFile tracking
> ---------------------------------------------------------
>
>                 Key: HBASE-24749
>                 URL: https://issues.apache.org/jira/browse/HBASE-24749
>             Project: HBase
>          Issue Type: Umbrella
>          Components: Compaction, HFile
>    Affects Versions: 3.0.0-alpha-1
>            Reporter: Tak-Lon (Stephen) Wu
>            Assignee: Tak-Lon (Stephen) Wu
>            Priority: Major
>              Labels: design, discussion, objectstore, storeFile, storeengine
>         Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct insert HFiles and Persist in-memory HFile tracking.pdf
>
>
> We propose a new feature (a new store engine) to remove the {{.tmp}} directory used in the commit stage for common HFile operations such as flush and compaction to improve the write throughput and latency on object stores. Specifically for S3 filesystems, this will also mitigate read-after-write inconsistencies caused by immediate HFiles validation after moving the HFile(s) to data directory.
> Please see attached for this proposal and the initial result captured with 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, and workload C RUN result.
> The goal of this JIRA is to discuss with the community if the proposed improvement on the object stores use case makes senses and if we miss anything should be included.
> Improvement Highlights
>  1. Lower write latency, especially the p99+
>  2. Higher write throughput on flush and compaction 
>  3. Lower MTTR on region (re)open or assignment 
>  4. Remove consistent check dependencies (e.g. DynamoDB) supported by file system implementation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)