You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Yechao Chen (JIRA)" <ji...@apache.org> on 2019/02/22 06:22:00 UTC
[jira] [Updated] (HBASE-21810) bulkload support set hfile
compression on client
[ https://issues.apache.org/jira/browse/HBASE-21810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yechao Chen updated HBASE-21810:
--------------------------------
Description:
hbase bulkload (HFileOutputFormat2) generate hfile ,the compression from the table(cf) compression,
if the compression can be set on client ,sometimes,it's useful,
some case in our production:
1、hfile bulkload replication between the data center with bandwidth limit, we can set the compression of the bulkload hfile not changing the table compression
2、bulkload hfile not set compression ,but the table compression is gz/zstd/snappy... ,can reduce the hfile created time and compaction will make the hfile to compression finally
3、somethings the yarn nodes (hfile created by reduce) /dobulkload client has no compression lib,but the hbase cluster has,it's useful for this case
was:
hbase bulkload (HFileOutputFormat2) generate hfile ,the compression from the table(cf) compression,
if the compression can be set on client ,somethings it's useful,
some case in our production:
1、hfile bulkload replication between the data center with bandwidth limit, we can set the compression of the bulkload hfile not changing the table compression
2、bulkload hfile not set compression ,but the table compression is gz/zstd/snappy... ,can reduce the hfile created time and compaction will make the hfile to compression finally
3、somethings the yarn nodes (hfile created by reduce) /dobulkload client has no compression lib,but the hbase cluster has,it's useful for this case
> bulkload support set hfile compression on client
> --------------------------------------------------
>
> Key: HBASE-21810
> URL: https://issues.apache.org/jira/browse/HBASE-21810
> Project: HBase
> Issue Type: Improvement
> Components: mapreduce
> Affects Versions: 1.3.3, 1.4.9, 2.1.2, 1.2.10, 2.0.4
> Reporter: Yechao Chen
> Assignee: Yechao Chen
> Priority: Major
> Attachments: HBASE-21810.branch-1.001.patch, HBASE-21810.branch-1.2.001.patch, HBASE-21810.branch-2.001.patch, HBASE-21810.master.001.patch
>
>
> hbase bulkload (HFileOutputFormat2) generate hfile ,the compression from the table(cf) compression,
> if the compression can be set on client ,sometimes,it's useful,
> some case in our production:
> 1、hfile bulkload replication between the data center with bandwidth limit, we can set the compression of the bulkload hfile not changing the table compression
> 2、bulkload hfile not set compression ,but the table compression is gz/zstd/snappy... ,can reduce the hfile created time and compaction will make the hfile to compression finally
> 3、somethings the yarn nodes (hfile created by reduce) /dobulkload client has no compression lib,but the hbase cluster has,it's useful for this case
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)