You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "fengpod (Jira)" <ji...@apache.org> on 2020/12/02 06:37:01 UTC
[jira] [Created] (KYLIN-4833) use distcp to control the speed of
writting hfile data to hbase cluster
fengpod created KYLIN-4833:
------------------------------
Summary: use distcp to control the speed of writting hfile data to hbase cluster
Key: KYLIN-4833
URL: https://issues.apache.org/jira/browse/KYLIN-4833
Project: Kylin
Issue Type: Improvement
Components: Storage - HBase
Affects Versions: v3.1.1
Reporter: fengpod
When a large data is written to hbase cluster at the same time,the cluster load will become very high,which will affect the query performance. This pr allows data to be written data to hadoop hdfs when doing step “Convert Cuboid Data to HFile”,and then hfile will be transferred to the hbase cluster by DistCp。DistCp controls the speed of write data so as to reduce the pressure of cluster。 This pr adds a new step " HFile Distcp To HBase" between “Convert Cuboid Data to HFile” and "Load HFile to HBase Table" 。As look like this:
!https://user-images.githubusercontent.com/4843586/100835711-013fae00-34a9-11eb-8de8-e69228ba0991.png!
--
This message was sent by Atlassian Jira
(v8.3.4#803005)