You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "sherhomhuang (Jira)" <ji...@apache.org> on 2022/06/19 12:48:00 UTC
[jira] [Closed] (HUDI-4280) Support more parallelisms in flink when writing data to less bucket num but more than one partiton path.
[ https://issues.apache.org/jira/browse/HUDI-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
sherhomhuang closed HUDI-4280.
------------------------------
Resolution: Fixed
It is improved in HUDI-4101
> Support more parallelisms in flink when writing data to less bucket num but more than one partiton path.
> --------------------------------------------------------------------------------------------------------
>
> Key: HUDI-4280
> URL: https://issues.apache.org/jira/browse/HUDI-4280
> Project: Apache Hudi
> Issue Type: Improvement
> Components: flink
> Reporter: sherhomhuang
> Assignee: sherhomhuang
> Priority: Major
> Fix For: 0.12.0
>
> Original Estimate: 96h
> Remaining Estimate: 96h
>
> Support more parallelisms in flink when writing data to less bucket num but more than one partiton path.
> *Existing shortcoming:*
> Suppose a table is just set to be _*N*_ bucket num, but it may has a large historical data in *_M_* partition paths({_}*M >> N*{_}). When importing historical data, the speed of writing to the table will be limited , because parallelism cannot be set greater than _*N*_ for the algorithm in class {_}BucketIndexPartitioner{_}.
> {*}Improvement{*}:
> Optimize the method of partitioner, to support _*M * N*_ parallelisms when importing to _*N*_ bucket num table.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)