You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Udit Mehrotra (Jira)" <ji...@apache.org> on 2021/08/25 09:22:00 UTC

[jira] [Updated] (HUDI-1753) Assigns the buckets by record key for Flink writer

     [ https://issues.apache.org/jira/browse/HUDI-1753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Udit Mehrotra updated HUDI-1753:
--------------------------------
    Fix Version/s:     (was: 0.9.0)
                   0.10.0

> Assigns the buckets by record key for Flink writer
> --------------------------------------------------
>
>                 Key: HUDI-1753
>                 URL: https://issues.apache.org/jira/browse/HUDI-1753
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: Flink Integration
>            Reporter: Danny Chen
>            Priority: Major
>             Fix For: 0.10.0
>
>
> Currently we assign the buckets by record partition path, which could cause hotspot if the partition field is datetime type. Actually we can decide the buckets by grouping records with their record keys first, the assign is valid only if there is no conflict (two task write to same buckets).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)