You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Jian Feng (Jira)" <ji...@apache.org> on 2021/09/11 06:36:00 UTC

[jira] [Assigned] (HUDI-2414) enable Hot and cold data separate when ingest data

     [ https://issues.apache.org/jira/browse/HUDI-2414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jian Feng reassigned HUDI-2414:
-------------------------------

    Assignee: Jian Feng

> enable Hot and cold data separate when ingest data
> --------------------------------------------------
>
>                 Key: HUDI-2414
>                 URL: https://issues.apache.org/jira/browse/HUDI-2414
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: Writer Core
>            Reporter: Jian Feng
>            Assignee: Jian Feng
>            Priority: Major
>
> when using Hudi to ingest e-commercial company's item data,there are massive update data into old partitions,if one record need update, then the whole file it belongs need rewrite, that result in every commit nearly rewrite the whole table.
> I'm thinking if Hudi can provide a hot and cold data separate tool, work with specific column(such as create time and update time) to distinguish hot data and cold data, then rebuild table to separate them into different file groups, after recreate table, the performance will be much better 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)