You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Alexey Kudinkin (Jira)" <ji...@apache.org> on 2022/02/02 21:32:00 UTC

[jira] [Updated] (HUDI-2763) Metadata table records key deduplication

     [ https://issues.apache.org/jira/browse/HUDI-2763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alexey Kudinkin updated HUDI-2763:
----------------------------------
    Epic Link: HUDI-1292

> Metadata table records key deduplication
> ----------------------------------------
>
>                 Key: HUDI-2763
>                 URL: https://issues.apache.org/jira/browse/HUDI-2763
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: writer-core
>            Reporter: Manoj Govindassamy
>            Assignee: Manoj Govindassamy
>            Priority: Blocker
>              Labels: pull-request-available
>             Fix For: 0.11.0
>
>
> The Metadata record payload has a key field which is the same as the record key. Since the metadata table on-disk storage is HFile KV format, we are already persisting the record key in HFile data block as the Key. So, it would be good to save on the storage cost for the metadata table by avoiding any redundant key fields.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)