You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "kazdy (Jira)" <ji...@apache.org> on 2023/02/25 20:41:00 UTC

[jira] [Updated] (HUDI-5848) No PreCombineField mode - make COMBINE_BEFORE_UPSERT=false automatically

     [ https://issues.apache.org/jira/browse/HUDI-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

kazdy updated HUDI-5848:
------------------------
    Summary: No PreCombineField mode - make COMBINE_BEFORE_UPSERT=false automatically  (was: If no precombine field is provided make COMBINE_BEFORE_UPSERT=false automatically)

> No PreCombineField mode - make COMBINE_BEFORE_UPSERT=false automatically
> ------------------------------------------------------------------------
>
>                 Key: HUDI-5848
>                 URL: https://issues.apache.org/jira/browse/HUDI-5848
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: dev-experience
>            Reporter: kazdy
>            Assignee: kazdy
>            Priority: Minor
>             Fix For: 0.13.1
>
>
> Starting from 0.13 precombine field is optional in Spark.
> Before this was only available in Flink, but in Flink COMBINE_BEFORE_UPSERT is set to false by default and if no precombine field is provided upserts can be done without any configuration changes.
> In Hudi + Spark, on the other hand, users must explicitly set COMBINE_BEFORE_UPSERT option to false first in order to do upserts in absence of precombine field.
> As a Hudi user, if no precombine field is provided I would like Hudi to automatically set the appropriate option of COMBINE_BEFORE_UPSERT, to provide a seamless experience.
> I assume precombine field can be optional only if the table type is CoW, for MoR precombine is required for it to work properly so it's ok to throw an error in absence of precombine when operation is upsert.
> Therefore this should work only for CoW.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)