You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "kazdy (Jira)" <ji...@apache.org> on 2023/03/10 10:59:00 UTC

[jira] [Commented] (HUDI-5824) COMBINE_BEFORE_UPSERT=false option does not work for upsert

    [ https://issues.apache.org/jira/browse/HUDI-5824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17698867#comment-17698867 ] 

kazdy commented on HUDI-5824:
-----------------------------

[~xushiyan] I see that you marked this as critical, but after having a discussion with Danny in the PR I am not sure if this should be considered a bug or not. Could you take a look and maybe come up to an agreement with Danny on what to do with this?
Imo this is a bug since public user facing config does not work, but it seems Danny would rather like to keep behavior where upsert always precombine. So then maybe it should be considered to make/mark this config internal or deprecated?

> COMBINE_BEFORE_UPSERT=false option does not work for upsert
> -----------------------------------------------------------
>
>                 Key: HUDI-5824
>                 URL: https://issues.apache.org/jira/browse/HUDI-5824
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: spark
>    Affects Versions: 0.12.1, 0.12.2, 0.13.0
>            Reporter: kazdy
>            Assignee: kazdy
>            Priority: Critical
>              Labels: pull-request-available
>             Fix For: 0.13.1, 0.12.3
>
>
> hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala
> shouldCombine does not take into account the situation where the write operation is UPSERT but COMBINE_BEFORE_UPSERT is false.
> Currently, Hudi always combines records on UPSERT, and option COMBINE_BEFORE_UPSERT is not honored.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)