You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Raymond Xu (Jira)" <ji...@apache.org> on 2022/05/16 11:11:00 UTC

[jira] [Assigned] (HUDI-2839) Align configs across Spark datasource, write client, etc

     [ https://issues.apache.org/jira/browse/HUDI-2839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raymond Xu reassigned HUDI-2839:
--------------------------------

    Assignee: Sagar Sumit

> Align configs across Spark datasource, write client, etc
> --------------------------------------------------------
>
>                 Key: HUDI-2839
>                 URL: https://issues.apache.org/jira/browse/HUDI-2839
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: configs, spark
>            Reporter: Ethan Guo
>            Assignee: Sagar Sumit
>            Priority: Critical
>             Fix For: 0.12.0
>
>
> This is aroused when discussing HUDI-2818.  For the same logic such as keygenerator, compaction, clustering, etc., there are different configs in Spark datasource and write client and they may cause conflicts.  This can cause unexpected behavior on the write path.
>  
> Raymond: I encountered this NPE when trying to run 0.10 over a 0.8 table: https://issues.apache.org/jira/browse/HUDI-2818.
> to align configs, do you think we should auto set {{hoodie.table.keygenerator.class}} when user sets {{hoodie.datasource.write.keygenerator.class}} and also the other way around?
> Siva: guess in the regular write path(HoodiesparkSqlWriter), this is what happens. i.e. users sets only {{{}hoodie.datasource.write.keygenerator.class{}}}, but internally we set {{hoodie.table.keygenerator.class}}  from datasource write config.
> Vinoth: {{HoodieConfig}} has some alternaitves/fallback mechanism. Something to consider
> but overall we should fix these
> Ethan: when working on compaction/clustering, I also see different configs around the same logic between spark datasource and write client.  maybe we can take a pass of all configs later and make them consistent



--
This message was sent by Atlassian Jira
(v8.20.7#820007)