You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "zhuanshenbsj1 (Jira)" <ji...@apache.org> on 2023/04/06 11:44:00 UTC

[jira] [Created] (HUDI-6045) Adjust HoodieTableSink for sink operator generation

zhuanshenbsj1 created HUDI-6045:
-----------------------------------

             Summary: Adjust HoodieTableSink for sink operator generation
                 Key: HUDI-6045
                 URL: https://issues.apache.org/jira/browse/HUDI-6045
             Project: Apache Hudi
          Issue Type: Improvement
          Components: writer-core
            Reporter: zhuanshenbsj1
             Fix For: 0.14.0


# Insert scenario in the COW table, if both online sync-clustering and online async-clustering(plan generate || plan execute) are configured simultaneously, sometimes sync-clustering takes effect, sometimes async-clustering takes effect:

sync-clustering=true & generate async-clustering=true & execute async-clustering=true:sync-clustering takes effect
sync-clustering=true & generate async-clustering=false & execute async-clustering=true: async-clustering takes effect
 # Insert scenario in the MOR table,sometimes generate log files, sometimes generate parquet files.
async-compaction=true & generate async-clustering=false : generate log files
async-compaction=false & generate async-clustering=true: generate parquet files

This will cause confusion for users

After modification:
 # Insert scenario in the COW table, sync-clustering has higher priority than online async-clusering.
 # Insert scenario in the MOR table will always generate parquet files



--
This message was sent by Atlassian Jira
(v8.20.10#820010)