You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "YangXuan (Jira)" <ji...@apache.org> on 2022/01/12 02:16:00 UTC

[jira] [Commented] (HUDI-2776) Cluster update strategy should not be fenced by write config

    [ https://issues.apache.org/jira/browse/HUDI-2776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17473243#comment-17473243 ] 

YangXuan commented on HUDI-2776:
--------------------------------

Please add the pr link address, thank you.

> Cluster update strategy should not be fenced by write config
> ------------------------------------------------------------
>
>                 Key: HUDI-2776
>                 URL: https://issues.apache.org/jira/browse/HUDI-2776
>             Project: Apache Hudi
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: sivabalan narayanan
>            Assignee: Sagar Sumit
>            Priority: Blocker
>             Fix For: 0.10.0
>
>
> In a multi-writer scenario, not all writers might set the enable clustering config. 
> BaseSparkCommitActionExecutor
> {code:java}
> private JavaRDD<HoodieRecord<T>> clusteringHandleUpdate(JavaRDD<HoodieRecord<T>> inputRecordsRDD) {
>   if (config.isClusteringEnabled()) {
>     Set<HoodieFileGroupId> fileGroupsInPendingClustering =
>         table.getFileSystemView().getFileGroupsInPendingClustering().map(entry -> entry.getKey()).collect(Collectors.toSet());
>     UpdateStrategy updateStrategy = (UpdateStrategy)ReflectionUtils
>         .loadClass(config.getClusteringUpdatesStrategyClass(), this.context, fileGroupsInPendingClustering);
>     return (JavaRDD<HoodieRecord<T>>)updateStrategy.handleUpdate(inputRecordsRDD);
>   } else {
>     return inputRecordsRDD;
>   }
> } {code}
> When clustering is scheduled and being executed by writer1, writer2 could go ahead and make updates to the same file group w/o any issues, given writer2 did not enable clustering in the write config. 
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)