You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/09/05 23:22:43 UTC

[GitHub] [hudi] yihua commented on a diff in pull request #6574: Keep a clustering running at the same time.#6573

yihua commented on code in PR #6574:
URL: https://github.com/apache/hudi/pull/6574#discussion_r963150834


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/cluster/ClusteringPlanActionExecutor.java:
##########
@@ -63,6 +63,7 @@ protected Option<HoodieClusteringPlan> createClusteringPlan() {
     int commitsSinceLastClustering = table.getActiveTimeline().getCommitsTimeline().filterCompletedInstants()
         .findInstantsAfter(lastClusteringInstant.map(HoodieInstant::getTimestamp).orElse("0"), Integer.MAX_VALUE)
         .countInstants();
+
     if (config.inlineClusteringEnabled() && config.getInlineClusterMaxCommits() > commitsSinceLastClustering) {

Review Comment:
   This and the condition below guarantee that the clustering is only scheduled based on the max_commits config.  @eric9204 could you double check the logic?



##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/cluster/ClusteringPlanActionExecutor.java:
##########
@@ -77,11 +78,14 @@ protected Option<HoodieClusteringPlan> createClusteringPlan() {
       return Option.empty();
     }
 
-    LOG.info("Generating clustering plan for table " + config.getBasePath());
-    ClusteringPlanStrategy strategy = (ClusteringPlanStrategy)
-        ReflectionUtils.loadClass(ClusteringPlanStrategy.checkAndGetClusteringPlanStrategy(config), table, context, config);
+    ClusteringPlanStrategy strategy = null;
+    if (config.getAsyncClusterMaxCommits() <= commitsSinceLastClustering) {
+      LOG.info("Generating clustering plan for table " + config.getBasePath());
+      strategy = (ClusteringPlanStrategy)
+          ReflectionUtils.loadClass(ClusteringPlanStrategy.checkAndGetClusteringPlanStrategy(config), table, context, config);
+    }
 
-    return strategy.generateClusteringPlan();
+    return strategy == null ? Option.empty() : strategy.generateClusteringPlan();

Review Comment:
   @eric9204 This does not seem to solve the problem you mentioned, around frequent clustering scheduling.  This only avoids NPE.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org