You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2019/11/09 06:56:34 UTC

[GitHub] [incubator-hudi] bhasudha commented on a change in pull request #1004: [HUDI-15] Adding delete api to HoodieWriteClient

bhasudha commented on a change in pull request #1004: [HUDI-15] Adding delete api to HoodieWriteClient
URL: https://github.com/apache/incubator-hudi/pull/1004#discussion_r344432040
 
 

 ##########
 File path: hudi-client/src/main/java/org/apache/hudi/HoodieWriteClient.java
 ##########
 @@ -325,6 +326,31 @@ public static SparkConf registerClasses(SparkConf conf) {
     }
   }
 
+  /**
+   * Deletes a bunch of keys from the Hoodie table, at the supplied commitTime
+   */
+  public JavaRDD<WriteStatus> delete(JavaRDD<HoodieKey> keys, final String commitTime) {
+    HoodieTable<T> table = getTableAndInitCtx();
+    try {
+      // De-dupe/merge if needed
+      JavaRDD<HoodieKey> dedupedKeys =
+          combineKeysOnCondition(config.shouldCombineBeforeUpsert(), keys, config.getUpsertShuffleParallelism());
 
 Review comment:
   @bvaradar just for my understanding: If we pre-combine for upsert, we also do pre-combine for other write actions? Is it possible either now or in future that we would want to pre-combine for only insert/upsert and not for delete?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services