You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/01/07 04:39:03 UTC

[GitHub] [hudi] nsivabalan commented on a change in pull request #2111: [HUDI-1234] Insert new records regardless of small file when using insert operation

nsivabalan commented on a change in pull request #2111:
URL: https://github.com/apache/hudi/pull/2111#discussion_r553104712



##########
File path: hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java
##########
@@ -191,14 +201,29 @@ private void assignInserts(WorkloadProfile profile, HoodieEngineContext context)
           long recordsToAppend = Math.min((config.getParquetMaxFileSize() - smallFile.sizeBytes) / averageRecordSize,
               totalUnassignedInserts);
           if (recordsToAppend > 0 && totalUnassignedInserts > 0) {
-            // create a new bucket or re-use an existing bucket
             int bucket;
-            if (updateLocationToBucket.containsKey(smallFile.location.getFileId())) {
-              bucket = updateLocationToBucket.get(smallFile.location.getFileId());
-              LOG.info("Assigning " + recordsToAppend + " inserts to existing update bucket " + bucket);
-            } else {
-              bucket = addUpdateBucket(partitionPath, smallFile.location.getFileId());
-              LOG.info("Assigning " + recordsToAppend + " inserts to new update bucket " + bucket);
+            if (config.isRouteInsertsToNewFiles()) {
+              // if insert operation, route inserts to new files regardless of small file handling.

Review comment:
       I don't want to add config.isRouteInsertsToNewFiles() to 3 if else conditions and hence, have added a bigger if else for older and new code path. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org