You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/11/09 23:11:15 UTC

[GitHub] [hudi] manojpec opened a new pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

manojpec opened a new pull request #3955:
URL: https://github.com/apache/hudi/pull/3955


   ## What is the purpose of the pull request
   
   Avoid ArithmeticException from ExternalSpillableMap put operations.
   
   ## Brief change log
   
   - ExternalSpillableMap does the payload/value size estimation on the first put to
     determine when to spill over to disk map. The payload size re-estimation also
     happens after a minimum threshold of puts. This size re-estimation goes my the
     current in-memory map size for calculating average payload size and does attempts
     divide by zero operation when the map is size is empty. Avoiding the
     ArithmeticException during the payload size re-estimate by checking the map size
     upfront.
   
   - Added an unit test to repro the case and exercise the fix.
   
   ## Verify this pull request
   
   *(Please pick either of the following options)*
   
   This pull request is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This pull request is already covered by existing tests, such as *(please describe tests)*.
   
   (or)
   
   This change added tests and can be verified as follows:
   
   *(example:)*
   
     - *Added integration tests for end-to-end.*
     - *Added HoodieClientWriteTest to verify the change.*
     - *Manually verified the change by running a job locally.*
   
   ## Committer checklist
   
    - [ ] Has a corresponding JIRA in PR title & commit
    
    - [ ] Commit message is descriptive of the change
    
    - [ ] CI is green
   
    - [ ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on a change in pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#discussion_r746288477



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/util/collection/ExternalSpillableMap.java
##########
@@ -204,19 +205,21 @@ public R get(Object key) {
   public R put(T key, R value) {
     if (this.currentInMemoryMapSize < maxInMemorySizeInBytes || inMemoryMap.containsKey(key)) {
       if (shouldEstimatePayloadSize && estimatedPayloadSize == 0) {
-        // At first, use the sizeEstimate of a record being inserted into the spillable map.
-        // Note, the converter may over estimate the size of a record in the JVM
+        // At first, use the size estimate of a record being inserted into the Spillable map.
+        // Note, the converter may overestimate the size of a record in the JVM.
         this.estimatedPayloadSize = keySizeEstimator.sizeEstimate(key) + valueSizeEstimator.sizeEstimate(value);
-        LOG.info("Estimated Payload size => " + estimatedPayloadSize);
-      } else if (shouldEstimatePayloadSize && inMemoryMap.size() % NUMBER_OF_RECORDS_TO_ESTIMATE_PAYLOAD_SIZE == 0) {
+        LOG.debug("Estimated Payload size => " + estimatedPayloadSize);
+      } else if (shouldEstimatePayloadSize
+          && !inMemoryMap.isEmpty()
+          && (inMemoryMap.size() % NUMBER_OF_RECORDS_TO_ESTIMATE_PAYLOAD_SIZE == 0)) {

Review comment:
       So the exceptionis from this line? doing `0 % 100`? is n't that ok?
   
   ```
   scala> 0 % 100
   res0: Int = 0
   ```
   
   What am I missing (pretty brain fried, so possible I am missing sth obv)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-966105747


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256",
       "triggerID" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "triggerType" : "PUSH"
     }, {
       "hash" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3270",
       "triggerID" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3294",
       "triggerID" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "triggerType" : "PUSH"
     }, {
       "hash" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3304",
       "triggerID" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 6ec10a72a469e665cbefd62c908aa4835ef36ceb Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3294) 
   * 59219ac3d6b159a5dcff0c78b7a5aca09ddf497e Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3304) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] manojpec commented on a change in pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
manojpec commented on a change in pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#discussion_r746356713



##########
File path: hudi-common/src/test/java/org/apache/hudi/common/util/collection/TestExternalSpillableMap.java
##########
@@ -339,7 +343,42 @@ public void testDataCorrectnessWithoutHoodieMetadata(ExternalSpillableMap.DiskMa
 
   // TODO : come up with a performance eval test for spillableMap
   @Test
-  public void testLargeInsertUpsert() {}
+  public void testLargeInsertUpsert() {
+  }
+
+  @Test
+  public void testPayloadSizeEstimate() throws IOException, URISyntaxException {
+    final ExternalSpillableMap.DiskMapType diskMapType = ExternalSpillableMap.DiskMapType.BITCASK;
+    final boolean isCompressionEnabled = false;
+    final Schema schema = SchemaTestUtil.getSimpleSchema();
+
+    ExternalSpillableMap<String, HoodieRecord<? extends HoodieRecordPayload>> records =
+        new ExternalSpillableMap<>(16L, basePath, new DefaultSizeEstimator(),
+            new HoodieRecordSizeEstimator(schema), diskMapType, isCompressionEnabled);
+
+    List<String> recordKeys = new ArrayList<>();
+
+    // Put a single record. Payload size estimation happens as part of this initial put.
+    HoodieRecord seedRecord = SchemaTestUtil.generateHoodieTestRecordsWithoutHoodieMetadata(0, 1).get(0);
+    records.put(seedRecord.getRecordKey(), seedRecord);
+
+    // Remove the key immediately to make the map empty again.
+    records.remove(seedRecord.getRecordKey());
+
+    // Payload size re-estimation should not happen as the map
+    // size has not reached the minimum size threshold for
+    // recalculation.
+    records.put(seedRecord.getRecordKey(), seedRecord);
+
+    // Put more records than the threshold to trigger payload size re-estimation
+    while (records.getDiskBasedMapNumEntries() < 1) {

Review comment:
       Without the fix, Map put operation throws exception and fails the test. To make it more clear on the expectation of the test, added the `assertDoesNotThrow` to the line where the exception will be thrown without the fix. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-965924351


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256",
       "triggerID" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "triggerType" : "PUSH"
     }, {
       "hash" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3270",
       "triggerID" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3270) 
   * 6ec10a72a469e665cbefd62c908aa4835ef36ceb UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-964631311


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a580e6377fbf3e65d1c04eef66f40793a410acb9 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-964645345


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a580e6377fbf3e65d1c04eef66f40793a410acb9 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255) 
   * 6c6b38a59d917c0a637f74ced4a623b1e82cb314 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-966104036


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256",
       "triggerID" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "triggerType" : "PUSH"
     }, {
       "hash" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3270",
       "triggerID" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3294",
       "triggerID" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "triggerType" : "PUSH"
     }, {
       "hash" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 6ec10a72a469e665cbefd62c908aa4835ef36ceb Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3294) 
   * 59219ac3d6b159a5dcff0c78b7a5aca09ddf497e UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-966575591


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256",
       "triggerID" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "triggerType" : "PUSH"
     }, {
       "hash" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3270",
       "triggerID" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3294",
       "triggerID" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "triggerType" : "PUSH"
     }, {
       "hash" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3304",
       "triggerID" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3316",
       "triggerID" : "966544292",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 59219ac3d6b159a5dcff0c78b7a5aca09ddf497e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3304) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3316) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-966545615


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256",
       "triggerID" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "triggerType" : "PUSH"
     }, {
       "hash" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3270",
       "triggerID" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3294",
       "triggerID" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "triggerType" : "PUSH"
     }, {
       "hash" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3304",
       "triggerID" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3316",
       "triggerID" : "966544292",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 59219ac3d6b159a5dcff0c78b7a5aca09ddf497e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3304) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3316) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-965925448


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256",
       "triggerID" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "triggerType" : "PUSH"
     }, {
       "hash" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3270",
       "triggerID" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3294",
       "triggerID" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3270) 
   * 6ec10a72a469e665cbefd62c908aa4835ef36ceb Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3294) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-964991561


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256",
       "triggerID" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "triggerType" : "PUSH"
     }, {
       "hash" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3270",
       "triggerID" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3270) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-966157078


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256",
       "triggerID" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "triggerType" : "PUSH"
     }, {
       "hash" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3270",
       "triggerID" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3294",
       "triggerID" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "triggerType" : "PUSH"
     }, {
       "hash" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3304",
       "triggerID" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 59219ac3d6b159a5dcff0c78b7a5aca09ddf497e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3304) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#discussion_r746132469



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/util/collection/ExternalSpillableMap.java
##########
@@ -204,19 +205,21 @@ public R get(Object key) {
   public R put(T key, R value) {
     if (this.currentInMemoryMapSize < maxInMemorySizeInBytes || inMemoryMap.containsKey(key)) {
       if (shouldEstimatePayloadSize && estimatedPayloadSize == 0) {
-        // At first, use the sizeEstimate of a record being inserted into the spillable map.
-        // Note, the converter may over estimate the size of a record in the JVM
+        // At first, use the size estimate of a record being inserted into the Spillable map.
+        // Note, the converter may overestimate the size of a record in the JVM.
         this.estimatedPayloadSize = keySizeEstimator.sizeEstimate(key) + valueSizeEstimator.sizeEstimate(value);
-        LOG.info("Estimated Payload size => " + estimatedPayloadSize);
-      } else if (shouldEstimatePayloadSize && inMemoryMap.size() % NUMBER_OF_RECORDS_TO_ESTIMATE_PAYLOAD_SIZE == 0) {
+        LOG.debug("Estimated Payload size => " + estimatedPayloadSize);

Review comment:
       wondering how come we did not hit this issue so far. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-964646448


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a580e6377fbf3e65d1c04eef66f40793a410acb9 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255) 
   * 6c6b38a59d917c0a637f74ced4a623b1e82cb314 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] manojpec commented on a change in pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
manojpec commented on a change in pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#discussion_r746136059



##########
File path: hudi-common/src/test/java/org/apache/hudi/common/util/collection/TestExternalSpillableMap.java
##########
@@ -339,7 +343,41 @@ public void testDataCorrectnessWithoutHoodieMetadata(ExternalSpillableMap.DiskMa
 
   // TODO : come up with a performance eval test for spillableMap
   @Test
-  public void testLargeInsertUpsert() {}
+  public void testLargeInsertUpsert() {
+  }
+
+  @ParameterizedTest
+  @MethodSource("testArguments")
+  public void testPayloadSizeEstimate(ExternalSpillableMap.DiskMapType diskMapType,

Review comment:
       right, the issue is in the Map wrapper class and not in the backing map. Removed the parameterization. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-964901528


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256",
       "triggerID" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "triggerType" : "PUSH"
     }, {
       "hash" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 6c6b38a59d917c0a637f74ced4a623b1e82cb314 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256) 
   * 80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-965944804


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256",
       "triggerID" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "triggerType" : "PUSH"
     }, {
       "hash" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3270",
       "triggerID" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3294",
       "triggerID" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 6ec10a72a469e665cbefd62c908aa4835ef36ceb Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3294) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-966545615


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256",
       "triggerID" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "triggerType" : "PUSH"
     }, {
       "hash" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3270",
       "triggerID" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3294",
       "triggerID" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "triggerType" : "PUSH"
     }, {
       "hash" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3304",
       "triggerID" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3316",
       "triggerID" : "966544292",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 59219ac3d6b159a5dcff0c78b7a5aca09ddf497e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3304) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3316) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on a change in pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#discussion_r746734023



##########
File path: hudi-common/src/test/java/org/apache/hudi/common/util/collection/TestExternalSpillableMap.java
##########
@@ -339,7 +343,42 @@ public void testDataCorrectnessWithoutHoodieMetadata(ExternalSpillableMap.DiskMa
 
   // TODO : come up with a performance eval test for spillableMap
   @Test
-  public void testLargeInsertUpsert() {}
+  public void testLargeInsertUpsert() {
+  }
+
+  @Test
+  public void testPayloadSizeEstimate() throws IOException, URISyntaxException {
+    final ExternalSpillableMap.DiskMapType diskMapType = ExternalSpillableMap.DiskMapType.BITCASK;
+    final boolean isCompressionEnabled = false;
+    final Schema schema = SchemaTestUtil.getSimpleSchema();
+
+    ExternalSpillableMap<String, HoodieRecord<? extends HoodieRecordPayload>> records =
+        new ExternalSpillableMap<>(16L, basePath, new DefaultSizeEstimator(),
+            new HoodieRecordSizeEstimator(schema), diskMapType, isCompressionEnabled);
+
+    List<String> recordKeys = new ArrayList<>();
+
+    // Put a single record. Payload size estimation happens as part of this initial put.
+    HoodieRecord seedRecord = SchemaTestUtil.generateHoodieTestRecordsWithoutHoodieMetadata(0, 1).get(0);
+    records.put(seedRecord.getRecordKey(), seedRecord);
+
+    // Remove the key immediately to make the map empty again.
+    records.remove(seedRecord.getRecordKey());
+
+    // Payload size re-estimation should not happen as the map
+    // size has not reached the minimum size threshold for
+    // recalculation.
+    records.put(seedRecord.getRecordKey(), seedRecord);
+
+    // Put more records than the threshold to trigger payload size re-estimation
+    while (records.getDiskBasedMapNumEntries() < 1) {

Review comment:
       do we need the lines 371-379. they are just trying to validate no error is thrown? in general, tests should have clear asserts. My suggestion would be to scope down this test to just testing the divide by zero scenario. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-966959106


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256",
       "triggerID" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "triggerType" : "PUSH"
     }, {
       "hash" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3270",
       "triggerID" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3294",
       "triggerID" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "triggerType" : "PUSH"
     }, {
       "hash" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3304",
       "triggerID" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3316",
       "triggerID" : "966544292",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3331",
       "triggerID" : "966922303",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 59219ac3d6b159a5dcff0c78b7a5aca09ddf497e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3304) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3316) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3331) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on a change in pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#discussion_r746734984



##########
File path: hudi-common/src/test/java/org/apache/hudi/common/util/collection/TestExternalSpillableMap.java
##########
@@ -337,9 +342,41 @@ public void testDataCorrectnessWithoutHoodieMetadata(ExternalSpillableMap.DiskMa
     assertEquals(gRecord.get(fieldName).toString(), newValue);
   }
 
-  // TODO : come up with a performance eval test for spillableMap
   @Test
-  public void testLargeInsertUpsert() {}
+  public void testPayloadSizeEstimate() throws IOException, URISyntaxException {

Review comment:
       rename: testEstimationWithEmptyMap or something to capture the scenario. the current name feels very broad. we are not really testing the actual estimate or anything




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-966104036


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256",
       "triggerID" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "triggerType" : "PUSH"
     }, {
       "hash" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3270",
       "triggerID" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3294",
       "triggerID" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "triggerType" : "PUSH"
     }, {
       "hash" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 6ec10a72a469e665cbefd62c908aa4835ef36ceb Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3294) 
   * 59219ac3d6b159a5dcff0c78b7a5aca09ddf497e UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-966105747


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256",
       "triggerID" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "triggerType" : "PUSH"
     }, {
       "hash" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3270",
       "triggerID" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3294",
       "triggerID" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "triggerType" : "PUSH"
     }, {
       "hash" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3304",
       "triggerID" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 6ec10a72a469e665cbefd62c908aa4835ef36ceb Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3294) 
   * 59219ac3d6b159a5dcff0c78b7a5aca09ddf497e Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3304) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-964645345


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a580e6377fbf3e65d1c04eef66f40793a410acb9 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255) 
   * 6c6b38a59d917c0a637f74ced4a623b1e82cb314 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-964631311


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a580e6377fbf3e65d1c04eef66f40793a410acb9 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-964661625


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256",
       "triggerID" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a580e6377fbf3e65d1c04eef66f40793a410acb9 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255) 
   * 6c6b38a59d917c0a637f74ced4a623b1e82cb314 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] manojpec commented on a change in pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
manojpec commented on a change in pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#discussion_r747147107



##########
File path: hudi-common/src/test/java/org/apache/hudi/common/util/collection/TestExternalSpillableMap.java
##########
@@ -339,7 +343,42 @@ public void testDataCorrectnessWithoutHoodieMetadata(ExternalSpillableMap.DiskMa
 
   // TODO : come up with a performance eval test for spillableMap
   @Test
-  public void testLargeInsertUpsert() {}
+  public void testLargeInsertUpsert() {
+  }
+
+  @Test
+  public void testPayloadSizeEstimate() throws IOException, URISyntaxException {
+    final ExternalSpillableMap.DiskMapType diskMapType = ExternalSpillableMap.DiskMapType.BITCASK;
+    final boolean isCompressionEnabled = false;
+    final Schema schema = SchemaTestUtil.getSimpleSchema();
+
+    ExternalSpillableMap<String, HoodieRecord<? extends HoodieRecordPayload>> records =
+        new ExternalSpillableMap<>(16L, basePath, new DefaultSizeEstimator(),
+            new HoodieRecordSizeEstimator(schema), diskMapType, isCompressionEnabled);
+
+    List<String> recordKeys = new ArrayList<>();
+
+    // Put a single record. Payload size estimation happens as part of this initial put.
+    HoodieRecord seedRecord = SchemaTestUtil.generateHoodieTestRecordsWithoutHoodieMetadata(0, 1).get(0);
+    records.put(seedRecord.getRecordKey(), seedRecord);
+
+    // Remove the key immediately to make the map empty again.
+    records.remove(seedRecord.getRecordKey());
+
+    // Payload size re-estimation should not happen as the map
+    // size has not reached the minimum size threshold for
+    // recalculation.
+    records.put(seedRecord.getRecordKey(), seedRecord);
+
+    // Put more records than the threshold to trigger payload size re-estimation
+    while (records.getDiskBasedMapNumEntries() < 1) {

Review comment:
       Removed the while block. But, I still need to put more than 100 entries to verify the payload size re-estimation code path. Updated the test. Please take one more look. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] manojpec commented on a change in pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
manojpec commented on a change in pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#discussion_r747310834



##########
File path: hudi-common/src/test/java/org/apache/hudi/common/util/collection/TestExternalSpillableMap.java
##########
@@ -337,9 +342,41 @@ public void testDataCorrectnessWithoutHoodieMetadata(ExternalSpillableMap.DiskMa
     assertEquals(gRecord.get(fieldName).toString(), newValue);
   }
 
-  // TODO : come up with a performance eval test for spillableMap
   @Test
-  public void testLargeInsertUpsert() {}
+  public void testPayloadSizeEstimate() throws IOException, URISyntaxException {

Review comment:
       done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-964903272


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256",
       "triggerID" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "triggerType" : "PUSH"
     }, {
       "hash" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3270",
       "triggerID" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 6c6b38a59d917c0a637f74ced4a623b1e82cb314 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256) 
   * 80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3270) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] manojpec commented on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
manojpec commented on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-966544292


   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-964991561


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256",
       "triggerID" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "triggerType" : "PUSH"
     }, {
       "hash" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3270",
       "triggerID" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3270) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-966923426


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256",
       "triggerID" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "triggerType" : "PUSH"
     }, {
       "hash" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3270",
       "triggerID" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3294",
       "triggerID" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "triggerType" : "PUSH"
     }, {
       "hash" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3304",
       "triggerID" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3316",
       "triggerID" : "966544292",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3331",
       "triggerID" : "966922303",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 59219ac3d6b159a5dcff0c78b7a5aca09ddf497e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3304) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3316) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3331) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-965924351


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256",
       "triggerID" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "triggerType" : "PUSH"
     }, {
       "hash" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3270",
       "triggerID" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3270) 
   * 6ec10a72a469e665cbefd62c908aa4835ef36ceb UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] manojpec commented on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
manojpec commented on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-966922070


   CI has been failing because of the dependency lib error and nothing to do with this PR changes.
   ```
   [ERROR] Failed to execute goal on project hudi-utilities_2.11: Could not resolve dependencies for project org.apache.hudi:hudi-utilities_2.11:jar:0.10.0-SNAPSHOT: Failed to collect dependencies at io.confluent:kafka-avro-serializer:jar:5.3.4: Failed to read artifact descriptor for io.confluent:kafka-avro-serializer:jar:5.3.4: Could not transfer artifact io.confluent:kafka-avro-serializer:pom:5.3.4 from/to confluent (https://packages.confluent.io/maven/): transfer failed for https://packages.confluent.io/maven/io/confluent/kafka-avro-serializer/5.3.4/kafka-avro-serializer-5.3.4.pom: Connection reset -> [Help 1]
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-964630006


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a580e6377fbf3e65d1c04eef66f40793a410acb9 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] manojpec commented on a change in pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
manojpec commented on a change in pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#discussion_r746355186



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/util/collection/ExternalSpillableMap.java
##########
@@ -204,19 +205,21 @@ public R get(Object key) {
   public R put(T key, R value) {
     if (this.currentInMemoryMapSize < maxInMemorySizeInBytes || inMemoryMap.containsKey(key)) {
       if (shouldEstimatePayloadSize && estimatedPayloadSize == 0) {
-        // At first, use the sizeEstimate of a record being inserted into the spillable map.
-        // Note, the converter may over estimate the size of a record in the JVM
+        // At first, use the size estimate of a record being inserted into the Spillable map.
+        // Note, the converter may overestimate the size of a record in the JVM.
         this.estimatedPayloadSize = keySizeEstimator.sizeEstimate(key) + valueSizeEstimator.sizeEstimate(value);
-        LOG.info("Estimated Payload size => " + estimatedPayloadSize);
-      } else if (shouldEstimatePayloadSize && inMemoryMap.size() % NUMBER_OF_RECORDS_TO_ESTIMATE_PAYLOAD_SIZE == 0) {
+        LOG.debug("Estimated Payload size => " + estimatedPayloadSize);
+      } else if (shouldEstimatePayloadSize
+          && !inMemoryMap.isEmpty()
+          && (inMemoryMap.size() % NUMBER_OF_RECORDS_TO_ESTIMATE_PAYLOAD_SIZE == 0)) {

Review comment:
       the issue is in line 218, where the denominator is zero.
   
   ```
           this.estimatedPayloadSize = totalMapSize / inMemoryMap.size();
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-964903272


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256",
       "triggerID" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "triggerType" : "PUSH"
     }, {
       "hash" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3270",
       "triggerID" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 6c6b38a59d917c0a637f74ced4a623b1e82cb314 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256) 
   * 80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3270) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] manojpec commented on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
manojpec commented on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-966922303


   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-966575591


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256",
       "triggerID" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "triggerType" : "PUSH"
     }, {
       "hash" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3270",
       "triggerID" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3294",
       "triggerID" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "triggerType" : "PUSH"
     }, {
       "hash" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3304",
       "triggerID" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3316",
       "triggerID" : "966544292",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 59219ac3d6b159a5dcff0c78b7a5aca09ddf497e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3304) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3316) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on a change in pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#discussion_r746731186



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/util/collection/ExternalSpillableMap.java
##########
@@ -204,19 +205,21 @@ public R get(Object key) {
   public R put(T key, R value) {
     if (this.currentInMemoryMapSize < maxInMemorySizeInBytes || inMemoryMap.containsKey(key)) {
       if (shouldEstimatePayloadSize && estimatedPayloadSize == 0) {
-        // At first, use the sizeEstimate of a record being inserted into the spillable map.
-        // Note, the converter may over estimate the size of a record in the JVM
+        // At first, use the size estimate of a record being inserted into the Spillable map.
+        // Note, the converter may overestimate the size of a record in the JVM.
         this.estimatedPayloadSize = keySizeEstimator.sizeEstimate(key) + valueSizeEstimator.sizeEstimate(value);
-        LOG.info("Estimated Payload size => " + estimatedPayloadSize);
-      } else if (shouldEstimatePayloadSize && inMemoryMap.size() % NUMBER_OF_RECORDS_TO_ESTIMATE_PAYLOAD_SIZE == 0) {
+        LOG.debug("Estimated Payload size => " + estimatedPayloadSize);
+      } else if (shouldEstimatePayloadSize
+          && !inMemoryMap.isEmpty()
+          && (inMemoryMap.size() % NUMBER_OF_RECORDS_TO_ESTIMATE_PAYLOAD_SIZE == 0)) {

Review comment:
       ah ok. I don't think we remove from the map per se, anywhere. that's probably why its all fine now. good catch. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-965944804


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256",
       "triggerID" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "triggerType" : "PUSH"
     }, {
       "hash" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3270",
       "triggerID" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3294",
       "triggerID" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 6ec10a72a469e665cbefd62c908aa4835ef36ceb Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3294) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-964661625


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256",
       "triggerID" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a580e6377fbf3e65d1c04eef66f40793a410acb9 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255) 
   * 6c6b38a59d917c0a637f74ced4a623b1e82cb314 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-964646448


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a580e6377fbf3e65d1c04eef66f40793a410acb9 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255) 
   * 6c6b38a59d917c0a637f74ced4a623b1e82cb314 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] manojpec commented on a change in pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
manojpec commented on a change in pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#discussion_r746354529



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/util/collection/ExternalSpillableMap.java
##########
@@ -204,19 +205,21 @@ public R get(Object key) {
   public R put(T key, R value) {
     if (this.currentInMemoryMapSize < maxInMemorySizeInBytes || inMemoryMap.containsKey(key)) {
       if (shouldEstimatePayloadSize && estimatedPayloadSize == 0) {
-        // At first, use the sizeEstimate of a record being inserted into the spillable map.
-        // Note, the converter may over estimate the size of a record in the JVM
+        // At first, use the size estimate of a record being inserted into the Spillable map.
+        // Note, the converter may overestimate the size of a record in the JVM.
         this.estimatedPayloadSize = keySizeEstimator.sizeEstimate(key) + valueSizeEstimator.sizeEstimate(value);
-        LOG.info("Estimated Payload size => " + estimatedPayloadSize);
-      } else if (shouldEstimatePayloadSize && inMemoryMap.size() % NUMBER_OF_RECORDS_TO_ESTIMATE_PAYLOAD_SIZE == 0) {
+        LOG.debug("Estimated Payload size => " + estimatedPayloadSize);
+      } else if (shouldEstimatePayloadSize

Review comment:
       sure, reverted the formatting changes. 

##########
File path: hudi-common/src/test/java/org/apache/hudi/common/util/collection/TestExternalSpillableMap.java
##########
@@ -339,7 +343,42 @@ public void testDataCorrectnessWithoutHoodieMetadata(ExternalSpillableMap.DiskMa
 
   // TODO : come up with a performance eval test for spillableMap
   @Test
-  public void testLargeInsertUpsert() {}
+  public void testLargeInsertUpsert() {

Review comment:
       done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-964901528


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256",
       "triggerID" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "triggerType" : "PUSH"
     }, {
       "hash" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 6c6b38a59d917c0a637f74ced4a623b1e82cb314 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256) 
   * 80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#discussion_r746130666



##########
File path: hudi-common/src/test/java/org/apache/hudi/common/util/collection/TestExternalSpillableMap.java
##########
@@ -339,7 +343,41 @@ public void testDataCorrectnessWithoutHoodieMetadata(ExternalSpillableMap.DiskMa
 
   // TODO : come up with a performance eval test for spillableMap
   @Test
-  public void testLargeInsertUpsert() {}
+  public void testLargeInsertUpsert() {
+  }
+
+  @ParameterizedTest
+  @MethodSource("testArguments")
+  public void testPayloadSizeEstimate(ExternalSpillableMap.DiskMapType diskMapType,

Review comment:
       I guess we don't need to do parametrized here. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on a change in pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#discussion_r746270073



##########
File path: hudi-common/src/test/java/org/apache/hudi/common/util/collection/TestExternalSpillableMap.java
##########
@@ -339,7 +343,42 @@ public void testDataCorrectnessWithoutHoodieMetadata(ExternalSpillableMap.DiskMa
 
   // TODO : come up with a performance eval test for spillableMap
   @Test
-  public void testLargeInsertUpsert() {}
+  public void testLargeInsertUpsert() {

Review comment:
       lets remove this empty test method while you are at it. 

##########
File path: hudi-common/src/test/java/org/apache/hudi/common/util/collection/TestExternalSpillableMap.java
##########
@@ -339,7 +343,42 @@ public void testDataCorrectnessWithoutHoodieMetadata(ExternalSpillableMap.DiskMa
 
   // TODO : come up with a performance eval test for spillableMap
   @Test
-  public void testLargeInsertUpsert() {}
+  public void testLargeInsertUpsert() {
+  }
+
+  @Test
+  public void testPayloadSizeEstimate() throws IOException, URISyntaxException {
+    final ExternalSpillableMap.DiskMapType diskMapType = ExternalSpillableMap.DiskMapType.BITCASK;
+    final boolean isCompressionEnabled = false;
+    final Schema schema = SchemaTestUtil.getSimpleSchema();
+
+    ExternalSpillableMap<String, HoodieRecord<? extends HoodieRecordPayload>> records =
+        new ExternalSpillableMap<>(16L, basePath, new DefaultSizeEstimator(),
+            new HoodieRecordSizeEstimator(schema), diskMapType, isCompressionEnabled);
+
+    List<String> recordKeys = new ArrayList<>();
+
+    // Put a single record. Payload size estimation happens as part of this initial put.
+    HoodieRecord seedRecord = SchemaTestUtil.generateHoodieTestRecordsWithoutHoodieMetadata(0, 1).get(0);
+    records.put(seedRecord.getRecordKey(), seedRecord);
+
+    // Remove the key immediately to make the map empty again.
+    records.remove(seedRecord.getRecordKey());
+
+    // Payload size re-estimation should not happen as the map
+    // size has not reached the minimum size threshold for
+    // recalculation.
+    records.put(seedRecord.getRecordKey(), seedRecord);
+
+    // Put more records than the threshold to trigger payload size re-estimation
+    while (records.getDiskBasedMapNumEntries() < 1) {

Review comment:
       what is this test asserting?  I don't see any conditions being tested? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-964630006


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a580e6377fbf3e65d1c04eef66f40793a410acb9 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-964688002


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256",
       "triggerID" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 6c6b38a59d917c0a637f74ced4a623b1e82cb314 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-964688002


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256",
       "triggerID" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 6c6b38a59d917c0a637f74ced4a623b1e82cb314 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-966157078


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256",
       "triggerID" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "triggerType" : "PUSH"
     }, {
       "hash" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3270",
       "triggerID" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3294",
       "triggerID" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "triggerType" : "PUSH"
     }, {
       "hash" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3304",
       "triggerID" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 59219ac3d6b159a5dcff0c78b7a5aca09ddf497e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3304) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-965925448


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256",
       "triggerID" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "triggerType" : "PUSH"
     }, {
       "hash" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3270",
       "triggerID" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3294",
       "triggerID" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3270) 
   * 6ec10a72a469e665cbefd62c908aa4835ef36ceb Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3294) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan merged pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
nsivabalan merged pull request #3955:
URL: https://github.com/apache/hudi/pull/3955


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#issuecomment-966923426


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3255",
       "triggerID" : "a580e6377fbf3e65d1c04eef66f40793a410acb9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3256",
       "triggerID" : "6c6b38a59d917c0a637f74ced4a623b1e82cb314",
       "triggerType" : "PUSH"
     }, {
       "hash" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3270",
       "triggerID" : "80fd8feff9f7f7d64a5cbe3b3b0c9d8378aacd3b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3294",
       "triggerID" : "6ec10a72a469e665cbefd62c908aa4835ef36ceb",
       "triggerType" : "PUSH"
     }, {
       "hash" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3304",
       "triggerID" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3316",
       "triggerID" : "966544292",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "59219ac3d6b159a5dcff0c78b7a5aca09ddf497e",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3331",
       "triggerID" : "966922303",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 59219ac3d6b159a5dcff0c78b7a5aca09ddf497e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3304) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3316) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=3331) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on a change in pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#discussion_r746137889



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/util/collection/ExternalSpillableMap.java
##########
@@ -204,19 +205,21 @@ public R get(Object key) {
   public R put(T key, R value) {
     if (this.currentInMemoryMapSize < maxInMemorySizeInBytes || inMemoryMap.containsKey(key)) {
       if (shouldEstimatePayloadSize && estimatedPayloadSize == 0) {
-        // At first, use the sizeEstimate of a record being inserted into the spillable map.
-        // Note, the converter may over estimate the size of a record in the JVM
+        // At first, use the size estimate of a record being inserted into the Spillable map.
+        // Note, the converter may overestimate the size of a record in the JVM.
         this.estimatedPayloadSize = keySizeEstimator.sizeEstimate(key) + valueSizeEstimator.sizeEstimate(value);
-        LOG.info("Estimated Payload size => " + estimatedPayloadSize);
-      } else if (shouldEstimatePayloadSize && inMemoryMap.size() % NUMBER_OF_RECORDS_TO_ESTIMATE_PAYLOAD_SIZE == 0) {
+        LOG.debug("Estimated Payload size => " + estimatedPayloadSize);
+      } else if (shouldEstimatePayloadSize

Review comment:
       Can we please not change formatting on lines the PR does not intend to change?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] manojpec commented on a change in pull request #3955: [HUDI-2718] ExternalSpillableMap payload size re-estimation throws ArithmeticException

Posted by GitBox <gi...@apache.org>.
manojpec commented on a change in pull request #3955:
URL: https://github.com/apache/hudi/pull/3955#discussion_r746136652



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/util/collection/ExternalSpillableMap.java
##########
@@ -204,19 +205,21 @@ public R get(Object key) {
   public R put(T key, R value) {
     if (this.currentInMemoryMapSize < maxInMemorySizeInBytes || inMemoryMap.containsKey(key)) {
       if (shouldEstimatePayloadSize && estimatedPayloadSize == 0) {
-        // At first, use the sizeEstimate of a record being inserted into the spillable map.
-        // Note, the converter may over estimate the size of a record in the JVM
+        // At first, use the size estimate of a record being inserted into the Spillable map.
+        // Note, the converter may overestimate the size of a record in the JVM.
         this.estimatedPayloadSize = keySizeEstimator.sizeEstimate(key) + valueSizeEstimator.sizeEstimate(value);
-        LOG.info("Estimated Payload size => " + estimatedPayloadSize);
-      } else if (shouldEstimatePayloadSize && inMemoryMap.size() % NUMBER_OF_RECORDS_TO_ESTIMATE_PAYLOAD_SIZE == 0) {
+        LOG.debug("Estimated Payload size => " + estimatedPayloadSize);

Review comment:
       The map needs to go back to empty after the initial put and within the next threshold put count for the size re-estimation. The MetaIndex code path is triggering this now I guess. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org