You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/06/21 21:07:08 UTC

[GitHub] [hudi] rmahindra123 opened a new pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

rmahindra123 opened a new pull request #3128:
URL: https://github.com/apache/hudi/pull/3128


   ## What is the purpose of the pull request
   
   Based on our benchmark of ExternalSpillableMap with large data, we noticed that higher amount of data is spilled to disk when using DiskBasedMap when compared to RocksDb, causing much higher GET latencies. One strong reason for this was that compression was not being used in DiskBasedMap, while RocksDb uses compression. This PR adds a way to enable compression when storing KVs in DiskBasedMap that should alleviate such perf concerns.
   
   ## Brief change log
   - Added compression and decompression when storing KVs in DiskBasedMap.java
   
   ## Verify this pull request
   - Added tests to TestDiskBasedMap.java and TestExternalSpillableMap.java


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#issuecomment-865345707


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "264aaba2e451e1579fd1e3c4fc378ad7563ce3bd",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=328",
       "triggerID" : "264aaba2e451e1579fd1e3c4fc378ad7563ce3bd",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 264aaba2e451e1579fd1e3c4fc378ad7563ce3bd Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=328) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#issuecomment-865384115


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3128](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5f8d112) into [master](https://codecov.io/gh/apache/hudi/commit/01ad449ad68d9c77ee1493fcdf833df53df6106a?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (01ad449) will **decrease** coverage by `24.87%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3128/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #3128       +/-   ##
   =============================================
   - Coverage     52.19%   27.31%   -24.88%     
   + Complexity     2659     1291     -1368     
   =============================================
     Files           335      386       +51     
     Lines         14981    15334      +353     
     Branches       1505     1337      -168     
   =============================================
   - Hits           7819     4189     -3630     
   - Misses         6536    10842     +4306     
   + Partials        626      303      -323     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `20.92% <0.00%> (∅)` | |
   | hudicommon | `?` | |
   | hudihadoopmr | `?` | |
   | hudisync | `4.85% <ø> (?)` | |
   | huditimelineservice | `?` | |
   | hudiutilities | `59.26% <ø> (-10.39%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [.../org/apache/hudi/async/AsyncClusteringService.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ2x1c3RlcmluZ1NlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...ava/org/apache/hudi/async/AsyncCompactService.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ29tcGFjdFNlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...java/org/apache/hudi/async/HoodieAsyncService.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0hvb2RpZUFzeW5jU2VydmljZS5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...ient/http/HoodieWriteCommitHttpCallbackClient.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NhbGxiYWNrL2NsaWVudC9odHRwL0hvb2RpZVdyaXRlQ29tbWl0SHR0cENhbGxiYWNrQ2xpZW50LmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...udi/callback/util/HoodieCommitCallbackFactory.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NhbGxiYWNrL3V0aWwvSG9vZGllQ29tbWl0Q2FsbGJhY2tGYWN0b3J5LmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...g/apache/hudi/client/AbstractClusteringClient.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdENsdXN0ZXJpbmdDbGllbnQuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...a/org/apache/hudi/client/AbstractHoodieClient.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdEhvb2RpZUNsaWVudC5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [.../apache/hudi/client/AbstractHoodieWriteClient.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdEhvb2RpZVdyaXRlQ2xpZW50LmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...va/org/apache/hudi/client/AsyncCleanerService.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9Bc3luY0NsZWFuZXJTZXJ2aWNlLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [.../org/apache/hudi/client/CompactionAdminClient.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9Db21wYWN0aW9uQWRtaW5DbGllbnQuamF2YQ==) | `0.00% <ø> (ø)` | |
   | ... and [728 more](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [f0a2f37...5f8d112](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter commented on pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#issuecomment-865384115


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3128](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (264aaba) into [master](https://codecov.io/gh/apache/hudi/commit/c08fbb4268ee4b227452fd27d5e6ba322eeef00e?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (c08fbb4) will **increase** coverage by `3.86%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3128/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #3128      +/-   ##
   ============================================
   + Coverage     46.01%   49.88%   +3.86%     
   + Complexity     5306      394    -4912     
   ============================================
     Files           911       66     -845     
     Lines         39476     2919   -36557     
     Branches       4254      318    -3936     
   ============================================
   - Hits          18166     1456   -16710     
   + Misses        19456     1327   -18129     
   + Partials       1854      136    -1718     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `?` | |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `?` | |
   | huditimelineservice | `?` | |
   | hudiutilities | `49.88% <ø> (-8.50%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...ies/exception/HoodieSnapshotExporterException.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2V4Y2VwdGlvbi9Ib29kaWVTbmFwc2hvdEV4cG9ydGVyRXhjZXB0aW9uLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../apache/hudi/utilities/HoodieSnapshotExporter.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZVNuYXBzaG90RXhwb3J0ZXIuamF2YQ==) | `5.17% <0.00%> (-83.63%)` | :arrow_down: |
   | [...hudi/utilities/schema/JdbcbasedSchemaProvider.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9KZGJjYmFzZWRTY2hlbWFQcm92aWRlci5qYXZh) | `0.00% <0.00%> (-72.23%)` | :arrow_down: |
   | [...org/apache/hudi/utilities/HDFSParquetImporter.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hERlNQYXJxdWV0SW1wb3J0ZXIuamF2YQ==) | `0.00% <0.00%> (-71.82%)` | :arrow_down: |
   | [...he/hudi/utilities/transform/AWSDmsTransformer.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3RyYW5zZm9ybS9BV1NEbXNUcmFuc2Zvcm1lci5qYXZh) | `0.00% <0.00%> (-66.67%)` | :arrow_down: |
   | [...in/java/org/apache/hudi/utilities/UtilHelpers.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL1V0aWxIZWxwZXJzLmphdmE=) | `40.69% <0.00%> (-23.84%)` | :arrow_down: |
   | [...rg/apache/hudi/utilities/HoodieSnapshotCopier.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZVNuYXBzaG90Q29waWVyLmphdmE=) | `0.00% <0.00%> (-13.80%)` | :arrow_down: |
   | [...hudi/utilities/sources/helpers/KafkaOffsetGen.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9LYWZrYU9mZnNldEdlbi5qYXZh) | `86.36% <0.00%> (-0.91%)` | :arrow_down: |
   | [...e/hudi/sink/transform/RowDataToHoodieFunction.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL3RyYW5zZm9ybS9Sb3dEYXRhVG9Ib29kaWVGdW5jdGlvbi5qYXZh) | | |
   | [...rg/apache/hudi/cli/commands/HoodieSyncCommand.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpL2NvbW1hbmRzL0hvb2RpZVN5bmNDb21tYW5kLmphdmE=) | | |
   | ... and [838 more](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [c08fbb4...264aaba](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#issuecomment-865345707






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter commented on pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#issuecomment-865384115


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3128](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (264aaba) into [master](https://codecov.io/gh/apache/hudi/commit/c08fbb4268ee4b227452fd27d5e6ba322eeef00e?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (c08fbb4) will **increase** coverage by `3.86%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3128/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #3128      +/-   ##
   ============================================
   + Coverage     46.01%   49.88%   +3.86%     
   + Complexity     5306      394    -4912     
   ============================================
     Files           911       66     -845     
     Lines         39476     2919   -36557     
     Branches       4254      318    -3936     
   ============================================
   - Hits          18166     1456   -16710     
   + Misses        19456     1327   -18129     
   + Partials       1854      136    -1718     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `?` | |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `?` | |
   | huditimelineservice | `?` | |
   | hudiutilities | `49.88% <ø> (-8.50%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...ies/exception/HoodieSnapshotExporterException.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2V4Y2VwdGlvbi9Ib29kaWVTbmFwc2hvdEV4cG9ydGVyRXhjZXB0aW9uLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../apache/hudi/utilities/HoodieSnapshotExporter.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZVNuYXBzaG90RXhwb3J0ZXIuamF2YQ==) | `5.17% <0.00%> (-83.63%)` | :arrow_down: |
   | [...hudi/utilities/schema/JdbcbasedSchemaProvider.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9KZGJjYmFzZWRTY2hlbWFQcm92aWRlci5qYXZh) | `0.00% <0.00%> (-72.23%)` | :arrow_down: |
   | [...org/apache/hudi/utilities/HDFSParquetImporter.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hERlNQYXJxdWV0SW1wb3J0ZXIuamF2YQ==) | `0.00% <0.00%> (-71.82%)` | :arrow_down: |
   | [...he/hudi/utilities/transform/AWSDmsTransformer.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3RyYW5zZm9ybS9BV1NEbXNUcmFuc2Zvcm1lci5qYXZh) | `0.00% <0.00%> (-66.67%)` | :arrow_down: |
   | [...in/java/org/apache/hudi/utilities/UtilHelpers.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL1V0aWxIZWxwZXJzLmphdmE=) | `40.69% <0.00%> (-23.84%)` | :arrow_down: |
   | [...rg/apache/hudi/utilities/HoodieSnapshotCopier.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZVNuYXBzaG90Q29waWVyLmphdmE=) | `0.00% <0.00%> (-13.80%)` | :arrow_down: |
   | [...hudi/utilities/sources/helpers/KafkaOffsetGen.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9LYWZrYU9mZnNldEdlbi5qYXZh) | `86.36% <0.00%> (-0.91%)` | :arrow_down: |
   | [...e/hudi/sink/transform/RowDataToHoodieFunction.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL3RyYW5zZm9ybS9Sb3dEYXRhVG9Ib29kaWVGdW5jdGlvbi5qYXZh) | | |
   | [...rg/apache/hudi/cli/commands/HoodieSyncCommand.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpL2NvbW1hbmRzL0hvb2RpZVN5bmNDb21tYW5kLmphdmE=) | | |
   | ... and [838 more](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [c08fbb4...264aaba](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] rmahindra123 commented on a change in pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
rmahindra123 commented on a change in pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#discussion_r667254655



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/util/collection/DiskBasedMap.java
##########
@@ -395,4 +417,48 @@ public int compareTo(ValueMetadata o) {
       return Long.compare(this.offsetOfValue, o.offsetOfValue);
     }
   }
+
+  private static class DiskCompressionInstance implements Serializable {
+    public static final int DISK_COMPRESSION_INITIAL_BUFFER_SIZE = 1048576;
+
+    // Caching ByteArrayOutputStreams to avoid recreating it for every operation
+    private final ByteArrayOutputStream compressBaos;
+    private final ByteArrayOutputStream decompressBaos;
+    private final byte[] decompressBuffer;
+
+    DiskCompressionInstance() {
+      compressBaos = new ByteArrayOutputStream(DISK_COMPRESSION_INITIAL_BUFFER_SIZE);
+      decompressBaos = new ByteArrayOutputStream(DISK_COMPRESSION_INITIAL_BUFFER_SIZE);
+      decompressBuffer = new byte[8192];
+    }
+
+    public byte[] compressBytes(final byte [] value) throws IOException {

Review comment:
       Yeah this is optimized for running as a single thread.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on a change in pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#discussion_r659870711



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/util/collection/DiskBasedMap.java
##########
@@ -395,4 +417,48 @@ public int compareTo(ValueMetadata o) {
       return Long.compare(this.offsetOfValue, o.offsetOfValue);
     }
   }
+
+  private static class DiskCompressionInstance implements Serializable {
+    public static final int DISK_COMPRESSION_INITIAL_BUFFER_SIZE = 1048576;
+
+    // Caching ByteArrayOutputStreams to avoid recreating it for every operation
+    private final ByteArrayOutputStream compressBaos;
+    private final ByteArrayOutputStream decompressBaos;
+    private final byte[] decompressBuffer;
+
+    DiskCompressionInstance() {
+      compressBaos = new ByteArrayOutputStream(DISK_COMPRESSION_INITIAL_BUFFER_SIZE);
+      decompressBaos = new ByteArrayOutputStream(DISK_COMPRESSION_INITIAL_BUFFER_SIZE);
+      decompressBuffer = new byte[8192];

Review comment:
       pull this into a static final?

##########
File path: hudi-common/src/main/java/org/apache/hudi/common/util/collection/DiskBasedMap.java
##########
@@ -395,4 +417,48 @@ public int compareTo(ValueMetadata o) {
       return Long.compare(this.offsetOfValue, o.offsetOfValue);
     }
   }
+
+  private static class DiskCompressionInstance implements Serializable {

Review comment:
       rename: `Compressor` or `CompressionHandler`

##########
File path: hudi-common/src/main/java/org/apache/hudi/common/util/collection/DiskBasedMap.java
##########
@@ -395,4 +417,48 @@ public int compareTo(ValueMetadata o) {
       return Long.compare(this.offsetOfValue, o.offsetOfValue);
     }
   }
+
+  private static class DiskCompressionInstance implements Serializable {
+    public static final int DISK_COMPRESSION_INITIAL_BUFFER_SIZE = 1048576;
+
+    // Caching ByteArrayOutputStreams to avoid recreating it for every operation
+    private final ByteArrayOutputStream compressBaos;
+    private final ByteArrayOutputStream decompressBaos;
+    private final byte[] decompressBuffer;
+
+    DiskCompressionInstance() {
+      compressBaos = new ByteArrayOutputStream(DISK_COMPRESSION_INITIAL_BUFFER_SIZE);
+      decompressBaos = new ByteArrayOutputStream(DISK_COMPRESSION_INITIAL_BUFFER_SIZE);
+      decompressBuffer = new byte[8192];

Review comment:
       what if there is a value beyond 8192? better to use something that is dynamic? 

##########
File path: hudi-common/src/main/java/org/apache/hudi/common/util/collection/DiskBasedMap.java
##########
@@ -395,4 +417,48 @@ public int compareTo(ValueMetadata o) {
       return Long.compare(this.offsetOfValue, o.offsetOfValue);
     }
   }
+
+  private static class DiskCompressionInstance implements Serializable {
+    public static final int DISK_COMPRESSION_INITIAL_BUFFER_SIZE = 1048576;
+
+    // Caching ByteArrayOutputStreams to avoid recreating it for every operation
+    private final ByteArrayOutputStream compressBaos;
+    private final ByteArrayOutputStream decompressBaos;
+    private final byte[] decompressBuffer;
+
+    DiskCompressionInstance() {
+      compressBaos = new ByteArrayOutputStream(DISK_COMPRESSION_INITIAL_BUFFER_SIZE);
+      decompressBaos = new ByteArrayOutputStream(DISK_COMPRESSION_INITIAL_BUFFER_SIZE);
+      decompressBuffer = new byte[8192];
+    }
+
+    public byte[] compressBytes(final byte [] value) throws IOException {
+      compressBaos.reset();

Review comment:
       you may have to handle some resizing logic here w.r.t the 8MB limit




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#issuecomment-865384115


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3128](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5f8d112) into [master](https://codecov.io/gh/apache/hudi/commit/01ad449ad68d9c77ee1493fcdf833df53df6106a?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (01ad449) will **decrease** coverage by `4.41%`.
   > The diff coverage is `37.44%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3128/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #3128      +/-   ##
   ============================================
   - Coverage     52.19%   47.77%   -4.42%     
   - Complexity     2659     5560    +2901     
   ============================================
     Files           335      935     +600     
     Lines         14981    41576   +26595     
     Branches       1505     4183    +2678     
   ============================================
   + Hits           7819    19864   +12045     
   - Misses         6536    19946   +13410     
   - Partials        626     1766    +1140     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `39.97% <21.39%> (+1.14%)` | :arrow_up: |
   | hudiclient | `34.47% <39.38%> (∅)` | |
   | hudicommon | `48.63% <ø> (-6.11%)` | :arrow_down: |
   | hudiflink | `60.03% <ø> (?)` | |
   | hudihadoopmr | `51.55% <ø> (+18.25%)` | :arrow_up: |
   | hudisparkdatasource | `67.21% <ø> (?)` | |
   | hudisync | `55.73% <ø> (?)` | |
   | huditimelineservice | `64.07% <ø> (-1.23%)` | :arrow_down: |
   | hudiutilities | `59.26% <ø> (-10.39%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...a/org/apache/hudi/cli/HoodieTableHeaderFields.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpL0hvb2RpZVRhYmxlSGVhZGVyRmllbGRzLmphdmE=) | `0.00% <ø> (ø)` | |
   | [...rg/apache/hudi/cli/commands/SavepointsCommand.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpL2NvbW1hbmRzL1NhdmVwb2ludHNDb21tYW5kLmphdmE=) | `13.84% <0.00%> (-0.44%)` | :arrow_down: |
   | [...org/apache/hudi/cli/utils/InputStreamConsumer.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpL3V0aWxzL0lucHV0U3RyZWFtQ29uc3VtZXIuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...ain/scala/org/apache/hudi/cli/DedupeSparkJob.scala](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGkvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL2NsaS9EZWR1cGVTcGFya0pvYi5zY2FsYQ==) | `0.00% <0.00%> (ø)` | |
   | [.../main/scala/org/apache/hudi/cli/SparkHelpers.scala](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGkvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL2NsaS9TcGFya0hlbHBlcnMuc2NhbGE=) | `0.00% <0.00%> (ø)` | |
   | [.../org/apache/hudi/async/AsyncClusteringService.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ2x1c3RlcmluZ1NlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...ava/org/apache/hudi/async/AsyncCompactService.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ29tcGFjdFNlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...java/org/apache/hudi/async/HoodieAsyncService.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0hvb2RpZUFzeW5jU2VydmljZS5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...ient/http/HoodieWriteCommitHttpCallbackClient.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NhbGxiYWNrL2NsaWVudC9odHRwL0hvb2RpZVdyaXRlQ29tbWl0SHR0cENhbGxiYWNrQ2xpZW50LmphdmE=) | `51.35% <0.00%> (ø)` | |
   | [...udi/callback/util/HoodieCommitCallbackFactory.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NhbGxiYWNrL3V0aWwvSG9vZGllQ29tbWl0Q2FsbGJhY2tGYWN0b3J5LmphdmE=) | `0.00% <0.00%> (ø)` | |
   | ... and [907 more](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [f0a2f37...5f8d112](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#issuecomment-865384115


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3128](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5f8d112) into [master](https://codecov.io/gh/apache/hudi/commit/01ad449ad68d9c77ee1493fcdf833df53df6106a?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (01ad449) will **decrease** coverage by `49.36%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3128/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master   #3128       +/-   ##
   ============================================
   - Coverage     52.19%   2.83%   -49.37%     
   + Complexity     2659      85     -2574     
   ============================================
     Files           335     284       -51     
     Lines         14981   11830     -3151     
     Branches       1505     982      -523     
   ============================================
   - Hits           7819     335     -7484     
   - Misses         6536   11469     +4933     
   + Partials        626      26      -600     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `0.00% <0.00%> (∅)` | |
   | hudicommon | `?` | |
   | hudihadoopmr | `?` | |
   | hudisync | `4.85% <ø> (?)` | |
   | huditimelineservice | `?` | |
   | hudiutilities | `9.11% <ø> (-60.54%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [.../org/apache/hudi/async/AsyncClusteringService.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ2x1c3RlcmluZ1NlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...ava/org/apache/hudi/async/AsyncCompactService.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ29tcGFjdFNlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...java/org/apache/hudi/async/HoodieAsyncService.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0hvb2RpZUFzeW5jU2VydmljZS5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...ient/http/HoodieWriteCommitHttpCallbackClient.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NhbGxiYWNrL2NsaWVudC9odHRwL0hvb2RpZVdyaXRlQ29tbWl0SHR0cENhbGxiYWNrQ2xpZW50LmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...udi/callback/util/HoodieCommitCallbackFactory.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NhbGxiYWNrL3V0aWwvSG9vZGllQ29tbWl0Q2FsbGJhY2tGYWN0b3J5LmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...g/apache/hudi/client/AbstractClusteringClient.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdENsdXN0ZXJpbmdDbGllbnQuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...a/org/apache/hudi/client/AbstractHoodieClient.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdEhvb2RpZUNsaWVudC5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [.../apache/hudi/client/AbstractHoodieWriteClient.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdEhvb2RpZVdyaXRlQ2xpZW50LmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...va/org/apache/hudi/client/AsyncCleanerService.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9Bc3luY0NsZWFuZXJTZXJ2aWNlLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [.../org/apache/hudi/client/CompactionAdminClient.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9Db21wYWN0aW9uQWRtaW5DbGllbnQuamF2YQ==) | `0.00% <ø> (ø)` | |
   | ... and [647 more](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [f0a2f37...5f8d112](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] rmahindra123 commented on a change in pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
rmahindra123 commented on a change in pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#discussion_r669296999



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/util/collection/ExternalSpillableMap.java
##########
@@ -108,7 +116,7 @@ public ExternalSpillableMap(Long maxInMemorySizeInBytes, String baseFilePath, Si
                 break;
               case BITCASK:
               default:
-                diskBasedMap = new BitCaskDiskMap<>(baseFilePath);
+                diskBasedMap = isCompressionEnabled ? (new BitCaskDiskMap<>(baseFilePath, true)) : (new BitCaskDiskMap<>(baseFilePath, false));

Review comment:
       lol this was stupid :)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#issuecomment-865345707


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "264aaba2e451e1579fd1e3c4fc378ad7563ce3bd",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=328",
       "triggerID" : "264aaba2e451e1579fd1e3c4fc378ad7563ce3bd",
       "triggerType" : "PUSH"
     }, {
       "hash" : "973773c06c47af2ed5410c0938e690ea3aad7f48",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=330",
       "triggerID" : "973773c06c47af2ed5410c0938e690ea3aad7f48",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2f8c0835c87a73417c3c0c05f1d515b61acba352",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=830",
       "triggerID" : "2f8c0835c87a73417c3c0c05f1d515b61acba352",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2f8c0835c87a73417c3c0c05f1d515b61acba352 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=830) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#issuecomment-865384115


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3128](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (2f8c083) into [master](https://codecov.io/gh/apache/hudi/commit/b4562e86e4b58d6151fdeea12e727b8c8881a213?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (b4562e8) will **decrease** coverage by `32.01%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3128/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #3128       +/-   ##
   =============================================
   - Coverage     47.63%   15.61%   -32.02%     
   + Complexity     5506      489     -5017     
   =============================================
     Files           930      281      -649     
     Lines         41275    11627    -29648     
     Branches       4138      952     -3186     
   =============================================
   - Hits          19661     1816    -17845     
   + Misses        19867     9650    -10217     
   + Partials       1747      161     -1586     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `0.00% <0.00%> (-34.59%)` | :arrow_down: |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `5.37% <ø> (-49.15%)` | :arrow_down: |
   | huditimelineservice | `?` | |
   | hudiutilities | `58.57% <ø> (-0.04%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `0.00% <0.00%> (-42.89%)` | :arrow_down: |
   | [...ain/java/org/apache/hudi/io/HoodieMergeHandle.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2lvL0hvb2RpZU1lcmdlSGFuZGxlLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...rg/apache/hudi/client/bootstrap/BootstrapMode.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9ib290c3RyYXAvQm9vdHN0cmFwTW9kZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...he/hudi/hive/HiveStylePartitionValueExtractor.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZVN0eWxlUGFydGl0aW9uVmFsdWVFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...pache/hudi/client/utils/ConcatenatingIterator.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC91dGlscy9Db25jYXRlbmF0aW5nSXRlcmF0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...che/hudi/config/HoodieMetricsPrometheusConfig.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVNZXRyaWNzUHJvbWV0aGV1c0NvbmZpZy5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [717 more](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [b4562e8...2f8c083](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#issuecomment-865345707


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "264aaba2e451e1579fd1e3c4fc378ad7563ce3bd",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=328",
       "triggerID" : "264aaba2e451e1579fd1e3c4fc378ad7563ce3bd",
       "triggerType" : "PUSH"
     }, {
       "hash" : "973773c06c47af2ed5410c0938e690ea3aad7f48",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=330",
       "triggerID" : "973773c06c47af2ed5410c0938e690ea3aad7f48",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2f8c0835c87a73417c3c0c05f1d515b61acba352",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=830",
       "triggerID" : "2f8c0835c87a73417c3c0c05f1d515b61acba352",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5f8d112453cf0c9fc1f9f4dcb081f6f82459ca4e",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "5f8d112453cf0c9fc1f9f4dcb081f6f82459ca4e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2f8c0835c87a73417c3c0c05f1d515b61acba352 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=830) 
   * 5f8d112453cf0c9fc1f9f4dcb081f6f82459ca4e UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#issuecomment-865345707


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "264aaba2e451e1579fd1e3c4fc378ad7563ce3bd",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=328",
       "triggerID" : "264aaba2e451e1579fd1e3c4fc378ad7563ce3bd",
       "triggerType" : "PUSH"
     }, {
       "hash" : "973773c06c47af2ed5410c0938e690ea3aad7f48",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=330",
       "triggerID" : "973773c06c47af2ed5410c0938e690ea3aad7f48",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 973773c06c47af2ed5410c0938e690ea3aad7f48 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=330) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#issuecomment-869760645


   @nsivabalan do you want to take this home? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#issuecomment-865384115


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3128](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (2f8c083) into [master](https://codecov.io/gh/apache/hudi/commit/b4562e86e4b58d6151fdeea12e727b8c8881a213?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (b4562e8) will **decrease** coverage by `20.23%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3128/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #3128       +/-   ##
   =============================================
   - Coverage     47.63%   27.39%   -20.24%     
   + Complexity     5506     1287     -4219     
   =============================================
     Files           930      381      -549     
     Lines         41275    15115    -26160     
     Branches       4138     1305     -2833     
   =============================================
   - Hits          19661     4141    -15520     
   + Misses        19867    10674     -9193     
   + Partials       1747      300     -1447     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `21.03% <0.00%> (-13.55%)` | :arrow_down: |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `5.37% <ø> (-49.15%)` | :arrow_down: |
   | huditimelineservice | `?` | |
   | hudiutilities | `58.57% <ø> (-0.04%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `0.00% <0.00%> (-42.89%)` | :arrow_down: |
   | [...ain/java/org/apache/hudi/io/HoodieMergeHandle.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2lvL0hvb2RpZU1lcmdlSGFuZGxlLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...rg/apache/hudi/client/bootstrap/BootstrapMode.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9ib290c3RyYXAvQm9vdHN0cmFwTW9kZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...he/hudi/hive/HiveStylePartitionValueExtractor.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZVN0eWxlUGFydGl0aW9uVmFsdWVFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...pache/hudi/client/utils/ConcatenatingIterator.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC91dGlscy9Db25jYXRlbmF0aW5nSXRlcmF0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...che/hudi/config/HoodieMetricsPrometheusConfig.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVNZXRyaWNzUHJvbWV0aGV1c0NvbmZpZy5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [617 more](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [b4562e8...2f8c083](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#issuecomment-865384115


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3128](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (2f8c083) into [master](https://codecov.io/gh/apache/hudi/commit/b4562e86e4b58d6151fdeea12e727b8c8881a213?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (b4562e8) will **decrease** coverage by `44.75%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3128/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master   #3128       +/-   ##
   ============================================
   - Coverage     47.63%   2.88%   -44.76%     
   + Complexity     5506      85     -5421     
   ============================================
     Files           930     281      -649     
     Lines         41275   11627    -29648     
     Branches       4138     952     -3186     
   ============================================
   - Hits          19661     335    -19326     
   + Misses        19867   11266     -8601     
   + Partials       1747      26     -1721     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `0.00% <0.00%> (-34.59%)` | :arrow_down: |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `5.37% <ø> (-49.15%)` | :arrow_down: |
   | huditimelineservice | `?` | |
   | hudiutilities | `9.25% <ø> (-49.36%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `0.00% <0.00%> (-42.89%)` | :arrow_down: |
   | [...ain/java/org/apache/hudi/io/HoodieMergeHandle.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2lvL0hvb2RpZU1lcmdlSGFuZGxlLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [764 more](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [b4562e8...2f8c083](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#discussion_r659451616



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/util/collection/ExternalSpillableMap.java
##########
@@ -80,22 +84,38 @@
   private final String baseFilePath;
 
   public ExternalSpillableMap(Long maxInMemorySizeInBytes, String baseFilePath, SizeEstimator<T> keySizeEstimator,
-      SizeEstimator<R> valueSizeEstimator) throws IOException {
+                              SizeEstimator<R> valueSizeEstimator) throws IOException {
+    this(maxInMemorySizeInBytes, baseFilePath, keySizeEstimator,
+        valueSizeEstimator, DiskMapType.DISK_MAP);
+  }
+
+  public ExternalSpillableMap(Long maxInMemorySizeInBytes, String baseFilePath, SizeEstimator<T> keySizeEstimator,
+                              SizeEstimator<R> valueSizeEstimator, DiskMapType diskMapType) throws IOException {
     this.inMemoryMap = new HashMap<>();
     this.baseFilePath = baseFilePath;
-    this.diskBasedMap = new DiskBasedMap<>(baseFilePath);
     this.maxInMemorySizeInBytes = (long) Math.floor(maxInMemorySizeInBytes * sizingFactorForInMemoryMap);
     this.currentInMemoryMapSize = 0L;
     this.keySizeEstimator = keySizeEstimator;
     this.valueSizeEstimator = valueSizeEstimator;
+    this.diskMapType = diskMapType;
   }
 
-  private DiskBasedMap<T, R> getDiskBasedMap() {
+  private SpillableDiskMap<T, R> getDiskBasedMap() {
     if (null == diskBasedMap) {
       synchronized (this) {
         if (null == diskBasedMap) {
           try {
-            diskBasedMap = new DiskBasedMap<>(baseFilePath);
+            switch (diskMapType) {
+              case ROCK_DB:
+                diskBasedMap = new SpillableRocksDBBasedMap<>(baseFilePath);
+                break;
+              case COMPRESSED_DISK_MAP:

Review comment:
       compression is just a config right. is there a necessity to introduce a new enum? 

##########
File path: hudi-common/src/main/java/org/apache/hudi/common/util/collection/DiskBasedMap.java
##########
@@ -395,4 +417,48 @@ public int compareTo(ValueMetadata o) {
       return Long.compare(this.offsetOfValue, o.offsetOfValue);
     }
   }
+
+  private static class DiskCompressionInstance implements Serializable {
+    public static final int DISK_COMPRESSION_INITIAL_BUFFER_SIZE = 1048576;
+
+    // Caching ByteArrayOutputStreams to avoid recreating it for every operation
+    private final ByteArrayOutputStream compressBaos;
+    private final ByteArrayOutputStream decompressBaos;
+    private final byte[] decompressBuffer;
+
+    DiskCompressionInstance() {
+      compressBaos = new ByteArrayOutputStream(DISK_COMPRESSION_INITIAL_BUFFER_SIZE);
+      decompressBaos = new ByteArrayOutputStream(DISK_COMPRESSION_INITIAL_BUFFER_SIZE);
+      decompressBuffer = new byte[8192];
+    }
+
+    public byte[] compressBytes(final byte [] value) throws IOException {

Review comment:
       I assume these are not required to be thread safe ? can you confirm this. 

##########
File path: hudi-common/src/main/java/org/apache/hudi/common/util/collection/DiskBasedMap.java
##########
@@ -395,4 +417,48 @@ public int compareTo(ValueMetadata o) {
       return Long.compare(this.offsetOfValue, o.offsetOfValue);
     }
   }
+
+  private static class DiskCompressionInstance implements Serializable {
+    public static final int DISK_COMPRESSION_INITIAL_BUFFER_SIZE = 1048576;
+
+    // Caching ByteArrayOutputStreams to avoid recreating it for every operation
+    private final ByteArrayOutputStream compressBaos;
+    private final ByteArrayOutputStream decompressBaos;
+    private final byte[] decompressBuffer;
+
+    DiskCompressionInstance() {
+      compressBaos = new ByteArrayOutputStream(DISK_COMPRESSION_INITIAL_BUFFER_SIZE);
+      decompressBaos = new ByteArrayOutputStream(DISK_COMPRESSION_INITIAL_BUFFER_SIZE);
+      decompressBuffer = new byte[8192];
+    }
+
+    public byte[] compressBytes(final byte [] value) throws IOException {

Review comment:
       are these required to be public ? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] rmahindra123 commented on a change in pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
rmahindra123 commented on a change in pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#discussion_r667254510



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/util/collection/DiskBasedMap.java
##########
@@ -395,4 +417,48 @@ public int compareTo(ValueMetadata o) {
       return Long.compare(this.offsetOfValue, o.offsetOfValue);
     }
   }
+
+  private static class DiskCompressionInstance implements Serializable {
+    public static final int DISK_COMPRESSION_INITIAL_BUFFER_SIZE = 1048576;
+
+    // Caching ByteArrayOutputStreams to avoid recreating it for every operation
+    private final ByteArrayOutputStream compressBaos;
+    private final ByteArrayOutputStream decompressBaos;
+    private final byte[] decompressBuffer;
+
+    DiskCompressionInstance() {
+      compressBaos = new ByteArrayOutputStream(DISK_COMPRESSION_INITIAL_BUFFER_SIZE);
+      decompressBaos = new ByteArrayOutputStream(DISK_COMPRESSION_INITIAL_BUFFER_SIZE);
+      decompressBuffer = new byte[8192];

Review comment:
       Yes, it is dynamic and uses ByteArrayOutputStream (decompressBaos). decompressBuffer is just a intermediate buffer that reads from the inputStream and writes to the outputstream, aka decompressBaos. So 8192 is the max bytes read at a given time.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan merged pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
nsivabalan merged pull request #3128:
URL: https://github.com/apache/hudi/pull/3128


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#issuecomment-870849822


   sure, will take care of this


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#issuecomment-865345707


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "264aaba2e451e1579fd1e3c4fc378ad7563ce3bd",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "264aaba2e451e1579fd1e3c4fc378ad7563ce3bd",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 264aaba2e451e1579fd1e3c4fc378ad7563ce3bd UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#issuecomment-865345707


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "264aaba2e451e1579fd1e3c4fc378ad7563ce3bd",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=328",
       "triggerID" : "264aaba2e451e1579fd1e3c4fc378ad7563ce3bd",
       "triggerType" : "PUSH"
     }, {
       "hash" : "973773c06c47af2ed5410c0938e690ea3aad7f48",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=330",
       "triggerID" : "973773c06c47af2ed5410c0938e690ea3aad7f48",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2f8c0835c87a73417c3c0c05f1d515b61acba352",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "2f8c0835c87a73417c3c0c05f1d515b61acba352",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 973773c06c47af2ed5410c0938e690ea3aad7f48 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=330) 
   * 2f8c0835c87a73417c3c0c05f1d515b61acba352 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#discussion_r668901207



##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java
##########
@@ -290,6 +290,11 @@
       .defaultValue(ExternalSpillableMap.DiskMapType.BITCASK)
       .withDocumentation("Enable usage of either BITCASK or ROCKS_DB as disk map for External Spillable Map");
 
+  public static final ConfigProperty<Boolean> DISK_MAP_BITCASK_COMPRESSION_ENABLED = ConfigProperty
+      .key("hoodie.diskmap.bitcask.enabled")
+      .defaultValue(false)

Review comment:
       sorry, may I know why the property name does not have "compression"? I was expecting something like "hoodie.diskmap.bitcask.compression.enabled"

##########
File path: hudi-common/src/main/java/org/apache/hudi/common/util/collection/BitCaskDiskMap.java
##########
@@ -399,4 +419,47 @@ public int compareTo(ValueMetadata o) {
       return Long.compare(this.offsetOfValue, o.offsetOfValue);
     }
   }
+
+  private static class CompressionHandler implements Serializable {
+    private static final int DISK_COMPRESSION_INITIAL_BUFFER_SIZE = 1048576;
+    private static final int DECOMPRESS_INTERMEDIATE_BUFFER_SIZE = 8192;
+
+    // Caching ByteArrayOutputStreams to avoid recreating it for every operation
+    private final ByteArrayOutputStream compressBaos;
+    private final ByteArrayOutputStream decompressBaos;
+    private final byte[] decompressIntermediateBuffer;
+
+    CompressionHandler() {
+      compressBaos = new ByteArrayOutputStream(DISK_COMPRESSION_INITIAL_BUFFER_SIZE);
+      decompressBaos = new ByteArrayOutputStream(DISK_COMPRESSION_INITIAL_BUFFER_SIZE);
+      decompressIntermediateBuffer = new byte[DECOMPRESS_INTERMEDIATE_BUFFER_SIZE];
+    }
+
+    private byte[] compressBytes(final byte[] value) throws IOException {
+      compressBaos.reset();
+      Deflater deflater = new Deflater(Deflater.BEST_COMPRESSION);
+      DeflaterOutputStream dos = new DeflaterOutputStream(compressBaos, deflater);
+      try {
+        dos.write(value);
+      } finally {
+        dos.close();
+        deflater.end();
+      }
+      return compressBaos.toByteArray();
+    }
+
+    private byte[] decompressBytes(final byte[] bytes) throws IOException {
+      decompressBaos.reset();
+      InputStream in = new InflaterInputStream(new ByteArrayInputStream(bytes));
+      try {
+        int len;
+        while ((len = in.read(decompressIntermediateBuffer)) > 0) {

Review comment:
       is decompressIntermediateBuffer overwritten everytime is it? 

##########
File path: hudi-common/src/main/java/org/apache/hudi/common/util/collection/ExternalSpillableMap.java
##########
@@ -108,7 +116,7 @@ public ExternalSpillableMap(Long maxInMemorySizeInBytes, String baseFilePath, Si
                 break;
               case BITCASK:
               default:
-                diskBasedMap = new BitCaskDiskMap<>(baseFilePath);
+                diskBasedMap = isCompressionEnabled ? (new BitCaskDiskMap<>(baseFilePath, true)) : (new BitCaskDiskMap<>(baseFilePath, false));

Review comment:
       new BitCaskDiskMap<>(baseFilePath, isCompressionEnabled) 

##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java
##########
@@ -290,6 +290,11 @@
       .defaultValue(ExternalSpillableMap.DiskMapType.BITCASK)
       .withDocumentation("Enable usage of either BITCASK or ROCKS_DB as disk map for External Spillable Map");
 
+  public static final ConfigProperty<Boolean> DISK_MAP_BITCASK_COMPRESSION_ENABLED = ConfigProperty
+      .key("hoodie.diskmap.bitcask.enabled")
+      .defaultValue(false)

Review comment:
       also, why not enable by default




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#issuecomment-867784936


   Looks like there are some checkstyle issues. Can you please look into it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#issuecomment-865345707


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "264aaba2e451e1579fd1e3c4fc378ad7563ce3bd",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=328",
       "triggerID" : "264aaba2e451e1579fd1e3c4fc378ad7563ce3bd",
       "triggerType" : "PUSH"
     }, {
       "hash" : "973773c06c47af2ed5410c0938e690ea3aad7f48",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "973773c06c47af2ed5410c0938e690ea3aad7f48",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 264aaba2e451e1579fd1e3c4fc378ad7563ce3bd Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=328) 
   * 973773c06c47af2ed5410c0938e690ea3aad7f48 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#issuecomment-865384115


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3128](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (264aaba) into [master](https://codecov.io/gh/apache/hudi/commit/c08fbb4268ee4b227452fd27d5e6ba322eeef00e?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (c08fbb4) will **increase** coverage by `13.05%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3128/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #3128       +/-   ##
   =============================================
   + Coverage     46.01%   59.07%   +13.05%     
   + Complexity     5306     1181     -4125     
   =============================================
     Files           911      162      -749     
     Lines         39476     6365    -33111     
     Branches       4254      670     -3584     
   =============================================
   - Hits          18166     3760    -14406     
   + Misses        19456     2333    -17123     
   + Partials       1854      272     -1582     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `66.86% <ø> (+36.41%)` | :arrow_up: |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `?` | |
   | huditimelineservice | `?` | |
   | hudiutilities | `49.88% <ø> (-8.50%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...ies/exception/HoodieSnapshotExporterException.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2V4Y2VwdGlvbi9Ib29kaWVTbmFwc2hvdEV4cG9ydGVyRXhjZXB0aW9uLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../apache/hudi/utilities/HoodieSnapshotExporter.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZVNuYXBzaG90RXhwb3J0ZXIuamF2YQ==) | `5.17% <0.00%> (-83.63%)` | :arrow_down: |
   | [...hudi/utilities/schema/JdbcbasedSchemaProvider.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9KZGJjYmFzZWRTY2hlbWFQcm92aWRlci5qYXZh) | `0.00% <0.00%> (-72.23%)` | :arrow_down: |
   | [...org/apache/hudi/utilities/HDFSParquetImporter.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hERlNQYXJxdWV0SW1wb3J0ZXIuamF2YQ==) | `0.00% <0.00%> (-71.82%)` | :arrow_down: |
   | [...he/hudi/utilities/transform/AWSDmsTransformer.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3RyYW5zZm9ybS9BV1NEbXNUcmFuc2Zvcm1lci5qYXZh) | `0.00% <0.00%> (-66.67%)` | :arrow_down: |
   | [...in/java/org/apache/hudi/utilities/UtilHelpers.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL1V0aWxIZWxwZXJzLmphdmE=) | `40.69% <0.00%> (-23.84%)` | :arrow_down: |
   | [...rg/apache/hudi/utilities/HoodieSnapshotCopier.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZVNuYXBzaG90Q29waWVyLmphdmE=) | `0.00% <0.00%> (-13.80%)` | :arrow_down: |
   | [...hudi/utilities/sources/helpers/KafkaOffsetGen.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9LYWZrYU9mZnNldEdlbi5qYXZh) | `86.36% <0.00%> (-0.91%)` | :arrow_down: |
   | [...e/hudi/exception/HoodieCorruptedDataException.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvZXhjZXB0aW9uL0hvb2RpZUNvcnJ1cHRlZERhdGFFeGNlcHRpb24uamF2YQ==) | | |
   | [...hudi/table/action/commit/AbstractDeleteHelper.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3RhYmxlL2FjdGlvbi9jb21taXQvQWJzdHJhY3REZWxldGVIZWxwZXIuamF2YQ==) | | |
   | ... and [742 more](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [c08fbb4...264aaba](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] rmahindra123 commented on a change in pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
rmahindra123 commented on a change in pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#discussion_r669296725



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/util/collection/BitCaskDiskMap.java
##########
@@ -399,4 +419,47 @@ public int compareTo(ValueMetadata o) {
       return Long.compare(this.offsetOfValue, o.offsetOfValue);
     }
   }
+
+  private static class CompressionHandler implements Serializable {
+    private static final int DISK_COMPRESSION_INITIAL_BUFFER_SIZE = 1048576;
+    private static final int DECOMPRESS_INTERMEDIATE_BUFFER_SIZE = 8192;
+
+    // Caching ByteArrayOutputStreams to avoid recreating it for every operation
+    private final ByteArrayOutputStream compressBaos;
+    private final ByteArrayOutputStream decompressBaos;
+    private final byte[] decompressIntermediateBuffer;
+
+    CompressionHandler() {
+      compressBaos = new ByteArrayOutputStream(DISK_COMPRESSION_INITIAL_BUFFER_SIZE);
+      decompressBaos = new ByteArrayOutputStream(DISK_COMPRESSION_INITIAL_BUFFER_SIZE);
+      decompressIntermediateBuffer = new byte[DECOMPRESS_INTERMEDIATE_BUFFER_SIZE];
+    }
+
+    private byte[] compressBytes(final byte[] value) throws IOException {
+      compressBaos.reset();
+      Deflater deflater = new Deflater(Deflater.BEST_COMPRESSION);
+      DeflaterOutputStream dos = new DeflaterOutputStream(compressBaos, deflater);
+      try {
+        dos.write(value);
+      } finally {
+        dos.close();
+        deflater.end();
+      }
+      return compressBaos.toByteArray();
+    }
+
+    private byte[] decompressBytes(final byte[] bytes) throws IOException {
+      decompressBaos.reset();
+      InputStream in = new InflaterInputStream(new ByteArrayInputStream(bytes));
+      try {
+        int len;
+        while ((len = in.read(decompressIntermediateBuffer)) > 0) {

Review comment:
       yes, it is just used to read from input stream and write to the output stream. also ran this by Vinoth, and he is fine with this.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#issuecomment-865345707


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "264aaba2e451e1579fd1e3c4fc378ad7563ce3bd",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=328",
       "triggerID" : "264aaba2e451e1579fd1e3c4fc378ad7563ce3bd",
       "triggerType" : "PUSH"
     }, {
       "hash" : "973773c06c47af2ed5410c0938e690ea3aad7f48",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=330",
       "triggerID" : "973773c06c47af2ed5410c0938e690ea3aad7f48",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2f8c0835c87a73417c3c0c05f1d515b61acba352",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=830",
       "triggerID" : "2f8c0835c87a73417c3c0c05f1d515b61acba352",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 973773c06c47af2ed5410c0938e690ea3aad7f48 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=330) 
   * 2f8c0835c87a73417c3c0c05f1d515b61acba352 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=830) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] rmahindra123 commented on a change in pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
rmahindra123 commented on a change in pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#discussion_r667255510



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/util/collection/DiskBasedMap.java
##########
@@ -395,4 +417,48 @@ public int compareTo(ValueMetadata o) {
       return Long.compare(this.offsetOfValue, o.offsetOfValue);
     }
   }
+
+  private static class DiskCompressionInstance implements Serializable {
+    public static final int DISK_COMPRESSION_INITIAL_BUFFER_SIZE = 1048576;
+
+    // Caching ByteArrayOutputStreams to avoid recreating it for every operation
+    private final ByteArrayOutputStream compressBaos;
+    private final ByteArrayOutputStream decompressBaos;
+    private final byte[] decompressBuffer;
+
+    DiskCompressionInstance() {
+      compressBaos = new ByteArrayOutputStream(DISK_COMPRESSION_INITIAL_BUFFER_SIZE);
+      decompressBaos = new ByteArrayOutputStream(DISK_COMPRESSION_INITIAL_BUFFER_SIZE);
+      decompressBuffer = new byte[8192];
+    }
+
+    public byte[] compressBytes(final byte [] value) throws IOException {
+      compressBaos.reset();

Review comment:
       sorry did not follow here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#issuecomment-865345707


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "264aaba2e451e1579fd1e3c4fc378ad7563ce3bd",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "264aaba2e451e1579fd1e3c4fc378ad7563ce3bd",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 264aaba2e451e1579fd1e3c4fc378ad7563ce3bd UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#issuecomment-865384115






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#issuecomment-865384115


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3128](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5f8d112) into [master](https://codecov.io/gh/apache/hudi/commit/01ad449ad68d9c77ee1493fcdf833df53df6106a?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (01ad449) will **decrease** coverage by `36.43%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3128/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #3128       +/-   ##
   =============================================
   - Coverage     52.19%   15.75%   -36.44%     
   + Complexity     2659      493     -2166     
   =============================================
     Files           335      284       -51     
     Lines         14981    11830     -3151     
     Branches       1505      982      -523     
   =============================================
   - Hits           7819     1864     -5955     
   - Misses         6536     9803     +3267     
   + Partials        626      163      -463     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `0.00% <0.00%> (∅)` | |
   | hudicommon | `?` | |
   | hudihadoopmr | `?` | |
   | hudisync | `4.85% <ø> (?)` | |
   | huditimelineservice | `?` | |
   | hudiutilities | `59.26% <ø> (-10.39%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [.../org/apache/hudi/async/AsyncClusteringService.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ2x1c3RlcmluZ1NlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...ava/org/apache/hudi/async/AsyncCompactService.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ29tcGFjdFNlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...java/org/apache/hudi/async/HoodieAsyncService.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0hvb2RpZUFzeW5jU2VydmljZS5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...ient/http/HoodieWriteCommitHttpCallbackClient.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NhbGxiYWNrL2NsaWVudC9odHRwL0hvb2RpZVdyaXRlQ29tbWl0SHR0cENhbGxiYWNrQ2xpZW50LmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...udi/callback/util/HoodieCommitCallbackFactory.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NhbGxiYWNrL3V0aWwvSG9vZGllQ29tbWl0Q2FsbGJhY2tGYWN0b3J5LmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...g/apache/hudi/client/AbstractClusteringClient.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdENsdXN0ZXJpbmdDbGllbnQuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...a/org/apache/hudi/client/AbstractHoodieClient.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdEhvb2RpZUNsaWVudC5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [.../apache/hudi/client/AbstractHoodieWriteClient.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdEhvb2RpZVdyaXRlQ2xpZW50LmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...va/org/apache/hudi/client/AsyncCleanerService.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9Bc3luY0NsZWFuZXJTZXJ2aWNlLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [.../org/apache/hudi/client/CompactionAdminClient.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9Db21wYWN0aW9uQWRtaW5DbGllbnQuamF2YQ==) | `0.00% <ø> (ø)` | |
   | ... and [622 more](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [f0a2f37...5f8d112](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#issuecomment-865345707


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "264aaba2e451e1579fd1e3c4fc378ad7563ce3bd",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=328",
       "triggerID" : "264aaba2e451e1579fd1e3c4fc378ad7563ce3bd",
       "triggerType" : "PUSH"
     }, {
       "hash" : "973773c06c47af2ed5410c0938e690ea3aad7f48",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=330",
       "triggerID" : "973773c06c47af2ed5410c0938e690ea3aad7f48",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2f8c0835c87a73417c3c0c05f1d515b61acba352",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=830",
       "triggerID" : "2f8c0835c87a73417c3c0c05f1d515b61acba352",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5f8d112453cf0c9fc1f9f4dcb081f6f82459ca4e",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=899",
       "triggerID" : "5f8d112453cf0c9fc1f9f4dcb081f6f82459ca4e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 5f8d112453cf0c9fc1f9f4dcb081f6f82459ca4e Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=899) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#issuecomment-865384115






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#issuecomment-865345707


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "264aaba2e451e1579fd1e3c4fc378ad7563ce3bd",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=328",
       "triggerID" : "264aaba2e451e1579fd1e3c4fc378ad7563ce3bd",
       "triggerType" : "PUSH"
     }, {
       "hash" : "973773c06c47af2ed5410c0938e690ea3aad7f48",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=330",
       "triggerID" : "973773c06c47af2ed5410c0938e690ea3aad7f48",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2f8c0835c87a73417c3c0c05f1d515b61acba352",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=830",
       "triggerID" : "2f8c0835c87a73417c3c0c05f1d515b61acba352",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5f8d112453cf0c9fc1f9f4dcb081f6f82459ca4e",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=899",
       "triggerID" : "5f8d112453cf0c9fc1f9f4dcb081f6f82459ca4e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2f8c0835c87a73417c3c0c05f1d515b61acba352 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=830) 
   * 5f8d112453cf0c9fc1f9f4dcb081f6f82459ca4e Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=899) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#issuecomment-865384115


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3128](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5f8d112) into [master](https://codecov.io/gh/apache/hudi/commit/01ad449ad68d9c77ee1493fcdf833df53df6106a?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (01ad449) will **decrease** coverage by `6.05%`.
   > The diff coverage is `37.44%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3128/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #3128      +/-   ##
   ============================================
   - Coverage     52.19%   46.14%   -6.06%     
   - Complexity     2659     4666    +2007     
   ============================================
     Files           335      838     +503     
     Lines         14981    35596   +20615     
     Branches       1505     3498    +1993     
   ============================================
   + Hits           7819    16424    +8605     
   - Misses         6536    17733   +11197     
   - Partials        626     1439     +813     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `39.97% <21.39%> (+1.14%)` | :arrow_up: |
   | hudiclient | `34.47% <39.38%> (∅)` | |
   | hudicommon | `48.63% <ø> (-6.11%)` | :arrow_down: |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `67.21% <ø> (?)` | |
   | hudisync | `55.73% <ø> (?)` | |
   | huditimelineservice | `64.07% <ø> (-1.23%)` | :arrow_down: |
   | hudiutilities | `59.26% <ø> (-10.39%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...a/org/apache/hudi/cli/HoodieTableHeaderFields.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpL0hvb2RpZVRhYmxlSGVhZGVyRmllbGRzLmphdmE=) | `0.00% <ø> (ø)` | |
   | [...rg/apache/hudi/cli/commands/SavepointsCommand.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpL2NvbW1hbmRzL1NhdmVwb2ludHNDb21tYW5kLmphdmE=) | `13.84% <0.00%> (-0.44%)` | :arrow_down: |
   | [...org/apache/hudi/cli/utils/InputStreamConsumer.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpL3V0aWxzL0lucHV0U3RyZWFtQ29uc3VtZXIuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...ain/scala/org/apache/hudi/cli/DedupeSparkJob.scala](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGkvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL2NsaS9EZWR1cGVTcGFya0pvYi5zY2FsYQ==) | `0.00% <0.00%> (ø)` | |
   | [.../main/scala/org/apache/hudi/cli/SparkHelpers.scala](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGkvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL2NsaS9TcGFya0hlbHBlcnMuc2NhbGE=) | `0.00% <0.00%> (ø)` | |
   | [.../org/apache/hudi/async/AsyncClusteringService.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ2x1c3RlcmluZ1NlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...ava/org/apache/hudi/async/AsyncCompactService.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ29tcGFjdFNlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...java/org/apache/hudi/async/HoodieAsyncService.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0hvb2RpZUFzeW5jU2VydmljZS5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...ient/http/HoodieWriteCommitHttpCallbackClient.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NhbGxiYWNrL2NsaWVudC9odHRwL0hvb2RpZVdyaXRlQ29tbWl0SHR0cENhbGxiYWNrQ2xpZW50LmphdmE=) | `51.35% <0.00%> (ø)` | |
   | [...udi/callback/util/HoodieCommitCallbackFactory.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NhbGxiYWNrL3V0aWwvSG9vZGllQ29tbWl0Q2FsbGJhY2tGYWN0b3J5LmphdmE=) | `0.00% <0.00%> (ø)` | |
   | ... and [862 more](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [f0a2f37...5f8d112](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] rmahindra123 commented on a change in pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
rmahindra123 commented on a change in pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#discussion_r667259394



##########
File path: hudi-common/src/test/java/org/apache/hudi/common/util/collection/TestBitCaskDiskMap.java
##########
@@ -66,27 +68,33 @@ public void setup() {
     initPath();
   }
 
-  @Test
-  public void testSimpleInsert() throws IOException, URISyntaxException {
-    BitCaskDiskMap records = new BitCaskDiskMap<>(basePath);
+  @ParameterizedTest
+  @ValueSource(booleans = {false, true})
+  public void testSimpleInsert(boolean isCompressionEnabled) throws IOException, URISyntaxException {
+    BitCaskDiskMap records = new BitCaskDiskMap<>(basePath, isCompressionEnabled);
     List<IndexedRecord> iRecords = SchemaTestUtil.generateHoodieTestRecords(0, 100);
-    ((GenericRecord) iRecords.get(0)).get(HoodieRecord.COMMIT_TIME_METADATA_FIELD).toString();
     List<String> recordKeys = SpillableMapTestUtils.upsertRecords(iRecords, records);
 
+    Map<String, IndexedRecord> originalRecords = iRecords.stream()
+        .collect(Collectors.toMap(k -> ((GenericRecord) k).get(HoodieRecord.RECORD_KEY_METADATA_FIELD).toString(), v -> v));
+
     // make sure records have spilled to disk
     assertTrue(records.sizeOfFileOnDiskInBytes() > 0);
     Iterator<HoodieRecord<? extends HoodieRecordPayload>> itr = records.iterator();
-    List<HoodieRecord> oRecords = new ArrayList<>();
     while (itr.hasNext()) {
       HoodieRecord<? extends HoodieRecordPayload> rec = itr.next();
-      oRecords.add(rec);
       assert recordKeys.contains(rec.getRecordKey());
+      IndexedRecord originalRecord = originalRecords.get(rec.getRecordKey());
+      HoodieAvroPayload payload = (HoodieAvroPayload) rec.getData();
+      Option<IndexedRecord> value = payload.getInsertValue(HoodieAvroUtils.addMetadataFields(getSimpleSchema()));
+      assertEquals(originalRecord, value.get());

Review comment:
       Added testing for value, that should test compression/decompression




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#discussion_r669909750



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/util/collection/BitCaskDiskMap.java
##########
@@ -188,21 +204,25 @@ public R get(Object key) {
   }
 
   private R get(ValueMetadata entry) {
-    return get(entry, getRandomAccessFile());
+    return get(entry, getRandomAccessFile(), isCompressionEnabled);
   }
 
-  public static <R> R get(ValueMetadata entry, RandomAccessFile file) {
+  public static <R> R get(ValueMetadata entry, RandomAccessFile file, boolean isCompressionEnabled) {
     try {
-      return SerializationUtils
-          .deserialize(SpillableMapUtils.readBytesFromDisk(file, entry.getOffsetOfValue(), entry.getSizeOfValue()));
+      byte[] bytesFromDisk = SpillableMapUtils.readBytesFromDisk(file, entry.getOffsetOfValue(), entry.getSizeOfValue());
+      if (isCompressionEnabled) {
+        return SerializationUtils.deserialize(DISK_COMPRESSION_REF.get().decompressBytes(bytesFromDisk));
+      }

Review comment:
       not required to fix in this patch, but something to keep in mind. would be good to have an explicit else block for line 216. this "if" block is just one line and so its fine. But if its a large "if" block, then reader/dev might might wonder that some code path may not return from within "if" block and hence we have a return outside of "if" block. 
   So, whenever you have if else, try to always explicitly add else block. 

##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieMergeHandle.java
##########
@@ -200,7 +200,7 @@ protected void initializeIncomingRecordsMap() {
       LOG.info("MaxMemoryPerPartitionMerge => " + memoryForMerge);
       this.keyToNewRecords = new ExternalSpillableMap<>(memoryForMerge, config.getSpillableMapBasePath(),
           new DefaultSizeEstimator(), new HoodieRecordSizeEstimator(tableSchema),
-          config.getSpillableDiskMapType());
+          config.getSpillableDiskMapType(), config.isBitCaskDiskMapCompressionEnabled());

Review comment:
       Do you wanna make change in HoodieMergedLogRecordScanner as well ? Or thats planned for a follow up PR




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] rmahindra123 commented on a change in pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
rmahindra123 commented on a change in pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#discussion_r670094851



##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieMergeHandle.java
##########
@@ -200,7 +200,7 @@ protected void initializeIncomingRecordsMap() {
       LOG.info("MaxMemoryPerPartitionMerge => " + memoryForMerge);
       this.keyToNewRecords = new ExternalSpillableMap<>(memoryForMerge, config.getSpillableMapBasePath(),
           new DefaultSizeEstimator(), new HoodieRecordSizeEstimator(tableSchema),
-          config.getSpillableDiskMapType());
+          config.getSpillableDiskMapType(), config.isBitCaskDiskMapCompressionEnabled());

Review comment:
       Good point, will be done in a follow up PR https://issues.apache.org/jira/browse/HUDI-2044




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#discussion_r669909750



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/util/collection/BitCaskDiskMap.java
##########
@@ -188,21 +204,25 @@ public R get(Object key) {
   }
 
   private R get(ValueMetadata entry) {
-    return get(entry, getRandomAccessFile());
+    return get(entry, getRandomAccessFile(), isCompressionEnabled);
   }
 
-  public static <R> R get(ValueMetadata entry, RandomAccessFile file) {
+  public static <R> R get(ValueMetadata entry, RandomAccessFile file, boolean isCompressionEnabled) {
     try {
-      return SerializationUtils
-          .deserialize(SpillableMapUtils.readBytesFromDisk(file, entry.getOffsetOfValue(), entry.getSizeOfValue()));
+      byte[] bytesFromDisk = SpillableMapUtils.readBytesFromDisk(file, entry.getOffsetOfValue(), entry.getSizeOfValue());
+      if (isCompressionEnabled) {
+        return SerializationUtils.deserialize(DISK_COMPRESSION_REF.get().decompressBytes(bytesFromDisk));
+      }

Review comment:
       not required to fix in this patch, but something to keep in mind. would be good to have an explicit else block for line 216. this "if" block is just one line and so its fine. But if its a large "if" block, then reader/dev might might wonder that some code path may not return from within "if" block and hence we have a return outside of "if" block. 
   So, whenever you have "if" "else"s, try to always explicitly add else block. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#issuecomment-865384115


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3128](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (973773c) into [master](https://codecov.io/gh/apache/hudi/commit/c08fbb4268ee4b227452fd27d5e6ba322eeef00e?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (c08fbb4) will **increase** coverage by `3.86%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3128/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #3128      +/-   ##
   ============================================
   + Coverage     46.01%   49.88%   +3.86%     
   + Complexity     5306      394    -4912     
   ============================================
     Files           911       66     -845     
     Lines         39476     2919   -36557     
     Branches       4254      318    -3936     
   ============================================
   - Hits          18166     1456   -16710     
   + Misses        19456     1327   -18129     
   + Partials       1854      136    -1718     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `?` | |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `?` | |
   | huditimelineservice | `?` | |
   | hudiutilities | `49.88% <ø> (-8.50%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...ies/exception/HoodieSnapshotExporterException.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2V4Y2VwdGlvbi9Ib29kaWVTbmFwc2hvdEV4cG9ydGVyRXhjZXB0aW9uLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../apache/hudi/utilities/HoodieSnapshotExporter.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZVNuYXBzaG90RXhwb3J0ZXIuamF2YQ==) | `5.17% <0.00%> (-83.63%)` | :arrow_down: |
   | [...hudi/utilities/schema/JdbcbasedSchemaProvider.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9KZGJjYmFzZWRTY2hlbWFQcm92aWRlci5qYXZh) | `0.00% <0.00%> (-72.23%)` | :arrow_down: |
   | [...org/apache/hudi/utilities/HDFSParquetImporter.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hERlNQYXJxdWV0SW1wb3J0ZXIuamF2YQ==) | `0.00% <0.00%> (-71.82%)` | :arrow_down: |
   | [...he/hudi/utilities/transform/AWSDmsTransformer.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3RyYW5zZm9ybS9BV1NEbXNUcmFuc2Zvcm1lci5qYXZh) | `0.00% <0.00%> (-66.67%)` | :arrow_down: |
   | [...in/java/org/apache/hudi/utilities/UtilHelpers.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL1V0aWxIZWxwZXJzLmphdmE=) | `40.69% <0.00%> (-23.84%)` | :arrow_down: |
   | [...rg/apache/hudi/utilities/HoodieSnapshotCopier.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZVNuYXBzaG90Q29waWVyLmphdmE=) | `0.00% <0.00%> (-13.80%)` | :arrow_down: |
   | [...e/hudi/sink/transform/RowDataToHoodieFunction.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL3RyYW5zZm9ybS9Sb3dEYXRhVG9Ib29kaWVGdW5jdGlvbi5qYXZh) | | |
   | [...rg/apache/hudi/cli/commands/HoodieSyncCommand.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpL2NvbW1hbmRzL0hvb2RpZVN5bmNDb21tYW5kLmphdmE=) | | |
   | [...org/apache/hudi/util/StringToRowDataConverter.java](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS91dGlsL1N0cmluZ1RvUm93RGF0YUNvbnZlcnRlci5qYXZh) | | |
   | ... and [836 more](https://codecov.io/gh/apache/hudi/pull/3128/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [c08fbb4...973773c](https://codecov.io/gh/apache/hudi/pull/3128?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#issuecomment-869760645


   @nsivabalan do you want to take this home? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on a change in pull request #3128: [HUDI-2029] Implement compression for DiskBasedMap in Spillable Map

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #3128:
URL: https://github.com/apache/hudi/pull/3128#discussion_r659870711



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/util/collection/DiskBasedMap.java
##########
@@ -395,4 +417,48 @@ public int compareTo(ValueMetadata o) {
       return Long.compare(this.offsetOfValue, o.offsetOfValue);
     }
   }
+
+  private static class DiskCompressionInstance implements Serializable {
+    public static final int DISK_COMPRESSION_INITIAL_BUFFER_SIZE = 1048576;
+
+    // Caching ByteArrayOutputStreams to avoid recreating it for every operation
+    private final ByteArrayOutputStream compressBaos;
+    private final ByteArrayOutputStream decompressBaos;
+    private final byte[] decompressBuffer;
+
+    DiskCompressionInstance() {
+      compressBaos = new ByteArrayOutputStream(DISK_COMPRESSION_INITIAL_BUFFER_SIZE);
+      decompressBaos = new ByteArrayOutputStream(DISK_COMPRESSION_INITIAL_BUFFER_SIZE);
+      decompressBuffer = new byte[8192];

Review comment:
       pull this into a static final?

##########
File path: hudi-common/src/main/java/org/apache/hudi/common/util/collection/DiskBasedMap.java
##########
@@ -395,4 +417,48 @@ public int compareTo(ValueMetadata o) {
       return Long.compare(this.offsetOfValue, o.offsetOfValue);
     }
   }
+
+  private static class DiskCompressionInstance implements Serializable {

Review comment:
       rename: `Compressor` or `CompressionHandler`

##########
File path: hudi-common/src/main/java/org/apache/hudi/common/util/collection/DiskBasedMap.java
##########
@@ -395,4 +417,48 @@ public int compareTo(ValueMetadata o) {
       return Long.compare(this.offsetOfValue, o.offsetOfValue);
     }
   }
+
+  private static class DiskCompressionInstance implements Serializable {
+    public static final int DISK_COMPRESSION_INITIAL_BUFFER_SIZE = 1048576;
+
+    // Caching ByteArrayOutputStreams to avoid recreating it for every operation
+    private final ByteArrayOutputStream compressBaos;
+    private final ByteArrayOutputStream decompressBaos;
+    private final byte[] decompressBuffer;
+
+    DiskCompressionInstance() {
+      compressBaos = new ByteArrayOutputStream(DISK_COMPRESSION_INITIAL_BUFFER_SIZE);
+      decompressBaos = new ByteArrayOutputStream(DISK_COMPRESSION_INITIAL_BUFFER_SIZE);
+      decompressBuffer = new byte[8192];

Review comment:
       what if there is a value beyond 8192? better to use something that is dynamic? 

##########
File path: hudi-common/src/main/java/org/apache/hudi/common/util/collection/DiskBasedMap.java
##########
@@ -395,4 +417,48 @@ public int compareTo(ValueMetadata o) {
       return Long.compare(this.offsetOfValue, o.offsetOfValue);
     }
   }
+
+  private static class DiskCompressionInstance implements Serializable {
+    public static final int DISK_COMPRESSION_INITIAL_BUFFER_SIZE = 1048576;
+
+    // Caching ByteArrayOutputStreams to avoid recreating it for every operation
+    private final ByteArrayOutputStream compressBaos;
+    private final ByteArrayOutputStream decompressBaos;
+    private final byte[] decompressBuffer;
+
+    DiskCompressionInstance() {
+      compressBaos = new ByteArrayOutputStream(DISK_COMPRESSION_INITIAL_BUFFER_SIZE);
+      decompressBaos = new ByteArrayOutputStream(DISK_COMPRESSION_INITIAL_BUFFER_SIZE);
+      decompressBuffer = new byte[8192];
+    }
+
+    public byte[] compressBytes(final byte [] value) throws IOException {
+      compressBaos.reset();

Review comment:
       you may have to handle some resizing logic here w.r.t the 8MB limit




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org