You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/06/16 13:07:20 UTC

[GitHub] [hudi] baobaoyeye opened a new pull request #1739: [HUDI-760]Remove Rolling Stat management from Hudi Writer

baobaoyeye opened a new pull request #1739:
URL: https://github.com/apache/hudi/pull/1739


   ## What is the purpose of the pull request
   
   This pull request cleaned some rolling stat metadata's abandoned operation, when commit
   and clean up related operations in unit tests
   
   ## Brief change log
   
     - Remove Rolling Stat management from Hudi Writer
   
   ## Verify this pull request
   
   This pull request is already covered by existing tests, such as
    - *TestHoodieClientOnCopyOnWriteStorage.java*
    - *TestHoodieMergeOnReadTable.java*
   
   ## Committer checklist
   
    - [ ] Has a corresponding JIRA in PR title & commit
    
    - [ ] Commit message is descriptive of the change
    
    - [ ] CI is green
   
    - [ ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] bvaradar merged pull request #1739: [HUDI-760]Remove Rolling Stat management from Hudi Writer

Posted by GitBox <gi...@apache.org>.
bvaradar merged pull request #1739:
URL: https://github.com/apache/hudi/pull/1739


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] baobaoyeye commented on pull request #1739: [HUDI-760]Remove Rolling Stat management from Hudi Writer

Posted by GitBox <gi...@apache.org>.
baobaoyeye commented on pull request #1739:
URL: https://github.com/apache/hudi/pull/1739#issuecomment-647296501


   @nsivabalan fixed
   
   > @baobaoyeye : can you fix the commit message (may be same as desc). I will merge once we you are done.
   
   fixed,thanks


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] baobaoyeye commented on a change in pull request #1739: [HUDI-760]Remove Rolling Stat management from Hudi Writer

Posted by GitBox <gi...@apache.org>.
baobaoyeye commented on a change in pull request #1739:
URL: https://github.com/apache/hudi/pull/1739#discussion_r447956376



##########
File path: hudi-client/src/main/java/org/apache/hudi/table/action/commit/BaseCommitActionExecutor.java
##########
@@ -231,15 +231,7 @@ protected void finalizeWrite(String instantTime, List<HoodieWriteStat> stats, Ho
   }
 
   private void updateMetadataAndRollingStats(HoodieCommitMetadata metadata, List<HoodieWriteStat> writeStats) {
-    // 1. Look up the previous compaction/commit and get the HoodieCommitMetadata from there.
-    // 2. Now, first read the existing rolling stats and merge with the result of current metadata.
-
-    // Need to do this on every commit (delta or commit) to support COW and MOR.
-    for (HoodieWriteStat stat : writeStats) {
-      String partitionPath = stat.getPartitionPath();
-      // TODO: why is stat.getPartitionPath() null at times here.
-      metadata.addWriteStat(partitionPath, stat);
-    }
+    writeStats.forEach(stat -> metadata.addWriteStat(stat.getPartitionPath(), stat));

Review comment:
       Good suggestion, fixed




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #1739: [HUDI-760]Remove Rolling Stat management from Hudi Writer

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #1739:
URL: https://github.com/apache/hudi/pull/1739#issuecomment-647165636


   @baobaoyeye : can you fix the commit message (may be same as desc). I will merge once we you are done. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #1739: [HUDI-760]Remove Rolling Stat management from Hudi Writer

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #1739:
URL: https://github.com/apache/hudi/pull/1739#issuecomment-644838726


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/1739?src=pr&el=h1) Report
   > Merging [#1739](https://codecov.io/gh/apache/hudi/pull/1739?src=pr&el=desc) into [master](https://codecov.io/gh/apache/hudi/commit/043eb564c2c7e4578237b5d0f2bbad8276db51f4&el=desc) will **decrease** coverage by `0.25%`.
   > The diff coverage is `50.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/1739/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/1739?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #1739      +/-   ##
   ============================================
   - Coverage     18.14%   17.89%   -0.26%     
   + Complexity      859      844      -15     
   ============================================
     Files           352      352              
     Lines         15414    15388      -26     
     Branches       1526     1523       -3     
   ============================================
   - Hits           2797     2753      -44     
   - Misses        12259    12281      +22     
   + Partials        358      354       -4     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/1739?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [.../table/action/commit/BaseCommitActionExecutor.java](https://codecov.io/gh/apache/hudi/pull/1739/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdGFibGUvYWN0aW9uL2NvbW1pdC9CYXNlQ29tbWl0QWN0aW9uRXhlY3V0b3IuamF2YQ==) | `47.27% <0.00%> (+1.25%)` | `14.00 <0.00> (ø)` | |
   | [.../apache/hudi/client/AbstractHoodieWriteClient.java](https://codecov.io/gh/apache/hudi/pull/1739/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpZW50L0Fic3RyYWN0SG9vZGllV3JpdGVDbGllbnQuamF2YQ==) | `51.80% <100.00%> (-1.03%)` | `12.00 <1.00> (-1.00)` | |
   | [...e/hudi/common/model/HoodieRollingStatMetadata.java](https://codecov.io/gh/apache/hudi/pull/1739/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0hvb2RpZVJvbGxpbmdTdGF0TWV0YWRhdGEuamF2YQ==) | `0.00% <0.00%> (-46.16%)` | `0.00% <0.00%> (-7.00%)` | |
   | [...rg/apache/hudi/common/model/HoodieRollingStat.java](https://codecov.io/gh/apache/hudi/pull/1739/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0hvb2RpZVJvbGxpbmdTdGF0LmphdmE=) | `0.00% <0.00%> (-42.86%)` | `0.00% <0.00%> (-6.00%)` | |
   | [...i/common/table/timeline/HoodieDefaultTimeline.java](https://codecov.io/gh/apache/hudi/pull/1739/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL0hvb2RpZURlZmF1bHRUaW1lbGluZS5qYXZh) | `50.76% <0.00%> (-1.54%)` | `27.00% <0.00%> (-1.00%)` | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/1739?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/1739?src=pr&el=footer). Last update [043eb56...a9e3ec3](https://codecov.io/gh/apache/hudi/pull/1739?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on pull request #1739: [HUDI-760]Remove Rolling Stat management from Hudi Writer

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on pull request #1739:
URL: https://github.com/apache/hudi/pull/1739#issuecomment-647800356


   @n3nash can you take a quick second pass here? and we can merge?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] baobaoyeye commented on a change in pull request #1739: [HUDI-760]Remove Rolling Stat management from Hudi Writer

Posted by GitBox <gi...@apache.org>.
baobaoyeye commented on a change in pull request #1739:
URL: https://github.com/apache/hudi/pull/1739#discussion_r447956245



##########
File path: hudi-client/src/main/java/org/apache/hudi/client/AbstractHoodieWriteClient.java
##########
@@ -177,44 +175,7 @@ protected void finalizeWrite(HoodieTable<T> table, String instantTime, List<Hood
 
   private void updateMetadataAndRollingStats(String actionType, HoodieCommitMetadata metadata,

Review comment:
       Good suggestion, fixed




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] baobaoyeye commented on pull request #1739: [HUDI-760]Remove Rolling Stat management from Hudi Writer

Posted by GitBox <gi...@apache.org>.
baobaoyeye commented on pull request #1739:
URL: https://github.com/apache/hudi/pull/1739#issuecomment-652117383


   > @baobaoyeye : Thanks a lot for contributing this PR. I have few minor suggestions in cleaning up the code further. Can you please address then and let us know. We will merge it then.
   Good suggestion, fixed
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] baobaoyeye commented on a change in pull request #1739: [HUDI-760]Remove Rolling Stat management from Hudi Writer

Posted by GitBox <gi...@apache.org>.
baobaoyeye commented on a change in pull request #1739:
URL: https://github.com/apache/hudi/pull/1739#discussion_r447956502



##########
File path: hudi-client/src/test/java/org/apache/hudi/client/TestHoodieClientOnCopyOnWriteStorage.java
##########
@@ -849,7 +848,7 @@ public void testCommitWritesRelativePaths() throws Exception {
   }
 
   /**
-   * Test to ensure commit metadata points to valid files.
+   * Test to ensure commit metadata points to valid files.10.
    */
   @Test
   public void testRollingStatsInMetadata() throws Exception {

Review comment:
       fixed

##########
File path: hudi-client/src/test/java/org/apache/hudi/table/TestHoodieMergeOnReadTable.java
##########
@@ -1094,14 +1093,10 @@ public void testRollingStatsInMetadata() throws Exception {
       HoodieCommitMetadata metadata = HoodieCommitMetadata.fromBytes(
           table.getActiveTimeline().getInstantDetails(table.getActiveTimeline().getDeltaCommitTimeline().lastInstant().get()).get(),
           HoodieCommitMetadata.class);
-      HoodieRollingStatMetadata rollingStatMetadata = HoodieCommitMetadata.fromBytes(
-          metadata.getExtraMetadata().get(HoodieRollingStatMetadata.ROLLING_STAT_METADATA_KEY).getBytes(),
-          HoodieRollingStatMetadata.class);
       int inserts = 0;
-      for (Map.Entry<String, Map<String, HoodieRollingStat>> pstat : rollingStatMetadata.getPartitionToRollingStats()
-          .entrySet()) {
-        for (Map.Entry<String, HoodieRollingStat> stat : pstat.getValue().entrySet()) {
-          inserts += stat.getValue().getInserts();
+      for (Map.Entry<String, List<HoodieWriteStat>> pstat : metadata.getPartitionToWriteStats().entrySet()) {

Review comment:
       fixed




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter commented on pull request #1739: [HUDI-760]Remove Rolling Stat management from Hudi Writer

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #1739:
URL: https://github.com/apache/hudi/pull/1739#issuecomment-644838726


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/1739?src=pr&el=h1) Report
   > Merging [#1739](https://codecov.io/gh/apache/hudi/pull/1739?src=pr&el=desc) into [master](https://codecov.io/gh/apache/hudi/commit/043eb564c2c7e4578237b5d0f2bbad8276db51f4&el=desc) will **decrease** coverage by `0.25%`.
   > The diff coverage is `50.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/1739/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/1739?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #1739      +/-   ##
   ============================================
   - Coverage     18.14%   17.89%   -0.26%     
   + Complexity      859      844      -15     
   ============================================
     Files           352      352              
     Lines         15414    15388      -26     
     Branches       1526     1523       -3     
   ============================================
   - Hits           2797     2753      -44     
   - Misses        12259    12281      +22     
   + Partials        358      354       -4     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/1739?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [.../table/action/commit/BaseCommitActionExecutor.java](https://codecov.io/gh/apache/hudi/pull/1739/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdGFibGUvYWN0aW9uL2NvbW1pdC9CYXNlQ29tbWl0QWN0aW9uRXhlY3V0b3IuamF2YQ==) | `47.27% <0.00%> (+1.25%)` | `14.00 <0.00> (ø)` | |
   | [.../apache/hudi/client/AbstractHoodieWriteClient.java](https://codecov.io/gh/apache/hudi/pull/1739/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpZW50L0Fic3RyYWN0SG9vZGllV3JpdGVDbGllbnQuamF2YQ==) | `51.80% <100.00%> (-1.03%)` | `12.00 <1.00> (-1.00)` | |
   | [...e/hudi/common/model/HoodieRollingStatMetadata.java](https://codecov.io/gh/apache/hudi/pull/1739/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0hvb2RpZVJvbGxpbmdTdGF0TWV0YWRhdGEuamF2YQ==) | `0.00% <0.00%> (-46.16%)` | `0.00% <0.00%> (-7.00%)` | |
   | [...rg/apache/hudi/common/model/HoodieRollingStat.java](https://codecov.io/gh/apache/hudi/pull/1739/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0hvb2RpZVJvbGxpbmdTdGF0LmphdmE=) | `0.00% <0.00%> (-42.86%)` | `0.00% <0.00%> (-6.00%)` | |
   | [...i/common/table/timeline/HoodieDefaultTimeline.java](https://codecov.io/gh/apache/hudi/pull/1739/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL0hvb2RpZURlZmF1bHRUaW1lbGluZS5qYXZh) | `50.76% <0.00%> (-1.54%)` | `27.00% <0.00%> (-1.00%)` | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/1739?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/1739?src=pr&el=footer). Last update [043eb56...a9e3ec3](https://codecov.io/gh/apache/hudi/pull/1739?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] baobaoyeye removed a comment on pull request #1739: [HUDI-760]Remove Rolling Stat management from Hudi Writer

Posted by GitBox <gi...@apache.org>.
baobaoyeye removed a comment on pull request #1739:
URL: https://github.com/apache/hudi/pull/1739#issuecomment-652117383


   > @baobaoyeye : Thanks a lot for contributing this PR. I have few minor suggestions in cleaning up the code further. Can you please address then and let us know. We will merge it then.
   Good suggestion, fixed
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org