You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/01/13 08:39:59 UTC

[GitHub] [incubator-hudi] liujianhuiouc opened a new pull request #1216: [HUDI-525]

liujianhuiouc opened a new pull request #1216: [HUDI-525]
URL: https://github.com/apache/incubator-hudi/pull/1216
 
 
   add insert info with insert records num in the HoodieCommitMeta. 
   because of the file id is unknown, set it as empty string
   
    testing result:
   ```
   
     "partitionToWriteStats" : {
       "americas/brazil/sao_paulo" : [ {
         "fileId" : "",
         "path" : null,
         "prevCommit" : null,
         "numWrites" : 0,
         "numDeletes" : 0,
         "numUpdateWrites" : 0,
         "numInserts" : 3,
         "totalWriteBytes" : 0,
         "totalWriteErrors" : 0,
         "tempPath" : null,
         "partitionPath" : null,
         "totalLogRecords" : 0,
         "totalLogFilesCompacted" : 0,
         "totalLogSizeCompacted" : 0,
         "totalUpdatedRecordsCompacted" : 0,
         "totalLogBlocks" : 0,
         "totalCorruptLogBlock" : 0,
         "totalRollbackBlocks" : 0,
         "fileSizeInBytes" : 0
       } ],
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] liujianhuiouc commented on issue #1216: [HUDI-525] lack of insert info in delta_commit inflight

Posted by GitBox <gi...@apache.org>.
liujianhuiouc commented on issue #1216: [HUDI-525] lack of insert info in delta_commit inflight
URL: https://github.com/apache/incubator-hudi/pull/1216#issuecomment-583253681
 
 
   @n3nash  I'm sorry to reply so late,  i have fix that with HoodieWriteStat.NULL_COMMIT

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] n3nash commented on a change in pull request #1216: [HUDI-525] lack of insert info in delta_commit inflight

Posted by GitBox <gi...@apache.org>.
n3nash commented on a change in pull request #1216: [HUDI-525] lack of insert info in delta_commit inflight
URL: https://github.com/apache/incubator-hudi/pull/1216#discussion_r368082200
 
 

 ##########
 File path: hudi-client/src/main/java/org/apache/hudi/HoodieWriteClient.java
 ##########
 @@ -428,6 +428,12 @@ private void saveWorkloadProfileMetadataToInflight(WorkloadProfile profile, Hood
       HoodieCommitMetadata metadata = new HoodieCommitMetadata();
       profile.getPartitionPaths().forEach(path -> {
         WorkloadStat partitionStat = profile.getWorkloadStat(path.toString());
+        HoodieWriteStat insertStat = new HoodieWriteStat();
+        insertStat.setNumInserts(partitionStat.getNumInserts());
+        insertStat.setFileId("");
 
 Review comment:
   Can we just leave them unset ? It should just be null..

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] liujianhuiouc commented on issue #1216: [HUDI-525] lack of insert info in delta_commit inflight

Posted by GitBox <gi...@apache.org>.
liujianhuiouc commented on issue #1216: [HUDI-525] lack of insert info in delta_commit inflight
URL: https://github.com/apache/incubator-hudi/pull/1216#issuecomment-574522300
 
 
   sometimes will to know the num of record of bulk inserts as the inflight state not successful transition to complete

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] liujianhuiouc closed pull request #1216: [HUDI-525] lack of insert info in delta_commit inflight

Posted by GitBox <gi...@apache.org>.
liujianhuiouc closed pull request #1216: [HUDI-525] lack of insert info in delta_commit inflight
URL: https://github.com/apache/incubator-hudi/pull/1216
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] n3nash commented on issue #1216: [HUDI-525] lack of insert info in delta_commit inflight

Posted by GitBox <gi...@apache.org>.
n3nash commented on issue #1216: [HUDI-525] lack of insert info in delta_commit inflight
URL: https://github.com/apache/incubator-hudi/pull/1216#issuecomment-581707851
 
 
   @liujianhuiouc do you need any inputs here ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] leesf commented on issue #1216: [HUDI-525]

Posted by GitBox <gi...@apache.org>.
leesf commented on issue #1216: [HUDI-525]
URL: https://github.com/apache/incubator-hudi/pull/1216#issuecomment-573590408
 
 
   Thanks for opening the contribution @liujianhuiouc ! would you please change the title and follow the guide https://hudi.apache.org/contributing.html#life-of-a-contributor?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] n3nash commented on issue #1216: [HUDI-525] lack of insert info in delta_commit inflight

Posted by GitBox <gi...@apache.org>.
n3nash commented on issue #1216: [HUDI-525] lack of insert info in delta_commit inflight
URL: https://github.com/apache/incubator-hudi/pull/1216#issuecomment-578903392
 
 
   @liujianhuiouc Could you please address the last comment ? We can then merge this PR

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] liujianhuiouc commented on issue #1216: [HUDI-525] lack of insert info in delta_commit inflight

Posted by GitBox <gi...@apache.org>.
liujianhuiouc commented on issue #1216: [HUDI-525] lack of insert info in delta_commit inflight
URL: https://github.com/apache/incubator-hudi/pull/1216#issuecomment-586070652
 
 
   @n3nash I dont have any definate case , in that case, the fields related to updates already in the metadata, so i think the inserts info should also in that metadat

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] n3nash commented on issue #1216: [HUDI-525] lack of insert info in delta_commit inflight

Posted by GitBox <gi...@apache.org>.
n3nash commented on issue #1216: [HUDI-525] lack of insert info in delta_commit inflight
URL: https://github.com/apache/incubator-hudi/pull/1216#issuecomment-573809598
 
 
   @liujianhuiouc What functionality are we going to enhance by adding this information to the inflight workload profile metadata ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] n3nash commented on issue #1216: [HUDI-525] lack of insert info in delta_commit inflight

Posted by GitBox <gi...@apache.org>.
n3nash commented on issue #1216: [HUDI-525] lack of insert info in delta_commit inflight
URL: https://github.com/apache/incubator-hudi/pull/1216#issuecomment-585358877
 
 
   @liujianhuiouc on second thoughts, do you have a definite use case for knowing the number of inserts during a failed write ? Since there are no fileIds, this information is actually use-less to Hudi - this intermediate commit metadata is meant for use internally by Hudi and not by external consumers so don't see the value in this. WDYT ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] n3nash commented on a change in pull request #1216: [HUDI-525] lack of insert info in delta_commit inflight

Posted by GitBox <gi...@apache.org>.
n3nash commented on a change in pull request #1216: [HUDI-525] lack of insert info in delta_commit inflight
URL: https://github.com/apache/incubator-hudi/pull/1216#discussion_r369936144
 
 

 ##########
 File path: hudi-client/src/main/java/org/apache/hudi/HoodieWriteClient.java
 ##########
 @@ -428,6 +428,12 @@ private void saveWorkloadProfileMetadataToInflight(WorkloadProfile profile, Hood
       HoodieCommitMetadata metadata = new HoodieCommitMetadata();
       profile.getPartitionPaths().forEach(path -> {
         WorkloadStat partitionStat = profile.getWorkloadStat(path.toString());
+        HoodieWriteStat insertStat = new HoodieWriteStat();
+        insertStat.setNumInserts(partitionStat.getNumInserts());
+        insertStat.setFileId("");
 
 Review comment:
   okay, for setPrevCommit please use HoodieWriteStat.NULL_COMMIT

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] liujianhuiouc commented on a change in pull request #1216: [HUDI-525] lack of insert info in delta_commit inflight

Posted by GitBox <gi...@apache.org>.
liujianhuiouc commented on a change in pull request #1216: [HUDI-525] lack of insert info in delta_commit inflight
URL: https://github.com/apache/incubator-hudi/pull/1216#discussion_r368354556
 
 

 ##########
 File path: hudi-client/src/main/java/org/apache/hudi/HoodieWriteClient.java
 ##########
 @@ -428,6 +428,12 @@ private void saveWorkloadProfileMetadataToInflight(WorkloadProfile profile, Hood
       HoodieCommitMetadata metadata = new HoodieCommitMetadata();
       profile.getPartitionPaths().forEach(path -> {
         WorkloadStat partitionStat = profile.getWorkloadStat(path.toString());
+        HoodieWriteStat insertStat = new HoodieWriteStat();
+        insertStat.setNumInserts(partitionStat.getNumInserts());
+        insertStat.setFileId("");
 
 Review comment:
   unset will make test fail , that cause NPE

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services