You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/10/27 02:57:16 UTC

[GitHub] [hudi] waywtdcc opened a new pull request, #7075: [HUDI-5100][flink]Support writing tasks independently in the flink batch mode

waywtdcc opened a new pull request, #7075:
URL: https://github.com/apache/hudi/pull/7075

   …tch mode
   
   ### Change Logs
   
   Support writing tasks independently in the flink batch mode, jira:https://issues.apache.org/jira/browse/HUDI-5100
   
   ### Impact
   
   Support writing tasks independently in the flink batch mode,
   
   ### Risk level (write none, low medium or high below)
   
   medium
   
   ### Documentation Update
   
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7075: [HUDI-5100][flink]Support writing tasks independently in the flink batch mode

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #7075:
URL: https://github.com/apache/hudi/pull/7075#issuecomment-1292938809

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4e7c3106700d9100a3d91b90995260e6e6fabee2",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "4e7c3106700d9100a3d91b90995260e6e6fabee2",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 4e7c3106700d9100a3d91b90995260e6e6fabee2 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on pull request #7075: [HUDI-5100][flink]Support writing tasks independently in the flink batch mode

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on PR #7075:
URL: https://github.com/apache/hudi/pull/7075#issuecomment-1585983760

   Not yet, hope we can address this issue for 0.14.0 release.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] liufangqi commented on pull request #7075: [HUDI-5100][flink]Support writing tasks independently in the flink batch mode

Posted by "liufangqi (via GitHub)" <gi...@apache.org>.
liufangqi commented on PR #7075:
URL: https://github.com/apache/hudi/pull/7075#issuecomment-1584230719

   @danny0405 hello, do we find a reasonable way to solve this issue? now we face the case too. We can't start writing when we don't get the whole resource on flink batch mode.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #7075: [HUDI-5100][flink]Support writing tasks independently in the flink batch mode

Posted by GitBox <gi...@apache.org>.
danny0405 commented on code in PR #7075:
URL: https://github.com/apache/hudi/pull/7075#discussion_r1006392805


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/meta/CkpMetadata.java:
##########
@@ -132,6 +132,20 @@ private void clean(String newInstant) {
     }
   }
 
+  /**
+   * start a checkpoint commit message.
+   *
+   * @param instant The start commit instant
+   */
+  public void startCommitInstant(String instant) {

Review Comment:
   The reason we do not support this is because we have no idea how many tasks are running for partial mode, and we have no idea when to commit the metadata data.
   
   One solution is we commit the metadata in commit sink, but that is a different mechanism and we need to rethink about it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] waywtdcc commented on a diff in pull request #7075: [HUDI-5100][flink]Support writing tasks independently in the flink batch mode

Posted by GitBox <gi...@apache.org>.
waywtdcc commented on code in PR #7075:
URL: https://github.com/apache/hudi/pull/7075#discussion_r1006419928


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/meta/CkpMetadata.java:
##########
@@ -132,6 +132,20 @@ private void clean(String newInstant) {
     }
   }
 
+  /**
+   * start a checkpoint commit message.
+   *
+   * @param instant The start commit instant
+   */
+  public void startCommitInstant(String instant) {

Review Comment:
   I think the reason why the task is blocked is that it needs to be done when committing instant. However, it is not accurate to use inflight to represent running instant and committing instant instead of recording a state in commit.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7075: [HUDI-5100][flink]Support writing tasks independently in the flink batch mode

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #7075:
URL: https://github.com/apache/hudi/pull/7075#issuecomment-1292941916

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4e7c3106700d9100a3d91b90995260e6e6fabee2",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12612",
       "triggerID" : "4e7c3106700d9100a3d91b90995260e6e6fabee2",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 4e7c3106700d9100a3d91b90995260e6e6fabee2 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12612) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] waywtdcc commented on a diff in pull request #7075: [HUDI-5100][flink]Support writing tasks independently in the flink batch mode

Posted by GitBox <gi...@apache.org>.
waywtdcc commented on code in PR #7075:
URL: https://github.com/apache/hudi/pull/7075#discussion_r1006419928


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/meta/CkpMetadata.java:
##########
@@ -132,6 +132,20 @@ private void clean(String newInstant) {
     }
   }
 
+  /**
+   * start a checkpoint commit message.
+   *
+   * @param instant The start commit instant
+   */
+  public void startCommitInstant(String instant) {

Review Comment:
   I think the reason why the task is blocked is that it needs to be done when committing instant, not running instant. But before, there was no distinction between running instant and commiting instant. Instead, inflight is used to represent the running instant and commiting instant states, which is not accurate enough.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] waywtdcc commented on a diff in pull request #7075: [HUDI-5100][flink]Support writing tasks independently in the flink batch mode

Posted by GitBox <gi...@apache.org>.
waywtdcc commented on code in PR #7075:
URL: https://github.com/apache/hudi/pull/7075#discussion_r1006418655


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/meta/CkpMetadata.java:
##########
@@ -132,6 +132,20 @@ private void clean(String newInstant) {
     }
   }
 
+  /**
+   * start a checkpoint commit message.
+   *
+   * @param instant The start commit instant
+   */
+  public void startCommitInstant(String instant) {

Review Comment:
   This method does not affect the metadata submission at the end of the commit. This is just a status to record the start of commit.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #7075: [HUDI-5100][flink]Support writing tasks independently in the flink batch mode

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #7075:
URL: https://github.com/apache/hudi/pull/7075#issuecomment-1293142529

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4e7c3106700d9100a3d91b90995260e6e6fabee2",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12612",
       "triggerID" : "4e7c3106700d9100a3d91b90995260e6e6fabee2",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 4e7c3106700d9100a3d91b90995260e6e6fabee2 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12612) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org