You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "XuQianJin-Stars (via GitHub)" <gi...@apache.org> on 2023/03/15 03:20:24 UTC

[GitHub] [hudi] XuQianJin-Stars opened a new pull request, #8188: [MINOR] Improve instantToWrite

XuQianJin-Stars opened a new pull request, #8188:
URL: https://github.com/apache/hudi/pull/8188

   
   ![image](https://user-images.githubusercontent.com/10494131/225196750-64047345-9743-48fd-b146-88389907950f.png)
   
   
   ### Change Logs
   
   NA
   
   ### Impact
   
   NA
   
   ### Risk level (write none, low medium or high below)
   
   none
   
   ### Documentation Update
   
   NA
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hbgstc123 commented on a diff in pull request #8188: [MINOR] Improve instantToWrite

Posted by "hbgstc123 (via GitHub)" <gi...@apache.org>.
hbgstc123 commented on code in PR #8188:
URL: https://github.com/apache/hudi/pull/8188#discussion_r1137370673


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/utils/TimeWait.java:
##########
@@ -54,10 +54,11 @@ public static Builder builder() {
   public void waitFor() {
     try {
       if (waitingTime > timeout) {
-        throw new HoodieException("Timeout(" + waitingTime + "ms) while waiting for " + action);
+        LOG.warn("Timeout(" + waitingTime + "ms) while waiting for " + action);

Review Comment:
   I have a question: when `waitingTime > timeout`, function `waitFor()` will return without sleep, and will the AbstractStreamWriteFunction::instantToWrite go busy loop waiting for new commit.requested instant?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] XuQianJin-Stars commented on pull request #8188: [MINOR] Improve instantToWrite

Posted by "XuQianJin-Stars (via GitHub)" <gi...@apache.org>.
XuQianJin-Stars commented on PR #8188:
URL: https://github.com/apache/hudi/pull/8188#issuecomment-1469578788

   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] SteNicholas commented on a diff in pull request #8188: [MINOR] Improve instantToWrite

Posted by "SteNicholas (via GitHub)" <gi...@apache.org>.
SteNicholas commented on code in PR #8188:
URL: https://github.com/apache/hudi/pull/8188#discussion_r1138038570


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/bulk/BulkInsertWriteFunction.java:
##########
@@ -195,7 +200,7 @@ private String instantToWrite() {
     // waits for the checkpoint notification until the checkpoint timeout threshold hits.
     TimeWait timeWait = TimeWait.builder()
         .timeout(config.getLong(FlinkOptions.WRITE_COMMIT_ACK_TIMEOUT))
-        .action("instant initialize")
+        .action("BulkInsertWrite instant initialize")

Review Comment:
   Any need to emphasize `BulkInsertWrite instant initialize`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8188: [MINOR] Improve instantToWrite

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8188:
URL: https://github.com/apache/hudi/pull/8188#issuecomment-1469833476

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f661159b0ca739e783dcaa4901b84ea48a0c66c6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15725",
       "triggerID" : "f661159b0ca739e783dcaa4901b84ea48a0c66c6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "304084d5216e81b9400d53a16fe9a5b8d50fe3ac",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15727",
       "triggerID" : "304084d5216e81b9400d53a16fe9a5b8d50fe3ac",
       "triggerType" : "PUSH"
     }, {
       "hash" : "304084d5216e81b9400d53a16fe9a5b8d50fe3ac",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15730",
       "triggerID" : "1469578788",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 304084d5216e81b9400d53a16fe9a5b8d50fe3ac Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15727) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15730) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] SteNicholas commented on a diff in pull request #8188: [MINOR] Improve instantToWrite

Posted by "SteNicholas (via GitHub)" <gi...@apache.org>.
SteNicholas commented on code in PR #8188:
URL: https://github.com/apache/hudi/pull/8188#discussion_r1138039079


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/utils/TimeWait.java:
##########
@@ -67,7 +68,7 @@ public void waitFor() {
    * Builder.
    */
   public static class Builder {
-    private long timeout = 5 * 60 * 1000L; // default 5 minutes
+    private long timeout = 10 * 60 * 1000L; // default 10 minutes

Review Comment:
   Why should this update the default timeout?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] XuQianJin-Stars commented on a diff in pull request #8188: [MINOR] Improve instantToWrite

Posted by "XuQianJin-Stars (via GitHub)" <gi...@apache.org>.
XuQianJin-Stars commented on code in PR #8188:
URL: https://github.com/apache/hudi/pull/8188#discussion_r1138216544


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/utils/TimeWait.java:
##########
@@ -54,10 +54,11 @@ public static Builder builder() {
   public void waitFor() {
     try {
       if (waitingTime > timeout) {
-        throw new HoodieException("Timeout(" + waitingTime + "ms) while waiting for " + action);
+        LOG.warn("Timeout(" + waitingTime + "ms) while waiting for " + action);

Review Comment:
   > I have a question: when `waitingTime > timeout`, function `waitFor()` will return without sleep, and will the AbstractStreamWriteFunction::instantToWrite go busy loop waiting for new commit.requested instant?
   
   In the case of a lot of parallelism, the cost of job restart is also very high.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8188: [MINOR] Improve instantToWrite

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8188:
URL: https://github.com/apache/hudi/pull/8188#issuecomment-1469584388

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f661159b0ca739e783dcaa4901b84ea48a0c66c6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15725",
       "triggerID" : "f661159b0ca739e783dcaa4901b84ea48a0c66c6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "304084d5216e81b9400d53a16fe9a5b8d50fe3ac",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15727",
       "triggerID" : "304084d5216e81b9400d53a16fe9a5b8d50fe3ac",
       "triggerType" : "PUSH"
     }, {
       "hash" : "304084d5216e81b9400d53a16fe9a5b8d50fe3ac",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15730",
       "triggerID" : "1469578788",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 304084d5216e81b9400d53a16fe9a5b8d50fe3ac Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15727) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15730) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] SteNicholas commented on a diff in pull request #8188: [MINOR] Improve instantToWrite

Posted by "SteNicholas (via GitHub)" <gi...@apache.org>.
SteNicholas commented on code in PR #8188:
URL: https://github.com/apache/hudi/pull/8188#discussion_r1138037717


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/utils/TimeWait.java:
##########
@@ -54,10 +54,11 @@ public static Builder builder() {
   public void waitFor() {
     try {
       if (waitingTime > timeout) {
-        throw new HoodieException("Timeout(" + waitingTime + "ms) while waiting for " + action);
+        LOG.warn("Timeout(" + waitingTime + "ms) while waiting for " + action);

Review Comment:
   @hbgstc123 +1. And I think if timeout, throwing exception makes sense to me. Otherwise how the timeout works?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8188: [MINOR] Improve instantToWrite

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8188:
URL: https://github.com/apache/hudi/pull/8188#issuecomment-1469376914

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f661159b0ca739e783dcaa4901b84ea48a0c66c6",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15725",
       "triggerID" : "f661159b0ca739e783dcaa4901b84ea48a0c66c6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "304084d5216e81b9400d53a16fe9a5b8d50fe3ac",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15727",
       "triggerID" : "304084d5216e81b9400d53a16fe9a5b8d50fe3ac",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f661159b0ca739e783dcaa4901b84ea48a0c66c6 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15725) 
   * 304084d5216e81b9400d53a16fe9a5b8d50fe3ac Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15727) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] XuQianJin-Stars closed pull request #8188: [MINOR] Improve instantToWrite

Posted by "XuQianJin-Stars (via GitHub)" <gi...@apache.org>.
XuQianJin-Stars closed pull request #8188: [MINOR] Improve instantToWrite
URL: https://github.com/apache/hudi/pull/8188


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8188: [MINOR] Improve instantToWrite

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8188:
URL: https://github.com/apache/hudi/pull/8188#issuecomment-1469372212

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f661159b0ca739e783dcaa4901b84ea48a0c66c6",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15725",
       "triggerID" : "f661159b0ca739e783dcaa4901b84ea48a0c66c6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "304084d5216e81b9400d53a16fe9a5b8d50fe3ac",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "304084d5216e81b9400d53a16fe9a5b8d50fe3ac",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f661159b0ca739e783dcaa4901b84ea48a0c66c6 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15725) 
   * 304084d5216e81b9400d53a16fe9a5b8d50fe3ac UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8188: [MINOR] Improve instantToWrite

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8188:
URL: https://github.com/apache/hudi/pull/8188#issuecomment-1469273055

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f661159b0ca739e783dcaa4901b84ea48a0c66c6",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15725",
       "triggerID" : "f661159b0ca739e783dcaa4901b84ea48a0c66c6",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f661159b0ca739e783dcaa4901b84ea48a0c66c6 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15725) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8188: [MINOR] Improve instantToWrite

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8188:
URL: https://github.com/apache/hudi/pull/8188#issuecomment-1469265889

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f661159b0ca739e783dcaa4901b84ea48a0c66c6",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f661159b0ca739e783dcaa4901b84ea48a0c66c6",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f661159b0ca739e783dcaa4901b84ea48a0c66c6 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8188: [MINOR] Improve instantToWrite

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8188:
URL: https://github.com/apache/hudi/pull/8188#issuecomment-1469563503

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f661159b0ca739e783dcaa4901b84ea48a0c66c6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15725",
       "triggerID" : "f661159b0ca739e783dcaa4901b84ea48a0c66c6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "304084d5216e81b9400d53a16fe9a5b8d50fe3ac",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15727",
       "triggerID" : "304084d5216e81b9400d53a16fe9a5b8d50fe3ac",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 304084d5216e81b9400d53a16fe9a5b8d50fe3ac Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15727) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org