You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/09/27 02:02:08 UTC

[GitHub] [hudi] yihua opened a new pull request, #6802: [HUDI-4924] Auto-tune dedup parallelism

yihua opened a new pull request, #6802:
URL: https://github.com/apache/hudi/pull/6802

   ### Change Logs
   
   _Describe context and summary for this change. Highlight if any code was copied._
   
   ### Impact
   
   _Describe any public API or user-facing feature change or any performance impact._
   
   **Risk level: none | low | medium | high**
   
   _Choose one. If medium or high, explain what verification was done to mitigate the risks._
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, config, or user-facing change_
   
   - _The config description must be updated if new configs are added or the default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the
     ticket number here and follow the [instruction](https://hudi.apache.org/contribute/developer-setup#website) to make
     changes to the website._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6802: [HUDI-4924] Auto-tune dedup parallelism

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6802:
URL: https://github.com/apache/hudi/pull/6802#issuecomment-1260176698

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a7e22e5456a947c9f8ad000d47f4b32bddf1937c",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11761",
       "triggerID" : "a7e22e5456a947c9f8ad000d47f4b32bddf1937c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4b207ae3989df07ab53e8bf69ed3c65ffc818270",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11808",
       "triggerID" : "4b207ae3989df07ab53e8bf69ed3c65ffc818270",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f29b7651130535f6212f2f3917ad6d48800710cb",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f29b7651130535f6212f2f3917ad6d48800710cb",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a7e22e5456a947c9f8ad000d47f4b32bddf1937c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11761) 
   * 4b207ae3989df07ab53e8bf69ed3c65ffc818270 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11808) 
   * f29b7651130535f6212f2f3917ad6d48800710cb UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6802: [HUDI-4924] Auto-tune dedup parallelism

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6802:
URL: https://github.com/apache/hudi/pull/6802#issuecomment-1259079157

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a7e22e5456a947c9f8ad000d47f4b32bddf1937c",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11761",
       "triggerID" : "a7e22e5456a947c9f8ad000d47f4b32bddf1937c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a7e22e5456a947c9f8ad000d47f4b32bddf1937c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11761) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6802: [HUDI-4924] Auto-tune dedup parallelism

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6802:
URL: https://github.com/apache/hudi/pull/6802#issuecomment-1260243028

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a7e22e5456a947c9f8ad000d47f4b32bddf1937c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11761",
       "triggerID" : "a7e22e5456a947c9f8ad000d47f4b32bddf1937c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4b207ae3989df07ab53e8bf69ed3c65ffc818270",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11808",
       "triggerID" : "4b207ae3989df07ab53e8bf69ed3c65ffc818270",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f29b7651130535f6212f2f3917ad6d48800710cb",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11812",
       "triggerID" : "f29b7651130535f6212f2f3917ad6d48800710cb",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 4b207ae3989df07ab53e8bf69ed3c65ffc818270 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11808) 
   * f29b7651130535f6212f2f3917ad6d48800710cb Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11812) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6802: [HUDI-4924] Auto-tune dedup parallelism

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6802:
URL: https://github.com/apache/hudi/pull/6802#issuecomment-1260870696

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f29b7651130535f6212f2f3917ad6d48800710cb",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11812",
       "triggerID" : "f29b7651130535f6212f2f3917ad6d48800710cb",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f29b7651130535f6212f2f3917ad6d48800710cb Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11812) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6802: [HUDI-4924] Auto-tune dedup parallelism

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6802:
URL: https://github.com/apache/hudi/pull/6802#issuecomment-1258878758

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a7e22e5456a947c9f8ad000d47f4b32bddf1937c",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11761",
       "triggerID" : "a7e22e5456a947c9f8ad000d47f4b32bddf1937c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a7e22e5456a947c9f8ad000d47f4b32bddf1937c Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11761) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6802: [HUDI-4924] Auto-tune dedup parallelism

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6802:
URL: https://github.com/apache/hudi/pull/6802#issuecomment-1258875051

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a7e22e5456a947c9f8ad000d47f4b32bddf1937c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a7e22e5456a947c9f8ad000d47f4b32bddf1937c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a7e22e5456a947c9f8ad000d47f4b32bddf1937c UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] yihua commented on pull request #6802: [HUDI-4924] Auto-tune dedup parallelism

Posted by "yihua (via GitHub)" <gi...@apache.org>.
yihua commented on PR #6802:
URL: https://github.com/apache/hudi/pull/6802#issuecomment-1636937250

   > May I ask if there is anyway we can improve it? Thanks
   
   Sorry to miss this message.  I think we need to revert this change.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6802: [HUDI-4924] Auto-tune dedup parallelism

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6802:
URL: https://github.com/apache/hudi/pull/6802#issuecomment-1260173350

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a7e22e5456a947c9f8ad000d47f4b32bddf1937c",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11761",
       "triggerID" : "a7e22e5456a947c9f8ad000d47f4b32bddf1937c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4b207ae3989df07ab53e8bf69ed3c65ffc818270",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "4b207ae3989df07ab53e8bf69ed3c65ffc818270",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a7e22e5456a947c9f8ad000d47f4b32bddf1937c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11761) 
   * 4b207ae3989df07ab53e8bf69ed3c65ffc818270 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6802: [HUDI-4924] Auto-tune dedup parallelism

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6802:
URL: https://github.com/apache/hudi/pull/6802#issuecomment-1260533244

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f29b7651130535f6212f2f3917ad6d48800710cb",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f29b7651130535f6212f2f3917ad6d48800710cb",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f29b7651130535f6212f2f3917ad6d48800710cb UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6802: [HUDI-4924] Auto-tune dedup parallelism

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6802:
URL: https://github.com/apache/hudi/pull/6802#issuecomment-1260179805

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a7e22e5456a947c9f8ad000d47f4b32bddf1937c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11761",
       "triggerID" : "a7e22e5456a947c9f8ad000d47f4b32bddf1937c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4b207ae3989df07ab53e8bf69ed3c65ffc818270",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11808",
       "triggerID" : "4b207ae3989df07ab53e8bf69ed3c65ffc818270",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f29b7651130535f6212f2f3917ad6d48800710cb",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f29b7651130535f6212f2f3917ad6d48800710cb",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 4b207ae3989df07ab53e8bf69ed3c65ffc818270 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11808) 
   * f29b7651130535f6212f2f3917ad6d48800710cb UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan commented on pull request #6802: [HUDI-4924] Auto-tune dedup parallelism

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on PR #6802:
URL: https://github.com/apache/hudi/pull/6802#issuecomment-1260162593

   have rebased and pushed 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TengHuo commented on pull request #6802: [HUDI-4924] Auto-tune dedup parallelism

Posted by "TengHuo (via GitHub)" <gi...@apache.org>.
TengHuo commented on PR #6802:
URL: https://github.com/apache/hudi/pull/6802#issuecomment-1455802492

   Hi @yihua 
   
   We found an issue in our DeltaStreamer pipeline recently. Our Kafka to Hudi DeltaStreamer pipeline is running slower than 0.10 when we upgraded to 0.12. After checking. We noticed that this issue was caused by this slow stage: `Building workload profile`.
   
   <img width="1782" alt="slow_build_workload_profile" src="https://user-images.githubusercontent.com/7539060/223072527-ea56fae5-d2d3-4843-a3f9-8db008c8f7bb.png">
   
   The parallelism of this stage was 10, which is from this line, https://github.com/apache/hudi/blob/master/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/commit/HoodieWriteHelper.java#L64
   
   So, even if we setup the config `hoodie.upsert.shuffle.parallelism` as 1000, it will be ignored by the parallelism of input records, which is the number of Kafka topic partition.
   
   May I ask if there is anyway we can improve it? Thanks
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TengHuo commented on pull request #6802: [HUDI-4924] Auto-tune dedup parallelism

Posted by "TengHuo (via GitHub)" <gi...@apache.org>.
TengHuo commented on PR #6802:
URL: https://github.com/apache/hudi/pull/6802#issuecomment-1647106218

   > > May I ask if there is anyway we can improve it? Thanks
   > 
   > Sorry to miss this message. I think we need to revert this change.
   
   Thanks @yihua 
   
   Yeah, agree, we have reverted in our internal version. I want to know if there is any better idea for implementing auto-tune?
   
   We can help if there is a solid solution.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan merged pull request #6802: [HUDI-4924] Auto-tune dedup parallelism

Posted by GitBox <gi...@apache.org>.
nsivabalan merged PR #6802:
URL: https://github.com/apache/hudi/pull/6802


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6802: [HUDI-4924] Auto-tune dedup parallelism

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6802:
URL: https://github.com/apache/hudi/pull/6802#issuecomment-1260539975

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "f29b7651130535f6212f2f3917ad6d48800710cb",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11812",
       "triggerID" : "f29b7651130535f6212f2f3917ad6d48800710cb",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f29b7651130535f6212f2f3917ad6d48800710cb Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11812) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6802: [HUDI-4924] Auto-tune dedup parallelism

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6802:
URL: https://github.com/apache/hudi/pull/6802#issuecomment-1260416284

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a7e22e5456a947c9f8ad000d47f4b32bddf1937c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11761",
       "triggerID" : "a7e22e5456a947c9f8ad000d47f4b32bddf1937c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4b207ae3989df07ab53e8bf69ed3c65ffc818270",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11808",
       "triggerID" : "4b207ae3989df07ab53e8bf69ed3c65ffc818270",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f29b7651130535f6212f2f3917ad6d48800710cb",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11812",
       "triggerID" : "f29b7651130535f6212f2f3917ad6d48800710cb",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f29b7651130535f6212f2f3917ad6d48800710cb Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11812) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] yihua commented on pull request #6802: [HUDI-4924] Auto-tune dedup parallelism

Posted by "yihua (via GitHub)" <gi...@apache.org>.
yihua commented on PR #6802:
URL: https://github.com/apache/hudi/pull/6802#issuecomment-1647215573

   > > > May I ask if there is anyway we can improve it? Thanks
   > > 
   > > 
   > > Sorry to miss this message. I think we need to revert this change.
   > 
   > Thanks @yihua
   > 
   > Yeah, agree, we have reverted it in our internal version. I want to know if there is any better idea for implementing auto-tune?
   > 
   > We can help if there is a solid solution.
   
   I think we need to make sure that the dedup parallelism is only applied to the dedup stage, not affecting subsequent stages, which may require better parallelism control by repartitioning with right parallelism before workload profiling. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org