You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/02/25 03:12:11 UTC

[GitHub] [hudi] boneanxs opened a new pull request #4905: [WIP] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

boneanxs opened a new pull request #4905:
URL: https://github.com/apache/hudi/pull/4905


   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.*
   
   ## What is the purpose of the pull request
   
   *(For example: This pull request adds quick-start document.)*
   
   ## Brief change log
   
   *(for example:)*
     - *Modify AnnotationLocation checkstyle rule in checkstyle.xml*
   
   ## Verify this pull request
   
   *(Please pick either of the following options)*
   
   This pull request is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This pull request is already covered by existing tests, such as *(please describe tests)*.
   
   (or)
   
   This change added tests and can be verified as follows:
   
   *(example:)*
   
     - *Added integration tests for end-to-end.*
     - *Added HoodieClientWriteTest to verify the change.*
     - *Manually verified the change by running a job locally.*
   
   ## Committer checklist
   
    - [ ] Has a corresponding JIRA in PR title & commit
    
    - [ ] Commit message is descriptive of the change
    
    - [ ] CI is green
   
    - [ ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4905: [HUDI-3548] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1057040845


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6289",
       "triggerID" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0b64d4695ec538c4a8d56a2be0cd81aca98f077e",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6475",
       "triggerID" : "0b64d4695ec538c4a8d56a2be0cd81aca98f077e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a025412a3d2bb1a9010944a72c4ee3b558e8c49c",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6477",
       "triggerID" : "a025412a3d2bb1a9010944a72c4ee3b558e8c49c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 0b64d4695ec538c4a8d56a2be0cd81aca98f077e Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6475) 
   * a025412a3d2bb1a9010944a72c4ee3b558e8c49c Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6477) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4905: [HUDI-3548] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1057570906


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6289",
       "triggerID" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0b64d4695ec538c4a8d56a2be0cd81aca98f077e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6475",
       "triggerID" : "0b64d4695ec538c4a8d56a2be0cd81aca98f077e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a025412a3d2bb1a9010944a72c4ee3b558e8c49c",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6477",
       "triggerID" : "a025412a3d2bb1a9010944a72c4ee3b558e8c49c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f648cc8727f9986deed050c6e2322eed23c1719b",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6485",
       "triggerID" : "f648cc8727f9986deed050c6e2322eed23c1719b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a025412a3d2bb1a9010944a72c4ee3b558e8c49c Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6477) 
   * f648cc8727f9986deed050c6e2322eed23c1719b Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6485) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4905: [WIP] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1050507607


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6289",
       "triggerID" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "CANCELED",
       "url" : "TBD",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   *  Unknown: [CANCELED](TBD) 
   * a1d4192d50bafaaf3419f286c147985c41e633c7 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4905: [WIP] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1050481085


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6289",
       "triggerID" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 3877df2a9de100a10e1e4876e02cf28b78071e85 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6289) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4905: [HUDI-3548] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1056978792


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6289",
       "triggerID" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0b64d4695ec538c4a8d56a2be0cd81aca98f077e",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6475",
       "triggerID" : "0b64d4695ec538c4a8d56a2be0cd81aca98f077e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a025412a3d2bb1a9010944a72c4ee3b558e8c49c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a025412a3d2bb1a9010944a72c4ee3b558e8c49c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 0b64d4695ec538c4a8d56a2be0cd81aca98f077e Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6475) 
   * a025412a3d2bb1a9010944a72c4ee3b558e8c49c UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4905: [HUDI-3548] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1057100672


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6289",
       "triggerID" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0b64d4695ec538c4a8d56a2be0cd81aca98f077e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6475",
       "triggerID" : "0b64d4695ec538c4a8d56a2be0cd81aca98f077e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a025412a3d2bb1a9010944a72c4ee3b558e8c49c",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6477",
       "triggerID" : "a025412a3d2bb1a9010944a72c4ee3b558e8c49c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a025412a3d2bb1a9010944a72c4ee3b558e8c49c Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6477) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4905: [HUDI-3548] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1056975375


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6289",
       "triggerID" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0b64d4695ec538c4a8d56a2be0cd81aca98f077e",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6475",
       "triggerID" : "0b64d4695ec538c4a8d56a2be0cd81aca98f077e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a025412a3d2bb1a9010944a72c4ee3b558e8c49c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a025412a3d2bb1a9010944a72c4ee3b558e8c49c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a1d4192d50bafaaf3419f286c147985c41e633c7 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292) 
   * 0b64d4695ec538c4a8d56a2be0cd81aca98f077e Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6475) 
   * a025412a3d2bb1a9010944a72c4ee3b558e8c49c UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4905: [HUDI-3548] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1056978792


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6289",
       "triggerID" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0b64d4695ec538c4a8d56a2be0cd81aca98f077e",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6475",
       "triggerID" : "0b64d4695ec538c4a8d56a2be0cd81aca98f077e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a025412a3d2bb1a9010944a72c4ee3b558e8c49c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a025412a3d2bb1a9010944a72c4ee3b558e8c49c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 0b64d4695ec538c4a8d56a2be0cd81aca98f077e Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6475) 
   * a025412a3d2bb1a9010944a72c4ee3b558e8c49c UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4905: [HUDI-3548] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1056972239


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6289",
       "triggerID" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0b64d4695ec538c4a8d56a2be0cd81aca98f077e",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6475",
       "triggerID" : "0b64d4695ec538c4a8d56a2be0cd81aca98f077e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a1d4192d50bafaaf3419f286c147985c41e633c7 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292) 
   * 0b64d4695ec538c4a8d56a2be0cd81aca98f077e Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6475) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4905: [HUDI-3548] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1057570906


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6289",
       "triggerID" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0b64d4695ec538c4a8d56a2be0cd81aca98f077e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6475",
       "triggerID" : "0b64d4695ec538c4a8d56a2be0cd81aca98f077e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a025412a3d2bb1a9010944a72c4ee3b558e8c49c",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6477",
       "triggerID" : "a025412a3d2bb1a9010944a72c4ee3b558e8c49c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f648cc8727f9986deed050c6e2322eed23c1719b",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6485",
       "triggerID" : "f648cc8727f9986deed050c6e2322eed23c1719b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a025412a3d2bb1a9010944a72c4ee3b558e8c49c Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6477) 
   * f648cc8727f9986deed050c6e2322eed23c1719b Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6485) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4905: [WIP] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1050485965


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6289",
       "triggerID" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "",
       "status" : "CANCELED",
       "url" : "TBD",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 3877df2a9de100a10e1e4876e02cf28b78071e85 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6289) 
   *  Unknown: [CANCELED](TBD) 
   * a1d4192d50bafaaf3419f286c147985c41e633c7 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4905: [WIP] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1050479530


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 3877df2a9de100a10e1e4876e02cf28b78071e85 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan merged pull request #4905: [HUDI-3548] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
nsivabalan merged pull request #4905:
URL: https://github.com/apache/hudi/pull/4905


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4905: [HUDI-3548] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1057569392


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6289",
       "triggerID" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0b64d4695ec538c4a8d56a2be0cd81aca98f077e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6475",
       "triggerID" : "0b64d4695ec538c4a8d56a2be0cd81aca98f077e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a025412a3d2bb1a9010944a72c4ee3b558e8c49c",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6477",
       "triggerID" : "a025412a3d2bb1a9010944a72c4ee3b558e8c49c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f648cc8727f9986deed050c6e2322eed23c1719b",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f648cc8727f9986deed050c6e2322eed23c1719b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a025412a3d2bb1a9010944a72c4ee3b558e8c49c Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6477) 
   * f648cc8727f9986deed050c6e2322eed23c1719b UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] boneanxs commented on pull request #4905: [WIP] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
boneanxs commented on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1054044282


   > parameters.get(DataSourceWriteOptions.ASYNC_CLUSTERING_ENABLE().key()) while instantiating writeConfig may or may not work. Even though we have added alternate keys, not sure what exactly gets returned as part of DataSourceWriteOptions.ASYNC_CLUSTERING_ENABLE().key().
   
   I think `DataSourceWriteOptions.ASYNC_CLUSTERING_ENABLE().key()` returned `hoodie.clustering.async.enabled`
   
   Also, after landing https://github.com/apache/hudi/pull/4828,  there are two kinds of situations from users side, 
   
   1. If users enable clustering by `DataSourceOptions.ASYNC_CLUSTERING_ENABLE.key` or `HoodieClusteringConfig.ASYNC_CLUSTERING_ENABLE.key`, the clustering service can be enabled, because in the configMap, it will store `hoodie.clustering.async.enabled -> true`
   ```scala
   df.option(DataSourceOptions.ASYNC_CLUSTERING_ENABLE.key, true)
   // or
   df.option(HoodieClusteringConfig.ASYNC_CLUSTERING_ENABLE.key, true)
   ```
   
   2. But if users enable clustering by key's value directly(which means using `hoodie.datasource.clustering.async.enable` or `hoodie.clustering.async.enabled`), there could be some differences,
   
   ```scala
   // will not start clustering because codes
   // parameters.get(DataSourceOptions.ASYNC_CLUSTERING_ENABLE.key).exists(r => r.toBoolean) in method
   // isAsyncClusteringEnabled use value hoodie.clustering.async.enabled to check. 
   df.option("hoodie.datasource.clustering.async.enable", true)
   // will work
   df.option("hoodie.clustering.async.enabled", true)
   ```
   from DataSourceUtils
   
   ```java
   public static HoodieWriteConfig createHoodieConfig(String schemaStr, String basePath,
         String tblName, Map<String, String> parameters) {
   
       boolean asyncClusteringEnabled = Boolean.parseBoolean(parameters.get(DataSourceWriteOptions.ASYNC_CLUSTERING_ENABLE().key()));
   
   return builder.forTable(tblName)
           .withCompactionConfig(HoodieCompactionConfig.newBuilder()
               .withPayloadClass(parameters.get(DataSourceWriteOptions.PAYLOAD_CLASS_NAME().key()))
               .withInlineCompaction(inlineCompact).build())
           .withClusteringConfig(HoodieClusteringConfig.newBuilder()
               .withInlineClustering(inlineClusteringEnabled)
               .withAsyncClustering(asyncClusteringEnabled).build())
           .withPayloadConfig(HoodiePayloadConfig.newBuilder().withPayloadOrderingField(parameters.get(DataSourceWriteOptions.PRECOMBINE_FIELD().key()))
               .build())
           // override above with Hoodie configs specified as options.
           .withProps(parameters).build();
   ```
   if we set `df.option("hoodie.datasource.clustering.async.enable", true)`, `HoodieWriteConfig` will both contain `hoodie.datasource.clustering.async.enable -> true` and `hoodie.clustering.async.enabled -> false`, because it uses `parameters.get(DataSourceWriteOptions.ASYNC_CLUSTERING_ENABLE().key()))` which equals to `parameters.get("hoodie.clustering.async.enabled")` to check async clustering is enabled or not, which of cause returns false, while it will also save all parameters to `HoodieWriteConfig` by `.withProps(parameters).build()`, so `hoodie.datasource.clustering.async.enable -> true` will also be saved. As method `HoodieWriteConfig.isAsyncClusteringEnabled` first use key `hoodie.clustering.async.enabled` to check, then use alternative key `hoodie.datasource.clustering.async.enable` to check, so it will be false.
   
   From my understanding, we can remove these parameters related checks in the pr(Looks these checks also is unnecessary), and only use `HoodieWriteConfig.isAsyncClusteringEnabled` to check if async clustering is enabled or not. In this way, following usages `DataSourceWriteOptions.ASYNC_CLUSTERING_ENABLE().key`, `HoodieClusteringConfig.ASYNC_CLUSTERING_ENABLE().key`, `hoodie.datasource.clustering.async.enable` and `hoodie.clustering.async.enabled` all works.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #4905: [HUDI-3548] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#discussion_r818076410



##########
File path: hudi-spark-datasource/hudi-spark/src/test/java/org/apache/hudi/TestDataSourceUtils.java
##########
@@ -222,6 +224,26 @@ public void testCreateRDDCustomColumnsSortPartitionerWithValidPartitioner() thro
     assertThat(partitioner.isPresent(), is(true));
   }
 
+  @Test
+  public void testCreateHoodieConfigWithAsyncClustering() {
+    ArrayList<ImmutablePair<String, Boolean>> asyncClusteringKeyValues = new ArrayList<>(4);
+    asyncClusteringKeyValues.add(new ImmutablePair(DataSourceWriteOptions.ASYNC_CLUSTERING_ENABLE().key(), true));
+    asyncClusteringKeyValues.add(new ImmutablePair(HoodieClusteringConfig.ASYNC_CLUSTERING_ENABLE.key(), true));
+    asyncClusteringKeyValues.add(new ImmutablePair("hoodie.datasource.clustering.async.enable", true));
+    asyncClusteringKeyValues.add(new ImmutablePair("hoodie.clustering.async.enabled", true));
+
+    asyncClusteringKeyValues.stream().forEach(pair -> {
+      HashMap<String, String> params = new HashMap<>(3);
+      params.put(DataSourceWriteOptions.TABLE_TYPE().key(), DataSourceWriteOptions.TABLE_TYPE().defaultValue());
+      params.put(DataSourceWriteOptions.PAYLOAD_CLASS_NAME().key(),
+              DataSourceWriteOptions.PAYLOAD_CLASS_NAME().defaultValue());
+      params.put(pair.left, pair.right.toString());
+      HoodieWriteConfig hoodieConfig = DataSourceUtils
+              .createHoodieConfig(avroSchemaString, config.getBasePath(), "test", params);
+      assertEquals(pair.right, hoodieConfig.isAsyncClusteringEnabled());
+    });

Review comment:
       can you also add assertion for this 
   ```
   HoodieClusteringConfig.from(props).isAsyncClusteringEnabled()
   ```
   this is what we use in DeltaStreamer code path. should work. but would be good to cover in tests as well. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4905: [WIP] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1050479530


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 3877df2a9de100a10e1e4876e02cf28b78071e85 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4905: [WIP] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1050507607


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6289",
       "triggerID" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "CANCELED",
       "url" : "TBD",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   *  Unknown: [CANCELED](TBD) 
   * a1d4192d50bafaaf3419f286c147985c41e633c7 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan edited a comment on pull request #4905: [WIP] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
nsivabalan edited a comment on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1053612275


   I spent time to go over all usages of both configs and here is my understanding. 
   
   Lets see what was the status prior to merging https://github.com/apache/hudi/pull/4828
   
   for simplicitly, lets talk about one config i.e async enable clustering. 
   for this, we have one config at writer client layer(hoodie.clustering.async.enabled) and one at data source layer (hoodie.datasource.clustering.async.enable). lets call each of them writeClientAsyncConfig and dataSourceAsyncConfig for explanation purpose. 
   
   DetlaStreamer flow:
   here, dataSourceAsyncConfig is never in the picture. we check if writeClientAsyncConfig is set and go about it. no confusion here.
   
   Spark Datasource flow: 
   We expect users to set datasourceAsyncConfig in this case. 
   And within HoodieSparkSqlWriter, while configuring/instantiating HoodieWriteClientConfig, based on whats set of datasourceAsyncConfig, hudi sets the writeClientAsyncConfig. 
   
   Excerpt from DataSourceUtils 
   ```
   
   public static HoodieWriteConfig createHoodieConfig(String schemaStr, String basePath,
         String tblName, Map<String, String> parameters) {
   .
   .
       boolean asyncClusteringEnabled = Boolean.parseBoolean(parameters.get(DataSourceWriteOptions.ASYNC_CLUSTERING_ENABLE().key()));
   .
   HoodieWriteConfig.newBuilder().... 
   .withClusteringConfig(HoodieClusteringConfig.newBuilder()
               .withAsyncClustering(asyncClusteringEnabled).build())
   .
   ```
   
   So, one bug I see here is, if someone sets writeClientAsyncConfig with spark datasource write, probably we may not honor, bcoz, we explicitly check the value of dataSourceAsyncConfig and set the writeClientAsyncConfig in HoodieWriteConfig while building the writeConfig.
   
   We have another method in HoodieSparkSqlWriter where we check if asyncClustering is enabled. 
   ```
     private def isAsyncClusteringEnabled(client: SparkRDDWriteClient[HoodieRecordPayload[Nothing]],
                                          parameters: Map[String, String]): Boolean = {
       log.info(s"Config.asyncClusteringEnabled ? ${client.getConfig.isAsyncClusteringEnabled}")
       asyncClusteringTriggerFnDefined && client.getConfig.isAsyncClusteringEnabled &&
         parameters.get(DataSourceOptions.ASYNC_CLUSTERING_ENABLE.key).exists(r => r.toBoolean)
     }
   ```
   
   this is called/used only in HoodieStreaming sink flow. Anyways. 
   Lets see what happens if 
   a) User sets just the datasourceAsyncConfig.
   writeConfig will get instantiation properly and so client.getConfig.isAsyncClusteringEnabled will be true and parameters.get(DataSourceOptions.ASYNC_CLUSTERING_ENABLE.key) will also be true.
   and we should be good here. 
   b) User sets just the writeClientConfig. 
   writeConfig instantiation will have a gap here. and so client.getConfig.isAsyncClusteringEnabled will return false. 
   
   
   Now, lets revisit whats the status with latest master (i.e. after landing https://github.com/apache/hudi/pull/4828) 
   
   No change in DeltaStreamer flow.
   
   Spark DataSource flow: 
   Excerpt from DataSourceUtils 
   ```
   
   public static HoodieWriteConfig createHoodieConfig(String schemaStr, String basePath,
         String tblName, Map<String, String> parameters) {
   .
   .
       boolean asyncClusteringEnabled = Boolean.parseBoolean(parameters.get(DataSourceWriteOptions.ASYNC_CLUSTERING_ENABLE().key()));
   .
   HoodieWriteConfig.newBuilder().... 
   .withClusteringConfig(HoodieClusteringConfig.newBuilder()
               .withAsyncClustering(asyncClusteringEnabled).build())
   .
   ```
   dataSourceAsyncConfig (DataSourceWriteOptions.ASYNC_CLUSTERING_ENABLE().key()) refers to same config key as writeClientAsyncConfig now. So, essentially the previously used dataSourceAsyncConfig does not exist anymore. 
   
   and the other method of interest is 
   ```
     private def isAsyncClusteringEnabled(client: SparkRDDWriteClient[HoodieRecordPayload[Nothing]],
                                          parameters: Map[String, String]): Boolean = {
       log.info(s"Config.asyncClusteringEnabled ? ${client.getConfig.isAsyncClusteringEnabled}")
       asyncClusteringTriggerFnDefined && client.getConfig.isAsyncClusteringEnabled &&
         parameters.get(DataSourceOptions.ASYNC_CLUSTERING_ENABLE.key).exists(r => r.toBoolean)
     }
   ```
   
   Now, let's see both options.
   a) User sets just old dataSourceAsyncConfig. 
   parameters.get(DataSourceWriteOptions.ASYNC_CLUSTERING_ENABLE().key()) while instantiating writeConfig may or may not work. Even though we have added alternate keys, not sure what exactly gets returned as part of DataSourceWriteOptions.ASYNC_CLUSTERING_ENABLE().key(). 
   Assuming both keys work, 
   within isAsyncClusteringEnabled: client.getConfig.isAsyncClusteringEnabled will be true. parameters.get(DataSourceOptions.ASYNC_CLUSTERING_ENABLE.key) will also be true. 
   But if old dataSourceAsyncConfig does not get picked up with parameters.get(DataSourceOptions.ASYNC_CLUSTERING_ENABLE.key), there will be a gap. 
   b) User sets writeClientAsyncConfig. 
   parameters.get(DataSourceOptions.ASYNC_CLUSTERING_ENABLE.key) will refer to writeClientConfig. and so writeConfig instantiation should be good. and isAsyncClusteringEnabled() should also work. 
   
   
   @codope : I am thinking if we should revert the old patch. Do you remember why did we introduced a new datasource config in the first place w/o re-using writeClientConfig for clustering enablement ? 
   
   
   
   
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] boneanxs commented on pull request #4905: [WIP] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
boneanxs commented on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1050484445


   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4905: [HUDI-3548] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1056975375


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6289",
       "triggerID" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0b64d4695ec538c4a8d56a2be0cd81aca98f077e",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6475",
       "triggerID" : "0b64d4695ec538c4a8d56a2be0cd81aca98f077e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a025412a3d2bb1a9010944a72c4ee3b558e8c49c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a025412a3d2bb1a9010944a72c4ee3b558e8c49c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a1d4192d50bafaaf3419f286c147985c41e633c7 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292) 
   * 0b64d4695ec538c4a8d56a2be0cd81aca98f077e Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6475) 
   * a025412a3d2bb1a9010944a72c4ee3b558e8c49c UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4905: [HUDI-3548] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1056972239


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6289",
       "triggerID" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0b64d4695ec538c4a8d56a2be0cd81aca98f077e",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6475",
       "triggerID" : "0b64d4695ec538c4a8d56a2be0cd81aca98f077e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a1d4192d50bafaaf3419f286c147985c41e633c7 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292) 
   * 0b64d4695ec538c4a8d56a2be0cd81aca98f077e Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6475) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4905: [HUDI-3548] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1057040845


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6289",
       "triggerID" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0b64d4695ec538c4a8d56a2be0cd81aca98f077e",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6475",
       "triggerID" : "0b64d4695ec538c4a8d56a2be0cd81aca98f077e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a025412a3d2bb1a9010944a72c4ee3b558e8c49c",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6477",
       "triggerID" : "a025412a3d2bb1a9010944a72c4ee3b558e8c49c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 0b64d4695ec538c4a8d56a2be0cd81aca98f077e Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6475) 
   * a025412a3d2bb1a9010944a72c4ee3b558e8c49c Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6477) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #4905: [WIP] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1050505944


   @boneanxs : Can I get some more clarity around the issue. 
   Do you mean to say, if you set "hoodie.datasource.clustering.async.enable=true" with spark datasource writes, clustering gets executed inline? or do you mean to say, clustering does not get executed only ? 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4905: [WIP] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1050485965


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6289",
       "triggerID" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "",
       "status" : "CANCELED",
       "url" : "TBD",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 3877df2a9de100a10e1e4876e02cf28b78071e85 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6289) 
   *  Unknown: [CANCELED](TBD) 
   * a1d4192d50bafaaf3419f286c147985c41e633c7 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4905: [WIP] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1050532697


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6289",
       "triggerID" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * a1d4192d50bafaaf3419f286c147985c41e633c7 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4905: [HUDI-3548] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1050532697


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6289",
       "triggerID" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * a1d4192d50bafaaf3419f286c147985c41e633c7 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4905: [HUDI-3548] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1057569392


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6289",
       "triggerID" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0b64d4695ec538c4a8d56a2be0cd81aca98f077e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6475",
       "triggerID" : "0b64d4695ec538c4a8d56a2be0cd81aca98f077e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a025412a3d2bb1a9010944a72c4ee3b558e8c49c",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6477",
       "triggerID" : "a025412a3d2bb1a9010944a72c4ee3b558e8c49c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f648cc8727f9986deed050c6e2322eed23c1719b",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f648cc8727f9986deed050c6e2322eed23c1719b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a025412a3d2bb1a9010944a72c4ee3b558e8c49c Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6477) 
   * f648cc8727f9986deed050c6e2322eed23c1719b UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4905: [HUDI-3548] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1057609983


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6289",
       "triggerID" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0b64d4695ec538c4a8d56a2be0cd81aca98f077e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6475",
       "triggerID" : "0b64d4695ec538c4a8d56a2be0cd81aca98f077e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a025412a3d2bb1a9010944a72c4ee3b558e8c49c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6477",
       "triggerID" : "a025412a3d2bb1a9010944a72c4ee3b558e8c49c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f648cc8727f9986deed050c6e2322eed23c1719b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6485",
       "triggerID" : "f648cc8727f9986deed050c6e2322eed23c1719b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f648cc8727f9986deed050c6e2322eed23c1719b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6485) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4905: [WIP] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1050481085


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6289",
       "triggerID" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 3877df2a9de100a10e1e4876e02cf28b78071e85 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6289) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4905: [WIP] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1050487606


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6289",
       "triggerID" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "",
       "status" : "CANCELED",
       "url" : "TBD",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a1d4192d50bafaaf3419f286c147985c41e633c7 UNKNOWN
   *  Unknown: [CANCELED](TBD) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4905: [WIP] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1050487606


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6289",
       "triggerID" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "",
       "status" : "CANCELED",
       "url" : "TBD",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a1d4192d50bafaaf3419f286c147985c41e633c7 UNKNOWN
   *  Unknown: [CANCELED](TBD) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #4905: [WIP] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1053612275


   I spent time to go over all usages of both configs and here is my understanding. 
   
   Lets see what was the status prior to merging https://github.com/apache/hudi/pull/4828
   
   for simplicitly, lets talk about one config i.e async enable clustering. 
   for this, we have one config at writer client layer(hoodie.clustering.async.enabled) and one at data source layer (hoodie.datasource.clustering.async.enable). lets call each of them writeClientAsyncConfig and dataSourceAsyncConfig for explanation purpose. 
   
   DetlaStreamer flow:
   here, dataSourceAsyncConfig is never in the picture. we check if writeClientAsyncConfig is set and go about it. no confusion here.
   
   Spark Datasource flow: 
   We expect users to set datasourceAsyncConfig in this case. 
   And within HoodieSparkSqlWriter, while configuring/instantiating HoodieWriteClientConfig, based on whats set of datasourceAsyncConfig, hudi sets the writeClientAsyncConfig. 
   
   Excerpt from DataSourceUtils 
   ```
   
   public static HoodieWriteConfig createHoodieConfig(String schemaStr, String basePath,
         String tblName, Map<String, String> parameters) {
   .
   .
       boolean asyncClusteringEnabled = Boolean.parseBoolean(parameters.get(DataSourceWriteOptions.ASYNC_CLUSTERING_ENABLE().key()));
   .
   HoodieWriteConfig.newBuilder().... 
   .withClusteringConfig(HoodieClusteringConfig.newBuilder()
               .withAsyncClustering(asyncClusteringEnabled).build())
   .
   ```
   
   So, one bug I see here is, if someone sets writeClientAsyncConfig with spark datasource write, probably we may not honor, bcoz, we explicitly check the value of dataSourceAsyncConfig and set the writeClientAsyncConfig in HoodieWriteConfig while building the writeConfig.
   
   We have another method in HoodieSparkSqlWriter where we check if asyncClustering is enabled. 
   ```
     private def isAsyncClusteringEnabled(client: SparkRDDWriteClient[HoodieRecordPayload[Nothing]],
                                          parameters: Map[String, String]): Boolean = {
       log.info(s"Config.asyncClusteringEnabled ? ${client.getConfig.isAsyncClusteringEnabled}")
       asyncClusteringTriggerFnDefined && client.getConfig.isAsyncClusteringEnabled &&
         parameters.get(DataSourceOptions.ASYNC_CLUSTERING_ENABLE.key).exists(r => r.toBoolean)
     }
   ```
   
   this is called/used only in HoodieStreaming sink flow. Anyways. 
   Lets see what happens if 
   a) User sets just the datasourceAsyncConfig.
   writeConfig will get instantiation properly and so client.getConfig.isAsyncClusteringEnabled will be true and parameters.get(DataSourceOptions.ASYNC_CLUSTERING_ENABLE.key) will also be true.
   and we should be good here. 
   b) User sets just the writeClientConfig. 
   writeConfig instantiation will have a gap here. and so client.getConfig.isAsyncClusteringEnabled will return false. 
   
   
   Now, lets revisit whats the status with latest master (i.e. after landing https://github.com/apache/hudi/pull/4828) 
   
   No change in DeltaStreamer flow.
   
   Spark DataSource flow: 
   Excerpt from DataSourceUtils 
   ```
   
   public static HoodieWriteConfig createHoodieConfig(String schemaStr, String basePath,
         String tblName, Map<String, String> parameters) {
   .
   .
       boolean asyncClusteringEnabled = Boolean.parseBoolean(parameters.get(DataSourceWriteOptions.ASYNC_CLUSTERING_ENABLE().key()));
   .
   HoodieWriteConfig.newBuilder().... 
   .withClusteringConfig(HoodieClusteringConfig.newBuilder()
               .withAsyncClustering(asyncClusteringEnabled).build())
   .
   ```
   dataSourceAsyncConfig (DataSourceWriteOptions.ASYNC_CLUSTERING_ENABLE().key()) refers to same config key as writeClientAsyncConfig now. So, essentially the previously used dataSourceAsyncConfig does not exist anymore. 
   
   and the other method of interest is 
   ```
     private def isAsyncClusteringEnabled(client: SparkRDDWriteClient[HoodieRecordPayload[Nothing]],
                                          parameters: Map[String, String]): Boolean = {
       log.info(s"Config.asyncClusteringEnabled ? ${client.getConfig.isAsyncClusteringEnabled}")
       asyncClusteringTriggerFnDefined && client.getConfig.isAsyncClusteringEnabled &&
         parameters.get(DataSourceOptions.ASYNC_CLUSTERING_ENABLE.key).exists(r => r.toBoolean)
     }
   ```
   
   Now, lets see both options.
   a) User sets just old dataSourceAsyncConfig. 
   parameters.get(DataSourceWriteOptions.ASYNC_CLUSTERING_ENABLE().key()) while instantiating writeConfig may or may not work. Even though we have added alternate keys, not sure what exactly gets returned as part of DataSourceWriteOptions.ASYNC_CLUSTERING_ENABLE().key(). 
   Assuming both keys work, within isAsyncClusteringEnabled: client.getConfig.isAsyncClusteringEnabled will be true, 
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4905: [HUDI-3548] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1056969010


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6289",
       "triggerID" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0b64d4695ec538c4a8d56a2be0cd81aca98f077e",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "0b64d4695ec538c4a8d56a2be0cd81aca98f077e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a1d4192d50bafaaf3419f286c147985c41e633c7 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292) 
   * 0b64d4695ec538c4a8d56a2be0cd81aca98f077e UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #4905: [HUDI-3548] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1056969010


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6289",
       "triggerID" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0b64d4695ec538c4a8d56a2be0cd81aca98f077e",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "0b64d4695ec538c4a8d56a2be0cd81aca98f077e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a1d4192d50bafaaf3419f286c147985c41e633c7 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292) 
   * 0b64d4695ec538c4a8d56a2be0cd81aca98f077e UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot removed a comment on pull request #4905: [HUDI-3548] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
hudi-bot removed a comment on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1057100672


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6289",
       "triggerID" : "3877df2a9de100a10e1e4876e02cf28b78071e85",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6292",
       "triggerID" : "a1d4192d50bafaaf3419f286c147985c41e633c7",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1050484445",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0b64d4695ec538c4a8d56a2be0cd81aca98f077e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6475",
       "triggerID" : "0b64d4695ec538c4a8d56a2be0cd81aca98f077e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a025412a3d2bb1a9010944a72c4ee3b558e8c49c",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6477",
       "triggerID" : "a025412a3d2bb1a9010944a72c4ee3b558e8c49c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a025412a3d2bb1a9010944a72c4ee3b558e8c49c Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=6477) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] boneanxs commented on pull request #4905: [WIP] Fix if user specify key "hoodie.datasource.clustering.async.enable" directly, async clustering not work

Posted by GitBox <gi...@apache.org>.
boneanxs commented on pull request #4905:
URL: https://github.com/apache/hudi/pull/4905#issuecomment-1050599665


   > @boneanxs : Can I get some more clarity around the issue. Do you mean to say, if you set "hoodie.datasource.clustering.async.enable=true" with spark datasource writes, clustering gets executed inline? or do you mean to say, clustering does not get executed only ?
   
   Thanks for your reply, as we sync datasource clustering configure with hoodieClusterConfig by this [pr](https://github.com/apache/hudi/pull/4828), now if we enable aync clustering by following:
   
   ```scala
   df.option("hoodie.datasource.clustering.async.enable", true)
   ```
   
   the async clustering would not be enabled, this is because: https://github.com/apache/hudi/pull/4905/files#diff-8bda4b2174721fd642a5435282834e5d796a320c1d9e1366b27be86bd548d48aL729
   
   ```scala
   asyncClusteringTriggerFnDefined && client.getConfig.isAsyncClusteringEnabled &&
         parameters.get(ASYNC_CLUSTERING_ENABLE.key).exists(r => r.toBoolean)
   ```
   use `ASYNC_CLUSTERING_ENABLE.key` which is `hoodie.clustering.async.enabled` to check, I think this code can be removed, so we can use both `hoodie.clustering.async.enabled` and `hoodie.datasource.clustering.async.enable` to enable async clustering service.
   
   Also, should we also sync compaction configurations same as clustering?  I found `HoodieCompactionConfig` only use `hoodie.compact.inline` to trigger sync compaction work, while `DataSourceWriteOptions` introduce `ASYNC_COMPACT_ENABLE` to enable async compaction work, I'm wonder if we should move `ASYNC_COMPACT_ENABLE` to `HoodieCompactionConfig`?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org