You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "zhuanshenbsj1 (via GitHub)" <gi...@apache.org> on 2023/04/20 02:51:40 UTC

[GitHub] [hudi] zhuanshenbsj1 opened a new pull request, #8505: Spark offline compaction/Clustering Job will do clean like Flink job

zhuanshenbsj1 opened a new pull request, #8505:
URL: https://github.com/apache/hudi/pull/8505

   ### Change Logs
   
   Adjust the cleaning operation in SparkRDDWriteClient#cluster/compact, when ASYNC_CLEAN is true will do asynchronous clean in prewrite, otherwise will do synchronous clean in autoCleanOnCommit().
   
   ### Impact
   
   _Describe any public API or user-facing feature change or any performance impact._
   
   ### Risk level (write none, low medium or high below)
   
   _If medium or high, explain what verification was done to mitigate the risks._
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, config, or user-facing change_
   
   - _The config description must be updated if new configs are added or the default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the
     ticket number here and follow the [instruction](https://hudi.apache.org/contribute/developer-setup#website) to make
     changes to the website._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1535701206

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16647",
       "triggerID" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16615",
       "triggerID" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16661",
       "triggerID" : "1522646175",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16687",
       "triggerID" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16850",
       "triggerID" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   * 8539f134f599512b4fd9c6e9a8bcac8172f7094d Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16687) 
   * 4fc5fb7659adab4b978a8477f48b262b54733596 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16850) 
   * d05e517e595e292974ec4d3a6dfbb23537ecae81 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] zhuanshenbsj1 commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "zhuanshenbsj1 (via GitHub)" <gi...@apache.org>.
zhuanshenbsj1 commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1545337327

   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1545371213

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16647",
       "triggerID" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16615",
       "triggerID" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16661",
       "triggerID" : "1522646175",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16687",
       "triggerID" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16850",
       "triggerID" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16854",
       "triggerID" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e69e498a02b9f0bc95955ef99bacf6cc286b84e5",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "e69e498a02b9f0bc95955ef99bacf6cc286b84e5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fbd341163585d108ea9ca4b93f11f7646c057f50",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16870",
       "triggerID" : "fbd341163585d108ea9ca4b93f11f7646c057f50",
       "triggerType" : "PUSH"
     }, {
       "hash" : "74c778482b03b2e57ad342e36d11d27a5c08a70c",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16888",
       "triggerID" : "74c778482b03b2e57ad342e36d11d27a5c08a70c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1545337327",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   * e69e498a02b9f0bc95955ef99bacf6cc286b84e5 UNKNOWN
   * 74c778482b03b2e57ad342e36d11d27a5c08a70c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16888) 
   * 3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1522012414

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16647",
       "triggerID" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   * 25c1856f6baa94428c65d3dd03caf04ae19bad52 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616) 
   * 7cf81113aeac802b3ba35723627ebcb6f6943453 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16647) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1519363195

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 55007361a8c01779a883cee54ecf45ce94e25dce Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475) 
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 787a543839c6bc021b5ee98a99faa373b399f8e6 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1520059484

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 787a543839c6bc021b5ee98a99faa373b399f8e6 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593) 
   * 8670f7026c22101c7cd7ab4627b1b07d5b9cd991 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1522668875

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16647",
       "triggerID" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16615",
       "triggerID" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16661",
       "triggerID" : "1522646175",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   * 46f712faba3339cae98f6462bec8044ba0b01839 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16615) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16661) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1519367667

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 55007361a8c01779a883cee54ecf45ce94e25dce Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475) 
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 787a543839c6bc021b5ee98a99faa373b399f8e6 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1174871236


##########
hudi-utilities/src/test/java/org/apache/hudi/utilities/TestHoodieCompactor.java:
##########
@@ -0,0 +1,177 @@
+package org.apache.hudi.utilities;
+
+import org.apache.hudi.client.SparkRDDWriteClient;
+import org.apache.hudi.client.WriteStatus;
+import org.apache.hudi.common.config.HoodieMetadataConfig;
+import org.apache.hudi.common.config.HoodieStorageConfig;
+import org.apache.hudi.common.model.HoodieAvroPayload;
+import org.apache.hudi.common.model.HoodieCleaningPolicy;
+import org.apache.hudi.common.model.HoodieTableType;
+import org.apache.hudi.common.model.HoodieWriteStat;
+import org.apache.hudi.common.table.HoodieTableConfig;
+import org.apache.hudi.common.table.HoodieTableMetaClient;
+import org.apache.hudi.common.table.timeline.HoodieActiveTimeline;
+import org.apache.hudi.common.table.timeline.HoodieTimeline;
+import org.apache.hudi.common.testutils.HoodieTestDataGenerator;
+import org.apache.hudi.common.util.Option;
+import org.apache.hudi.config.HoodieCleanConfig;
+import org.apache.hudi.config.HoodieCompactionConfig;
+import org.apache.hudi.config.HoodieIndexConfig;
+import org.apache.hudi.config.HoodieLayoutConfig;
+import org.apache.hudi.config.HoodieWriteConfig;
+import org.apache.hudi.index.HoodieIndex;
+import org.apache.hudi.table.action.commit.SparkBucketIndexPartitioner;
+import org.apache.hudi.table.storage.HoodieStorageLayout;
+import org.apache.hudi.utilities.testutils.UtilitiesTestBase;
+
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.spark.api.java.JavaRDD;
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.BeforeAll;
+import org.junit.jupiter.api.BeforeEach;
+import org.junit.jupiter.api.Test;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Properties;
+import java.util.stream.Collectors;
+
+import static org.apache.hudi.common.testutils.HoodieTestDataGenerator.TRIP_EXAMPLE_SCHEMA;
+import static org.junit.jupiter.api.Assertions.assertTrue;
+
+public class TestHoodieCompactor extends UtilitiesTestBase {
+  private static final Logger LOG = LoggerFactory.getLogger(UtilitiesTestBase.class);
+  private HoodieTestDataGenerator dataGen;
+  private SparkRDDWriteClient client;
+  private HoodieTableMetaClient metaClient;
+
+  @BeforeAll
+  public static void initClass() throws Exception {
+    UtilitiesTestBase.initTestServices(false, false, false);
+  }
+
+  @BeforeEach
+  public void setup() {
+    dataGen = new HoodieTestDataGenerator();
+  }
+
+  protected HoodieCompactor initialHoodieCompactorSyncClean(String tableBasePath, Boolean runSchedule, String scheduleAndExecute) {
+    HoodieCompactor.Config compactionConfig = buildHoodieCompactionUtilConfig(tableBasePath,
+              runSchedule, scheduleAndExecute);
+    List<String> configs = new ArrayList<>();

Review Comment:
   There is already an existing `TestHoodieCompactor` in the project.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1536327041

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16647",
       "triggerID" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16615",
       "triggerID" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16661",
       "triggerID" : "1522646175",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16687",
       "triggerID" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16850",
       "triggerID" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16854",
       "triggerID" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e69e498a02b9f0bc95955ef99bacf6cc286b84e5",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "e69e498a02b9f0bc95955ef99bacf6cc286b84e5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fbd341163585d108ea9ca4b93f11f7646c057f50",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fbd341163585d108ea9ca4b93f11f7646c057f50",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   * d05e517e595e292974ec4d3a6dfbb23537ecae81 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16854) 
   * e69e498a02b9f0bc95955ef99bacf6cc286b84e5 UNKNOWN
   * fbd341163585d108ea9ca4b93f11f7646c057f50 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1189576582


##########
hudi-utilities/src/test/java/org/apache/hudi/utilities/offlinejob/TestOfflineHoodieClusteringJob.java:
##########
@@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.utilities.offlinejob;
+
+import org.apache.hudi.client.SparkRDDWriteClient;
+import org.apache.hudi.common.config.HoodieMetadataConfig;
+import org.apache.hudi.common.config.HoodieStorageConfig;
+import org.apache.hudi.common.model.HoodieAvroPayload;
+import org.apache.hudi.common.model.HoodieCleaningPolicy;
+import org.apache.hudi.common.model.HoodieTableType;
+import org.apache.hudi.common.table.HoodieTableMetaClient;
+import org.apache.hudi.common.table.timeline.HoodieActiveTimeline;
+import org.apache.hudi.config.HoodieCleanConfig;
+import org.apache.hudi.config.HoodieClusteringConfig;
+import org.apache.hudi.config.HoodieWriteConfig;
+import org.apache.hudi.utilities.HoodieClusteringJob;
+
+import org.junit.jupiter.api.Test;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Properties;
+
+import static org.apache.hudi.common.testutils.HoodieTestDataGenerator.TRIP_EXAMPLE_SCHEMA;
+
+public class TestOfflineHoodieClusteringJob extends HoodieOfflineJobTestBase {
+
+  protected HoodieClusteringJob initialHoodieClusteringJobClean(String tableBasePath, Boolean runSchedule, String scheduleAndExecute,
+                                                              Boolean isAutoClean) {
+    HoodieClusteringJob.Config clusterConfig = buildHoodieClusteringUtilConfig(tableBasePath,

Review Comment:
   Can we move the tests to like `TestHoodieDeltaStreamer` ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1185884599


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieCompactor.java:
##########
@@ -269,13 +272,14 @@ private int doCompact(JavaSparkContext jsc) throws Exception {
         }
       }
       HoodieWriteMetadata<JavaRDD<WriteStatus>> compactionMetadata = client.compact(cfg.compactionInstantTime);
+      cleanAfterCompact(client);
       return UtilHelpers.handleErrors(compactionMetadata.getCommitMetadata().get(), cfg.compactionInstantTime);
     }
   }
 
   private Option<String> doSchedule(JavaSparkContext jsc) {
     try (SparkRDDWriteClient client =
-             UtilHelpers.createHoodieClient(jsc, cfg.basePath, "", cfg.parallelism, Option.of(cfg.strategyClassName), props)) {
+             UtilHelpers.createHoodieClient(jsc, cfg.basePath, "", cfg.parallelism, Option.of(cfg.strategyClassName), props, cfg.asyncSerivceEanble)) {
 
       if (StringUtils.isNullOrEmpty(cfg.compactionInstantTime)) {
         LOG.warn("No instant time is provided for scheduling compaction.");

Review Comment:
   Why cann't we just set the props correctly first? Then there is no need to define another new method.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1546513645

   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1515651524

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 55007361a8c01779a883cee54ecf45ce94e25dce Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "zhuanshenbsj1 (via GitHub)" <gi...@apache.org>.
zhuanshenbsj1 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1185742751


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java:
##########
@@ -256,7 +279,16 @@ private int doScheduleAndCluster(JavaSparkContext jsc) throws Exception {
       LOG.info("The schedule instant time is " + instantTime.get());
       LOG.info("Step 2: Do cluster");
       Option<HoodieCommitMetadata> metadata = client.cluster(instantTime.get()).getCommitMetadata();
+      cleanAfterCluster(client);
       return UtilHelpers.handleErrors(metadata.get(), instantTime.get());
     }
   }
+
+  private void cleanAfterCluster(SparkRDDWriteClient client) {
+    client.waitForAsyncServiceCompletion();
+    if (client.getConfig().isAutoClean() && !client.getConfig().isAsyncClean()) {

Review Comment:
   > We only need to add a sync cleaning of it is **enabled**, does that make sense to you?
   
   
   I originally intended to maintain consistency with Flinkjob, retaining both cleaning way.  Adjusted to only keep synchronous cleaning mode



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1185884599


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieCompactor.java:
##########
@@ -269,13 +272,14 @@ private int doCompact(JavaSparkContext jsc) throws Exception {
         }
       }
       HoodieWriteMetadata<JavaRDD<WriteStatus>> compactionMetadata = client.compact(cfg.compactionInstantTime);
+      cleanAfterCompact(client);
       return UtilHelpers.handleErrors(compactionMetadata.getCommitMetadata().get(), cfg.compactionInstantTime);
     }
   }
 
   private Option<String> doSchedule(JavaSparkContext jsc) {
     try (SparkRDDWriteClient client =
-             UtilHelpers.createHoodieClient(jsc, cfg.basePath, "", cfg.parallelism, Option.of(cfg.strategyClassName), props)) {
+             UtilHelpers.createHoodieClient(jsc, cfg.basePath, "", cfg.parallelism, Option.of(cfg.strategyClassName), props, cfg.asyncSerivceEanble)) {
 
       if (StringUtils.isNullOrEmpty(cfg.compactionInstantTime)) {
         LOG.warn("No instant time is provided for scheduling compaction.");

Review Comment:
   Why cann't we just set the props correctly first? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1191855786


##########
hudi-utilities/src/test/java/org/apache/hudi/utilities/offlinejob/TestOfflineHoodieCompactor.java:
##########
@@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.utilities.offlinejob;
+
+import org.apache.hudi.client.SparkRDDWriteClient;
+import org.apache.hudi.common.config.HoodieMetadataConfig;
+import org.apache.hudi.common.config.HoodieStorageConfig;
+import org.apache.hudi.common.model.HoodieAvroPayload;
+import org.apache.hudi.common.model.HoodieCleaningPolicy;
+import org.apache.hudi.common.model.HoodieTableType;
+import org.apache.hudi.common.table.HoodieTableMetaClient;
+import org.apache.hudi.common.table.timeline.HoodieActiveTimeline;
+import org.apache.hudi.config.HoodieCleanConfig;
+import org.apache.hudi.config.HoodieCompactionConfig;
+import org.apache.hudi.config.HoodieIndexConfig;
+import org.apache.hudi.config.HoodieLayoutConfig;
+import org.apache.hudi.config.HoodieWriteConfig;
+import org.apache.hudi.index.HoodieIndex;
+import org.apache.hudi.table.action.commit.SparkBucketIndexPartitioner;
+import org.apache.hudi.table.storage.HoodieStorageLayout;
+import org.apache.hudi.utilities.HoodieCompactor;
+
+import org.junit.jupiter.api.Test;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Properties;
+
+import static org.apache.hudi.common.testutils.HoodieTestDataGenerator.TRIP_EXAMPLE_SCHEMA;
+
+public class TestOfflineHoodieCompactor extends HoodieOfflineJobTestBase {
+
+  protected HoodieCompactor initialHoodieCompactorClean(String tableBasePath, Boolean runSchedule, String scheduleAndExecute,
+                     Boolean isAutoClean) {

Review Comment:
   Yeah, that's true, just move the test into it should be fine.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1536607278

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16647",
       "triggerID" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16615",
       "triggerID" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16661",
       "triggerID" : "1522646175",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16687",
       "triggerID" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16850",
       "triggerID" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16854",
       "triggerID" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e69e498a02b9f0bc95955ef99bacf6cc286b84e5",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "e69e498a02b9f0bc95955ef99bacf6cc286b84e5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fbd341163585d108ea9ca4b93f11f7646c057f50",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16870",
       "triggerID" : "fbd341163585d108ea9ca4b93f11f7646c057f50",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   * e69e498a02b9f0bc95955ef99bacf6cc286b84e5 UNKNOWN
   * fbd341163585d108ea9ca4b93f11f7646c057f50 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16870) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1537076285

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16647",
       "triggerID" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16615",
       "triggerID" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16661",
       "triggerID" : "1522646175",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16687",
       "triggerID" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16850",
       "triggerID" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16854",
       "triggerID" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e69e498a02b9f0bc95955ef99bacf6cc286b84e5",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "e69e498a02b9f0bc95955ef99bacf6cc286b84e5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fbd341163585d108ea9ca4b93f11f7646c057f50",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16870",
       "triggerID" : "fbd341163585d108ea9ca4b93f11f7646c057f50",
       "triggerType" : "PUSH"
     }, {
       "hash" : "74c778482b03b2e57ad342e36d11d27a5c08a70c",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16888",
       "triggerID" : "74c778482b03b2e57ad342e36d11d27a5c08a70c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   * e69e498a02b9f0bc95955ef99bacf6cc286b84e5 UNKNOWN
   * fbd341163585d108ea9ca4b93f11f7646c057f50 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16870) 
   * 74c778482b03b2e57ad342e36d11d27a5c08a70c Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16888) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] zhuanshenbsj1 commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "zhuanshenbsj1 (via GitHub)" <gi...@apache.org>.
zhuanshenbsj1 commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1546643800

   > @zhuanshenbsj1 Hi, can you rebase with the latest maste and re-trigger the Azur


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1515645860

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 55007361a8c01779a883cee54ecf45ce94e25dce UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1176146542


##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDTableServiceClient.java:
##########
@@ -245,6 +246,7 @@ private void completeClustering(HoodieReplaceCommitMetadata metadata,
           metrics.updateCommitMetrics(parsedInstant.getTime(), durationInMs, metadata, HoodieActiveTimeline.REPLACE_COMMIT_ACTION)
       );
     }
+    waitForAsyncServiceCompletion();
     LOG.info("Clustering successfully on commit " + clusteringCommitTime);

Review Comment:
   This is not correct, we need to wait for the async table service to finish.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "zhuanshenbsj1 (via GitHub)" <gi...@apache.org>.
zhuanshenbsj1 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1175193258


##########
hudi-utilities/src/test/java/org/apache/hudi/utilities/TestHoodieCompactor.java:
##########
@@ -0,0 +1,177 @@
+package org.apache.hudi.utilities;
+
+import org.apache.hudi.client.SparkRDDWriteClient;
+import org.apache.hudi.client.WriteStatus;
+import org.apache.hudi.common.config.HoodieMetadataConfig;
+import org.apache.hudi.common.config.HoodieStorageConfig;
+import org.apache.hudi.common.model.HoodieAvroPayload;
+import org.apache.hudi.common.model.HoodieCleaningPolicy;
+import org.apache.hudi.common.model.HoodieTableType;
+import org.apache.hudi.common.model.HoodieWriteStat;
+import org.apache.hudi.common.table.HoodieTableConfig;
+import org.apache.hudi.common.table.HoodieTableMetaClient;
+import org.apache.hudi.common.table.timeline.HoodieActiveTimeline;
+import org.apache.hudi.common.table.timeline.HoodieTimeline;
+import org.apache.hudi.common.testutils.HoodieTestDataGenerator;
+import org.apache.hudi.common.util.Option;
+import org.apache.hudi.config.HoodieCleanConfig;
+import org.apache.hudi.config.HoodieCompactionConfig;
+import org.apache.hudi.config.HoodieIndexConfig;
+import org.apache.hudi.config.HoodieLayoutConfig;
+import org.apache.hudi.config.HoodieWriteConfig;
+import org.apache.hudi.index.HoodieIndex;
+import org.apache.hudi.table.action.commit.SparkBucketIndexPartitioner;
+import org.apache.hudi.table.storage.HoodieStorageLayout;
+import org.apache.hudi.utilities.testutils.UtilitiesTestBase;
+
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.spark.api.java.JavaRDD;
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.BeforeAll;
+import org.junit.jupiter.api.BeforeEach;
+import org.junit.jupiter.api.Test;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Properties;
+import java.util.stream.Collectors;
+
+import static org.apache.hudi.common.testutils.HoodieTestDataGenerator.TRIP_EXAMPLE_SCHEMA;
+import static org.junit.jupiter.api.Assertions.assertTrue;
+
+public class TestHoodieCompactor extends UtilitiesTestBase {
+  private static final Logger LOG = LoggerFactory.getLogger(UtilitiesTestBase.class);
+  private HoodieTestDataGenerator dataGen;
+  private SparkRDDWriteClient client;
+  private HoodieTableMetaClient metaClient;
+
+  @BeforeAll
+  public static void initClass() throws Exception {
+    UtilitiesTestBase.initTestServices(false, false, false);
+  }
+
+  @BeforeEach
+  public void setup() {
+    dataGen = new HoodieTestDataGenerator();
+  }
+
+  protected HoodieCompactor initialHoodieCompactorSyncClean(String tableBasePath, Boolean runSchedule, String scheduleAndExecute) {
+    HoodieCompactor.Config compactionConfig = buildHoodieCompactionUtilConfig(tableBasePath,
+              runSchedule, scheduleAndExecute);
+    List<String> configs = new ArrayList<>();

Review Comment:
   Rename to TestOfflineHoodieCompactor.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1523610285

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16647",
       "triggerID" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16615",
       "triggerID" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16661",
       "triggerID" : "1522646175",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "",
       "status" : "CANCELED",
       "url" : "TBD",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   *  Unknown: [CANCELED](TBD) 
   * 4dad96ba54827548c95059d12b7d5d5cdcc0c1a4 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673) 
   * 8539f134f599512b4fd9c6e9a8bcac8172f7094d UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "zhuanshenbsj1 (via GitHub)" <gi...@apache.org>.
zhuanshenbsj1 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1176051746


##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDTableServiceClient.java:
##########
@@ -245,6 +246,7 @@ private void completeClustering(HoodieReplaceCommitMetadata metadata,
           metrics.updateCommitMetrics(parsedInstant.getTime(), durationInMs, metadata, HoodieActiveTimeline.REPLACE_COMMIT_ACTION)
       );
     }
+    waitForAsyncServiceCompletion();
     LOG.info("Clustering successfully on commit " + clusteringCommitTime);

Review Comment:
   Without this change,if config ASYNC_CLEAN = true,AsyncCleanerService will be used to do clean in offline job 。In my unit testing for offline job,if the completion time of the compact/cluster job is earlier than the completion time of the sync-cleaning job, function BaseHoodieTableServiceClient.close() will force the asynchronous  cleaning job to be closed, it will causes interrupt Excpetion.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "zhuanshenbsj1 (via GitHub)" <gi...@apache.org>.
zhuanshenbsj1 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1177963180


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieCompactor.java:
##########
@@ -269,6 +269,7 @@ private int doCompact(JavaSparkContext jsc) throws Exception {
         }
       }
       HoodieWriteMetadata<JavaRDD<WriteStatus>> compactionMetadata = client.compact(cfg.compactionInstantTime);
+      cleanAfterCompact(client);
       return UtilHelpers.handleErrors(compactionMetadata.getCommitMetadata().get(), cfg.compactionInstantTime);

Review Comment:
   How about add a asyncEnable config like flink offline job?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1545307813

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16647",
       "triggerID" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16615",
       "triggerID" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16661",
       "triggerID" : "1522646175",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16687",
       "triggerID" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16850",
       "triggerID" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16854",
       "triggerID" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e69e498a02b9f0bc95955ef99bacf6cc286b84e5",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "e69e498a02b9f0bc95955ef99bacf6cc286b84e5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fbd341163585d108ea9ca4b93f11f7646c057f50",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16870",
       "triggerID" : "fbd341163585d108ea9ca4b93f11f7646c057f50",
       "triggerType" : "PUSH"
     }, {
       "hash" : "74c778482b03b2e57ad342e36d11d27a5c08a70c",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16888",
       "triggerID" : "74c778482b03b2e57ad342e36d11d27a5c08a70c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   * e69e498a02b9f0bc95955ef99bacf6cc286b84e5 UNKNOWN
   * 74c778482b03b2e57ad342e36d11d27a5c08a70c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16888) 
   * 3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1520277035

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 31b1aed974e2f1925fd81380c118a2fba91e23fa Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613) 
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   * 25c1856f6baa94428c65d3dd03caf04ae19bad52 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "zhuanshenbsj1 (via GitHub)" <gi...@apache.org>.
zhuanshenbsj1 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1185642315


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java:
##########
@@ -256,7 +279,16 @@ private int doScheduleAndCluster(JavaSparkContext jsc) throws Exception {
       LOG.info("The schedule instant time is " + instantTime.get());
       LOG.info("Step 2: Do cluster");
       Option<HoodieCommitMetadata> metadata = client.cluster(instantTime.get()).getCommitMetadata();
+      cleanAfterCluster(client);
       return UtilHelpers.handleErrors(metadata.get(), instantTime.get());
     }
   }
+
+  private void cleanAfterCluster(SparkRDDWriteClient client) {
+    client.waitForAsyncServiceCompletion();
+    if (client.getConfig().isAutoClean() && !client.getConfig().isAsyncClean()) {

Review Comment:
   > I think we need to trigge a sync clean if it is enabled.
   
   IF isAsyncClean is enable, spark offline job will start an async-cleaning in prewrite like flink job. So if isAsyncClean is disable then add a synchronous cleanup



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1537140908

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16647",
       "triggerID" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16615",
       "triggerID" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16661",
       "triggerID" : "1522646175",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16687",
       "triggerID" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16850",
       "triggerID" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16854",
       "triggerID" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e69e498a02b9f0bc95955ef99bacf6cc286b84e5",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "e69e498a02b9f0bc95955ef99bacf6cc286b84e5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fbd341163585d108ea9ca4b93f11f7646c057f50",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16870",
       "triggerID" : "fbd341163585d108ea9ca4b93f11f7646c057f50",
       "triggerType" : "PUSH"
     }, {
       "hash" : "74c778482b03b2e57ad342e36d11d27a5c08a70c",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16888",
       "triggerID" : "74c778482b03b2e57ad342e36d11d27a5c08a70c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   * e69e498a02b9f0bc95955ef99bacf6cc286b84e5 UNKNOWN
   * 74c778482b03b2e57ad342e36d11d27a5c08a70c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16888) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1186682757


##########
hudi-utilities/src/test/java/org/apache/hudi/utilities/offlinejob/TestOfflineHoodieCompactor.java:
##########
@@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.utilities.offlinejob;
+
+import org.apache.hudi.client.SparkRDDWriteClient;
+import org.apache.hudi.common.config.HoodieMetadataConfig;
+import org.apache.hudi.common.config.HoodieStorageConfig;
+import org.apache.hudi.common.model.HoodieAvroPayload;
+import org.apache.hudi.common.model.HoodieCleaningPolicy;
+import org.apache.hudi.common.model.HoodieTableType;
+import org.apache.hudi.common.table.HoodieTableMetaClient;
+import org.apache.hudi.common.table.timeline.HoodieActiveTimeline;
+import org.apache.hudi.config.HoodieCleanConfig;
+import org.apache.hudi.config.HoodieCompactionConfig;
+import org.apache.hudi.config.HoodieIndexConfig;
+import org.apache.hudi.config.HoodieLayoutConfig;
+import org.apache.hudi.config.HoodieWriteConfig;
+import org.apache.hudi.index.HoodieIndex;
+import org.apache.hudi.table.action.commit.SparkBucketIndexPartitioner;
+import org.apache.hudi.table.storage.HoodieStorageLayout;
+import org.apache.hudi.utilities.HoodieCompactor;
+
+import org.junit.jupiter.api.Test;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Properties;
+
+import static org.apache.hudi.common.testutils.HoodieTestDataGenerator.TRIP_EXAMPLE_SCHEMA;
+
+public class TestOfflineHoodieCompactor extends HoodieOfflineJobTestBase {
+
+  protected HoodieCompactor initialHoodieCompactorClean(String tableBasePath, Boolean runSchedule, String scheduleAndExecute,
+                     Boolean isAutoClean) {
+    HoodieCompactor.Config compactionConfig = buildHoodieCompactionUtilConfig(tableBasePath,
+              runSchedule, scheduleAndExecute);
+    List<String> configs = new ArrayList<>();
+    configs.add(String.format("%s=%s", HoodieCleanConfig.AUTO_CLEAN.key(), isAutoClean));
+    configs.add(String.format("%s=%s", HoodieCleanConfig.CLEANER_COMMITS_RETAINED.key(), 1));
+    configs.add(String.format("%s=%s", HoodieCompactionConfig.INLINE_COMPACT_NUM_DELTA_COMMITS.key(), 1));
+    compactionConfig.configs.addAll(configs);
+    return new HoodieCompactor(jsc, compactionConfig);
+  }
+
+  private HoodieCompactor.Config  buildHoodieCompactionUtilConfig(String basePath,
+                                                                  Boolean runSchedule,
+                                                                  String runningMode) {
+    HoodieCompactor.Config config = new HoodieCompactor.Config();
+    config.basePath = basePath;
+    config.runSchedule = runSchedule;
+    config.runningMode = runningMode;
+    config.configs.add("hoodie.metadata.enable=false");
+    return config;
+  }
+
+  @Test
+  public void testHoodieCompactorWithClean() throws Exception {
+    String tableBasePath = basePath + "/asyncCompaction";
+    Properties props = getPropertiesForKeyGen(true);
+    HoodieWriteConfig config = HoodieWriteConfig.newBuilder()
+        .forTable("asyncCompaction")
+        .withPath(tableBasePath)
+        .withSchema(TRIP_EXAMPLE_SCHEMA)
+        .withParallelism(2, 2)
+        .withMetadataConfig(HoodieMetadataConfig.newBuilder().enable(false).build())
+        .withAutoCommit(false)
+        .withCompactionConfig(HoodieCompactionConfig.newBuilder()
+          .withInlineCompaction(false).withScheduleInlineCompaction(false).build())
+        .withStorageConfig(HoodieStorageConfig.newBuilder()
+          .logFileMaxSize(1024).build())
+        .withCleanConfig(HoodieCleanConfig.newBuilder()
+          .withCleanerPolicy(HoodieCleaningPolicy.KEEP_LATEST_COMMITS)
+          .withAutoClean(false).withAsyncClean(false).build())
+        .withLayoutConfig(HoodieLayoutConfig.newBuilder()
+          .withLayoutType(HoodieStorageLayout.LayoutType.BUCKET.name())
+          .withLayoutPartitioner(SparkBucketIndexPartitioner.class.getName()).build())
+        .withIndexConfig(HoodieIndexConfig.newBuilder().fromProperties(props).withIndexType(HoodieIndex.IndexType.BUCKET).withBucketNum("1").build())
+        .build();
+    props.putAll(config.getProps());
+    Properties metaClientProps = HoodieTableMetaClient.withPropertyBuilder()
+        .setTableType(HoodieTableType.MERGE_ON_READ)
+        .setPayloadClass(HoodieAvroPayload.class)
+        .fromProperties(props)
+        .build();
+
+    metaClient =  HoodieTableMetaClient.initTableAndGetMetaClient(jsc.hadoopConfiguration(), tableBasePath, metaClientProps);
+    client = new SparkRDDWriteClient(context, config);
+
+    writeData(true, HoodieActiveTimeline.createNewInstantTime(), 100, true);
+    writeData(true, HoodieActiveTimeline.createNewInstantTime(), 100, true);

Review Comment:
   Do we have existing test class that we can add these tests in?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "zhuanshenbsj1 (via GitHub)" <gi...@apache.org>.
zhuanshenbsj1 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1176051746


##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDTableServiceClient.java:
##########
@@ -245,6 +246,7 @@ private void completeClustering(HoodieReplaceCommitMetadata metadata,
           metrics.updateCommitMetrics(parsedInstant.getTime(), durationInMs, metadata, HoodieActiveTimeline.REPLACE_COMMIT_ACTION)
       );
     }
+    waitForAsyncServiceCompletion();
     LOG.info("Clustering successfully on commit " + clusteringCommitTime);

Review Comment:
   Without this change,if config ASYNC_CLEAN = true,AsyncCleanerService will be used to do clean in offline job . In my unit testing for offline job,if the completion time of the compact/cluster job is earlier than the completion time of the sync-cleaning job, function BaseHoodieTableServiceClient.close() will force the asunc-cleaning job to be closed.
   So I added this wait and made the entire task wait for clean to complete before smoothly exiting.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1522447941

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16647",
       "triggerID" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   * 7cf81113aeac802b3ba35723627ebcb6f6943453 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16647) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1520170545

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 8670f7026c22101c7cd7ab4627b1b07d5b9cd991 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610) 
   * 31b1aed974e2f1925fd81380c118a2fba91e23fa Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613) 
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1522992882

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16647",
       "triggerID" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16615",
       "triggerID" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16661",
       "triggerID" : "1522646175",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "",
       "status" : "CANCELED",
       "url" : "TBD",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   *  Unknown: [CANCELED](TBD) 
   * 4dad96ba54827548c95059d12b7d5d5cdcc0c1a4 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "zhuanshenbsj1 (via GitHub)" <gi...@apache.org>.
zhuanshenbsj1 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1177963180


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieCompactor.java:
##########
@@ -269,6 +269,7 @@ private int doCompact(JavaSparkContext jsc) throws Exception {
         }
       }
       HoodieWriteMetadata<JavaRDD<WriteStatus>> compactionMetadata = client.compact(cfg.compactionInstantTime);
+      cleanAfterCompact(client);
       return UtilHelpers.handleErrors(compactionMetadata.getCommitMetadata().get(), cfg.compactionInstantTime);

Review Comment:
   How about add an asyncEnable config like flink offline job,default is false?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1185883439


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieCompactor.java:
##########
@@ -112,6 +112,9 @@ public static class Config implements Serializable {
         splitter = IdentitySplitter.class)
     public List<String> configs = new ArrayList<>();
 
+    // disable async-service in offline job
+    public Boolean asyncSerivceEanble = false;
+

Review Comment:
   asyncSerivceEanble -> asyncCleaningEnabled and by default false.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1189574078


##########
hudi-utilities/src/test/java/org/apache/hudi/utilities/offlinejob/TestOfflineHoodieCompactor.java:
##########
@@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.utilities.offlinejob;
+
+import org.apache.hudi.client.SparkRDDWriteClient;
+import org.apache.hudi.common.config.HoodieMetadataConfig;
+import org.apache.hudi.common.config.HoodieStorageConfig;
+import org.apache.hudi.common.model.HoodieAvroPayload;
+import org.apache.hudi.common.model.HoodieCleaningPolicy;
+import org.apache.hudi.common.model.HoodieTableType;
+import org.apache.hudi.common.table.HoodieTableMetaClient;
+import org.apache.hudi.common.table.timeline.HoodieActiveTimeline;
+import org.apache.hudi.config.HoodieCleanConfig;
+import org.apache.hudi.config.HoodieCompactionConfig;
+import org.apache.hudi.config.HoodieIndexConfig;
+import org.apache.hudi.config.HoodieLayoutConfig;
+import org.apache.hudi.config.HoodieWriteConfig;
+import org.apache.hudi.index.HoodieIndex;
+import org.apache.hudi.table.action.commit.SparkBucketIndexPartitioner;
+import org.apache.hudi.table.storage.HoodieStorageLayout;
+import org.apache.hudi.utilities.HoodieCompactor;
+
+import org.junit.jupiter.api.Test;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Properties;
+
+import static org.apache.hudi.common.testutils.HoodieTestDataGenerator.TRIP_EXAMPLE_SCHEMA;
+
+public class TestOfflineHoodieCompactor extends HoodieOfflineJobTestBase {
+
+  protected HoodieCompactor initialHoodieCompactorClean(String tableBasePath, Boolean runSchedule, String scheduleAndExecute,
+                     Boolean isAutoClean) {

Review Comment:
   Can we move the tests to `TestHoodieCompactor` ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1172047827


##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDWriteClient.java:
##########
@@ -292,7 +292,9 @@ protected void completeCompaction(HoodieCommitMetadata metadata,
   protected HoodieWriteMetadata<JavaRDD<WriteStatus>> compact(String compactionInstantTime, boolean shouldComplete) {
     HoodieSparkTable<T> table = HoodieSparkTable.create(config, context);
     preWrite(compactionInstantTime, WriteOperationType.COMPACT, table.getMetaClient());
-    return tableServiceClient.compact(compactionInstantTime, shouldComplete);
+    HoodieWriteMetadata<JavaRDD<WriteStatus>> compactionMetadata = tableServiceClient.compact(compactionInstantTime, shouldComplete);
+    autoCleanOnCommit();
+    return compactionMetadata;

Review Comment:
   From the context, we have no idea whether this code is triggered from offline job or online async task.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1545118849

   [6106.patch.zip](https://github.com/apache/hudi/files/11459135/6106.patch.zip)
   Thanks for the contribution, I have reviewed and created a patch~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "zhuanshenbsj1 (via GitHub)" <gi...@apache.org>.
zhuanshenbsj1 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1174758600


##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDWriteClient.java:
##########
@@ -292,7 +292,9 @@ protected void completeCompaction(HoodieCommitMetadata metadata,
   protected HoodieWriteMetadata<JavaRDD<WriteStatus>> compact(String compactionInstantTime, boolean shouldComplete) {
     HoodieSparkTable<T> table = HoodieSparkTable.create(config, context);
     preWrite(compactionInstantTime, WriteOperationType.COMPACT, table.getMetaClient());
-    return tableServiceClient.compact(compactionInstantTime, shouldComplete);
+    HoodieWriteMetadata<JavaRDD<WriteStatus>> compactionMetadata = tableServiceClient.compact(compactionInstantTime, shouldComplete);
+    autoCleanOnCommit();
+    return compactionMetadata;

Review Comment:
   Move the clean operation to offline && Add UT.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1522538020

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16647",
       "triggerID" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16615",
       "triggerID" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   * 46f712faba3339cae98f6462bec8044ba0b01839 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16615) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "zhuanshenbsj1 (via GitHub)" <gi...@apache.org>.
zhuanshenbsj1 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1177963180


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieCompactor.java:
##########
@@ -269,6 +269,7 @@ private int doCompact(JavaSparkContext jsc) throws Exception {
         }
       }
       HoodieWriteMetadata<JavaRDD<WriteStatus>> compactionMetadata = client.compact(cfg.compactionInstantTime);
+      cleanAfterCompact(client);
       return UtilHelpers.handleErrors(compactionMetadata.getCommitMetadata().get(), cfg.compactionInstantTime);

Review Comment:
   How about add an asyncEnable config like flink offline job?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "zhuanshenbsj1 (via GitHub)" <gi...@apache.org>.
zhuanshenbsj1 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1191848563


##########
hudi-utilities/src/test/java/org/apache/hudi/utilities/offlinejob/TestOfflineHoodieCompactor.java:
##########
@@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.utilities.offlinejob;
+
+import org.apache.hudi.client.SparkRDDWriteClient;
+import org.apache.hudi.common.config.HoodieMetadataConfig;
+import org.apache.hudi.common.config.HoodieStorageConfig;
+import org.apache.hudi.common.model.HoodieAvroPayload;
+import org.apache.hudi.common.model.HoodieCleaningPolicy;
+import org.apache.hudi.common.model.HoodieTableType;
+import org.apache.hudi.common.table.HoodieTableMetaClient;
+import org.apache.hudi.common.table.timeline.HoodieActiveTimeline;
+import org.apache.hudi.config.HoodieCleanConfig;
+import org.apache.hudi.config.HoodieCompactionConfig;
+import org.apache.hudi.config.HoodieIndexConfig;
+import org.apache.hudi.config.HoodieLayoutConfig;
+import org.apache.hudi.config.HoodieWriteConfig;
+import org.apache.hudi.index.HoodieIndex;
+import org.apache.hudi.table.action.commit.SparkBucketIndexPartitioner;
+import org.apache.hudi.table.storage.HoodieStorageLayout;
+import org.apache.hudi.utilities.HoodieCompactor;
+
+import org.junit.jupiter.api.Test;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Properties;
+
+import static org.apache.hudi.common.testutils.HoodieTestDataGenerator.TRIP_EXAMPLE_SCHEMA;
+
+public class TestOfflineHoodieCompactor extends HoodieOfflineJobTestBase {
+
+  protected HoodieCompactor initialHoodieCompactorClean(String tableBasePath, Boolean runSchedule, String scheduleAndExecute,
+                     Boolean isAutoClean) {

Review Comment:
   > Can we move the tests to `TestHoodieCompactor` ?
   
   This test class belongs to the project hudi-spark-client(not hudi-utilities), and is mainly used to test online compaction. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1546654893

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16647",
       "triggerID" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16615",
       "triggerID" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16661",
       "triggerID" : "1522646175",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16687",
       "triggerID" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16850",
       "triggerID" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16854",
       "triggerID" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e69e498a02b9f0bc95955ef99bacf6cc286b84e5",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "e69e498a02b9f0bc95955ef99bacf6cc286b84e5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fbd341163585d108ea9ca4b93f11f7646c057f50",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16870",
       "triggerID" : "fbd341163585d108ea9ca4b93f11f7646c057f50",
       "triggerType" : "PUSH"
     }, {
       "hash" : "74c778482b03b2e57ad342e36d11d27a5c08a70c",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16888",
       "triggerID" : "74c778482b03b2e57ad342e36d11d27a5c08a70c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1545337327",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1546513645",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1546647092",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "091fe00b809f6d0f47568eddc9a43fc4266afb6e",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "091fe00b809f6d0f47568eddc9a43fc4266afb6e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   * e69e498a02b9f0bc95955ef99bacf6cc286b84e5 UNKNOWN
   * 74c778482b03b2e57ad342e36d11d27a5c08a70c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16888) 
   * 3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70 UNKNOWN
   * 091fe00b809f6d0f47568eddc9a43fc4266afb6e UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 closed pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 closed pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean  like Flink job
URL: https://github.com/apache/hudi/pull/8505


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "zhuanshenbsj1 (via GitHub)" <gi...@apache.org>.
zhuanshenbsj1 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1185642208


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java:
##########
@@ -256,7 +279,16 @@ private int doScheduleAndCluster(JavaSparkContext jsc) throws Exception {
       LOG.info("The schedule instant time is " + instantTime.get());
       LOG.info("Step 2: Do cluster");
       Option<HoodieCommitMetadata> metadata = client.cluster(instantTime.get()).getCommitMetadata();
+      cleanAfterCluster(client);
       return UtilHelpers.handleErrors(metadata.get(), instantTime.get());
     }
   }
+
+  private void cleanAfterCluster(SparkRDDWriteClient client) {
+    client.waitForAsyncServiceCompletion();
+    if (client.getConfig().isAutoClean() && !client.getConfig().isAsyncClean()) {

Review Comment:
   > I think we need to trigge a sync clean if it is enabled.
   IF isAsyncClean is enable, spark offline job will start an async-cleaning in prewrite like flink job. So if isAsyncClean is disable then add a synchronous cleanup



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1535751794

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16647",
       "triggerID" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16615",
       "triggerID" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16661",
       "triggerID" : "1522646175",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16687",
       "triggerID" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16850",
       "triggerID" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16854",
       "triggerID" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   * 4fc5fb7659adab4b978a8477f48b262b54733596 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16850) 
   * d05e517e595e292974ec4d3a6dfbb23537ecae81 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16854) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1186641929


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieCompactor.java:
##########
@@ -295,4 +301,11 @@ private String getSchemaFromLatestInstant() throws Exception {
     Schema schema = schemaUtil.getTableAvroSchema(false);
     return schema.toString();
   }
+
+  private void cleanAfterCompact(SparkRDDWriteClient client) {
+    if (client.getConfig().isAutoClean()) {
+      LOG.info("Start to clean synchronously.");

Review Comment:
   Can the user set up auto clean manually? If not, it is always true.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1536315131

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16647",
       "triggerID" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16615",
       "triggerID" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16661",
       "triggerID" : "1522646175",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16687",
       "triggerID" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16850",
       "triggerID" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16854",
       "triggerID" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e69e498a02b9f0bc95955ef99bacf6cc286b84e5",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "e69e498a02b9f0bc95955ef99bacf6cc286b84e5",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   * d05e517e595e292974ec4d3a6dfbb23537ecae81 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16854) 
   * e69e498a02b9f0bc95955ef99bacf6cc286b84e5 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1546515757

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16647",
       "triggerID" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16615",
       "triggerID" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16661",
       "triggerID" : "1522646175",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16687",
       "triggerID" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16850",
       "triggerID" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16854",
       "triggerID" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e69e498a02b9f0bc95955ef99bacf6cc286b84e5",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "e69e498a02b9f0bc95955ef99bacf6cc286b84e5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fbd341163585d108ea9ca4b93f11f7646c057f50",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16870",
       "triggerID" : "fbd341163585d108ea9ca4b93f11f7646c057f50",
       "triggerType" : "PUSH"
     }, {
       "hash" : "74c778482b03b2e57ad342e36d11d27a5c08a70c",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16888",
       "triggerID" : "74c778482b03b2e57ad342e36d11d27a5c08a70c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1545337327",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1546513645",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   * e69e498a02b9f0bc95955ef99bacf6cc286b84e5 UNKNOWN
   * 74c778482b03b2e57ad342e36d11d27a5c08a70c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16888) 
   * 3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1546628584

   @zhuanshenbsj1 Hi, can you rebase with the latest maste and re-trigger the Azure CI tests?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "zhuanshenbsj1 (via GitHub)" <gi...@apache.org>.
zhuanshenbsj1 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1191850116


##########
hudi-utilities/src/test/java/org/apache/hudi/utilities/offlinejob/TestOfflineHoodieCompactor.java:
##########
@@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.utilities.offlinejob;
+
+import org.apache.hudi.client.SparkRDDWriteClient;
+import org.apache.hudi.common.config.HoodieMetadataConfig;
+import org.apache.hudi.common.config.HoodieStorageConfig;
+import org.apache.hudi.common.model.HoodieAvroPayload;
+import org.apache.hudi.common.model.HoodieCleaningPolicy;
+import org.apache.hudi.common.model.HoodieTableType;
+import org.apache.hudi.common.table.HoodieTableMetaClient;
+import org.apache.hudi.common.table.timeline.HoodieActiveTimeline;
+import org.apache.hudi.config.HoodieCleanConfig;
+import org.apache.hudi.config.HoodieCompactionConfig;
+import org.apache.hudi.config.HoodieIndexConfig;
+import org.apache.hudi.config.HoodieLayoutConfig;
+import org.apache.hudi.config.HoodieWriteConfig;
+import org.apache.hudi.index.HoodieIndex;
+import org.apache.hudi.table.action.commit.SparkBucketIndexPartitioner;
+import org.apache.hudi.table.storage.HoodieStorageLayout;
+import org.apache.hudi.utilities.HoodieCompactor;
+
+import org.junit.jupiter.api.Test;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Properties;
+
+import static org.apache.hudi.common.testutils.HoodieTestDataGenerator.TRIP_EXAMPLE_SCHEMA;
+
+public class TestOfflineHoodieCompactor extends HoodieOfflineJobTestBase {
+
+  protected HoodieCompactor initialHoodieCompactorClean(String tableBasePath, Boolean runSchedule, String scheduleAndExecute,
+                     Boolean isAutoClean) {

Review Comment:
   There seems to be no testing class for org.apache.hudi.utilities.HoodieCompactor before



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] zhuanshenbsj1 commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "zhuanshenbsj1 (via GitHub)" <gi...@apache.org>.
zhuanshenbsj1 commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1546644566

   rebase 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1535694337

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16647",
       "triggerID" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16615",
       "triggerID" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16661",
       "triggerID" : "1522646175",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16687",
       "triggerID" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16850",
       "triggerID" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   * 8539f134f599512b4fd9c6e9a8bcac8172f7094d Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16687) 
   * 4fc5fb7659adab4b978a8477f48b262b54733596 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16850) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1537074747

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16647",
       "triggerID" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16615",
       "triggerID" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16661",
       "triggerID" : "1522646175",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16687",
       "triggerID" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16850",
       "triggerID" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16854",
       "triggerID" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e69e498a02b9f0bc95955ef99bacf6cc286b84e5",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "e69e498a02b9f0bc95955ef99bacf6cc286b84e5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fbd341163585d108ea9ca4b93f11f7646c057f50",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16870",
       "triggerID" : "fbd341163585d108ea9ca4b93f11f7646c057f50",
       "triggerType" : "PUSH"
     }, {
       "hash" : "74c778482b03b2e57ad342e36d11d27a5c08a70c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "74c778482b03b2e57ad342e36d11d27a5c08a70c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   * e69e498a02b9f0bc95955ef99bacf6cc286b84e5 UNKNOWN
   * fbd341163585d108ea9ca4b93f11f7646c057f50 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16870) 
   * 74c778482b03b2e57ad342e36d11d27a5c08a70c UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1535677531

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16647",
       "triggerID" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16615",
       "triggerID" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16661",
       "triggerID" : "1522646175",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16687",
       "triggerID" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   * 8539f134f599512b4fd9c6e9a8bcac8172f7094d Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16687) 
   * 4fc5fb7659adab4b978a8477f48b262b54733596 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] zhuanshenbsj1 commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "zhuanshenbsj1 (via GitHub)" <gi...@apache.org>.
zhuanshenbsj1 commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1522901577

   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1522000257

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   * 25c1856f6baa94428c65d3dd03caf04ae19bad52 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616) 
   * 7cf81113aeac802b3ba35723627ebcb6f6943453 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1522917865

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16647",
       "triggerID" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16615",
       "triggerID" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16661",
       "triggerID" : "1522646175",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "",
       "status" : "CANCELED",
       "url" : "TBD",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   *  Unknown: [CANCELED](TBD) 
   * 4dad96ba54827548c95059d12b7d5d5cdcc0c1a4 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1182216471


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java:
##########
@@ -256,7 +279,16 @@ private int doScheduleAndCluster(JavaSparkContext jsc) throws Exception {
       LOG.info("The schedule instant time is " + instantTime.get());
       LOG.info("Step 2: Do cluster");
       Option<HoodieCommitMetadata> metadata = client.cluster(instantTime.get()).getCommitMetadata();
+      cleanAfterCluster(client);
       return UtilHelpers.handleErrors(metadata.get(), instantTime.get());
     }
   }
+
+  private void cleanAfterCluster(SparkRDDWriteClient client) {
+    client.waitForAsyncServiceCompletion();
+    if (client.getConfig().isAutoClean() && !client.getConfig().isAsyncClean()) {

Review Comment:
   I think we need to trigge a sync clean if it is enabled.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1522508500

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16647",
       "triggerID" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   * 7cf81113aeac802b3ba35723627ebcb6f6943453 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16647) 
   * 46f712faba3339cae98f6462bec8044ba0b01839 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1520240215

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 31b1aed974e2f1925fd81380c118a2fba91e23fa Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613) 
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "zhuanshenbsj1 (via GitHub)" <gi...@apache.org>.
zhuanshenbsj1 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1176051746


##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDTableServiceClient.java:
##########
@@ -245,6 +246,7 @@ private void completeClustering(HoodieReplaceCommitMetadata metadata,
           metrics.updateCommitMetrics(parsedInstant.getTime(), durationInMs, metadata, HoodieActiveTimeline.REPLACE_COMMIT_ACTION)
       );
     }
+    waitForAsyncServiceCompletion();
     LOG.info("Clustering successfully on commit " + clusteringCommitTime);

Review Comment:
   Without this change,if config ASYNC_CLEAN = true,AsyncCleanerService will be used to do clean in offline job . In my unit testing for offline job,if the completion time of the compact/cluster job is earlier than the completion time of the sync-cleaning job, function BaseHoodieTableServiceClient.close() will force the asynchronous  cleaning job to be closed, it will causes interrupt Excpetion and end this cleaning.
   So I added this wait and made the entire task wait for clean to complete before smoothly exiting.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] zhuanshenbsj1 commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "zhuanshenbsj1 (via GitHub)" <gi...@apache.org>.
zhuanshenbsj1 commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1545260800

   > [6106.patch.zip](https://github.com/apache/hudi/files/11459135/6106.patch.zip) Thanks for the contribution, I have reviewed and created a patch~
   
   Done.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1546656666

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16647",
       "triggerID" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16615",
       "triggerID" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16661",
       "triggerID" : "1522646175",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16687",
       "triggerID" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16850",
       "triggerID" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16854",
       "triggerID" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e69e498a02b9f0bc95955ef99bacf6cc286b84e5",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "e69e498a02b9f0bc95955ef99bacf6cc286b84e5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fbd341163585d108ea9ca4b93f11f7646c057f50",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16870",
       "triggerID" : "fbd341163585d108ea9ca4b93f11f7646c057f50",
       "triggerType" : "PUSH"
     }, {
       "hash" : "74c778482b03b2e57ad342e36d11d27a5c08a70c",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16888",
       "triggerID" : "74c778482b03b2e57ad342e36d11d27a5c08a70c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1545337327",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1546513645",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1546647092",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "091fe00b809f6d0f47568eddc9a43fc4266afb6e",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17049",
       "triggerID" : "091fe00b809f6d0f47568eddc9a43fc4266afb6e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   * e69e498a02b9f0bc95955ef99bacf6cc286b84e5 UNKNOWN
   * 74c778482b03b2e57ad342e36d11d27a5c08a70c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16888) 
   * 3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70 UNKNOWN
   * 091fe00b809f6d0f47568eddc9a43fc4266afb6e Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17049) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1546653455

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16647",
       "triggerID" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16615",
       "triggerID" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16661",
       "triggerID" : "1522646175",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16687",
       "triggerID" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16850",
       "triggerID" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16854",
       "triggerID" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e69e498a02b9f0bc95955ef99bacf6cc286b84e5",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "e69e498a02b9f0bc95955ef99bacf6cc286b84e5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fbd341163585d108ea9ca4b93f11f7646c057f50",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16870",
       "triggerID" : "fbd341163585d108ea9ca4b93f11f7646c057f50",
       "triggerType" : "PUSH"
     }, {
       "hash" : "74c778482b03b2e57ad342e36d11d27a5c08a70c",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16888",
       "triggerID" : "74c778482b03b2e57ad342e36d11d27a5c08a70c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1545337327",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1546513645",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1546647092",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   * e69e498a02b9f0bc95955ef99bacf6cc286b84e5 UNKNOWN
   * 74c778482b03b2e57ad342e36d11d27a5c08a70c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16888) 
   * 3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] zhuanshenbsj1 commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "zhuanshenbsj1 (via GitHub)" <gi...@apache.org>.
zhuanshenbsj1 commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1546647092

   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1546705209

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16647",
       "triggerID" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16615",
       "triggerID" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16661",
       "triggerID" : "1522646175",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16687",
       "triggerID" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16850",
       "triggerID" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16854",
       "triggerID" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e69e498a02b9f0bc95955ef99bacf6cc286b84e5",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "e69e498a02b9f0bc95955ef99bacf6cc286b84e5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fbd341163585d108ea9ca4b93f11f7646c057f50",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16870",
       "triggerID" : "fbd341163585d108ea9ca4b93f11f7646c057f50",
       "triggerType" : "PUSH"
     }, {
       "hash" : "74c778482b03b2e57ad342e36d11d27a5c08a70c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16888",
       "triggerID" : "74c778482b03b2e57ad342e36d11d27a5c08a70c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1545337327",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1546513645",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1546647092",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "091fe00b809f6d0f47568eddc9a43fc4266afb6e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17049",
       "triggerID" : "091fe00b809f6d0f47568eddc9a43fc4266afb6e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   * e69e498a02b9f0bc95955ef99bacf6cc286b84e5 UNKNOWN
   * 3ff0e0afdb20e6eec1bd2b7494e0e04790e22b70 UNKNOWN
   * 091fe00b809f6d0f47568eddc9a43fc4266afb6e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=17049) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1523623798

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16647",
       "triggerID" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16615",
       "triggerID" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16661",
       "triggerID" : "1522646175",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   * 4dad96ba54827548c95059d12b7d5d5cdcc0c1a4 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673) 
   * 8539f134f599512b4fd9c6e9a8bcac8172f7094d UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1524381631

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16647",
       "triggerID" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16615",
       "triggerID" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16661",
       "triggerID" : "1522646175",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16687",
       "triggerID" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   * 8539f134f599512b4fd9c6e9a8bcac8172f7094d Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16687) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1186641821


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieCompactor.java:
##########
@@ -112,6 +112,9 @@ public static class Config implements Serializable {
         splitter = IdentitySplitter.class)
     public List<String> configs = new ArrayList<>();
 
+    // disable async-service in offline job
+    public Boolean asyncSerivceEanble = false;
+

Review Comment:
   Do we need this member because it is hard-coded as false.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "zhuanshenbsj1 (via GitHub)" <gi...@apache.org>.
zhuanshenbsj1 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1186656966


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieCompactor.java:
##########
@@ -295,4 +301,11 @@ private String getSchemaFromLatestInstant() throws Exception {
     Schema schema = schemaUtil.getTableAvroSchema(false);
     return schema.toString();
   }
+
+  private void cleanAfterCompact(SparkRDDWriteClient client) {
+    if (client.getConfig().isAutoClean()) {
+      LOG.info("Start to clean synchronously.");

Review Comment:
   Auto-Clean can be modified through the input para config of the job constructor.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "zhuanshenbsj1 (via GitHub)" <gi...@apache.org>.
zhuanshenbsj1 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1185641155


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java:
##########
@@ -256,7 +279,16 @@ private int doScheduleAndCluster(JavaSparkContext jsc) throws Exception {
       LOG.info("The schedule instant time is " + instantTime.get());
       LOG.info("Step 2: Do cluster");
       Option<HoodieCommitMetadata> metadata = client.cluster(instantTime.get()).getCommitMetadata();
+      cleanAfterCluster(client);
       return UtilHelpers.handleErrors(metadata.get(), instantTime.get());
     }
   }
+
+  private void cleanAfterCluster(SparkRDDWriteClient client) {
+    client.waitForAsyncServiceCompletion();
+    if (client.getConfig().isAutoClean() && !client.getConfig().isAsyncClean()) {

Review Comment:
   > What kind of async table service do we want to wait for here?
   
   Asyn-cleanning and Async-archiving.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1536027530

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16647",
       "triggerID" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16615",
       "triggerID" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16661",
       "triggerID" : "1522646175",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16687",
       "triggerID" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16850",
       "triggerID" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16854",
       "triggerID" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   * d05e517e595e292974ec4d3a6dfbb23537ecae81 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16854) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1182215868


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java:
##########
@@ -256,7 +279,16 @@ private int doScheduleAndCluster(JavaSparkContext jsc) throws Exception {
       LOG.info("The schedule instant time is " + instantTime.get());
       LOG.info("Step 2: Do cluster");
       Option<HoodieCommitMetadata> metadata = client.cluster(instantTime.get()).getCommitMetadata();
+      cleanAfterCluster(client);
       return UtilHelpers.handleErrors(metadata.get(), instantTime.get());
     }
   }
+
+  private void cleanAfterCluster(SparkRDDWriteClient client) {
+    client.waitForAsyncServiceCompletion();
+    if (client.getConfig().isAutoClean() && !client.getConfig().isAsyncClean()) {

Review Comment:
   What kind of async table service do we want to wait for here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1535707725

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16647",
       "triggerID" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16615",
       "triggerID" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16661",
       "triggerID" : "1522646175",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16687",
       "triggerID" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16850",
       "triggerID" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   * 4fc5fb7659adab4b978a8477f48b262b54733596 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16850) 
   * d05e517e595e292974ec4d3a6dfbb23537ecae81 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] zhuanshenbsj1 closed pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "zhuanshenbsj1 (via GitHub)" <gi...@apache.org>.
zhuanshenbsj1 closed pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean  like Flink job
URL: https://github.com/apache/hudi/pull/8505


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 merged pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 merged PR #8505:
URL: https://github.com/apache/hudi/pull/8505


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1520046881

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 787a543839c6bc021b5ee98a99faa373b399f8e6 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593) 
   * 8670f7026c22101c7cd7ab4627b1b07d5b9cd991 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1519591347

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 787a543839c6bc021b5ee98a99faa373b399f8e6 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1523703148

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16647",
       "triggerID" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16615",
       "triggerID" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16661",
       "triggerID" : "1522646175",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16687",
       "triggerID" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   * 4dad96ba54827548c95059d12b7d5d5cdcc0c1a4 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673) 
   * 8539f134f599512b4fd9c6e9a8bcac8172f7094d Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16687) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "zhuanshenbsj1 (via GitHub)" <gi...@apache.org>.
zhuanshenbsj1 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1176051746


##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDTableServiceClient.java:
##########
@@ -245,6 +246,7 @@ private void completeClustering(HoodieReplaceCommitMetadata metadata,
           metrics.updateCommitMetrics(parsedInstant.getTime(), durationInMs, metadata, HoodieActiveTimeline.REPLACE_COMMIT_ACTION)
       );
     }
+    waitForAsyncServiceCompletion();
     LOG.info("Clustering successfully on commit " + clusteringCommitTime);

Review Comment:
   Without this change,if config ASYNC_CLEAN = true,AsyncCleanerService will be used to do clean in offline job . 
   ![image](https://user-images.githubusercontent.com/34104400/234192874-e369bead-cc4a-4c8e-ab0a-c4791c8bd0ef.png)
   In my unit testing for offline job,if the completion time of the compact/cluster job is earlier than the completion time of the sync-cleaning job, function BaseHoodieTableServiceClient.close() will force the asunc-cleaning job to be closed.
   <img width="372" alt="d0570721dc3b1983191e5451751c1815" src="https://user-images.githubusercontent.com/34104400/234192499-dabdbd5f-8df8-476f-9812-72151e5d6873.png">
   
   So I added this wait and made the entire task wait for clean to complete before smoothly exiting.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] zhuanshenbsj1 commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "zhuanshenbsj1 (via GitHub)" <gi...@apache.org>.
zhuanshenbsj1 commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1522646175

   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1520260743

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 31b1aed974e2f1925fd81380c118a2fba91e23fa Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613) 
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   * 25c1856f6baa94428c65d3dd03caf04ae19bad52 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "zhuanshenbsj1 (via GitHub)" <gi...@apache.org>.
zhuanshenbsj1 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1186146172


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieCompactor.java:
##########
@@ -269,13 +272,14 @@ private int doCompact(JavaSparkContext jsc) throws Exception {
         }
       }
       HoodieWriteMetadata<JavaRDD<WriteStatus>> compactionMetadata = client.compact(cfg.compactionInstantTime);
+      cleanAfterCompact(client);
       return UtilHelpers.handleErrors(compactionMetadata.getCommitMetadata().get(), cfg.compactionInstantTime);
     }
   }
 
   private Option<String> doSchedule(JavaSparkContext jsc) {
     try (SparkRDDWriteClient client =
-             UtilHelpers.createHoodieClient(jsc, cfg.basePath, "", cfg.parallelism, Option.of(cfg.strategyClassName), props)) {
+             UtilHelpers.createHoodieClient(jsc, cfg.basePath, "", cfg.parallelism, Option.of(cfg.strategyClassName), props, cfg.asyncSerivceEanble)) {
 
       if (StringUtils.isNullOrEmpty(cfg.compactionInstantTime)) {
         LOG.warn("No instant time is provided for scheduling compaction.");

Review Comment:
   Removed this method, add clean config into props.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1536385513

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16647",
       "triggerID" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16615",
       "triggerID" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16661",
       "triggerID" : "1522646175",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16673",
       "triggerID" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "",
       "status" : "DELETED",
       "url" : "TBD",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16687",
       "triggerID" : "8539f134f599512b4fd9c6e9a8bcac8172f7094d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16850",
       "triggerID" : "4fc5fb7659adab4b978a8477f48b262b54733596",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16854",
       "triggerID" : "d05e517e595e292974ec4d3a6dfbb23537ecae81",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e69e498a02b9f0bc95955ef99bacf6cc286b84e5",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "e69e498a02b9f0bc95955ef99bacf6cc286b84e5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fbd341163585d108ea9ca4b93f11f7646c057f50",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16870",
       "triggerID" : "fbd341163585d108ea9ca4b93f11f7646c057f50",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   * d05e517e595e292974ec4d3a6dfbb23537ecae81 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16854) 
   * e69e498a02b9f0bc95955ef99bacf6cc286b84e5 UNKNOWN
   * fbd341163585d108ea9ca4b93f11f7646c057f50 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16870) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1546821081

   The test failure is faky: https://dev.azure.com/apache-hudi-ci-org/apache-hudi-ci/_build/results?buildId=17049&view=logs&j=dcedfe73-9485-5cc5-817a-73b61fc5dcb0&t=746585d8-b50a-55c3-26c5-517d93af9934&l=42927


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1172048066


##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDWriteClient.java:
##########
@@ -292,7 +292,9 @@ protected void completeCompaction(HoodieCommitMetadata metadata,
   protected HoodieWriteMetadata<JavaRDD<WriteStatus>> compact(String compactionInstantTime, boolean shouldComplete) {
     HoodieSparkTable<T> table = HoodieSparkTable.create(config, context);
     preWrite(compactionInstantTime, WriteOperationType.COMPACT, table.getMetaClient());
-    return tableServiceClient.compact(compactionInstantTime, shouldComplete);
+    HoodieWriteMetadata<JavaRDD<WriteStatus>> compactionMetadata = tableServiceClient.compact(compactionInstantTime, shouldComplete);
+    autoCleanOnCommit();
+    return compactionMetadata;

Review Comment:
   How about we just add the cleaning for the offline job then? And remember to add some tests for it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1520833987

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   * 25c1856f6baa94428c65d3dd03caf04ae19bad52 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1176010790


##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDTableServiceClient.java:
##########
@@ -245,6 +246,7 @@ private void completeClustering(HoodieReplaceCommitMetadata metadata,
           metrics.updateCommitMetrics(parsedInstant.getTime(), durationInMs, metadata, HoodieActiveTimeline.REPLACE_COMMIT_ACTION)
       );
     }
+    waitForAsyncServiceCompletion();
     LOG.info("Clustering successfully on commit " + clusteringCommitTime);

Review Comment:
   Why we need this change?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1519357440

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 55007361a8c01779a883cee54ecf45ce94e25dce Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475) 
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1520154324

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 787a543839c6bc021b5ee98a99faa373b399f8e6 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593) 
   * 8670f7026c22101c7cd7ab4627b1b07d5b9cd991 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610) 
   * 31b1aed974e2f1925fd81380c118a2fba91e23fa UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "zhuanshenbsj1 (via GitHub)" <gi...@apache.org>.
zhuanshenbsj1 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1176051746


##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDTableServiceClient.java:
##########
@@ -245,6 +246,7 @@ private void completeClustering(HoodieReplaceCommitMetadata metadata,
           metrics.updateCommitMetrics(parsedInstant.getTime(), durationInMs, metadata, HoodieActiveTimeline.REPLACE_COMMIT_ACTION)
       );
     }
+    waitForAsyncServiceCompletion();
     LOG.info("Clustering successfully on commit " + clusteringCommitTime);

Review Comment:
   Without this change,if config ASYNC_CLEAN = true,AsyncCleanerService will be used to do clean in offline job 。In my unit testing for offline job,if the completion time of the compact/cluster job is earlier than the completion time of the sync-cleaning job, function BaseHoodieTableServiceClient.close() will force the asynchronous  cleaning job to be closed, it will causes interrupt Excpetion and end this cleaning.
   So I added this wait and made the entire task wait for clean to complete before smoothly exiting.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "zhuanshenbsj1 (via GitHub)" <gi...@apache.org>.
zhuanshenbsj1 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1176051746


##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDTableServiceClient.java:
##########
@@ -245,6 +246,7 @@ private void completeClustering(HoodieReplaceCommitMetadata metadata,
           metrics.updateCommitMetrics(parsedInstant.getTime(), durationInMs, metadata, HoodieActiveTimeline.REPLACE_COMMIT_ACTION)
       );
     }
+    waitForAsyncServiceCompletion();
     LOG.info("Clustering successfully on commit " + clusteringCommitTime);

Review Comment:
   Without this change,if config ASYNC_CLEAN = true,AsyncCleanerService will be used to do clean in offline job 。In my unit testing for offline job,if the completion time of the compact/cluster job is earlier than the completion time of the sync-cleaning job, function BaseHoodieTableServiceClient.close() will force the asynchronous  cleaning job to be closed, it will causes interrupt Excpetion.
   So I added this wait and made the entire task wait for clean to complete before smoothly exiting.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] zhuanshenbsj1 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "zhuanshenbsj1 (via GitHub)" <gi...@apache.org>.
zhuanshenbsj1 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1186656966


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieCompactor.java:
##########
@@ -295,4 +301,11 @@ private String getSchemaFromLatestInstant() throws Exception {
     Schema schema = schemaUtil.getTableAvroSchema(false);
     return schema.toString();
   }
+
+  private void cleanAfterCompact(SparkRDDWriteClient client) {
+    if (client.getConfig().isAutoClean()) {
+      LOG.info("Start to clean synchronously.");

Review Comment:
   Auto-Clean can be modified through the input para config of the job constructor, just like unit testing.  And this is also reasonable, allowing users to manually turn off cleaning.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1185665864


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java:
##########
@@ -256,7 +279,16 @@ private int doScheduleAndCluster(JavaSparkContext jsc) throws Exception {
       LOG.info("The schedule instant time is " + instantTime.get());
       LOG.info("Step 2: Do cluster");
       Option<HoodieCommitMetadata> metadata = client.cluster(instantTime.get()).getCommitMetadata();
+      cleanAfterCluster(client);
       return UtilHelpers.handleErrors(metadata.get(), instantTime.get());
     }
   }
+
+  private void cleanAfterCluster(SparkRDDWriteClient client) {
+    client.waitForAsyncServiceCompletion();
+    if (client.getConfig().isAutoClean() && !client.getConfig().isAsyncClean()) {

Review Comment:
   We only need to add a sync cleaning of it is **enabled**, does that make sense to you?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1185883641


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java:
##########
@@ -116,6 +116,28 @@ public static class Config implements Serializable {
         + "(using the CLI parameter \"--props\") can also be passed command line using this parameter. This can be repeated",
         splitter = IdentitySplitter.class)
     public List<String> configs = new ArrayList<>();
+
+    // disable async-service in offline job
+    public Boolean asyncSerivceEanble = false;

Review Comment:
   asyncSerivceEanble -> asyncCleaningEnabled and by default false.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8505:
URL: https://github.com/apache/hudi/pull/8505#issuecomment-1522928401

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16475",
       "triggerID" : "55007361a8c01779a883cee54ecf45ce94e25dce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f7c73e83812258b53b979afbd6d465e9066b801f",
       "triggerType" : "PUSH"
     }, {
       "hash" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16593",
       "triggerID" : "787a543839c6bc021b5ee98a99faa373b399f8e6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16610",
       "triggerID" : "8670f7026c22101c7cd7ab4627b1b07d5b9cd991",
       "triggerType" : "PUSH"
     }, {
       "hash" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16613",
       "triggerID" : "31b1aed974e2f1925fd81380c118a2fba91e23fa",
       "triggerType" : "PUSH"
     }, {
       "hash" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "269fad02a5346121e823a15c9804e2e63eb16c30",
       "triggerType" : "PUSH"
     }, {
       "hash" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "442430f680316bdfefc27c4aca9f7cd94e95373c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16616",
       "triggerID" : "25c1856f6baa94428c65d3dd03caf04ae19bad52",
       "triggerType" : "PUSH"
     }, {
       "hash" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16647",
       "triggerID" : "7cf81113aeac802b3ba35723627ebcb6f6943453",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16615",
       "triggerID" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "triggerType" : "PUSH"
     }, {
       "hash" : "46f712faba3339cae98f6462bec8044ba0b01839",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16661",
       "triggerID" : "1522646175",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "",
       "status" : "CANCELED",
       "url" : "TBD",
       "triggerID" : "1522901577",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "4dad96ba54827548c95059d12b7d5d5cdcc0c1a4",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f7c73e83812258b53b979afbd6d465e9066b801f UNKNOWN
   * 269fad02a5346121e823a15c9804e2e63eb16c30 UNKNOWN
   * 442430f680316bdfefc27c4aca9f7cd94e95373c UNKNOWN
   * 4dad96ba54827548c95059d12b7d5d5cdcc0c1a4 UNKNOWN
   *  Unknown: [CANCELED](TBD) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #8505: [HUDI-6106] Spark offline compaction/Clustering Job will do clean like Flink job

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1177650137


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieCompactor.java:
##########
@@ -269,6 +269,7 @@ private int doCompact(JavaSparkContext jsc) throws Exception {
         }
       }
       HoodieWriteMetadata<JavaRDD<WriteStatus>> compactionMetadata = client.compact(cfg.compactionInstantTime);
+      cleanAfterCompact(client);
       return UtilHelpers.handleErrors(compactionMetadata.getCommitMetadata().get(), cfg.compactionInstantTime);

Review Comment:
   Can we just execite a synchoronous cleaning for the offline compaction and clustering, it does not make sense to make the cleaning async because the whole job in executed as a batch.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org