You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/09/21 10:24:49 UTC

[GitHub] [hudi] TengHuo opened a new pull request, #6733: [HUDI-4880] Fix corrupted parquet file issue left over by cancelled compaction task

TengHuo opened a new pull request, #6733:
URL: https://github.com/apache/hudi/pull/6733

   ### Change Logs
   
   1. Remove marker delete code in `CompactionPlanOperator`, which could cause corrupted parquet files issue if compaction tasks were cancelled
   2. Fix HUDI-4108 in another way, ignore the marker file if it is already exist when creating
   
   ### Impact
   
   No API changed, minor change for fixing bug.
   
   **Risk level: none**
   
   ### Contributor's checklist
   
   - [x] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [x] Change Logs and Impact were stated clearly
   - [x] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left over by cancelled compaction task

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6733:
URL: https://github.com/apache/hudi/pull/6733#issuecomment-1296695533

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11551",
       "triggerID" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566",
       "triggerID" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566",
       "triggerID" : "1254573395",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11719",
       "triggerID" : "1257386587",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "7bc936a46859a0f9e68ce60b812d5f889867307b",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12674",
       "triggerID" : "7bc936a46859a0f9e68ce60b812d5f889867307b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fa31786d3256e2d0a40ae3c1f874d8f32a45ce82 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11719) 
   * 7bc936a46859a0f9e68ce60b812d5f889867307b Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12674) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left over by cancelled compaction task

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6733:
URL: https://github.com/apache/hudi/pull/6733#issuecomment-1253561385

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * c7c9984860b14b40d3f716f1fc1f16dc70f548b4 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left by a leaked thread in CompactFunction

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6733:
URL: https://github.com/apache/hudi/pull/6733#issuecomment-1298413907

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11551",
       "triggerID" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566",
       "triggerID" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566",
       "triggerID" : "1254573395",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11719",
       "triggerID" : "1257386587",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "7bc936a46859a0f9e68ce60b812d5f889867307b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12674",
       "triggerID" : "7bc936a46859a0f9e68ce60b812d5f889867307b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2830be0f2dae96c4adcb3717695f4e0474fbb84e",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12703",
       "triggerID" : "2830be0f2dae96c4adcb3717695f4e0474fbb84e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2830be0f2dae96c4adcb3717695f4e0474fbb84e Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12703) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TengHuo commented on a diff in pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left over by cancelled compaction task

Posted by GitBox <gi...@apache.org>.
TengHuo commented on code in PR #6733:
URL: https://github.com/apache/hudi/pull/6733#discussion_r1009158080


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/CompactionPlanOperator.java:
##########
@@ -129,9 +128,6 @@ private void scheduleCompaction(HoodieFlinkTable<?> table, long checkpointId) th
       List<CompactionOperation> operations = compactionPlan.getOperations().stream()
           .map(CompactionOperation::convertFromAvroRecordInstance).collect(toList());
       LOG.info("Execute compaction plan for instant {} as {} file groups", compactionInstantTime, operations.size());
-      WriteMarkersFactory
-          .get(table.getConfig().getMarkersType(), table, compactionInstantTime)
-          .deleteMarkerDir(table.getContext(), table.getConfig().getMarkersDeleteParallelism());

Review Comment:
   In this `HoodieMergeHandle.init` method, it will call a method `createMarkerFile` to create a marker file for the new data file when doing the compaction. So every marker file represents a new base file.
   
   https://github.com/apache/hudi/blob/efe553b327bc025d242afa37221a740dca9b1ea6/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieMergeHandle.java#L201
   
   In this method `HoodieTable.reconcileAgainstMarkers`, it will delete all data files which have marker files, but not in `List<HoodieWriteStat> stats`.
   
   https://github.com/apache/hudi/blob/4f6f15c3c761621eaaa1b3b52e0c2841626afe53/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/HoodieTable.java#L674
   
   In summary, `WriteMarkers` contains all data files we created in the current batch (and previous failed batches for the same instant), `List<HoodieWriteStat> stats` contains all data files we committed in the current batch.
   
   So, `Set(invalidDataPaths)` = `Set(data file in WriteMarkers)` - `Set(data file in List<HoodieWriteStat> stats)`
   
   In Hudi Flink online compaction, if there is anything wrong in compaction, it will do the retry automatically (won't restart the whole pipeline, only retry compaction). **So If we delete the marker file directory here, it is not possible to delete the files left by a previous failed compaction (a failed compaction with same instant), because all their marker files are deleted.** These un-committed data files will cause `corrupted data file exception` in future.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left by a leaked thread in CompactFunction

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6733:
URL: https://github.com/apache/hudi/pull/6733#issuecomment-1299743844

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11551",
       "triggerID" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566",
       "triggerID" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566",
       "triggerID" : "1254573395",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11719",
       "triggerID" : "1257386587",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "7bc936a46859a0f9e68ce60b812d5f889867307b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12674",
       "triggerID" : "7bc936a46859a0f9e68ce60b812d5f889867307b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2830be0f2dae96c4adcb3717695f4e0474fbb84e",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12703",
       "triggerID" : "2830be0f2dae96c4adcb3717695f4e0474fbb84e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "861db5109feea40129392a38d17c10f84397d258",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "861db5109feea40129392a38d17c10f84397d258",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2830be0f2dae96c4adcb3717695f4e0474fbb84e Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12703) 
   * 861db5109feea40129392a38d17c10f84397d258 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left by a leaked thread in CompactFunction

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6733:
URL: https://github.com/apache/hudi/pull/6733#issuecomment-1298262157

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11551",
       "triggerID" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566",
       "triggerID" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566",
       "triggerID" : "1254573395",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11719",
       "triggerID" : "1257386587",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "7bc936a46859a0f9e68ce60b812d5f889867307b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12674",
       "triggerID" : "7bc936a46859a0f9e68ce60b812d5f889867307b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2830be0f2dae96c4adcb3717695f4e0474fbb84e",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "2830be0f2dae96c4adcb3717695f4e0474fbb84e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 7bc936a46859a0f9e68ce60b812d5f889867307b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12674) 
   * 2830be0f2dae96c4adcb3717695f4e0474fbb84e UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left over by cancelled compaction task

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6733:
URL: https://github.com/apache/hudi/pull/6733#issuecomment-1296844632

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11551",
       "triggerID" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566",
       "triggerID" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566",
       "triggerID" : "1254573395",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11719",
       "triggerID" : "1257386587",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "7bc936a46859a0f9e68ce60b812d5f889867307b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12674",
       "triggerID" : "7bc936a46859a0f9e68ce60b812d5f889867307b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 7bc936a46859a0f9e68ce60b812d5f889867307b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12674) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TengHuo commented on pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left over by cancelled compaction task

Posted by GitBox <gi...@apache.org>.
TengHuo commented on PR #6733:
URL: https://github.com/apache/hudi/pull/6733#issuecomment-1257386460

   CI pipeline failed because of `Connection refused` issue, let me re-run it again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left by a leaked thread in CompactFunction

Posted by GitBox <gi...@apache.org>.
danny0405 commented on code in PR #6733:
URL: https://github.com/apache/hudi/pull/6733#discussion_r1011085806


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/CompactionPlanOperator.java:
##########
@@ -129,9 +128,6 @@ private void scheduleCompaction(HoodieFlinkTable<?> table, long checkpointId) th
       List<CompactionOperation> operations = compactionPlan.getOperations().stream()
           .map(CompactionOperation::convertFromAvroRecordInstance).collect(toList());
       LOG.info("Execute compaction plan for instant {} as {} file groups", compactionInstantTime, operations.size());
-      WriteMarkersFactory
-          .get(table.getConfig().getMarkersType(), table, compactionInstantTime)
-          .deleteMarkerDir(table.getContext(), table.getConfig().getMarkersDeleteParallelism());

Review Comment:
   > rollback only deletes data files, but not deleting marker files
   
   Rollback would delete the marker dir, check the logic in `BaseRollbackActionExecutor#runRollback`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TengHuo commented on a diff in pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left by a leaked thread in CompactFunction

Posted by GitBox <gi...@apache.org>.
TengHuo commented on code in PR #6733:
URL: https://github.com/apache/hudi/pull/6733#discussion_r1011337483


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/CompactionPlanOperator.java:
##########
@@ -129,9 +128,6 @@ private void scheduleCompaction(HoodieFlinkTable<?> table, long checkpointId) th
       List<CompactionOperation> operations = compactionPlan.getOperations().stream()
           .map(CompactionOperation::convertFromAvroRecordInstance).collect(toList());
       LOG.info("Execute compaction plan for instant {} as {} file groups", compactionInstantTime, operations.size());
-      WriteMarkersFactory
-          .get(table.getConfig().getMarkersType(), table, compactionInstantTime)
-          .deleteMarkerDir(table.getContext(), table.getConfig().getMarkersDeleteParallelism());

Review Comment:
   Reverted. @danny0405 please help to review. Thanks.
   
   I still think deleting marker folder before compaction is not s, because it is not clear in [HUDI-4108](https://issues.apache.org/jira/browse/HUDI-4108) that how the error happened, so it might mute some other errors (like HUDI-4880 we encountered).
   
   I will remove this part of code in our internal Hudi version. If there is something wrong about marker in future, I will sync it in this PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left by a leaked thread in CompactFunction

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6733:
URL: https://github.com/apache/hudi/pull/6733#issuecomment-1298268963

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11551",
       "triggerID" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566",
       "triggerID" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566",
       "triggerID" : "1254573395",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11719",
       "triggerID" : "1257386587",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "7bc936a46859a0f9e68ce60b812d5f889867307b",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12674",
       "triggerID" : "7bc936a46859a0f9e68ce60b812d5f889867307b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2830be0f2dae96c4adcb3717695f4e0474fbb84e",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12703",
       "triggerID" : "2830be0f2dae96c4adcb3717695f4e0474fbb84e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 7bc936a46859a0f9e68ce60b812d5f889867307b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12674) 
   * 2830be0f2dae96c4adcb3717695f4e0474fbb84e Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12703) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] xushiyan commented on pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left over by cancelled compaction task

Posted by GitBox <gi...@apache.org>.
xushiyan commented on PR #6733:
URL: https://github.com/apache/hudi/pull/6733#issuecomment-1296612988

   @TengHuo please rebase master; there were some flaky test fixes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left by a leaked thread in CompactFunction

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6733:
URL: https://github.com/apache/hudi/pull/6733#issuecomment-1300010469

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11551",
       "triggerID" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566",
       "triggerID" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566",
       "triggerID" : "1254573395",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11719",
       "triggerID" : "1257386587",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "7bc936a46859a0f9e68ce60b812d5f889867307b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12674",
       "triggerID" : "7bc936a46859a0f9e68ce60b812d5f889867307b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2830be0f2dae96c4adcb3717695f4e0474fbb84e",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12703",
       "triggerID" : "2830be0f2dae96c4adcb3717695f4e0474fbb84e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "861db5109feea40129392a38d17c10f84397d258",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "861db5109feea40129392a38d17c10f84397d258",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3247889d628032154dd3631f61dae116501c4384",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12726",
       "triggerID" : "3247889d628032154dd3631f61dae116501c4384",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d3d5a30845177e6a0fe981e2fee5b6600556da76",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d3d5a30845177e6a0fe981e2fee5b6600556da76",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2830be0f2dae96c4adcb3717695f4e0474fbb84e Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12703) 
   * 861db5109feea40129392a38d17c10f84397d258 UNKNOWN
   * 3247889d628032154dd3631f61dae116501c4384 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12726) 
   * d3d5a30845177e6a0fe981e2fee5b6600556da76 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left by a leaked thread in CompactFunction

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6733:
URL: https://github.com/apache/hudi/pull/6733#issuecomment-1300190292

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11551",
       "triggerID" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566",
       "triggerID" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566",
       "triggerID" : "1254573395",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11719",
       "triggerID" : "1257386587",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "7bc936a46859a0f9e68ce60b812d5f889867307b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12674",
       "triggerID" : "7bc936a46859a0f9e68ce60b812d5f889867307b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2830be0f2dae96c4adcb3717695f4e0474fbb84e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12703",
       "triggerID" : "2830be0f2dae96c4adcb3717695f4e0474fbb84e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "861db5109feea40129392a38d17c10f84397d258",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "861db5109feea40129392a38d17c10f84397d258",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3247889d628032154dd3631f61dae116501c4384",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12726",
       "triggerID" : "3247889d628032154dd3631f61dae116501c4384",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d3d5a30845177e6a0fe981e2fee5b6600556da76",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12729",
       "triggerID" : "d3d5a30845177e6a0fe981e2fee5b6600556da76",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 861db5109feea40129392a38d17c10f84397d258 UNKNOWN
   * 3247889d628032154dd3631f61dae116501c4384 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12726) 
   * d3d5a30845177e6a0fe981e2fee5b6600556da76 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12729) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TengHuo commented on a diff in pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left by a leaked thread in CompactFunction

Posted by GitBox <gi...@apache.org>.
TengHuo commented on code in PR #6733:
URL: https://github.com/apache/hudi/pull/6733#discussion_r1059396635


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/CompactionPlanOperator.java:
##########
@@ -129,9 +128,6 @@ private void scheduleCompaction(HoodieFlinkTable<?> table, long checkpointId) th
       List<CompactionOperation> operations = compactionPlan.getOperations().stream()
           .map(CompactionOperation::convertFromAvroRecordInstance).collect(toList());
       LOG.info("Execute compaction plan for instant {} as {} file groups", compactionInstantTime, operations.size());
-      WriteMarkersFactory
-          .get(table.getConfig().getMarkersType(), table, compactionInstantTime)
-          .deleteMarkerDir(table.getContext(), table.getConfig().getMarkersDeleteParallelism());

Review Comment:
   Correct, so `NonThrownExecutor` in `CompactFunction` uses a `Executors.newSingleThreadExecutor()` internally, it will handle the event one by one. 
   
   Totally agree with you, `collector.collect(...)` is not thread safe, it's better if `CompactFunction` can implement the method `asyncInvoke` in the interface `AsyncFunction`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TengHuo commented on pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left by a leaked thread in CompactFunction

Posted by GitBox <gi...@apache.org>.
TengHuo commented on PR #6733:
URL: https://github.com/apache/hudi/pull/6733#issuecomment-1299768328

   Something wrong in maven build, not related with this PR.
   
   ```log
   Error:  Failed to execute goal on project hudi-utilities_2.12: Could not resolve dependencies for project org.apache.hudi:hudi-utilities_2.12:jar:0.13.0-SNAPSHOT: Failed to collect dependencies at io.confluent:kafka-avro-serializer:jar:5.3.4: Failed to read artifact descriptor for io.confluent:kafka-avro-serializer:jar:5.3.4: Could not transfer artifact io.confluent:kafka-avro-serializer:pom:5.3.4 from/to confluent (https://packages.confluent.io/maven/): transfer failed for https://packages.confluent.io/maven/io/confluent/kafka-avro-serializer/5.3.4/kafka-avro-serializer-5.3.4.pom: Connection reset -> [Help 1]
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TengHuo commented on a diff in pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left by a leaked thread in CompactFunction

Posted by GitBox <gi...@apache.org>.
TengHuo commented on code in PR #6733:
URL: https://github.com/apache/hudi/pull/6733#discussion_r1011532297


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/CompactFunction.java:
##########
@@ -133,4 +133,14 @@ private HoodieWriteConfig reloadWriteConfig() throws Exception {
   public void setExecutor(NonThrownExecutor executor) {
     this.executor = executor;
   }
+
+  @Override
+  public void close() throws Exception {
+    if (this.asyncCompaction) {
+      this.executor.close();

Review Comment:
   Added



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TengHuo commented on pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left over by cancelled compaction task

Posted by GitBox <gi...@apache.org>.
TengHuo commented on PR #6733:
URL: https://github.com/apache/hudi/pull/6733#issuecomment-1296632077

   > @TengHuo please rebase master; there were some flaky test fixes
   
   sure, np, just rebased it to the latest master


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left over by cancelled compaction task

Posted by GitBox <gi...@apache.org>.
danny0405 commented on code in PR #6733:
URL: https://github.com/apache/hudi/pull/6733#discussion_r1009129181


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/CompactionPlanOperator.java:
##########
@@ -129,9 +128,6 @@ private void scheduleCompaction(HoodieFlinkTable<?> table, long checkpointId) th
       List<CompactionOperation> operations = compactionPlan.getOperations().stream()
           .map(CompactionOperation::convertFromAvroRecordInstance).collect(toList());
       LOG.info("Execute compaction plan for instant {} as {} file groups", compactionInstantTime, operations.size());
-      WriteMarkersFactory
-          .get(table.getConfig().getMarkersType(), table, compactionInstantTime)
-          .deleteMarkerDir(table.getContext(), table.getConfig().getMarkersDeleteParallelism());

Review Comment:
   What's the problem here if we clean the markers before scheduling compaction plan.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 merged pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left by a leaked thread in CompactFunction

Posted by GitBox <gi...@apache.org>.
danny0405 merged PR #6733:
URL: https://github.com/apache/hudi/pull/6733


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left by a leaked thread in CompactFunction

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6733:
URL: https://github.com/apache/hudi/pull/6733#issuecomment-1300017520

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11551",
       "triggerID" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566",
       "triggerID" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566",
       "triggerID" : "1254573395",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11719",
       "triggerID" : "1257386587",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "7bc936a46859a0f9e68ce60b812d5f889867307b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12674",
       "triggerID" : "7bc936a46859a0f9e68ce60b812d5f889867307b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2830be0f2dae96c4adcb3717695f4e0474fbb84e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12703",
       "triggerID" : "2830be0f2dae96c4adcb3717695f4e0474fbb84e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "861db5109feea40129392a38d17c10f84397d258",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "861db5109feea40129392a38d17c10f84397d258",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3247889d628032154dd3631f61dae116501c4384",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12726",
       "triggerID" : "3247889d628032154dd3631f61dae116501c4384",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d3d5a30845177e6a0fe981e2fee5b6600556da76",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d3d5a30845177e6a0fe981e2fee5b6600556da76",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 861db5109feea40129392a38d17c10f84397d258 UNKNOWN
   * 3247889d628032154dd3631f61dae116501c4384 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12726) 
   * d3d5a30845177e6a0fe981e2fee5b6600556da76 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TengHuo commented on a diff in pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left by a leaked thread in CompactFunction

Posted by GitBox <gi...@apache.org>.
TengHuo commented on code in PR #6733:
URL: https://github.com/apache/hudi/pull/6733#discussion_r1010254943


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/CompactionPlanOperator.java:
##########
@@ -129,9 +128,6 @@ private void scheduleCompaction(HoodieFlinkTable<?> table, long checkpointId) th
       List<CompactionOperation> operations = compactionPlan.getOperations().stream()
           .map(CompactionOperation::convertFromAvroRecordInstance).collect(toList());
       LOG.info("Execute compaction plan for instant {} as {} file groups", compactionInstantTime, operations.size());
-      WriteMarkersFactory
-          .get(table.getConfig().getMarkersType(), table, compactionInstantTime)
-          .deleteMarkerDir(table.getContext(), table.getConfig().getMarkersDeleteParallelism());

Review Comment:
   cool, thanks for reviewing
   
   I'm testing if there will be duplicate marker issue when a compaction failed and a rollback performed.
   
   As I understand, rollback only deletes data files, but not deleting marker files, if a compaction failed and retry again, it should throw the same error as [HUDI-4108](https://issues.apache.org/jira/browse/HUDI-4108) with my current code.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TengHuo commented on a diff in pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left by a leaked thread in CompactFunction

Posted by GitBox <gi...@apache.org>.
TengHuo commented on code in PR #6733:
URL: https://github.com/apache/hudi/pull/6733#discussion_r1011497088


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/CompactFunction.java:
##########
@@ -133,4 +133,14 @@ private HoodieWriteConfig reloadWriteConfig() throws Exception {
   public void setExecutor(NonThrownExecutor executor) {
     this.executor = executor;
   }
+
+  @Override
+  public void close() throws Exception {
+    if (this.asyncCompaction) {
+      this.executor.close();

Review Comment:
   Got it, good idea



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] littleeleventhwolf commented on a diff in pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left by a leaked thread in CompactFunction

Posted by GitBox <gi...@apache.org>.
littleeleventhwolf commented on code in PR #6733:
URL: https://github.com/apache/hudi/pull/6733#discussion_r1059378282


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/CompactionPlanOperator.java:
##########
@@ -129,9 +128,6 @@ private void scheduleCompaction(HoodieFlinkTable<?> table, long checkpointId) th
       List<CompactionOperation> operations = compactionPlan.getOperations().stream()
           .map(CompactionOperation::convertFromAvroRecordInstance).collect(toList());
       LOG.info("Execute compaction plan for instant {} as {} file groups", compactionInstantTime, operations.size());
-      WriteMarkersFactory
-          .get(table.getConfig().getMarkersType(), table, compactionInstantTime)
-          .deleteMarkerDir(table.getContext(), table.getConfig().getMarkersDeleteParallelism());

Review Comment:
   > The leaked thread will do compaction after a task was destroyed, failed when this executor trying to call `collector.collect(new CompactionCommitEvent(...))` in `CompactFunction#processElement`.
   
   @TengHuo Does this mean that in the case of thread leaks, async compaction may have thread safety problems?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TengHuo commented on a diff in pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left by a leaked thread in CompactFunction

Posted by GitBox <gi...@apache.org>.
TengHuo commented on code in PR #6733:
URL: https://github.com/apache/hudi/pull/6733#discussion_r1059389241


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/CompactionPlanOperator.java:
##########
@@ -129,9 +128,6 @@ private void scheduleCompaction(HoodieFlinkTable<?> table, long checkpointId) th
       List<CompactionOperation> operations = compactionPlan.getOperations().stream()
           .map(CompactionOperation::convertFromAvroRecordInstance).collect(toList());
       LOG.info("Execute compaction plan for instant {} as {} file groups", compactionInstantTime, operations.size());
-      WriteMarkersFactory
-          .get(table.getConfig().getMarkersType(), table, compactionInstantTime)
-          .deleteMarkerDir(table.getContext(), table.getConfig().getMarkersDeleteParallelism());

Review Comment:
   Yeah, there could be a thread leak, async compaction use a single thread executor to do the compaction stuff and this thread could not be closed properly. This async compaction thread is designed for not blocking checkpoint when it is used in streaming pipeline mode.
   May I ask what safety concern you are referring to?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] littleeleventhwolf commented on a diff in pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left by a leaked thread in CompactFunction

Posted by GitBox <gi...@apache.org>.
littleeleventhwolf commented on code in PR #6733:
URL: https://github.com/apache/hudi/pull/6733#discussion_r1059393990


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/CompactionPlanOperator.java:
##########
@@ -129,9 +128,6 @@ private void scheduleCompaction(HoodieFlinkTable<?> table, long checkpointId) th
       List<CompactionOperation> operations = compactionPlan.getOperations().stream()
           .map(CompactionOperation::convertFromAvroRecordInstance).collect(toList());
       LOG.info("Execute compaction plan for instant {} as {} file groups", compactionInstantTime, operations.size());
-      WriteMarkersFactory
-          .get(table.getConfig().getMarkersType(), table, compactionInstantTime)
-          .deleteMarkerDir(table.getContext(), table.getConfig().getMarkersDeleteParallelism());

Review Comment:
   > May I ask what safety concern you are referring to?
   
   The `collector.collect(...)` is not a thread-safe method, so if multiple threads call this method at the same time, the amount of events sent by `compact_task` will not be equal to the amount of events received by `Sink:compact_commit`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left over by cancelled compaction task

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6733:
URL: https://github.com/apache/hudi/pull/6733#issuecomment-1253736000

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11551",
       "triggerID" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * c7c9984860b14b40d3f716f1fc1f16dc70f548b4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11551) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left over by cancelled compaction task

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6733:
URL: https://github.com/apache/hudi/pull/6733#issuecomment-1257399144

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11551",
       "triggerID" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566",
       "triggerID" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566",
       "triggerID" : "1254573395",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11719",
       "triggerID" : "1257386587",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * fa31786d3256e2d0a40ae3c1f874d8f32a45ce82 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11719) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left over by cancelled compaction task

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6733:
URL: https://github.com/apache/hudi/pull/6733#issuecomment-1254523155

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11551",
       "triggerID" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566",
       "triggerID" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fa31786d3256e2d0a40ae3c1f874d8f32a45ce82 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TengHuo commented on a diff in pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left by a leaked thread in CompactFunction

Posted by GitBox <gi...@apache.org>.
TengHuo commented on code in PR #6733:
URL: https://github.com/apache/hudi/pull/6733#discussion_r1011326595


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/CompactionPlanOperator.java:
##########
@@ -129,9 +128,6 @@ private void scheduleCompaction(HoodieFlinkTable<?> table, long checkpointId) th
       List<CompactionOperation> operations = compactionPlan.getOperations().stream()
           .map(CompactionOperation::convertFromAvroRecordInstance).collect(toList());
       LOG.info("Execute compaction plan for instant {} as {} file groups", compactionInstantTime, operations.size());
-      WriteMarkersFactory
-          .get(table.getConfig().getMarkersType(), table, compactionInstantTime)
-          .deleteMarkerDir(table.getContext(), table.getConfig().getMarkersDeleteParallelism());

Review Comment:
   Tested, and checked the source code. Indeed, rollback will delete the marker dir at the end of `runRollback`. So there should be no marker folder in `.temp` after rollback.
   
   https://github.com/apache/hudi/blob/6be2057376fb10e79ccb690757cb172a2ad48889/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/BaseRollbackActionExecutor.java#L118
   
   I can't re-produce the same error in [HUDI-4108](https://issues.apache.org/jira/browse/HUDI-4108). And I don't know how it happened. So revert the change in `CompactionPlanOperator.java` in this PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TengHuo commented on a diff in pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left over by cancelled compaction task

Posted by GitBox <gi...@apache.org>.
TengHuo commented on code in PR #6733:
URL: https://github.com/apache/hudi/pull/6733#discussion_r1010026246


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/CompactionPlanOperator.java:
##########
@@ -129,9 +128,6 @@ private void scheduleCompaction(HoodieFlinkTable<?> table, long checkpointId) th
       List<CompactionOperation> operations = compactionPlan.getOperations().stream()
           .map(CompactionOperation::convertFromAvroRecordInstance).collect(toList());
       LOG.info("Execute compaction plan for instant {} as {} file groups", compactionInstantTime, operations.size());
-      WriteMarkersFactory
-          .get(table.getConfig().getMarkersType(), table, compactionInstantTime)
-          .deleteMarkerDir(table.getContext(), table.getConfig().getMarkersDeleteParallelism());

Review Comment:
   Yeah, you are right. I agree with that there shouldn't be duplicate marker file in the first place.
   
   The duplicate marker file means duplicate data file. So for fixing this duplicate marker file issue(https://issues.apache.org/jira/browse/HUDI-4108), we have to find out how this duplicate marker file generated. Or, we should delete the marker file and the data file together instead of deleting marker file directory only.
   
   My PR can't fix the duplicate marker file issue as I haven't encountered the same problem as [HUDI-4108](https://issues.apache.org/jira/browse/HUDI-4108).
   
   About rollback function in `CompactionPlanOperator#open` and `CompactionCommitSink#commitIfNecessary`, they should delete the left data files if there is anything wrong during the compaction, but from our log files, they didn't work properly.
   
   Let me check why rollback not working properly in our pipeline. Will reply here later.
   
   My idea is that we shouldn't delete marker file directory here. It should be deleted in rollback function when deleting data files.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left over by cancelled compaction task

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6733:
URL: https://github.com/apache/hudi/pull/6733#issuecomment-1254613059

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11551",
       "triggerID" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566",
       "triggerID" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1254573395",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * fa31786d3256e2d0a40ae3c1f874d8f32a45ce82 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left over by cancelled compaction task

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6733:
URL: https://github.com/apache/hudi/pull/6733#issuecomment-1253567596

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11551",
       "triggerID" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * c7c9984860b14b40d3f716f1fc1f16dc70f548b4 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11551) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TengHuo commented on pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left over by cancelled compaction task

Posted by GitBox <gi...@apache.org>.
TengHuo commented on PR #6733:
URL: https://github.com/apache/hudi/pull/6733#issuecomment-1254573395

   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TengHuo commented on a diff in pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left over by cancelled compaction task

Posted by GitBox <gi...@apache.org>.
TengHuo commented on code in PR #6733:
URL: https://github.com/apache/hudi/pull/6733#discussion_r1010027681


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/CompactionPlanOperator.java:
##########
@@ -129,9 +128,6 @@ private void scheduleCompaction(HoodieFlinkTable<?> table, long checkpointId) th
       List<CompactionOperation> operations = compactionPlan.getOperations().stream()
           .map(CompactionOperation::convertFromAvroRecordInstance).collect(toList());
       LOG.info("Execute compaction plan for instant {} as {} file groups", compactionInstantTime, operations.size());
-      WriteMarkersFactory
-          .get(table.getConfig().getMarkersType(), table, compactionInstantTime)
-          .deleteMarkerDir(table.getContext(), table.getConfig().getMarkersDeleteParallelism());

Review Comment:
   Additionally, there was `INFLIGHT` instant in the time when compaction failed, but rollback didn't delete un-finished data files.
   
   I'm investigating this problem from logs.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left by a leaked thread in CompactFunction

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6733:
URL: https://github.com/apache/hudi/pull/6733#issuecomment-1300705250

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11551",
       "triggerID" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566",
       "triggerID" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566",
       "triggerID" : "1254573395",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11719",
       "triggerID" : "1257386587",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "7bc936a46859a0f9e68ce60b812d5f889867307b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12674",
       "triggerID" : "7bc936a46859a0f9e68ce60b812d5f889867307b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2830be0f2dae96c4adcb3717695f4e0474fbb84e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12703",
       "triggerID" : "2830be0f2dae96c4adcb3717695f4e0474fbb84e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "861db5109feea40129392a38d17c10f84397d258",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "861db5109feea40129392a38d17c10f84397d258",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3247889d628032154dd3631f61dae116501c4384",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12726",
       "triggerID" : "3247889d628032154dd3631f61dae116501c4384",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d3d5a30845177e6a0fe981e2fee5b6600556da76",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12729",
       "triggerID" : "d3d5a30845177e6a0fe981e2fee5b6600556da76",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 861db5109feea40129392a38d17c10f84397d258 UNKNOWN
   * d3d5a30845177e6a0fe981e2fee5b6600556da76 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12729) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left by a leaked thread in CompactFunction

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6733:
URL: https://github.com/apache/hudi/pull/6733#issuecomment-1299757977

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11551",
       "triggerID" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566",
       "triggerID" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566",
       "triggerID" : "1254573395",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11719",
       "triggerID" : "1257386587",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "7bc936a46859a0f9e68ce60b812d5f889867307b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12674",
       "triggerID" : "7bc936a46859a0f9e68ce60b812d5f889867307b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2830be0f2dae96c4adcb3717695f4e0474fbb84e",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12703",
       "triggerID" : "2830be0f2dae96c4adcb3717695f4e0474fbb84e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "861db5109feea40129392a38d17c10f84397d258",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "861db5109feea40129392a38d17c10f84397d258",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3247889d628032154dd3631f61dae116501c4384",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "3247889d628032154dd3631f61dae116501c4384",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2830be0f2dae96c4adcb3717695f4e0474fbb84e Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12703) 
   * 861db5109feea40129392a38d17c10f84397d258 UNKNOWN
   * 3247889d628032154dd3631f61dae116501c4384 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left by a leaked thread in CompactFunction

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6733:
URL: https://github.com/apache/hudi/pull/6733#issuecomment-1299898378

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11551",
       "triggerID" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566",
       "triggerID" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566",
       "triggerID" : "1254573395",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11719",
       "triggerID" : "1257386587",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "7bc936a46859a0f9e68ce60b812d5f889867307b",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12674",
       "triggerID" : "7bc936a46859a0f9e68ce60b812d5f889867307b",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2830be0f2dae96c4adcb3717695f4e0474fbb84e",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12703",
       "triggerID" : "2830be0f2dae96c4adcb3717695f4e0474fbb84e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "861db5109feea40129392a38d17c10f84397d258",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "861db5109feea40129392a38d17c10f84397d258",
       "triggerType" : "PUSH"
     }, {
       "hash" : "3247889d628032154dd3631f61dae116501c4384",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12726",
       "triggerID" : "3247889d628032154dd3631f61dae116501c4384",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2830be0f2dae96c4adcb3717695f4e0474fbb84e Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12703) 
   * 861db5109feea40129392a38d17c10f84397d258 UNKNOWN
   * 3247889d628032154dd3631f61dae116501c4384 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12726) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TengHuo commented on pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left by a leaked thread in CompactFunction

Posted by GitBox <gi...@apache.org>.
TengHuo commented on PR #6733:
URL: https://github.com/apache/hudi/pull/6733#issuecomment-1298233987

   Just reverted the code about ignoring duplicate marker error. The code will throw error if there is an existing duplicate marker file now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left over by cancelled compaction task

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6733:
URL: https://github.com/apache/hudi/pull/6733#issuecomment-1296688526

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11551",
       "triggerID" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566",
       "triggerID" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566",
       "triggerID" : "1254573395",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11719",
       "triggerID" : "1257386587",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "7bc936a46859a0f9e68ce60b812d5f889867307b",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "7bc936a46859a0f9e68ce60b812d5f889867307b",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fa31786d3256e2d0a40ae3c1f874d8f32a45ce82 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11719) 
   * 7bc936a46859a0f9e68ce60b812d5f889867307b UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left by a leaked thread in CompactFunction

Posted by GitBox <gi...@apache.org>.
danny0405 commented on code in PR #6733:
URL: https://github.com/apache/hudi/pull/6733#discussion_r1010230178


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/CompactionPlanOperator.java:
##########
@@ -129,9 +128,6 @@ private void scheduleCompaction(HoodieFlinkTable<?> table, long checkpointId) th
       List<CompactionOperation> operations = compactionPlan.getOperations().stream()
           .map(CompactionOperation::convertFromAvroRecordInstance).collect(toList());
       LOG.info("Execute compaction plan for instant {} as {} file groups", compactionInstantTime, operations.size());
-      WriteMarkersFactory
-          .get(table.getConfig().getMarkersType(), table, compactionInstantTime)
-          .deleteMarkerDir(table.getContext(), table.getConfig().getMarkersDeleteParallelism());

Review Comment:
   > For fixing this issue, I added a method CompactFunction#close(), it will close the executor properly and close the write client if needed.
   
   I agree this is a valid fix.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left by a leaked thread in CompactFunction

Posted by GitBox <gi...@apache.org>.
danny0405 commented on code in PR #6733:
URL: https://github.com/apache/hudi/pull/6733#discussion_r1011416142


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/CompactFunction.java:
##########
@@ -133,4 +133,14 @@ private HoodieWriteConfig reloadWriteConfig() throws Exception {
   public void setExecutor(NonThrownExecutor executor) {
     this.executor = executor;
   }
+
+  @Override
+  public void close() throws Exception {
+    if (this.asyncCompaction) {
+      this.executor.close();

Review Comment:
   And we can fix the `ClusteringOperator#close` as well.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TengHuo commented on a diff in pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left by a leaked thread in CompactFunction

Posted by GitBox <gi...@apache.org>.
TengHuo commented on code in PR #6733:
URL: https://github.com/apache/hudi/pull/6733#discussion_r1011337483


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/CompactionPlanOperator.java:
##########
@@ -129,9 +128,6 @@ private void scheduleCompaction(HoodieFlinkTable<?> table, long checkpointId) th
       List<CompactionOperation> operations = compactionPlan.getOperations().stream()
           .map(CompactionOperation::convertFromAvroRecordInstance).collect(toList());
       LOG.info("Execute compaction plan for instant {} as {} file groups", compactionInstantTime, operations.size());
-      WriteMarkersFactory
-          .get(table.getConfig().getMarkersType(), table, compactionInstantTime)
-          .deleteMarkerDir(table.getContext(), table.getConfig().getMarkersDeleteParallelism());

Review Comment:
   Reverted. @danny0405 please help to review. Thanks.
   
   I still think deleting marker folder before compaction is not s, because it is not clear in [HUDI-4108](https://issues.apache.org/jira/browse/HUDI-4108) that how the error happened, so it might mute some other errors.
   
   I will remove this part of code in our internal Hudi version. If there is something wrong about marker in future, I will sync it in this PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left over by cancelled compaction task

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6733:
URL: https://github.com/apache/hudi/pull/6733#issuecomment-1257486358

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11551",
       "triggerID" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566",
       "triggerID" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566",
       "triggerID" : "1254573395",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11719",
       "triggerID" : "1257386587",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * fa31786d3256e2d0a40ae3c1f874d8f32a45ce82 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11719) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TengHuo commented on pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left over by cancelled compaction task

Posted by GitBox <gi...@apache.org>.
TengHuo commented on PR #6733:
URL: https://github.com/apache/hudi/pull/6733#issuecomment-1257386587

   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left over by cancelled compaction task

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6733:
URL: https://github.com/apache/hudi/pull/6733#issuecomment-1254473535

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11551",
       "triggerID" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * c7c9984860b14b40d3f716f1fc1f16dc70f548b4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11551) 
   * fa31786d3256e2d0a40ae3c1f874d8f32a45ce82 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left over by cancelled compaction task

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6733:
URL: https://github.com/apache/hudi/pull/6733#issuecomment-1254476263

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11551",
       "triggerID" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566",
       "triggerID" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * c7c9984860b14b40d3f716f1fc1f16dc70f548b4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11551) 
   * fa31786d3256e2d0a40ae3c1f874d8f32a45ce82 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left over by cancelled compaction task

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6733:
URL: https://github.com/apache/hudi/pull/6733#issuecomment-1254618202

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11551",
       "triggerID" : "c7c9984860b14b40d3f716f1fc1f16dc70f548b4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566",
       "triggerID" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fa31786d3256e2d0a40ae3c1f874d8f32a45ce82",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566",
       "triggerID" : "1254573395",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * fa31786d3256e2d0a40ae3c1f874d8f32a45ce82 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11566) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TengHuo commented on a diff in pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left by a leaked thread in CompactFunction

Posted by GitBox <gi...@apache.org>.
TengHuo commented on code in PR #6733:
URL: https://github.com/apache/hudi/pull/6733#discussion_r1059397309


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/CompactionPlanOperator.java:
##########
@@ -129,9 +128,6 @@ private void scheduleCompaction(HoodieFlinkTable<?> table, long checkpointId) th
       List<CompactionOperation> operations = compactionPlan.getOperations().stream()
           .map(CompactionOperation::convertFromAvroRecordInstance).collect(toList());
       LOG.info("Execute compaction plan for instant {} as {} file groups", compactionInstantTime, operations.size());
-      WriteMarkersFactory
-          .get(table.getConfig().getMarkersType(), table, compactionInstantTime)
-          .deleteMarkerDir(table.getContext(), table.getConfig().getMarkersDeleteParallelism());

Review Comment:
   In some cases we met recently, some event could be lost. E.g. https://github.com/apache/hudi/pull/7408



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left over by cancelled compaction task

Posted by GitBox <gi...@apache.org>.
danny0405 commented on code in PR #6733:
URL: https://github.com/apache/hudi/pull/6733#discussion_r1009173949


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/CompactionPlanOperator.java:
##########
@@ -129,9 +128,6 @@ private void scheduleCompaction(HoodieFlinkTable<?> table, long checkpointId) th
       List<CompactionOperation> operations = compactionPlan.getOperations().stream()
           .map(CompactionOperation::convertFromAvroRecordInstance).collect(toList());
       LOG.info("Execute compaction plan for instant {} as {} file groups", compactionInstantTime, operations.size());
-      WriteMarkersFactory
-          .get(table.getConfig().getMarkersType(), table, compactionInstantTime)
-          .deleteMarkerDir(table.getContext(), table.getConfig().getMarkersDeleteParallelism());

Review Comment:
   Did you notice that only the `REQUESTED` instant are scheduled here ? That means it is not a failed compaction (or the state would be `INFLIGHT`). And we did have some rollback logic in `CompactPlanOperator#open` and `CompactionCommitSink#commitIfNecessary`.
   
   We need to figure out how a marker dir exists with a `REQUESTED` instant on timeline first :)
   
   There is a question needed to be answered here: a duplicated marker file means also a duplicated data file (either complete or corrupted), the duplicate data file copy would cause the new data file creation exception because of file exists exception, how could we resolve that ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TengHuo commented on a diff in pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left over by cancelled compaction task

Posted by GitBox <gi...@apache.org>.
TengHuo commented on code in PR #6733:
URL: https://github.com/apache/hudi/pull/6733#discussion_r1010190971


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/CompactionPlanOperator.java:
##########
@@ -129,9 +128,6 @@ private void scheduleCompaction(HoodieFlinkTable<?> table, long checkpointId) th
       List<CompactionOperation> operations = compactionPlan.getOperations().stream()
           .map(CompactionOperation::convertFromAvroRecordInstance).collect(toList());
       LOG.info("Execute compaction plan for instant {} as {} file groups", compactionInstantTime, operations.size());
-      WriteMarkersFactory
-          .get(table.getConfig().getMarkersType(), table, compactionInstantTime)
-          .deleteMarkerDir(table.getContext(), table.getConfig().getMarkersDeleteParallelism());

Review Comment:
   Hi @danny0405 
   
   I just reviewed the log files from our pipeline.
   
   There is another issue I forgot to mention, which is the executor in `CompactFunction` didn't close properly, it will cause a thread leak. The leaked thread will do compaction after a task was destroyed, failed when this executor trying to call `collector.collect(new CompactionCommitEvent(...))` in `CompactFunction#processElement`.
   
   In a very rare case, it could left an unfinished data file in HDFS which can't be deleted in a rollback. E.g. a rollback was performed before creating data file.
   
   For fixing this issue, I added a method `CompactFunction#close()`, it will close the executor properly and close the write client if needed.
   
   This is the task manager log from our MOR pipeline
   
   ```log
   2022-09-19 07:08:23,132 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Received task compact_plan_generate (1/1)#32 (0ac61651b7df52113e9e8d457b4611a7), deploy into slot with allocation id 0eb063453441ba3ed8b582ba71268f01.
   2022-09-19 07:08:23,133 INFO  org.apache.flink.runtime.taskmanager.Task                    [] - compact_plan_generate (1/1)#32 (0ac61651b7df52113e9e8d457b4611a7) switched from CREATED to DEPLOYING.
   2022-09-19 07:08:23,133 INFO  org.apache.flink.runtime.taskmanager.Task                    [] - Loading JAR files for task compact_plan_generate (1/1)#32 (0ac61651b7df52113e9e8d457b4611a7) [DEPLOYING].
   2022-09-19 07:08:23,134 INFO  org.apache.flink.runtime.taskexecutor.slot.TaskSlotTableImpl [] - Activate slot 0eb063453441ba3ed8b582ba71268f01.
   2022-09-19 07:08:23,135 INFO  org.apache.flink.runtime.taskmanager.Task                    [] - compact_plan_generate (1/1)#32 (0ac61651b7df52113e9e8d457b4611a7) switched from DEPLOYING to INITIALIZING.
   2022-09-19 07:08:23,136 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Received task compact_task (1/32)#32 (d5c0456dc51cb2d8b93d723c40b95dfb), deploy into slot with allocation id 0eb063453441ba3ed8b582ba71268f01.
   2022-09-19 07:08:23,137 INFO  org.apache.flink.runtime.taskmanager.Task                    [] - compact_task (1/32)#32 (d5c0456dc51cb2d8b93d723c40b95dfb) switched from CREATED to DEPLOYING.
   2022-09-19 07:08:23,137 INFO  org.apache.flink.runtime.taskmanager.Task                    [] - Loading JAR files for task compact_task (1/32)#32 (d5c0456dc51cb2d8b93d723c40b95dfb) [DEPLOYING].
   2022-09-19 07:08:23,137 INFO  org.apache.flink.runtime.state.DefaultOperatorStateBackendBuilder [] - Operator state restore duration : 0 ms.
   2022-09-19 07:08:23,138 INFO  org.apache.flink.runtime.taskexecutor.slot.TaskSlotTableImpl [] - Activate slot c8091bbeb6786d2ef42e964d08eac2da.
   2022-09-19 07:08:23,139 INFO  org.apache.flink.runtime.taskmanager.Task                    [] - compact_task (1/32)#32 (d5c0456dc51cb2d8b93d723c40b95dfb) switched from DEPLOYING to INITIALIZING.
   2022-09-19 07:08:23,140 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Received task compact_task (16/32)#32 (758c1f13d16eee52a1088f04a092176d), deploy into slot with allocation id c8091bbeb6786d2ef42e964d08eac2da.
   2022-09-19 07:08:23,141 INFO  org.apache.flink.runtime.taskmanager.Task                    [] - compact_task (16/32)#32 (758c1f13d16eee52a1088f04a092176d) switched from CREATED to DEPLOYING.
   2022-09-19 07:08:23,141 INFO  org.apache.flink.runtime.taskmanager.Task                    [] - Loading JAR files for task compact_task (16/32)#32 (758c1f13d16eee52a1088f04a092176d) [DEPLOYING].
   2022-09-19 07:08:23,142 INFO  org.apache.flink.runtime.taskexecutor.slot.TaskSlotTableImpl [] - Activate slot 0eb063453441ba3ed8b582ba71268f01.
   2022-09-19 07:08:23,142 INFO  org.apache.flink.runtime.state.DefaultOperatorStateBackendBuilder [] - Operator state restore duration : 0 ms.
   2022-09-19 07:08:23,142 INFO  org.apache.flink.runtime.state.DefaultOperatorStateBackendBuilder [] - Operator state restore duration : 10 ms.
   2022-09-19 07:08:23,142 INFO  org.apache.flink.runtime.taskmanager.Task                    [] - compact_task (16/32)#32 (758c1f13d16eee52a1088f04a092176d) switched from DEPLOYING to INITIALIZING.
   2022-09-19 07:08:23,143 INFO  org.apache.flink.runtime.state.DefaultOperatorStateBackendBuilder [] - Operator state restore duration : 11 ms.
   2022-09-19 07:08:23,143 INFO  org.apache.hudi.util.ViewStorageProperties                   [] - Loading filesystem view storage properties from hdfs://.../.hoodie/.aux/view_storage_conf.properties
   2022-09-19 07:08:23,144 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Received task Sink: compact_commit (1/1)#32 (7157a69aa5013201f736bf71d88babed), deploy into slot with allocation id 0eb063453441ba3ed8b582ba71268f01.
   2022-09-19 07:08:23,144 INFO  org.apache.flink.runtime.taskmanager.Task                    [] - Sink: compact_commit (1/1)#32 (7157a69aa5013201f736bf71d88babed) switched from CREATED to DEPLOYING.
   2022-09-19 07:08:23,145 INFO  org.apache.flink.runtime.taskmanager.Task                    [] - Loading JAR files for task Sink: compact_commit (1/1)#32 (7157a69aa5013201f736bf71d88babed) [DEPLOYING].
   2022-09-19 07:08:23,146 INFO  org.apache.flink.runtime.taskmanager.Task                    [] - Sink: compact_commit (1/1)#32 (7157a69aa5013201f736bf71d88babed) switched from DEPLOYING to INITIALIZING.
   2022-09-19 07:08:23,146 INFO  org.apache.flink.runtime.state.DefaultOperatorStateBackendBuilder [] - Operator state restore duration : 0 ms.
   2022-09-19 07:08:23,146 INFO  org.apache.hudi.util.ViewStorageProperties                   [] - Loading filesystem view storage properties from hdfs://.../.hoodie/.aux/view_storage_conf.properties
   2022-09-19 07:08:23,147 INFO  org.apache.flink.runtime.state.DefaultOperatorStateBackendBuilder [] - Operator state restore duration : 0 ms.
   2022-09-19 07:08:23,149 INFO  org.apache.hudi.util.ViewStorageProperties                   [] - Loading filesystem view storage properties from hdfs://.../.hoodie/.aux/view_storage_conf.properties
   2022-09-19 07:08:23,150 INFO  org.apache.hudi.util.ViewStorageProperties                   [] - Loading filesystem view storage properties from hdfs://.../.hoodie/.aux/view_storage_conf.properties
   2022-09-19 07:08:23,196 INFO  org.apache.flink.runtime.taskmanager.Task                    [] - compact_task (1/32)#32 (d5c0456dc51cb2d8b93d723c40b95dfb) switched from INITIALIZING to RUNNING.
   2022-09-19 07:08:23,199 INFO  org.apache.flink.runtime.taskmanager.Task                    [] - compact_task (16/32)#32 (758c1f13d16eee52a1088f04a092176d) switched from INITIALIZING to RUNNING.
   2022-09-19 07:08:23,200 INFO  org.apache.hudi.sink.CleanFunction                           [] - exec sync clean with instant time 20220919070823199...
   2022-09-19 07:08:23,206 INFO  org.apache.hudi.util.CompactionUtil                          [] - Rollback the inflight compaction instant: [==>20220919020324533__compaction__INFLIGHT] for failover
   2022-09-19 07:08:23,217 INFO  org.apache.flink.runtime.taskmanager.Task                    [] - Sink: compact_commit (1/1)#32 (7157a69aa5013201f736bf71d88babed) switched from INITIALIZING to RUNNING.
   2022-09-19 07:08:23,308 INFO  org.apache.hudi.sink.common.AbstractStreamWriteFunction      [] - Send bootstrap write metadata event to coordinator, task[15].
   2022-09-19 07:08:23,309 INFO  org.apache.flink.runtime.taskmanager.Task                    [] - bucket_write: hudi_mor (16/32)#32 (dc15bf98eb8675018312ca317099a005) switched from INITIALIZING to RUNNING.
   2022-09-19 07:08:23,318 INFO  org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase [] - Consumer subtask 15 will start reading 1 partitions with offsets in restored state: {KafkaTopicPartition{topic='...', partition=1}=6787076}
   2022-09-19 07:08:23,318 INFO  org.apache.flink.runtime.taskmanager.Task                    [] - Source: TableSourceScan(...)#32 (389bb286011a261bfa9e111568f3fc49) switched from INITIALIZING to RUNNING.
   2022-09-19 07:08:24,030 INFO  org.apache.hudi.util.CompactionUtil                          [] - Rollback the inflight compaction instant: [==>20220919054842744__compaction__INFLIGHT] for failover
   2022-09-19 07:08:24,889 INFO  org.apache.flink.runtime.taskmanager.Task                    [] - compact_plan_generate (1/1)#32 (0ac61651b7df52113e9e8d457b4611a7) switched from INITIALIZING to RUNNING.
   ...deleted unrelated logs
   2022-09-19 07:08:41,700 WARN  org.apache.hadoop.hdfs.DataStreamer                          [] - DataStreamer Exception
   java.io.FileNotFoundException: File does not exist: /.../00000027-451d-4b7e-a213-eb71e5aa0979_15-32-31_20220919020324533.parquet (inode 8691986250) Holder DFSClient_NONMAPREDUCE does not have any open files.
   	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2867)
   	at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.analyzeFileState(FSDirWriteFileOp.java:521)
   	at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.validateAddBlock(FSDirWriteFileOp.java:161)
   	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2747)
   	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:918)
   	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:532)
   	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:529)
   	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1087)
   	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1126)
   	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1042)
   	at javax.security.auth.Subject.doAs(Subject.java:422)
   	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:2052)
   	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3059)
   
   	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_252]
   	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:1.8.0_252]
   	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:1.8.0_252]
   	at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[?:1.8.0_252]
   	at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121)
   	at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:88)
   	at org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1122)
   	at org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1944)
   	at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1750)
   	at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:751)
   Caused by: org.apache.hadoop.ipc.RemoteException: File does not exist: /.../00000027-451d-4b7e-a213-eb71e5aa0979_15-32-31_20220919020324533.parquet (inode 8691986250) Holder DFSClient_NONMAPREDUCE does not have any open files.
   	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2867)
   	at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.analyzeFileState(FSDirWriteFileOp.java:521)
   	at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.validateAddBlock(FSDirWriteFileOp.java:161)
   	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2747)
   	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:918)
   	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:532)
   	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:529)
   	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1087)
   	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1126)
   	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1042)
   	at javax.security.auth.Subject.doAs(Subject.java:422)
   	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:2052)
   	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3059)
   
   	at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1590)
   	at org.apache.hadoop.ipc.Client.call(Client.java:1521)
   	at org.apache.hadoop.ipc.Client.call(Client.java:1418)
   	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:251)
   	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:130)
   	at com.sun.proxy.$Proxy27.addBlock(Unknown Source) ~[?:?]
   	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:472)
   	at sun.reflect.GeneratedMethodAccessor77.invoke(Unknown Source) ~[?:?]
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_252]
   	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_252]
   	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:449)
   	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:175)
   	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:167)
   	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:105)
   	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:375)
   	at com.sun.proxy.$Proxy28.addBlock(Unknown Source) ~[?:?]
   	at org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1119)
   	... 3 more
   2022-09-19 07:08:41,707 ERROR org.apache.hudi.sink.compact.CompactFunction                 [] - Executor executes action [Execute compaction for instant 20220919020324533 from task 15] error
   org.apache.hudi.exception.HoodieUpsertException: Failed to close UpdateHandle
   	at org.apache.hudi.io.HoodieMergeHandle.close(HoodieMergeHandle.java:431)
   	at org.apache.hudi.table.action.commit.FlinkMergeHelper.runMerge(FlinkMergeHelper.java:114)
   	at org.apache.hudi.table.HoodieFlinkCopyOnWriteTable.handleUpdateInternal(HoodieFlinkCopyOnWriteTable.java:379)
   	at org.apache.hudi.table.HoodieFlinkCopyOnWriteTable.handleUpdate(HoodieFlinkCopyOnWriteTable.java:370)
   	at org.apache.hudi.table.action.compact.HoodieCompactor.compact(HoodieCompactor.java:227)
   	at org.apache.hudi.sink.compact.CompactFunction.doCompaction(CompactFunction.java:109)
   	at org.apache.hudi.sink.compact.CompactFunction.lambda$processElement$0(CompactFunction.java:94)
   	at org.apache.hudi.sink.utils.NonThrownExecutor.lambda$execute$0(NonThrownExecutor.java:93)
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_252]
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_252]
   	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252]
   Caused by: java.io.FileNotFoundException: File does not exist: /.../00000027-451d-4b7e-a213-eb71e5aa0979_15-32-31_20220919020324533.parquet (inode 8691986250) Holder DFSClient_NONMAPREDUCE does not have any open files.
   	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2867)
   	at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.analyzeFileState(FSDirWriteFileOp.java:521)
   	at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.validateAddBlock(FSDirWriteFileOp.java:161)
   	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2747)
   	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:918)
   	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:532)
   	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:529)
   	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1087)
   	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1126)
   	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1042)
   	at javax.security.auth.Subject.doAs(Subject.java:422)
   	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:2052)
   	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3059)
   
   	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_252]
   	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:1.8.0_252]
   	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:1.8.0_252]
   	at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[?:1.8.0_252]
   	at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121)
   	at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:88)
   	at org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1122)
   	at org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1944)
   	at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1750)
   	at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:751)
   Caused by: org.apache.hadoop.ipc.RemoteException: File does not exist: /.../00000027-451d-4b7e-a213-eb71e5aa0979_15-32-31_20220919020324533.parquet (inode 8691986250) Holder DFSClient_NONMAPREDUCE does not have any open files.
   	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2867)
   	at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.analyzeFileState(FSDirWriteFileOp.java:521)
   	at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.validateAddBlock(FSDirWriteFileOp.java:161)
   	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2747)
   	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:918)
   	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:532)
   	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:529)
   	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1087)
   	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1126)
   	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1042)
   	at javax.security.auth.Subject.doAs(Subject.java:422)
   	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:2052)
   	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3059)
   
   	at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1590)
   	at org.apache.hadoop.ipc.Client.call(Client.java:1521)
   	at org.apache.hadoop.ipc.Client.call(Client.java:1418)
   	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:251)
   	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:130)
   	at com.sun.proxy.$Proxy27.addBlock(Unknown Source) ~[?:?]
   	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:472)
   	at sun.reflect.GeneratedMethodAccessor77.invoke(Unknown Source) ~[?:?]
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_252]
   	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_252]
   	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:449)
   	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:175)
   	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:167)
   	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:105)
   	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:375)
   	at com.sun.proxy.$Proxy28.addBlock(Unknown Source) ~[?:?]
   	at org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1119)
   	at org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1944)
   	at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1750)
   	at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:751)
   2022-09-19 07:08:41,710 ERROR org.apache.flink.core.io.SysoutLogging                       [] - java.lang.RuntimeException: Buffer pool is destroyed.
   2022-09-19 07:08:41,710 ERROR org.apache.flink.core.io.SysoutLogging                       [] - 	at org.apache.flink.streaming.runtime.io.RecordWriterOutput.pushToRecordWriter(RecordWriterOutput.java:109)
   2022-09-19 07:08:41,710 ERROR org.apache.flink.core.io.SysoutLogging                       [] - 	at org.apache.flink.streaming.runtime.io.RecordWriterOutput.collect(RecordWriterOutput.java:93)
   2022-09-19 07:08:41,710 ERROR org.apache.flink.core.io.SysoutLogging                       [] - 	at org.apache.flink.streaming.runtime.io.RecordWriterOutput.collect(RecordWriterOutput.java:44)
   2022-09-19 07:08:41,710 ERROR org.apache.flink.core.io.SysoutLogging                       [] - 	at org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:50)
   2022-09-19 07:08:41,710 ERROR org.apache.flink.core.io.SysoutLogging                       [] - 	at org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:28)
   2022-09-19 07:08:41,710 ERROR org.apache.flink.core.io.SysoutLogging                       [] - 	at org.apache.flink.streaming.api.operators.TimestampedCollector.collect(TimestampedCollector.java:50)
   2022-09-19 07:08:41,710 ERROR org.apache.flink.core.io.SysoutLogging                       [] - 	at org.apache.hudi.sink.compact.CompactFunction.lambda$processElement$1(CompactFunction.java:95)
   2022-09-19 07:08:41,710 ERROR org.apache.flink.core.io.SysoutLogging                       [] - 	at org.apache.hudi.sink.utils.NonThrownExecutor.lambda$execute$0(NonThrownExecutor.java:103)
   2022-09-19 07:08:41,710 ERROR org.apache.flink.core.io.SysoutLogging                       [] - 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   2022-09-19 07:08:41,710 ERROR org.apache.flink.core.io.SysoutLogging                       [] - 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   2022-09-19 07:08:41,710 ERROR org.apache.flink.core.io.SysoutLogging                       [] - 	at java.lang.Thread.run(Thread.java:748)
   2022-09-19 07:08:41,710 ERROR org.apache.flink.core.io.SysoutLogging                       [] - Caused by: java.lang.IllegalStateException: Buffer pool is destroyed.
   2022-09-19 07:08:41,710 ERROR org.apache.flink.core.io.SysoutLogging                       [] - 	at org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestMemorySegment(LocalBufferPool.java:337)
   2022-09-19 07:08:41,710 ERROR org.apache.flink.core.io.SysoutLogging                       [] - 	at org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestBufferBuilder(LocalBufferPool.java:279)
   2022-09-19 07:08:41,710 ERROR org.apache.flink.core.io.SysoutLogging                       [] - 	at org.apache.flink.runtime.io.network.partition.BufferWritingResultPartition.requestNewBufferBuilderFromPool(BufferWritingResultPartition.java:348)
   2022-09-19 07:08:41,710 ERROR org.apache.flink.core.io.SysoutLogging                       [] - 	at org.apache.flink.runtime.io.network.partition.BufferWritingResultPartition.requestNewUnicastBufferBuilder(BufferWritingResultPartition.java:331)
   2022-09-19 07:08:41,710 ERROR org.apache.flink.core.io.SysoutLogging                       [] - 	at org.apache.flink.runtime.io.network.partition.BufferWritingResultPartition.appendUnicastDataForNewRecord(BufferWritingResultPartition.java:263)
   2022-09-19 07:08:41,710 ERROR org.apache.flink.core.io.SysoutLogging                       [] - 	at org.apache.flink.runtime.io.network.partition.BufferWritingResultPartition.emitRecord(BufferWritingResultPartition.java:142)
   2022-09-19 07:08:41,710 ERROR org.apache.flink.core.io.SysoutLogging                       [] - 	at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:106)
   2022-09-19 07:08:41,710 ERROR org.apache.flink.core.io.SysoutLogging                       [] - 	at org.apache.flink.runtime.io.network.api.writer.ChannelSelectorRecordWriter.emit(ChannelSelectorRecordWriter.java:54)
   2022-09-19 07:08:41,710 ERROR org.apache.flink.core.io.SysoutLogging                       [] - 	at org.apache.flink.streaming.runtime.io.RecordWriterOutput.pushToRecordWriter(RecordWriterOutput.java:107)
   2022-09-19 07:08:41,710 ERROR org.apache.flink.core.io.SysoutLogging                       [] - 	... 10 more
   ```
   
   
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TengHuo commented on a diff in pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left over by cancelled compaction task

Posted by GitBox <gi...@apache.org>.
TengHuo commented on code in PR #6733:
URL: https://github.com/apache/hudi/pull/6733#discussion_r1010027681


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/CompactionPlanOperator.java:
##########
@@ -129,9 +128,6 @@ private void scheduleCompaction(HoodieFlinkTable<?> table, long checkpointId) th
       List<CompactionOperation> operations = compactionPlan.getOperations().stream()
           .map(CompactionOperation::convertFromAvroRecordInstance).collect(toList());
       LOG.info("Execute compaction plan for instant {} as {} file groups", compactionInstantTime, operations.size());
-      WriteMarkersFactory
-          .get(table.getConfig().getMarkersType(), table, compactionInstantTime)
-          .deleteMarkerDir(table.getContext(), table.getConfig().getMarkersDeleteParallelism());

Review Comment:
   ~~Additionally, there was `INFLIGHT` instant in the time when compaction failed, but rollback didn't delete un-finished data files.~~
   
   I'm investigating this problem from logs.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] TengHuo commented on a diff in pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left by a leaked thread in CompactFunction

Posted by GitBox <gi...@apache.org>.
TengHuo commented on code in PR #6733:
URL: https://github.com/apache/hudi/pull/6733#discussion_r1010254943


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/CompactionPlanOperator.java:
##########
@@ -129,9 +128,6 @@ private void scheduleCompaction(HoodieFlinkTable<?> table, long checkpointId) th
       List<CompactionOperation> operations = compactionPlan.getOperations().stream()
           .map(CompactionOperation::convertFromAvroRecordInstance).collect(toList());
       LOG.info("Execute compaction plan for instant {} as {} file groups", compactionInstantTime, operations.size());
-      WriteMarkersFactory
-          .get(table.getConfig().getMarkersType(), table, compactionInstantTime)
-          .deleteMarkerDir(table.getContext(), table.getConfig().getMarkersDeleteParallelism());

Review Comment:
   cool, thanks for reviewing
   
   I'm testing if there will be duplicate marker issue when a compaction failed and a rollback performed.
   
   ~~As I understand, rollback only deletes data files, but not deleting marker files, if a compaction failed and retry again, it should throw the same error as [HUDI-4108](https://issues.apache.org/jira/browse/HUDI-4108) with my current code.~~



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left by a leaked thread in CompactFunction

Posted by GitBox <gi...@apache.org>.
danny0405 commented on code in PR #6733:
URL: https://github.com/apache/hudi/pull/6733#discussion_r1011369210


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/CompactFunction.java:
##########
@@ -133,4 +133,14 @@ private HoodieWriteConfig reloadWriteConfig() throws Exception {
   public void setExecutor(NonThrownExecutor executor) {
     this.executor = executor;
   }
+
+  @Override
+  public void close() throws Exception {
+    if (this.asyncCompaction) {
+      this.executor.close();

Review Comment:
   `this.asyncCompaction` -> `this.executor != null`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org