You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "yihua (via GitHub)" <gi...@apache.org> on 2023/02/21 04:11:54 UTC
[GitHub] [hudi] yihua opened a new pull request, #8001: [HUDI-5817] Fix async indexer metadata writer to avoid eager rollback and failed write cleaning
yihua opened a new pull request, #8001:
URL: https://github.com/apache/hudi/pull/8001
### Change Logs
Even though the metadata table writer used by the async indexer is configured to use `LAZY` failed write cleaning policy, the `SparkHoodieBackedTableMetadataWriter` is hard-coded to roll back failed writes regardless of the configuration, which should not be triggered for the async indexer. In the current logic, the async indexer can trigger the rollback of inflight delta commit from another regular writer in the metadata table, causing issues. This also makes the following test flaky.
This PR fixes `SparkHoodieBackedTableMetadataWriter` so that the rollback of failed writes is not triggered by the async indexer.
```
2023-02-16T13:46:06.1573775Z [ERROR] Tests run: 113, Failures: 0, Errors: 1, Skipped: 2, Time elapsed: 3,518.191 s <<< FAILURE! - in org.apache.hudi.utilities.deltastreamer.TestHoodieDeltaStreamer
2023-02-16T13:46:06.1576031Z [ERROR] testHoodieIndexer{HoodieRecordType}[2] Time elapsed: 79.838 s <<< ERROR!
...
2023-02-16T13:46:06.1705711Z Caused by: java.lang.IllegalArgumentException
2023-02-16T13:46:06.1706251Z at org.apache.hudi.common.util.ValidationUtils.checkArgument(ValidationUtils.java:31)
2023-02-16T13:46:06.1706995Z at org.apache.hudi.common.table.timeline.HoodieActiveTimeline.transitionState(HoodieActiveTimeline.java:633)
2023-02-16T13:46:06.1707847Z at org.apache.hudi.common.table.timeline.HoodieActiveTimeline.transitionRequestedToInflight(HoodieActiveTimeline.java:698)
2023-02-16T13:46:06.1708751Z at org.apache.hudi.table.action.commit.BaseCommitActionExecutor.saveWorkloadProfileMetadataToInflight(BaseCommitActionExecutor.java:147)
2023-02-16T13:46:06.1709792Z at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.execute(BaseSparkCommitActionExecutor.java:172)
2023-02-16T13:46:06.1710733Z at org.apache.hudi.table.action.deltacommit.SparkUpsertPreppedDeltaCommitActionExecutor.execute(SparkUpsertPreppedDeltaCommitActionExecutor.java:44)
2023-02-16T13:46:06.1712815Z at org.apache.hudi.table.HoodieSparkMergeOnReadTable.upsertPrepped(HoodieSparkMergeOnReadTable.java:111)
2023-02-16T13:46:06.1713593Z at org.apache.hudi.table.HoodieSparkMergeOnReadTable.upsertPrepped(HoodieSparkMergeOnReadTable.java:80)
2023-02-16T13:46:06.1714353Z at org.apache.hudi.client.SparkRDDWriteClient.upsertPreppedRecords(SparkRDDWriteClient.java:154)
2023-02-16T13:46:06.1715155Z at org.apache.hudi.metadata.SparkHoodieBackedTableMetadataWriter.commit(SparkHoodieBackedTableMetadataWriter.java:186)
...
```
### Impact
Fixes the rollback behavior of async indexer. Also fixes the flaky test. Adds a new test to guard around the behavior (before this PR, the test fails).
### Risk level
low
### Documentation Update
N/A
### Contributor's checklist
- [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute)
- [ ] Change Logs and Impact were stated clearly
- [ ] Adequate tests were added if applicable
- [ ] CI passed
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan merged pull request #8001: [HUDI-5817] Fix async indexer metadata writer to avoid eager rollback and failed write cleaning
Posted by "nsivabalan (via GitHub)" <gi...@apache.org>.
nsivabalan merged PR #8001:
URL: https://github.com/apache/hudi/pull/8001
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #8001: [HUDI-5817] Fix async indexer metadata writer to avoid eager rollback and failed write cleaning
Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8001:
URL: https://github.com/apache/hudi/pull/8001#issuecomment-1438907160
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "3b479ce83b6f3a7d5ca1654d26ab58d3e36b8ec5",
"status" : "SUCCESS",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15305",
"triggerID" : "3b479ce83b6f3a7d5ca1654d26ab58d3e36b8ec5",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 3b479ce83b6f3a7d5ca1654d26ab58d3e36b8ec5 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15305)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #8001: [HUDI-5817] Fix async indexer metadata writer to avoid eager rollback and failed write cleaning
Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8001:
URL: https://github.com/apache/hudi/pull/8001#issuecomment-1437856388
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "3b479ce83b6f3a7d5ca1654d26ab58d3e36b8ec5",
"status" : "UNKNOWN",
"url" : "TBD",
"triggerID" : "3b479ce83b6f3a7d5ca1654d26ab58d3e36b8ec5",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 3b479ce83b6f3a7d5ca1654d26ab58d3e36b8ec5 UNKNOWN
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #8001: [HUDI-5817] Fix async indexer metadata writer to avoid eager rollback and failed write cleaning
Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8001:
URL: https://github.com/apache/hudi/pull/8001#issuecomment-1438813673
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "3b479ce83b6f3a7d5ca1654d26ab58d3e36b8ec5",
"status" : "PENDING",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15305",
"triggerID" : "3b479ce83b6f3a7d5ca1654d26ab58d3e36b8ec5",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 3b479ce83b6f3a7d5ca1654d26ab58d3e36b8ec5 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15305)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #8001: [HUDI-5817] Fix async indexer metadata writer to avoid eager rollback and failed write cleaning
Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8001:
URL: https://github.com/apache/hudi/pull/8001#issuecomment-1438760993
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "3b479ce83b6f3a7d5ca1654d26ab58d3e36b8ec5",
"status" : "UNKNOWN",
"url" : "TBD",
"triggerID" : "3b479ce83b6f3a7d5ca1654d26ab58d3e36b8ec5",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 3b479ce83b6f3a7d5ca1654d26ab58d3e36b8ec5 UNKNOWN
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #8001: [HUDI-5817] Fix async indexer metadata writer to avoid eager rollback and failed write cleaning
Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8001:
URL: https://github.com/apache/hudi/pull/8001#issuecomment-1437958889
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "3b479ce83b6f3a7d5ca1654d26ab58d3e36b8ec5",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15305",
"triggerID" : "3b479ce83b6f3a7d5ca1654d26ab58d3e36b8ec5",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 3b479ce83b6f3a7d5ca1654d26ab58d3e36b8ec5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15305)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #8001: [HUDI-5817] Fix async indexer metadata writer to avoid eager rollback and failed write cleaning
Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8001:
URL: https://github.com/apache/hudi/pull/8001#issuecomment-1437861949
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "3b479ce83b6f3a7d5ca1654d26ab58d3e36b8ec5",
"status" : "PENDING",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15305",
"triggerID" : "3b479ce83b6f3a7d5ca1654d26ab58d3e36b8ec5",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 3b479ce83b6f3a7d5ca1654d26ab58d3e36b8ec5 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15305)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org