You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "sivabalan narayanan (Jira)" <ji...@apache.org> on 2023/02/10 17:25:00 UTC

[jira] [Comment Edited] (HUDI-5733) TestHoodieDeltaStreamer.testHoodieIndexer failure

    [ https://issues.apache.org/jira/browse/HUDI-5733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17687206#comment-17687206 ] 

sivabalan narayanan edited comment on HUDI-5733 at 2/10/23 5:24 PM:
--------------------------------------------------------------------

Two viable options I see:
 # Make LogRecordReader to work w/ multi-writer wrt rollback blocks. then we can make MDT to lazy clean policy for rollbacks. Neat approach, but needs good testing since this is touching regular log blog reads (for any MOR table).
 # Little more invovled and not so clean fix. Apply eager rollbacks only for regular delta commits. Deduce delta commits from HoodieIndexer and employ lazy clean policy(based on heartbeat). 


was (Author: shivnarayan):
Two viable options I see:
 # Make LogRecordReader to work w/ multi-writer wrt rollback blocks. then we can make MDT to lazy clean policy for rollbacks.
 # Little more invovled and not so clean fix. Apply eager rollbacks only for regular delta commits. Deduce delta commits from HoodieIndexer and employ lazy clean policy(based on heartbeat). 

> TestHoodieDeltaStreamer.testHoodieIndexer failure
> -------------------------------------------------
>
>                 Key: HUDI-5733
>                 URL: https://issues.apache.org/jira/browse/HUDI-5733
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: deltastreamer, index, tests-ci
>            Reporter: Jonathan Vexler
>            Assignee: Sagar Sumit
>            Priority: Critical
>             Fix For: 0.13.1
>
>
> Sometimes it fails because in the metadata table a rollback occurs and rolls back a commit but the deltastreamer tries to change the instance from requested to inflight. This fails because the requested file has been removed from the timeline
>  
> Here is an example of a failing [test stack trace|https://dev.azure.com/apache-hudi-ci-org/apache-hudi-ci/_build/results?buildId=15021&view=logs&j=dcedfe73-9485-5cc5-817a-73b61fc5dcb0&t=746585d8-b50a-55c3-26c5-517d93af9934&l=30526]
> {code:java}
> Caused by: java.lang.IllegalArgumentException
> 	at org.apache.hudi.common.util.ValidationUtils.checkArgument(ValidationUtils.java:31)
> 	at org.apache.hudi.common.table.timeline.HoodieActiveTimeline.transitionState(HoodieActiveTimeline.java:633)
> 	at org.apache.hudi.common.table.timeline.HoodieActiveTimeline.transitionRequestedToInflight(HoodieActiveTimeline.java:698)
> 	at org.apache.hudi.table.action.commit.BaseCommitActionExecutor.saveWorkloadProfileMetadataToInflight(BaseCommitActionExecutor.java:147)
> 	at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.execute(BaseSparkCommitActionExecutor.java:172)
> 	at org.apache.hudi.table.action.deltacommit.SparkUpsertPreppedDeltaCommitActionExecutor.execute(SparkUpsertPreppedDeltaCommitActionExecutor.java:44)
> 	at org.apache.hudi.table.HoodieSparkMergeOnReadTable.upsertPrepped(HoodieSparkMergeOnReadTable.java:111)
> 	at org.apache.hudi.table.HoodieSparkMergeOnReadTable.upsertPrepped(HoodieSparkMergeOnReadTable.java:80)
> 	at org.apache.hudi.client.SparkRDDWriteClient.upsertPreppedRecords(SparkRDDWriteClient.java:154)
> 	at org.apache.hudi.metadata.SparkHoodieBackedTableMetadataWriter.commit(SparkHoodieBackedTableMetadataWriter.java:172)
> 	at org.apache.hudi.metadata.HoodieBackedTableMetadataWriter.processAndCommit(HoodieBackedTableMetadataWriter.java:823)
> 	at org.apache.hudi.metadata.HoodieBackedTableMetadataWriter.update(HoodieBackedTableMetadataWriter.java:890)
> 	at org.apache.hudi.client.BaseHoodieWriteClient.lambda$writeTableMetadata$1(BaseHoodieWriteClient.java:355)
> 	at org.apache.hudi.common.util.Option.ifPresent(Option.java:97)
> 	at org.apache.hudi.client.BaseHoodieWriteClient.writeTableMetadata(BaseHoodieWriteClient.java:355)
> 	at org.apache.hudi.client.BaseHoodieWriteClient.commit(BaseHoodieWriteClient.java:282)
> 	at org.apache.hudi.client.BaseHoodieWriteClient.commitStats(BaseHoodieWriteClient.java:233)
> 	at org.apache.hudi.client.SparkRDDWriteClient.commit(SparkRDDWriteClient.java:102)
> 	at org.apache.hudi.client.SparkRDDWriteClient.commit(SparkRDDWriteClient.java:61)
> 	at org.apache.hudi.client.BaseHoodieWriteClient.commit(BaseHoodieWriteClient.java:199)
> 	at org.apache.hudi.utilities.deltastreamer.DeltaSync.writeToSink(DeltaSync.java:713)
> 	at org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:395)
> 	at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$1(HoodieDeltaStreamer.java:716)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)