You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Vinoth Chandar (Jira)" <ji...@apache.org> on 2021/01/05 17:23:00 UTC

[jira] [Commented] (HUDI-1459) Support for handling of REPLACE instants

    [ https://issues.apache.org/jira/browse/HUDI-1459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17259074#comment-17259074 ] 

Vinoth Chandar commented on HUDI-1459:
--------------------------------------

[~pwason] [~satishkotha] 

several users reporting this when trying use SaveMode.Overwrite on spark data source and turning on metadata table

{code}
org.apache.hudi.exception.HoodieException: Unknown type of action replacecommit
  at org.apache.hudi.metadata.HoodieTableMetadataUtil.convertInstantToMetaRecords(HoodieTableMetadataUtil.java:96)
  at org.apache.hudi.metadata.HoodieBackedTableMetadataWriter.syncFromInstants(HoodieBackedTableMetadataWriter.java:372)
  at org.apache.hudi.metadata.HoodieBackedTableMetadataWriter.<init>(HoodieBackedTableMetadataWriter.java:120)
  at org.apache.hudi.metadata.SparkHoodieBackedTableMetadataWriter.<init>(SparkHoodieBackedTableMetadataWriter.java:62)
  at org.apache.hudi.metadata.SparkHoodieBackedTableMetadataWriter.create(SparkHoodieBackedTableMetadataWriter.java:58)
  at org.apache.hudi.client.SparkRDDWriteClient.syncTableMetadata(SparkRDDWriteClient.java:406)
  at org.apache.hudi.client.AbstractHoodieWriteClient.postCommit(AbstractHoodieWriteClient.java:417)
  at org.apache.hudi.client.AbstractHoodieWriteClient.commitStats(AbstractHoodieWriteClient.java:189)
  at org.apache.hudi.client.SparkRDDWriteClient.commit(SparkRDDWriteClient.java:108)
  at org.apache.hudi.HoodieSparkSqlWriter$.commitAndPerformPostOperations(HoodieSparkSqlWriter.scala:439)
  at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:224)
  at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:129)
  at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:45)
  at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
  at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
  at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:86)
  at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:173)
  at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:169)
  at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:197)
  at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
  at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:194)
  at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:169)
  at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:114)
  at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:112)
  at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:696)
  at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:696)
  at org.apache.spark.sql.execution.SQLExecution$.org$apache$spark$sql$execution$SQLExecution$$executeQuery$1(SQLExecution.scala:83)
  at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1$$anonfun$apply$1.apply(SQLExecution.scala:94)
  at org.apache.spark.sql.execution.QueryExecutionMetrics$.withMetrics(QueryExecutionMetrics.scala:141)
  at org.apache.spark.sql.execution.SQLExecution$.org$apache$spark$sql$execution$SQLExecution$$withMetrics(SQLExecution.scala:178)
  at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:93)
  at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:200)
  at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:92)
  at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:696)
  at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:305)
  at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:291)
  at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:249)
  ... 57 elided
{code}

> Support for handling of REPLACE instants
> ----------------------------------------
>
>                 Key: HUDI-1459
>                 URL: https://issues.apache.org/jira/browse/HUDI-1459
>             Project: Apache Hudi
>          Issue Type: Sub-task
>          Components: Writer Core
>            Reporter: Vinoth Chandar
>            Assignee: Prashant Wason
>            Priority: Blocker
>             Fix For: 0.7.0
>
>
> Once we rebase to master, we need to handle replace instants as well, as they show up on the timeline. 
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)