You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/08/28 00:25:22 UTC

[GitHub] [hudi] kazdy opened a new issue, #6519: [SUPPORT] SQL Update does not work for MoR _rt table

kazdy opened a new issue, #6519:
URL: https://github.com/apache/hudi/issues/6519

   I created partitioned MoR table and then wanted to use spark sql to update one row. 
   An error was thrown, same update works when I use _ro table. 
   I would expect both _ro and _rt tables registered in metastore/catalog to work when I run update statement.
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1. create MoR table, sync to hive so we have both _ro and _rt table
   2. write data to table
   3. run spark sql UPATE statement on _rt table
   
   ```
   hudi_options = {
       'hoodie.datasource.write.table.type': 'MERGE_ON_READ',
       'hoodie.table.name': "spectrum_mor",
       'hoodie.datasource.write.recordkey.field': 'uuid',
       'hoodie.datasource.write.partitionpath.field': 'partitionpath',
       'hoodie.datasource.write.table.name': "spectrum_mor",
       'hoodie.datasource.write.operation': 'upsert',
       'hoodie.datasource.write.precombine.field': 'ts',
       'hoodie.upsert.shuffle.parallelism': 2,
       'hoodie.insert.shuffle.parallelism': 2,
       "hoodie.datasource.write.hive_style_partitioning":"true",
       "hoodie.datasource.hive_sync.database":"processed",
       "hoodie.datasource.hive_sync.table":"spectrum_mor",
       "hoodie.datasource.hive_sync.partition_fields":"partitionpath",
       "hoodie.datasource.hive_sync.enable":"true",
       "hoodie.datasource.hive_sync.mode": "hms"}
   
   def datagen(id_from, id_to, partition):
       result = []
       for x in range(id_from, id_to):
           result.append({"uuid":x, "ts":1, "partitionpath":partition, "url":"hudi.apache.com"})
       return result
   
   newJson = datagen(1, 1000, 1)
   df = spark.read.json(sc.parallelize(newJson))
   
   df.write.format("hudi"). \
       options(**hudi_options). \
       mode("append"). \
       save("s3://data-lake-123456789123/spectrum_mor")
   
   spark.sql("update processed.spectrum_mor_rt set url = 'updated', ts = 2 where uuid = 2 and partitionpath=1;").show()
   
   ```
   
   **Expected behavior**
   
   UPDATE works for MoR _rt table
   
   **Environment Description**
   
   * Hudi version : 0.12
   
   * Spark version : 3.2.1-amzn (EMR 6.7.0)
   
   * Hive version :
   
   * Hadoop version :
   
   * Storage (HDFS/S3/GCS..) : S3
   
   * Running on Docker? (yes/no) : no
   
   
   **Additional context**
   
   Glue data catalog was used as metastore.
   
   **Stacktrace**
   
   An error was encountered:
   An error occurred while calling o88.sql.
   : java.lang.NoSuchMethodError: org.apache.spark.sql.execution.datasources.PartitionedFile.<init>(Lorg/apache/spark/sql/catalyst/InternalRow;Ljava/lang/String;JJ[Ljava/lang/String;)V
   	at org.apache.hudi.MergeOnReadSnapshotRelation.$anonfun$buildSplits$2(MergeOnReadSnapshotRelation.scala:213)
   	at scala.Option.map(Option.scala:230)
   	at org.apache.hudi.MergeOnReadSnapshotRelation.$anonfun$buildSplits$1(MergeOnReadSnapshotRelation.scala:211)
   	at scala.collection.immutable.List.map(List.scala:293)
   	at org.apache.hudi.MergeOnReadSnapshotRelation.buildSplits(MergeOnReadSnapshotRelation.scala:207)
   	at org.apache.hudi.MergeOnReadSnapshotRelation.collectFileSplits(MergeOnReadSnapshotRelation.scala:199)
   	at org.apache.hudi.MergeOnReadSnapshotRelation.collectFileSplits(MergeOnReadSnapshotRelation.scala:45)
   	at org.apache.hudi.HoodieBaseRelation.buildScan(HoodieBaseRelation.scala:330)
   	at org.apache.spark.sql.execution.datasources.DataSourceStrategy$.$anonfun$apply$4(DataSourceStrategy.scala:359)
   	at org.apache.spark.sql.execution.datasources.DataSourceStrategy$.$anonfun$pruneFilterProject$1(DataSourceStrategy.scala:393)
   	at org.apache.spark.sql.execution.datasources.DataSourceStrategy$.pruneFilterProjectRaw(DataSourceStrategy.scala:472)
   	at org.apache.spark.sql.execution.datasources.DataSourceStrategy$.pruneFilterProject(DataSourceStrategy.scala:392)
   	at org.apache.spark.sql.execution.datasources.DataSourceStrategy$.apply(DataSourceStrategy.scala:359)
   	at org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$1(QueryPlanner.scala:63)
   	at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486)
   	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492)
   	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:491)
   	at org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93)
   	at org.apache.spark.sql.execution.SparkStrategies.plan(SparkStrategies.scala:71)
   	at org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$3(QueryPlanner.scala:78)
   	at scala.collection.TraversableOnce$folder$1.apply(TraversableOnce.scala:196)
   	at scala.collection.TraversableOnce$folder$1.apply(TraversableOnce.scala:194)
   	at scala.collection.Iterator.foreach(Iterator.scala:943)
   	at scala.collection.Iterator.foreach$(Iterator.scala:943)
   	at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
   	at scala.collection.TraversableOnce.foldLeft(TraversableOnce.scala:199)
   	at scala.collection.TraversableOnce.foldLeft$(TraversableOnce.scala:192)
   	at scala.collection.AbstractIterator.foldLeft(Iterator.scala:1431)
   	at org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$2(QueryPlanner.scala:75)
   	at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486)
   	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492)
   	at org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93)
   	at org.apache.spark.sql.execution.SparkStrategies.plan(SparkStrategies.scala:71)
   	at org.apache.spark.sql.execution.QueryExecution$.createSparkPlan(QueryExecution.scala:504)
   	at org.apache.spark.sql.execution.QueryExecution.$anonfun$sparkPlan$2(QueryExecution.scala:165)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:192)
   	at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:224)
   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
   	at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:224)
   	at org.apache.spark.sql.execution.QueryExecution.$anonfun$sparkPlan$1(QueryExecution.scala:165)
   	at org.apache.spark.sql.execution.QueryExecution.withCteMap(QueryExecution.scala:75)
   	at org.apache.spark.sql.execution.QueryExecution.sparkPlan$lzycompute(QueryExecution.scala:158)
   	at org.apache.spark.sql.execution.QueryExecution.sparkPlan(QueryExecution.scala:158)
   	at org.apache.spark.sql.execution.QueryExecution.$anonfun$executedPlan$2(QueryExecution.scala:178)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:192)
   	at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:224)
   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
   	at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:224)
   	at org.apache.spark.sql.execution.QueryExecution.$anonfun$executedPlan$1(QueryExecution.scala:175)
   	at org.apache.spark.sql.execution.QueryExecution.withCteMap(QueryExecution.scala:75)
   	at org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:171)
   	at org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:171)
   	at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:195)
   	at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:194)
   	at org.apache.hudi.HoodieSparkUtils$.createRdd(HoodieSparkUtils.scala:164)
   	at org.apache.hudi.HoodieSparkUtils$.createRdd(HoodieSparkUtils.scala:146)
   	at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:285)
   	at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:183)
   	at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:45)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:75)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:73)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:84)
   	at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:115)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:232)
   	at org.apache.spark.sql.execution.SQLExecution$.executeQuery$1(SQLExecution.scala:110)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:135)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:232)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:135)
   	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:253)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:134)
   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
   	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:68)
   	at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:112)
   	at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:108)
   	at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:519)
   	at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:83)
   	at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:519)
   	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:30)
   	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267)
   	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263)
   	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)
   	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)
   	at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:495)
   	at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:108)
   	at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:95)
   	at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:93)
   	at org.apache.spark.sql.execution.QueryExecution.assertCommandExecuted(QueryExecution.scala:136)
   	at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:848)
   	at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:382)
   	at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:355)
   	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:247)
   	at org.apache.spark.sql.hudi.command.UpdateHoodieTableCommand.run(UpdateHoodieTableCommand.scala:74)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:75)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:73)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:84)
   	at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:115)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:232)
   	at org.apache.spark.sql.execution.SQLExecution$.executeQuery$1(SQLExecution.scala:110)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:135)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:232)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:135)
   	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:253)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:134)
   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
   	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:68)
   	at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:112)
   	at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:108)
   	at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:519)
   	at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:83)
   	at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:519)
   	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:30)
   	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267)
   	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263)
   	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)
   	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)
   	at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:495)
   	at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:108)
   	at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:95)
   	at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:93)
   	at org.apache.spark.sql.Dataset.<init>(Dataset.scala:221)
   	at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:101)
   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
   	at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:98)
   	at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:618)
   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
   	at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:613)
   	at sun.reflect.GeneratedMethodAccessor527.invoke(Unknown Source)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
   	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
   	at py4j.Gateway.invoke(Gateway.java:282)
   	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
   	at py4j.commands.CallCommand.execute(CallCommand.java:79)
   	at py4j.GatewayConnection.run(GatewayConnection.java:238)
   	at java.lang.Thread.run(Thread.java:750)
   
   Traceback (most recent call last):
     File "/mnt/yarn/usercache/livy/appcache/application_1661638806229_0004/container_1661638806229_0004_01_000001/pyspark.zip/pyspark/sql/session.py", line 723, in sql
       return DataFrame(self._jsparkSession.sql(sqlQuery), self._wrapped)
     File "/mnt/yarn/usercache/livy/appcache/application_1661638806229_0004/container_1661638806229_0004_01_000001/py4j-0.10.9.3-src.zip/py4j/java_gateway.py", line 1322, in __call__
       answer, self.gateway_client, self.target_id, self.name)
     File "/mnt/yarn/usercache/livy/appcache/application_1661638806229_0004/container_1661638806229_0004_01_000001/pyspark.zip/pyspark/sql/utils.py", line 111, in deco
       return f(*a, **kw)
     File "/mnt/yarn/usercache/livy/appcache/application_1661638806229_0004/container_1661638806229_0004_01_000001/py4j-0.10.9.3-src.zip/py4j/protocol.py", line 328, in get_return_value
       format(target_id, ".", name), value)
   py4j.protocol.Py4JJavaError: An error occurred while calling o88.sql.
   : java.lang.NoSuchMethodError: org.apache.spark.sql.execution.datasources.PartitionedFile.<init>(Lorg/apache/spark/sql/catalyst/InternalRow;Ljava/lang/String;JJ[Ljava/lang/String;)V
   	at org.apache.hudi.MergeOnReadSnapshotRelation.$anonfun$buildSplits$2(MergeOnReadSnapshotRelation.scala:213)
   	at scala.Option.map(Option.scala:230)
   	at org.apache.hudi.MergeOnReadSnapshotRelation.$anonfun$buildSplits$1(MergeOnReadSnapshotRelation.scala:211)
   	at scala.collection.immutable.List.map(List.scala:293)
   	at org.apache.hudi.MergeOnReadSnapshotRelation.buildSplits(MergeOnReadSnapshotRelation.scala:207)
   	at org.apache.hudi.MergeOnReadSnapshotRelation.collectFileSplits(MergeOnReadSnapshotRelation.scala:199)
   	at org.apache.hudi.MergeOnReadSnapshotRelation.collectFileSplits(MergeOnReadSnapshotRelation.scala:45)
   	at org.apache.hudi.HoodieBaseRelation.buildScan(HoodieBaseRelation.scala:330)
   	at org.apache.spark.sql.execution.datasources.DataSourceStrategy$.$anonfun$apply$4(DataSourceStrategy.scala:359)
   	at org.apache.spark.sql.execution.datasources.DataSourceStrategy$.$anonfun$pruneFilterProject$1(DataSourceStrategy.scala:393)
   	at org.apache.spark.sql.execution.datasources.DataSourceStrategy$.pruneFilterProjectRaw(DataSourceStrategy.scala:472)
   	at org.apache.spark.sql.execution.datasources.DataSourceStrategy$.pruneFilterProject(DataSourceStrategy.scala:392)
   	at org.apache.spark.sql.execution.datasources.DataSourceStrategy$.apply(DataSourceStrategy.scala:359)
   	at org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$1(QueryPlanner.scala:63)
   	at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486)
   	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492)
   	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:491)
   	at org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93)
   	at org.apache.spark.sql.execution.SparkStrategies.plan(SparkStrategies.scala:71)
   	at org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$3(QueryPlanner.scala:78)
   	at scala.collection.TraversableOnce$folder$1.apply(TraversableOnce.scala:196)
   	at scala.collection.TraversableOnce$folder$1.apply(TraversableOnce.scala:194)
   	at scala.collection.Iterator.foreach(Iterator.scala:943)
   	at scala.collection.Iterator.foreach$(Iterator.scala:943)
   	at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
   	at scala.collection.TraversableOnce.foldLeft(TraversableOnce.scala:199)
   	at scala.collection.TraversableOnce.foldLeft$(TraversableOnce.scala:192)
   	at scala.collection.AbstractIterator.foldLeft(Iterator.scala:1431)
   	at org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$2(QueryPlanner.scala:75)
   	at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486)
   	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492)
   	at org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93)
   	at org.apache.spark.sql.execution.SparkStrategies.plan(SparkStrategies.scala:71)
   	at org.apache.spark.sql.execution.QueryExecution$.createSparkPlan(QueryExecution.scala:504)
   	at org.apache.spark.sql.execution.QueryExecution.$anonfun$sparkPlan$2(QueryExecution.scala:165)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:192)
   	at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:224)
   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
   	at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:224)
   	at org.apache.spark.sql.execution.QueryExecution.$anonfun$sparkPlan$1(QueryExecution.scala:165)
   	at org.apache.spark.sql.execution.QueryExecution.withCteMap(QueryExecution.scala:75)
   	at org.apache.spark.sql.execution.QueryExecution.sparkPlan$lzycompute(QueryExecution.scala:158)
   	at org.apache.spark.sql.execution.QueryExecution.sparkPlan(QueryExecution.scala:158)
   	at org.apache.spark.sql.execution.QueryExecution.$anonfun$executedPlan$2(QueryExecution.scala:178)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:192)
   	at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:224)
   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
   	at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:224)
   	at org.apache.spark.sql.execution.QueryExecution.$anonfun$executedPlan$1(QueryExecution.scala:175)
   	at org.apache.spark.sql.execution.QueryExecution.withCteMap(QueryExecution.scala:75)
   	at org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:171)
   	at org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:171)
   	at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:195)
   	at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:194)
   	at org.apache.hudi.HoodieSparkUtils$.createRdd(HoodieSparkUtils.scala:164)
   	at org.apache.hudi.HoodieSparkUtils$.createRdd(HoodieSparkUtils.scala:146)
   	at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:285)
   	at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:183)
   	at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:45)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:75)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:73)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:84)
   	at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:115)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:232)
   	at org.apache.spark.sql.execution.SQLExecution$.executeQuery$1(SQLExecution.scala:110)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:135)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:232)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:135)
   	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:253)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:134)
   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
   	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:68)
   	at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:112)
   	at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:108)
   	at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:519)
   	at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:83)
   	at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:519)
   	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:30)
   	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267)
   	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263)
   	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)
   	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)
   	at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:495)
   	at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:108)
   	at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:95)
   	at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:93)
   	at org.apache.spark.sql.execution.QueryExecution.assertCommandExecuted(QueryExecution.scala:136)
   	at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:848)
   	at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:382)
   	at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:355)
   	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:247)
   	at org.apache.spark.sql.hudi.command.UpdateHoodieTableCommand.run(UpdateHoodieTableCommand.scala:74)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:75)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:73)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:84)
   	at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:115)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:232)
   	at org.apache.spark.sql.execution.SQLExecution$.executeQuery$1(SQLExecution.scala:110)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:135)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:232)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:135)
   	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:253)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:134)
   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
   	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:68)
   	at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:112)
   	at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:108)
   	at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:519)
   	at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:83)
   	at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:519)
   	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:30)
   	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267)
   	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263)
   	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)
   	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)
   	at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:495)
   	at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:108)
   	at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:95)
   	at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:93)
   	at org.apache.spark.sql.Dataset.<init>(Dataset.scala:221)
   	at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:101)
   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
   	at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:98)
   	at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:618)
   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
   	at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:613)
   	at sun.reflect.GeneratedMethodAccessor527.invoke(Unknown Source)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
   	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
   	at py4j.Gateway.invoke(Gateway.java:282)
   	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
   	at py4j.commands.CallCommand.execute(CallCommand.java:79)
   	at py4j.GatewayConnection.run(GatewayConnection.java:238)
   	at java.lang.Thread.run(Thread.java:750)
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] xushiyan commented on issue #6519: [SUPPORT] SQL Update does not work for MoR _rt table

Posted by GitBox <gi...@apache.org>.

xushiyan commented on issue #6519:
URL: https://github.com/apache/hudi/issues/6519#issuecomment-1250959686

   @kazdy `_rt` tables are views created for real-time queries, meant for read only so we're not supposed to run update against it. please run updates to the original hudi table.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] xushiyan closed issue #6519: [SUPPORT] SQL Update does not work for MoR _rt table

Posted by GitBox <gi...@apache.org>.

xushiyan closed issue #6519: [SUPPORT] SQL Update does not work for MoR _rt table
URL: https://github.com/apache/hudi/issues/6519


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org