You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/10/25 21:10:02 UTC

[GitHub] [hudi] navbalaraman opened a new issue, #7060: Error when upgrading to hudi 0.12.0 from 0.9.0

navbalaraman opened a new issue, #7060:
URL: https://github.com/apache/hudi/issues/7060

   We are using spark 3.1.2 with hudi 0.9.0 in our application and AWS S3 and AWS Glue Catalog to store and expose the data ingested. As part of a source data change where some of the new records are now coming in as null but this column exists in the table schema as it was built based on earlier records which had values against these columns. Based on the some of the issues reported (eg: [HUDI-4276](https://github.com/apache/hudi/pull/6017/commits)], we identified that this issue could be resolved with upgrading to hudi 0.12.0. 
   When upgrading hudi we are facing below error. Can you please provide info on what is causing this issue? (Only pom version changes have been done, no code changes)
   
   Error:
   org.apache.spark.sql.adapter.Spark3_1Adapter
   java.lang.ClassNotFoundException: org.apache.spark.sql.adapter.Spark3_1Adapter
   	at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
   	at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
   	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:355)
   	at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
   	at org.apache.hudi.SparkAdapterSupport.sparkAdapter(SparkAdapterSupport.scala:39)
   	at org.apache.hudi.SparkAdapterSupport.sparkAdapter$(SparkAdapterSupport.scala:29)
   	at org.apache.hudi.HoodieSparkUtils$.sparkAdapter$lzycompute(HoodieSparkUtils.scala:65)
   	at org.apache.hudi.HoodieSparkUtils$.sparkAdapter(HoodieSparkUtils.scala:65)
   	at org.apache.hudi.AvroConversionUtils$.convertStructTypeToAvroSchema(AvroConversionUtils.scala:150)
   	at org.apache.hudi.HoodieSparkSqlWriter$.bulkInsertAsRow(HoodieSparkSqlWriter.scala:540)
   	at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:178)
   	at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:183)
   	at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:46)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:90)
   	at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:180)
   	at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:218)
   	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
   	at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:215)
   	at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:176)
   	at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:132)
   	at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:131)
   	at org.apache.spark.sql.DataFrameWriter.$anonfun$runCommand$1(DataFrameWriter.scala:989)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:103)
   	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90)
   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
   	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
   	at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:989)
   	at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:438)
   	at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:415)
   	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:293)
   
   
   **pom.xml** 
   `<dependency>
         <groupId>org.scala-lang</groupId>
         <artifactId>scala-library</artifactId>
         <version>2.12.12</version>
       </dependency>
       <dependency>
         <groupId>org.apache.spark</groupId>
         <artifactId>spark-core_2.12</artifactId>
         <version>3.1.2</version>
         <scope>provided</scope>
       </dependency>
       <dependency>
         <groupId>org.apache.spark</groupId>
         <artifactId>spark-sql_2.12</artifactId>
         <version>3.1.2</version>
         <scope>provided</scope>
       </dependency>
       <dependency>
         <groupId>org.apache.spark</groupId>
         <artifactId>spark-hive_2.12</artifactId>
         <version>3.1.2</version>
         <scope>provided</scope>
       </dependency>
       <dependency>
         <groupId>org.apache.hudi</groupId>
         <artifactId>hudi-spark3-bundle_2.12</artifactId>
         <version>0.12.0</version>
       </dependency>
       <dependency>
         <groupId>org.mongodb.spark</groupId>
         <artifactId>mongo-spark-connector_2.12</artifactId>
         <version>3.0.1</version>
       </dependency>
   `
     


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] navbalaraman commented on issue #7060: Error when upgrading to hudi 0.12.0 from 0.9.0

Posted by GitBox <gi...@apache.org>.

navbalaraman commented on issue #7060:
URL: https://github.com/apache/hudi/issues/7060#issuecomment-1315575024

   @nsivabalan gentle reminder. any insights is appreciated.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] codope commented on issue #7060: Error when upgrading to hudi 0.12.0 from 0.9.0

Posted by GitBox <gi...@apache.org>.

codope commented on issue #7060:
URL: https://github.com/apache/hudi/issues/7060#issuecomment-1330198915

   @navbalaraman The issue due to partition field was fixed recently by https://github.com/apache/hudi/pull/7132
   Please reopen another issue if it still persists. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] codope closed issue #7060: Error when upgrading to hudi 0.12.0 from 0.9.0

Posted by GitBox <gi...@apache.org>.

codope closed issue #7060: Error when upgrading to hudi 0.12.0 from 0.9.0
URL: https://github.com/apache/hudi/issues/7060


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] navbalaraman commented on issue #7060: Error when upgrading to hudi 0.12.0 from 0.9.0

Posted by GitBox <gi...@apache.org>.

navbalaraman commented on issue #7060:
URL: https://github.com/apache/hudi/issues/7060#issuecomment-1310495311

   @nsivabalan Now i am seeing an issue with the upgrade code running on EMR 6.8.0. The initial pull of the data happnd with hudi 0.9.0 and spark 3.1.2 on EMR 6.5. Now the incremental pull with hudi 0.12.0 and spark 3.3.0 on EMR 6.8.0 is failing with the below error. 
   Any thoughts on what is needed to make this incremental pull to work without having to wipe out existing data and do initial pull again with the new version?
   
   It throws an error saying "_col_partition" which is the partition field (set as below) is missing. I verified in the Glue catalog and it does exist. In the existing parquet the col name is like "_hoodie_partition_path":"_col_partition=2022-10-12"
   DataSourceWriteOptions.PARTITIONPATH_FIELD.key() -> "_col_partition",
   
   Also, on the initial and incremental pull we have the below config to avoid duplicate partition column showing up in glue catalog.
   option(DataSourceWriteOptions.DROP_PARTITION_COLUMNS.key(), true)
   other configs:
   HoodieWriteConfig.TBL_NAME.key() -> tableName,
         DataSourceWriteOptions.TABLE_TYPE.key() -> "COPY_ON_WRITE",
         DataSourceWriteOptions.RECORDKEY_FIELD.key() -> "_id.oid",
         DataSourceWriteOptions.PARTITIONPATH_FIELD.key() -> "_col_partition",
         DataSourceWriteOptions.PRECOMBINE_FIELD.key() -> "_update_ts",
         DataSourceWriteOptions.HIVE_SYNC_ENABLED.key() -> "true",
         DataSourceWriteOptions.HIVE_URL.key() -> s"jdbc:hive2://${masterDns}:10000",
         DataSourceWriteOptions.HIVE_DATABASE.key() -> dbName,
         DataSourceWriteOptions.HIVE_TABLE.key() -> tableName,
         DataSourceWriteOptions.HIVE_PARTITION_EXTRACTOR_CLASS.key() -> classOf[MultiPartKeysValueExtractor].getName,
         DataSourceWriteOptions.HIVE_PARTITION_FIELDS.key() -> "_col_partition",
         HoodieCleanConfig.CLEANER_FILE_VERSIONS_RETAINED.key() -> "1",
         HoodieCleanConfig.CLEANER_POLICY.key() -> HoodieCleaningPolicy.KEEP_LATEST_FILE_VERSIONS.name(),
         HoodieTableConfig.HIVE_STYLE_PARTITIONING_ENABLE.key() -> "true",
         HoodieSyncConfig.META_SYNC_DATABASE_NAME.key() ->dbname,
         HoodieSyncConfig.META_SYNC_TABLE_NAME.key() -> tableName,
   
   ERROR STACK TRACE:
   
   diagnostics: User class threw exception: org.sparkproject.guava.util.concurrent.UncheckedExecutionException: java.lang.IllegalArgumentException: Cannot find columns: '_col_partition' in the schema[StructField(_hoodie_commit_time,StringType,true),StructField(_hoodie_commit_seqno,StringType,true),StructField(_hoodie_record_key,StringType,true),StructField(_hoodie_partition_path,StringType,true),StructField(_hoodie_file_name,StringType,true),StructField(_col_origin_ts,LongType,true),StructField(_col_update_ts,LongType,true),StructField(devicetype,StringType,true),StructField(deviceid,StringType,true),StructField(createdOn,TimestampType,true),StructField(channel,StringType,true),StructField(source,StringType,true),StructField(_class,StringType,true),StructField(model,StringType,true),StructField(commandoperation,StringType,true),StructField(phases,ArrayType(StructType(StructField(dateTime,TimestampType,true),StructField(dcmErrorCode,StringType,true),StructField(message,StringType,true
 ),StructField(payload,StringType,true),StructField(phaseDescription,StringType,true),StructField(phaseId,IntegerType,true),StructField(statusCode,IntegerType,true),StructField(statusDescription,StringType,true)),true),true),StructField(guid,StringType,true),StructField(commanddatetime,TimestampType,true),StructField(_id,StructType(StructField(oid,StringType,true)),true),StructField(isacsettingson,BooleanType,true),StructField(command,StringType,true),StructField(returncode,StringType,true),StructField(osversion,StringType,true),StructField(totaltimetaken,LongType,true),StructField(dcmtype,StringType,true),StructField(status,StringType,true),StructField(vin,StringType,true),StructField(correlationid,StringType,true),StructField(apprequestnumber,StringType,true),StructField(profilename,StringType,true),StructField(make,StringType,true),StructField(commandinitiatedby,StringType,true),StructField(xlocale,StringType,true),StructField(commandrequest,StringType,true),StructField(modifiedOn
 ,TimestampType,true),StructField(appbrand,StringType,true)]
   	at org.sparkproject.guava.cache.LocalCache$Segment.get(LocalCache.java:2263)
   	at org.sparkproject.guava.cache.LocalCache.get(LocalCache.java:4000)
   	at org.sparkproject.guava.cache.LocalCache$LocalManualCache.get(LocalCache.java:4789)
   	at org.apache.spark.sql.catalyst.catalog.SessionCatalog.getCachedPlan(SessionCatalog.scala:177)
   	at org.apache.spark.sql.execution.datasources.FindDataSourceTable.org$apache$spark$sql$execution$datasources$FindDataSourceTable$$readDataSourceTable(DataSourceStrategy.scala:272)
   	at org.apache.spark.sql.execution.datasources.FindDataSourceTable$$anonfun$apply$2.applyOrElse(DataSourceStrategy.scala:320)
   	at org.apache.spark.sql.execution.datasources.FindDataSourceTable$$anonfun$apply$2.applyOrElse(DataSourceStrategy.scala:306)
   	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsDownWithPruning$2(AnalysisHelper.scala:170)
   	at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:177)
   	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsDownWithPruning$1(AnalysisHelper.scala:170)
   	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:323)
   	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsDownWithPruning(AnalysisHelper.scala:168)
   	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsDownWithPruning$(AnalysisHelper.scala:164)
   	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsDownWithPruning(LogicalPlan.scala:30)
   	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsDownWithPruning$4(AnalysisHelper.scala:175)
   	at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren(TreeNode.scala:1249)
   	at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren$(TreeNode.scala:1248)
   	at org.apache.spark.sql.catalyst.plans.logical.OrderPreservingUnaryNode.mapChildren(LogicalPlan.scala:226)
   	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsDownWithPruning$1(AnalysisHelper.scala:175)
   	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:323)
   	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsDownWithPruning(AnalysisHelper.scala:168)
   	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsDownWithPruning$(AnalysisHelper.scala:164)
   	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsDownWithPruning(LogicalPlan.scala:30)
   	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsWithPruning(AnalysisHelper.scala:99)
   	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsWithPruning$(AnalysisHelper.scala:96)
   	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsWithPruning(LogicalPlan.scala:30)
   	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperators(AnalysisHelper.scala:76)
   	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperators$(AnalysisHelper.scala:75)
   	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperators(LogicalPlan.scala:30)
   	at org.apache.spark.sql.execution.datasources.FindDataSourceTable.apply(DataSourceStrategy.scala:306)
   	at org.apache.spark.sql.execution.datasources.FindDataSourceTable.apply(DataSourceStrategy.scala:246)
   	at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:215)
   	at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
   	at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)
   	at scala.collection.immutable.List.foldLeft(List.scala:91)
   	at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeBatch$1(RuleExecutor.scala:212)
   	at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$6(RuleExecutor.scala:284)
   	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
   	at org.apache.spark.sql.catalyst.rules.RuleExecutor$RuleExecutionContext$.withContext(RuleExecutor.scala:327)
   	at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$5(RuleExecutor.scala:284)
   	at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$5$adapted(RuleExecutor.scala:274)
   	at scala.collection.immutable.List.foreach(List.scala:431)
   	at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:274)
   	at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:188)
   	at org.apache.spark.sql.catalyst.analysis.Analyzer.org$apache$spark$sql$catalyst$analysis$Analyzer$$executeSameContext(Analyzer.scala:227)
   	at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$execute$1(Analyzer.scala:223)
   	at org.apache.spark.sql.catalyst.analysis.AnalysisContext$.withNewAnalysisContext(Analyzer.scala:172)
   	at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:223)
   	at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:187)
   	at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$executeAndTrack$1(RuleExecutor.scala:179)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
   	at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeAndTrack(RuleExecutor.scala:179)
   	at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeAndCheck$1(Analyzer.scala:208)
   	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:330)
   	at org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:207)
   	at org.apache.spark.sql.execution.QueryExecution.$anonfun$analyzed$1(QueryExecution.scala:78)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:192)
   	at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$2(QueryExecution.scala:213)
   	at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:552)
   	at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:213)
   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
   	at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:212)
   	at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:78)
   	at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:76)
   	at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:68)
   	at org.apache.spark.sql.Dataset$.$anonfun$ofRows$1(Dataset.scala:93)
   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
   	at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:91)
   	at org.apache.spark.sql.SparkSession.table(SparkSession.scala:604)
   	at org.apache.spark.sql.internal.CatalogImpl.refreshTable(CatalogImpl.scala:540)
   	at org.apache.hudi.HoodieSparkSqlWriter$.$anonfun$metaSync$3(HoodieSparkSqlWriter.scala:658)
   	at org.apache.hudi.HoodieSparkSqlWriter$.$anonfun$metaSync$3$adapted(HoodieSparkSqlWriter.scala:655)
   	at scala.collection.immutable.List.foreach(List.scala:431)
   	at org.apache.hudi.HoodieSparkSqlWriter$.metaSync(HoodieSparkSqlWriter.scala:655)
   	at org.apache.hudi.HoodieSparkSqlWriter$.commitAndPerformPostOperations(HoodieSparkSqlWriter.scala:734)
   	at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:338)
   	at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:183)
   	at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:45)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:75)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:73)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:84)
   	at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:103)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:224)
   	at org.apache.spark.sql.execution.SQLExecution$.executeQuery$1(SQLExecution.scala:114)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$7(SQLExecution.scala:139)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:224)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:139)
   	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:245)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:138)
   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
   	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:68)
   	at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:100)
   	at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:96)
   	at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:615)
   	at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:177)
   	at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:615)
   	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:30)
   	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267)
   	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263)
   	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)
   	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)
   	at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:591)
   	at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:96)
   	at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:83)
   	at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:81)
   	at org.apache.spark.sql.execution.QueryExecution.assertCommandExecuted(QueryExecution.scala:124)
   	at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:860)
   	at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:390)
   	at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:363)
   	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:239)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] namuny commented on issue #7060: Error when upgrading to hudi 0.12.0 from 0.9.0

Posted by GitBox <gi...@apache.org>.

namuny commented on issue #7060:
URL: https://github.com/apache/hudi/issues/7060#issuecomment-1292762381

   Naming scheme of the hudi-spark-bundle jar has changed since 0.9.0. `hudi-spark3-bundle` now points to Spark 3.2. For Spark 3.1, you'll now need to pull `hudi-spark3.1-bundle`. [Release notes for reference](https://hudi.apache.org/releases/release-0.11.0/#spark-versions-and-bundles)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] nsivabalan commented on issue #7060: Error when upgrading to hudi 0.12.0 from 0.9.0

Posted by GitBox <gi...@apache.org>.

nsivabalan commented on issue #7060:
URL: https://github.com/apache/hudi/issues/7060#issuecomment-1294200329

   is this EMR version of spark or OSS spark ? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] navbalaraman commented on issue #7060: Error when upgrading to hudi 0.12.0 from 0.9.0

Posted by GitBox <gi...@apache.org>.

navbalaraman commented on issue #7060:
URL: https://github.com/apache/hudi/issues/7060#issuecomment-1292870879

   Thanks @namuny I updated the pom reference as below but still getting the same error.
   
       <!-- https://mvnrepository.com/artifact/org.apache.hudi/hudi-spark3.1-bundle -->
       <dependency>
         <groupId>org.apache.hudi</groupId>
         <artifactId>hudi-spark3.1-bundle_2.12</artifactId>
         <version>0.12.0</version>
       </dependency>
   
   org.apache.spark.sql.adapter.Spark3_1Adapter
   java.lang.ClassNotFoundException: org.apache.spark.sql.adapter.Spark3_1Adapter
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] nsivabalan commented on issue #7060: Error when upgrading to hudi 0.12.0 from 0.9.0

Posted by GitBox <gi...@apache.org>.

nsivabalan commented on issue #7060:
URL: https://github.com/apache/hudi/issues/7060#issuecomment-1302808470

   @navbalaraman : gentle ping.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] navbalaraman commented on issue #7060: Error when upgrading to hudi 0.12.0 from 0.9.0

Posted by GitBox <gi...@apache.org>.

navbalaraman commented on issue #7060:
URL: https://github.com/apache/hudi/issues/7060#issuecomment-1303684365

   @nsivabalan EMR 6.8 supports only 0.11.1 but we needed to get on 0.12.0 to have the null value issue resolved. so I upgraded spark to 3.3.0 and used the below hudi spark bundle to get it working now. Thanks for your inputs.
   
       <dependency>
         <groupId>org.apache.hudi</groupId>
         <artifactId>hudi-spark3.3-bundle_2.12</artifactId>
         <version>0.12.0</version>
       </dependency>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org