You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/02/24 16:04:57 UTC
[GitHub] [hudi] melin opened a new issue #4903: [SUPPORT] Spark SQL performs insert to write hudI table. Metadata synchronization failed.
melin opened a new issue #4903:
URL: https://github.com/apache/hudi/issues/4903
```sql
CREATE TABLE bigdata.test_hudi_demo (
`_hoodie_commit_time` STRING,
`_hoodie_commit_seqno` STRING,
`_hoodie_record_key` STRING,
`_hoodie_partition_path` STRING,
`_hoodie_file_name` STRING,
`id` INT COMMENT '',
`name` STRING COMMENT '',
`price` DOUBLE COMMENT '',
`ds` DATE COMMENT '')
USING hudi
OPTIONS(
'hoodie.datasource.hive_sync.mode' = 'HMS',
'hoodie.datasource.write.precombine.field' = 'ds',
'hoodie.metadata.enable' = 'true',
'hoodie.parquet.compression.codec' = 'zstd',
'hoodie.payload.event.time.field' = 'ds',
'hoodie.payload.ordering.field' = 'ds',
'primaryKey' = 'id',
'type' = 'cow')
TBLPROPERTIES(
'path' = '/user/hive/warehouse/bigdata.db/test_hudi_demo')
```
sql: insert into table test_hudi_demo select 1, 'zhangsan', 20, to_date('20210810', 'yyyymmdd');
hive-site.xml
```xml
<?xml version="1.0" encoding="UTF-8"?>
<!--Autogenerated by Cloudera Manager-->
<configuration>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
</property>
<property>
<name>hive.warehouse.subdir.inherit.perms</name>
<value>true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://10.5.20.20:3306/hive23?createDatabaseIfNotExist=true&useSSL=false</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.cj.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>root</value>
</property>
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
</property>
<property>
<name>hive.server2.enable.doAs</name>
<value>false</value>
</property>
<property>
<name>hive.exec.dynamic.partition</name>
<value>true</value>
</property>
<property>
<name>hive.exec.dynamic.partition.mode</name>
<value>nonstrict</value>
</property>
<property>
<name>hive.exec.max.dynamic.partitions</name>
<value>100000</value>
</property>
<property>
<name>hive.exec.max.dynamic.partitions.pernode</name>
<value>100000</value>
</property>
</configuration>
```
```
57097 [SparkTaskThread-0] INFO hive.metastore - Trying to connect to metastore with URI thrift://localhost:9083
57132 [SparkTaskThread-0] WARN hive.metastore - Failed to connect to the MetaStore Server...
57133 [SparkTaskThread-0] INFO hive.metastore - Waiting 1 seconds before next connection attempt.
58133 [SparkTaskThread-0] INFO hive.metastore - Trying to connect to metastore with URI thrift://localhost:9083
58133 [SparkTaskThread-0] WARN hive.metastore - Failed to connect to the MetaStore Server...
58133 [SparkTaskThread-0] INFO hive.metastore - Waiting 1 seconds before next connection attempt.
59133 [SparkTaskThread-0] INFO hive.metastore - Trying to connect to metastore with URI thrift://localhost:9083
59134 [SparkTaskThread-0] WARN hive.metastore - Failed to connect to the MetaStore Server...
59134 [SparkTaskThread-0] INFO hive.metastore - Waiting 1 seconds before next connection attempt.
60140 [SparkTaskThread-0] WARN hive.ql.metadata.Hive - Failed to register all functions.
java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1742)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:83)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:133)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3607)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3659)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3639)
at org.apache.hadoop.hive.ql.metadata.Hive.getAllFunctions(Hive.java:3901)
at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:248)
at org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:231)
at org.apache.hadoop.hive.ql.metadata.Hive.<init>(Hive.java:395)
at org.apache.hadoop.hive.ql.metadata.Hive.create(Hive.java:339)
at org.apache.hadoop.hive.ql.metadata.Hive.getInternal(Hive.java:319)
at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:288)
at org.apache.hudi.hive.ddl.HMSDDLExecutor.<init>(HMSDDLExecutor.java:68)
at org.apache.hudi.hive.HoodieHiveClient.<init>(HoodieHiveClient.java:76)
at org.apache.hudi.hive.HiveSyncTool.<init>(HiveSyncTool.java:79)
at org.apache.hudi.HoodieSparkSqlWriter$.syncHive(HoodieSparkSqlWriter.scala:560)
at org.apache.hudi.HoodieSparkSqlWriter$.$anonfun$metaSync$2(HoodieSparkSqlWriter.scala:618)
at org.apache.hudi.HoodieSparkSqlWriter$.$anonfun$metaSync$2$adapted(HoodieSparkSqlWriter.scala:614)
at scala.collection.mutable.HashSet.foreach(HashSet.scala:79)
at org.apache.hudi.HoodieSparkSqlWriter$.metaSync(HoodieSparkSqlWriter.scala:614)
at org.apache.hudi.HoodieSparkSqlWriter$.commitAndPerformPostOperations(HoodieSparkSqlWriter.scala:688)
at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:301)
at org.apache.spark.sql.hudi.command.InsertIntoHoodieTableCommand$.run(InsertIntoHoodieTableCommand.scala:101)
at org.apache.spark.sql.hudi.command.InsertIntoHoodieTableCommand.run(InsertIntoHoodieTableCommand.scala:54)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:75)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:73)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:84)
at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:110)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:103)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:110)
at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:106)
at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:481)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:82)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:481)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:30)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:457)
at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:106)
at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:93)
at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:91)
at org.apache.spark.sql.Dataset.<init>(Dataset.scala:219)
at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:99)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:96)
at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:618)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
at org.apache.spark.sql.SparkSession.sql_aroundBody0(SparkSession.scala:613)
at org.apache.spark.sql.SparkSession$AjcClosure1.run(SparkSession.scala:1)
at com.dataworker.spark.jobserver.driver.aspectj.SparkSessionAspectj.ajc$around$com_dataworker_spark_jobserver_driver_aspectj_SparkSessionAspectj$9$e0361191proceed(SparkSessionAspectj.aj:129)
at com.dataworker.spark.jobserver.driver.aspectj.SparkSessionAspectj.ajc$around$com_dataworker_spark_jobserver_driver_aspectj_SparkSessionAspectj$9$e0361191(SparkSessionAspectj.aj:317)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:613)
at com.dataworker.spark.jobserver.driver.task.SparkSqlTask.$anonfun$runSparkSql$1(SparkSqlTask.scala:100)
at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
at com.dataworker.spark.jobserver.driver.task.SparkSqlTask.com$dataworker$spark$jobserver$driver$task$SparkSqlTask$$executeSql$1(SparkSqlTask.scala:52)
at com.dataworker.spark.jobserver.driver.task.SparkSqlTask$SqlThread$1$$anon$1.run(SparkSqlTask.scala:139)
at com.dataworker.spark.jobserver.driver.task.SparkSqlTask$SqlThread$1$$anon$1.run(SparkSqlTask.scala:136)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
at com.dataworker.spark.jobserver.driver.task.SparkSqlTask$SqlThread$1.run(SparkSqlTask.scala:136)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1740)
... 75 more
```
<img width="1379" alt="image" src="https://user-images.githubusercontent.com/1145830/155561478-7d0d0221-8b71-4b56-af29-a225798ff921.png">
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] melin closed issue #4903: [SUPPORT] Spark SQL insert to write hudI table. Metadata sync failed.
Posted by GitBox <gi...@apache.org>.
melin closed issue #4903:
URL: https://github.com/apache/hudi/issues/4903
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] neverdizzy commented on issue #4903: [SUPPORT] Spark SQL insert to write hudI table. Metadata sync failed.
Posted by GitBox <gi...@apache.org>.
neverdizzy commented on issue #4903:
URL: https://github.com/apache/hudi/issues/4903#issuecomment-1072328138
你是怎么解决的?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] neverdizzy commented on issue #4903: [SUPPORT] Spark SQL insert to write hudI table. Metadata sync failed.
Posted by GitBox <gi...@apache.org>.
neverdizzy commented on issue #4903:
URL: https://github.com/apache/hudi/issues/4903#issuecomment-1072328138
你是怎么解决的?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org