You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Ayush Saxena (Jira)" <ji...@apache.org> on 2022/11/09 21:47:00 UTC
[jira] [Resolved] (HIVE-26662) FAILED: SemanticException [Error 10072]: Database does not exist: global_temp
[ https://issues.apache.org/jira/browse/HIVE-26662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ayush Saxena resolved HIVE-26662.
---------------------------------
Resolution: Invalid
Vendor specific question, please reach out to their support
> FAILED: SemanticException [Error 10072]: Database does not exist: global_temp
> -----------------------------------------------------------------------------
>
> Key: HIVE-26662
> URL: https://issues.apache.org/jira/browse/HIVE-26662
> Project: Hive
> Issue Type: Bug
> Reporter: Mahmood Abu Awwad
> Priority: Blocker
>
> while running our batches using Apache Spark with Hive on EMR cluster, as we're using AWS glue as a MetaStore, it seems there is an issue occurs, which is
> {code:java}
> EntityNotFoundException ,Database global_temp not found {code}
> {code:java}
> 2022-10-09T10:36:31,262 INFO [573c4ce0-f73c-439b-829d-1f0b25db45ec main([])]: ql.Driver (:()) - Completed compiling command(queryId=hadoop_20221009103631_214e4b6c-b0f2-496e-b9a8-86831b202736); Time taken: 0.02 seconds
> 2022-10-09T10:36:31,262 INFO [573c4ce0-f73c-439b-829d-1f0b25db45ec main([])]: reexec.ReExecDriver (:()) - Execution #1 of query
> 2022-10-09T10:36:31,262 INFO [573c4ce0-f73c-439b-829d-1f0b25db45ec main([])]: ql.Driver (:()) - Concurrency mode is disabled, not creating a lock manager
> 2022-10-09T10:36:31,262 INFO [573c4ce0-f73c-439b-829d-1f0b25db45ec main([])]: ql.Driver (:()) - Executing command(queryId=hadoop_20221009103631_214e4b6c-b0f2-496e-b9a8-86831b202736): show views
> 2022-10-09T10:36:31,263 INFO [573c4ce0-f73c-439b-829d-1f0b25db45ec main([])]: ql.Driver (:()) - Starting task [Stage-0:DDL] in serial mode
> 2022-10-09T10:36:32,270 INFO [573c4ce0-f73c-439b-829d-1f0b25db45ec main([])]: ql.Driver (:()) - Completed executing command(queryId=hadoop_20221009103631_214e4b6c-b0f2-496e-b9a8-86831b202736); Time taken: 1.008 seconds
> 2022-10-09T10:36:32,270 INFO [573c4ce0-f73c-439b-829d-1f0b25db45ec main([])]: ql.Driver (:()) - OK
> 2022-10-09T10:36:32,270 INFO [573c4ce0-f73c-439b-829d-1f0b25db45ec main([])]: ql.Driver (:()) - Concurrency mode is disabled, not creating a lock manager
> 2022-10-09T10:36:32,271 INFO [573c4ce0-f73c-439b-829d-1f0b25db45ec main([])]: exec.ListSinkOperator (:()) - RECORDS_OUT_INTERMEDIATE:0, RECORDS_OUT_OPERATOR_LIST_SINK_0:0,
> 2022-10-09T10:36:32,271 INFO [573c4ce0-f73c-439b-829d-1f0b25db45ec main([])]: CliDriver (:()) - Time taken: 1.028 seconds
> 2022-10-09T10:36:32,271 INFO [573c4ce0-f73c-439b-829d-1f0b25db45ec main([])]: conf.HiveConf (HiveConf.java:getLogIdVar(5104)) - Using the default value passed in for log id: 573c4ce0-f73c-439b-829d-1f0b25db45ec
> 2022-10-09T10:36:32,272 INFO [573c4ce0-f73c-439b-829d-1f0b25db45ec main([])]: session.SessionState (SessionState.java:resetThreadName(452)) - Resetting thread name to main
> 2022-10-09T10:36:46,512 INFO [main([])]: conf.HiveConf (HiveConf.java:getLogIdVar(5104)) - Using the default value passed in for log id: 573c4ce0-f73c-439b-829d-1f0b25db45ec
> 2022-10-09T10:36:46,513 INFO [main([])]: session.SessionState (SessionState.java:updateThreadName(441)) - Updating thread name to 573c4ce0-f73c-439b-829d-1f0b25db45ec main
> 2022-10-09T10:36:46,515 INFO [573c4ce0-f73c-439b-829d-1f0b25db45ec main([])]: ql.Driver (:()) - Compiling command(queryId=hadoop_20221009103646_f390a868-07d7-49f1-b620-70d40e5e2cff): use global_temp
> 2022-10-09T10:36:46,530 INFO [573c4ce0-f73c-439b-829d-1f0b25db45ec main([])]: ql.Driver (:()) - Concurrency mode is disabled, not creating a lock manager
> 2022-10-09T10:36:46,666 ERROR [573c4ce0-f73c-439b-829d-1f0b25db45ec main([])]: ql.Driver (:()) - FAILED: SemanticException [Error 10072]: Database does not exist: global_temp
> org.apache.hadoop.hive.ql.parse.SemanticException: Database does not exist: global_temp
> at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.getDatabase(BaseSemanticAnalyzer.java:2171)
> at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeSwitchDatabase(DDLSemanticAnalyzer.java:1413)
> at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:516)
> at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:285)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:659)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1826)
> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1773)
> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1768)
> at org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
> at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:214)
> at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402)
> at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:236) {code}
> global_temp is a system preserved db by spark session to hold the global temp views.
> this db is not created on our AWS glue, as creating this on glue will fail all our EMR jobs with this error
> {code:java}
> ERROR ApplicationMaster: User class threw exception: org.apache.spark.SparkException: global_temp is a system preserved database, please rename your existing database to resolve the name conflict, or set a different value for spark.sql.globalTempDatabase, and launch your Spark application again. {code}
> We're not creating or using any global temp views in our project, but it seems this is a health check happen when initializing spark session by spark it self.
> EMR configuration used
> {code:java}
> // [
> {
> "Classification":"hive-site",
> "Properties":{
> "hive.msck.path.validation":"ignore",
> "hive.exec.max.dynamic.partitions":"1000000",
> "hive.vectorized.execution.enabled":"true",
> "hive.metastore.client.factory.class":"com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory",
> "hive.exec.dynamic.partition.mode":"nonstrict",
> "hive.exec.max.dynamic.partitions.pernode":"500000"
> },
> "Configurations":[
>
> ]
> },
> {
> "Classification":"yarn-site",
> "Properties":{
> "yarn.resourcemanager.scheduler.class":"org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler",
> "yarn.log-aggregation.retain-seconds":"-1",
> "yarn.scheduler.fair.allow-undeclared-pools":"true",
> "yarn.log-aggregation-enable":"true",
> "yarn.scheduler.fair.user-as-default-queue":"true",
> "yarn.nodemanager.remote-app-log-dir":"LOGS_PATH",
> "yarn.scheduler.fair.preemption":"true",
> "yarn.scheduler.fair.preemption.cluster-utilization-threshold":"0.8",
> "yarn.resourcemanager.am.max-attempts":"10"
> },
> "Configurations":[
>
> ]
> },
> {
> "Classification":"mapred-site",
> "Properties":{
> "mapred.jobtracker.taskScheduler":"org.apache.hadoop.mapred.FairScheduler"
> },
> "Configurations":[
>
> ]
> },
> {
> "Classification":"presto-connector-hive",
> "Properties":{
> "hive.recursive-directories":"true",
> "hive.metastore.glue.datacatalog.enabled":"true"
> },
> "Configurations":[
>
> ]
> },
> {
> "Classification":"spark-log4j",
> "Properties":{
> "log4j.logger.com.project":"DEBUG",
> "log4j.appender.rolling.layout":"org.apache.log4j.PatternLayout",
> "log4j.logger.org.apache.spark":"WARN",
> "log4j.appender.rolling.encoding":"UTF-8",
> "log4j.appender.rolling.layout.ConversionPattern":"%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n",
> "log4j.appender.rolling.maxBackupIndex":"5",
> "log4j.appender.rolling":"org.apache.log4j.RollingFileAppender",
> "log4j.rootLogger":"WARN, rolling",
> "log4j.logger.org.eclipse.jetty":"WARN",
> "log4j.appender.rolling.maxFileSize":"1000MB",
> "log4j.appender.rolling.file":"${spark.yarn.app.container.log.dir}/spark.log"
> },
> "Configurations":[
>
> ]
> },
> {
> "Classification":"emrfs-site",
> "Properties":{
> "fs.s3.maxConnections":"10000"
> },
> "Configurations":[
>
> ]
> },
> {
> "Classification":"spark-hive-site",
> "Properties":{
> "hive.metastore.client.factory.class":"com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory"
> },
> "Configurations":[
>
> ]
> }
> ] {code}
> and the spark submit command is
> {code:java}
> spark-submit --deploy-mode cluster --master yarn --conf spark.yarn.appMasterEnv.ENV=DEV --conf spark.executorEnv.ENV=DEV --conf spark.network.timeout=6000s --conf spark.sql.catalogImplementation=hive --conf spark.driver.memory=15g --conf spark.hadoop.hive.metastore.client.factory.class=com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory --class CLASS_NAME JAR_FILE_PATH{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)