You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Imran Rashid (JIRA)" <ji...@apache.org> on 2017/09/26 19:49:00 UTC

[jira] [Resolved] (SPARK-22121) Spark should fix hive table location for hdfs HA

     [ https://issues.apache.org/jira/browse/SPARK-22121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Imran Rashid resolved SPARK-22121.
----------------------------------
    Resolution: Won't Fix

Discussed with [~smilegator] on the PR, we've decided for now to mark this as won't fix as there isn't a clear home for making the auto-adjustment:

bq. Spark SQL might not be deployed in the HDFS system. Conceptually, this HDFS-specific codes should not be part of our HiveExternalCatalog . HiveExternalCatalog is just for using Hive metastore. It does not assume we use HDFS.

hopefully the jira description is enough for users to search and find the workaround.

> Spark should fix hive table location for hdfs HA
> ------------------------------------------------
>
>                 Key: SPARK-22121
>                 URL: https://issues.apache.org/jira/browse/SPARK-22121
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.2.0
>            Reporter: Imran Rashid
>            Assignee: Imran Rashid
>            Priority: Minor
>
> When converting an existing hdfs setup to have multiple namenodes, users *should* run the hive metatool to change the location of the metastore, to refer to the nameservice, instead of a specific namenode.  (See the {{updateLocation}} section of the [metatool docs|https://cwiki.apache.org/confluence/display/Hive/Hive+MetaTool].)
> However, users tend to forget to do this.  If hdfs HA is turned on after a hive database is already created, the db location may still reference just one namenode, instead of the nameservice.  To be a little more user friendly, Spark should detect the misconfiguration and try to auto-adjust for it.  (This is the behavior from hive as well.)
> An example exception is given below.  Users should run the hive metatool to update the database location if they see this.
> {noformat}
> Exception in thread "main" org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got exception: org.apache.hadoop.ipc.RemoteException Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
> 	at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:88)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1946)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1412)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:2986)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1142)
> 	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:938)
> 	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
> 	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
> 	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
> );
> 	at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:108)
> 	at org.apache.spark.sql.hive.HiveExternalCatalog.doCreateTable(HiveExternalCatalog.scala:217)
> 	at org.apache.spark.sql.catalyst.catalog.ExternalCatalog.createTable(ExternalCatalog.scala:110)
> 	at org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:316)
> 	at org.apache.spark.sql.execution.command.CreateTableCommand.run(tables.scala:127)
> 	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
> 	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
> 	at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:67)
> 	at org.apache.spark.sql.Dataset.<init>(Dataset.scala:182)
> 	at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:67)
> 	at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:623)
> 	at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:691)
> 	at com.cloudera.spark.RunHiveQl$$anonfun$run$1.apply(RunHiveQl.scala:50)
> 	at com.cloudera.spark.RunHiveQl$$anonfun$run$1.apply(RunHiveQl.scala:48)
> 	at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
> 	at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
> 	at com.cloudera.spark.RunHiveQl.run(RunHiveQl.scala:48)
> 	at com.cloudera.spark.RunHiveQl$.main(RunHiveQl.scala:181)
> 	at com.cloudera.spark.RunHiveQl.main(RunHiveQl.scala)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:498)
> 	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755)
> 	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
> 	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
> 	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
> 	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got exception: org.apache.hadoop.ipc.RemoteException Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
> 	at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:88)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1946)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1412)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:2986)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1142)
> 	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:938)
> 	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
> 	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
> 	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
> )
> 	at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:859)
> 	at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:864)
> 	at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTable$1.apply$mcV$sp(HiveClientImpl.scala:455)
> 	at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTable$1.apply(HiveClientImpl.scala:455)
> 	at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTable$1.apply(HiveClientImpl.scala:455)
> 	at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:299)
> 	at org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:240)
> 	at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:239)
> 	at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:282)
> 	at org.apache.spark.sql.hive.client.HiveClientImpl.createTable(HiveClientImpl.scala:454)
> 	at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply$mcV$sp(HiveExternalCatalog.scala:287)
> 	at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply(HiveExternalCatalog.scala:217)
> 	at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply(HiveExternalCatalog.scala:217)
> 	at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:99)
> 	... 27 more
> Caused by: MetaException(message:Got exception: org.apache.hadoop.ipc.RemoteException Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
> 	at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:88)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1946)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1412)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:2986)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1142)
> 	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:938)
> 	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
> 	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
> 	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
> )
> 	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result$create_table_with_environment_context_resultStandardScheme.read(ThriftHiveMetastore.java:41639)
> 	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result$create_table_with_environment_context_resultStandardScheme.read(ThriftHiveMetastore.java:41607)
> 	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result.read(ThriftHiveMetastore.java:41533)
> 	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86)
> 	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_table_with_environment_context(ThriftHiveMetastore.java:1187)
> 	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_table_with_environment_context(ThriftHiveMetastore.java:1173)
> 	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.create_table_with_environment_context(HiveMetaStoreClient.java:2431)
> 	at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.create_table_with_environment_context(SessionHiveMetaStoreClient.java:93)
> 	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:801)
> 	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:787)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:498)
> 	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:154)
> 	at com.sun.proxy.$Proxy29.createTable(Unknown Source)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:498)
> 	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2362)
> 	at com.sun.proxy.$Proxy29.createTable(Unknown Source)
> 	at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:849)
> 	... 40 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org