You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@zeppelin.apache.org by Ruslan Dautkhanov <da...@gmail.com> on 2016/11/20 23:59:04 UTC

Zepelin problem in HA HDFS

Running into issues with Zeppelin in a cluster that runs HA HDFS.
See complete exception stack [1].
"pc1udatahad01.x.y/10.20.32.54:8020...
category READ is not supported in state standby"
Yes, pc1udatahad01 is a current standby, why Spark/HMS/doesn't switch over
to the active one?
hdfs-site.xml that exists in zeppelin home/conf has a symlink
hdfs-site.xml -> /etc/hive/conf/hdfs-site.xml
and hdfs config properly points to a HA HDFS namespace.

Thoughts?

Interesting side effect is that HMS switches to a local Derby database (I
sent email on this last week in a separate email chain). See [1] stack - it
seems Hive/HMS tries to talk to HDFS and fails over to a local Derby
database.



Zeppelin 0.6.2
Spark 2.0.2
Hive 1.1
RHEL 6.6
Java 7



[1]

 INFO [2016-11-20 16:47:21,044] ({Thread-40}
RetryInvocationHandler.java[invoke]:148) - Exception while invoking
getFileInfo of class ClientNamenodeProtocolTranslatorPB over
pc1udatahad01.x.y/10.20.32.54:8020. Trying to fail over immediately.
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
Operation category READ is not supported in state standby. Visit
https://s.apache.org/sbnn-error
        at
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:88)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1831)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1449)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:4271)
        at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:897)
        at
org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getFileInfo(AuthorizationProviderProxyClientProtocol.java:528)
        at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:829)
        at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)

        at org.apache.hadoop.ipc.Client.call(Client.java:1472)
        at org.apache.hadoop.ipc.Client.call(Client.java:1409)
        at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
        at com.sun.proxy.$Proxy16.getFileInfo(Unknown Source)
        at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:762)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
        at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
        at com.sun.proxy.$Proxy17.getFileInfo(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2121)
        at
org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1215)
        at
org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1211)
        at
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1211)
        at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1412)
        at
org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(SessionState.java:616)
        at
org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:574)
        at
org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:518)
        at
org.apache.spark.sql.hive.client.HiveClientImpl.<init>(HiveClientImpl.scala:189)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
        at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
        at
org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264)
        at
org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:354)
        at
org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:258)
        at
org.apache.spark.sql.hive.HiveSharedState.metadataHive$lzycompute(HiveSharedState.scala:39)
        at
org.apache.spark.sql.hive.HiveSharedState.metadataHive(HiveSharedState.scala:38)
        at
org.apache.spark.sql.hive.HiveSharedState.externalCatalog$lzycompute(HiveSharedState.scala:46)
        at
org.apache.spark.sql.hive.HiveSharedState.externalCatalog(HiveSharedState.scala:45)
        at
org.apache.spark.sql.hive.HiveSessionState.catalog$lzycompute(HiveSessionState.scala:50)
        at
org.apache.spark.sql.hive.HiveSessionState.catalog(HiveSessionState.scala:48)
        at
org.apache.spark.sql.hive.HiveSessionState$$anon$1.<init>(HiveSessionState.scala:63)
        at
org.apache.spark.sql.hive.HiveSessionState.analyzer$lzycompute(HiveSessionState.scala:63)
        at
org.apache.spark.sql.hive.HiveSessionState.analyzer(HiveSessionState.scala:62)
        at
org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:49)
        at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
        at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:582)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:237)
        at
py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:280)
        at
py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
        at py4j.commands.CallCommand.execute(CallCommand.java:79)
        at py4j.GatewayConnection.run(GatewayConnection.java:214)
        at java.lang.Thread.run(Thread.java:745)


-- 
Ruslan Dautkhanov

Re: Zepelin problem in HA HDFS

Posted by Ruslan Dautkhanov <da...@gmail.com>.

Well, that wasn't a long-running session. HDFS Namenode states haven't
changed when that Zeppelin
notebook started. It's always reproducible problem. It might be a Spark 2.0
problem. I am failing back to Spark 1.6.

Thanks Felix.



-- 
Ruslan Dautkhanov

On Wed, Nov 23, 2016 at 6:58 AM, Felix Cheung <fe...@hotmail.com>
wrote:

> Quite possibly since Spark is talking to HDFS.
>
> Does it work in your environment when HA switch over with a long running
> spark shell session?
>
>
> ------------------------------
> *From:* Ruslan Dautkhanov <da...@gmail.com>
> *Sent:* Sunday, November 20, 2016 5:27:54 PM
> *To:* users@zeppelin.apache.org
> *Subject:* Re: Zepelin problem in HA HDFS
>
> When I failed over HDFS HA nameservice to another namenode, Zeppelin now
> has the same
> error stack *but* for the other namenode, which now became standby.
>
> Not sure if has something to do with Spark 2.0..
>
>
>
> --
> Ruslan Dautkhanov
>
> On Sun, Nov 20, 2016 at 4:59 PM, Ruslan Dautkhanov <da...@gmail.com>
> wrote:
>
>> Running into issues with Zeppelin in a cluster that runs HA HDFS.
>> See complete exception stack [1].
>> "pc1udatahad01.x.y/10.20.32.54:8020...
>> category READ is not supported in state standby"
>> Yes, pc1udatahad01 is a current standby, why Spark/HMS/doesn't switch
>> over to the active one?
>> hdfs-site.xml that exists in zeppelin home/conf has a symlink
>> hdfs-site.xml -> /etc/hive/conf/hdfs-site.xml
>> and hdfs config properly points to a HA HDFS namespace.
>>
>> Thoughts?
>>
>> Interesting side effect is that HMS switches to a local Derby database (I
>> sent email on this last week in a separate email chain). See [1] stack - it
>> seems Hive/HMS tries to talk to HDFS and fails over to a local Derby
>> database.
>>
>>
>>
>> Zeppelin 0.6.2
>> Spark 2.0.2
>> Hive 1.1
>> RHEL 6.6
>> Java 7
>>
>>
>>
>> [1]
>>
>>  INFO [2016-11-20 16:47:21,044] ({Thread-40}
>> RetryInvocationHandler.java[invoke]:148) - Exception while invoking
>> getFileInfo of class ClientNamenodeProtocolTranslatorPB over
>> pc1udatahad01.x.y/10.20.32.54:8020. Trying to fail over immediately.
>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>> Operation category READ is not supported in state standby. Visit
>> https://s.apache.org/sbnn-error
>>         at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.check
>> Operation(StandbyState.java:88)
>>         at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHACo
>> ntext.checkOperation(NameNode.java:1831)
>>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOpe
>> ration(FSNamesystem.java:1449)
>>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileI
>> nfo(FSNamesystem.java:4271)
>>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.get
>> FileInfo(NameNodeRpcServer.java:897)
>>         at org.apache.hadoop.hdfs.server.namenode.AuthorizationProvider
>> ProxyClientProtocol.getFileInfo(AuthorizationProvi
>> derProxyClientProtocol.java:528)
>>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServ
>> erSideTranslatorPB.getFileInfo(ClientNamenodeProt
>> ocolServerSideTranslatorPB.java:829)
>>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocol
>> Protos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNam
>> enodeProtocolProtos.java)
>>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcIn
>> voker.call(ProtobufRpcEngine.java:617)
>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:415)
>>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
>> upInformation.java:1709)
>>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
>>
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1472)
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1409)
>>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(
>> ProtobufRpcEngine.java:230)
>>         at com.sun.proxy.$Proxy16.getFileInfo(Unknown Source)
>>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTran
>> slatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:762)
>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
>> ssorImpl.java:57)
>>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>> thodAccessorImpl.java:43)
>>         at java.lang.reflect.Method.invoke(Method.java:606)
>>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMeth
>> od(RetryInvocationHandler.java:256)
>>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(Ret
>> ryInvocationHandler.java:104)
>>         at com.sun.proxy.$Proxy17.getFileInfo(Unknown Source)
>>         at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:
>> 2121)
>>         at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(
>> DistributedFileSystem.java:1215)
>>         at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(
>> DistributedFileSystem.java:1211)
>>         at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSyst
>> emLinkResolver.java:81)
>>         at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(D
>> istributedFileSystem.java:1211)
>>         at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1412)
>>         at org.apache.hadoop.hive.ql.session.SessionState.createRootHDF
>> SDir(SessionState.java:616)
>>         at org.apache.hadoop.hive.ql.session.SessionState.createSession
>> Dirs(SessionState.java:574)
>>         at org.apache.hadoop.hive.ql.session.SessionState.start(Session
>> State.java:518)
>>         at org.apache.spark.sql.hive.client.HiveClientImpl.<init>(HiveC
>> lientImpl.scala:189)
>>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>> Method)
>>         at sun.reflect.NativeConstructorAccessorImpl.newInstance(Native
>> ConstructorAccessorImpl.java:57)
>>         at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(De
>> legatingConstructorAccessorImpl.java:45)
>>         at java.lang.reflect.Constructor.newInstance(Constructor.java:5
>> 26)
>>         at org.apache.spark.sql.hive.client.IsolatedClientLoader.create
>> Client(IsolatedClientLoader.scala:264)
>>         at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(Hi
>> veUtils.scala:354)
>>         at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(Hi
>> veUtils.scala:258)
>>         at org.apache.spark.sql.hive.HiveSharedState.metadataHive$lzyco
>> mpute(HiveSharedState.scala:39)
>>         at org.apache.spark.sql.hive.HiveSharedState.metadataHive(HiveS
>> haredState.scala:38)
>>         at org.apache.spark.sql.hive.HiveSharedState.externalCatalog$
>> lzycompute(HiveSharedState.scala:46)
>>         at org.apache.spark.sql.hive.HiveSharedState.externalCatalog(Hi
>> veSharedState.scala:45)
>>         at org.apache.spark.sql.hive.HiveSessionState.catalog$lzycomput
>> e(HiveSessionState.scala:50)
>>         at org.apache.spark.sql.hive.HiveSessionState.catalog(HiveSessi
>> onState.scala:48)
>>         at org.apache.spark.sql.hive.HiveSessionState$$anon$1.<init>(
>> HiveSessionState.scala:63)
>>         at org.apache.spark.sql.hive.HiveSessionState.analyzer$lzycompu
>> te(HiveSessionState.scala:63)
>>         at org.apache.spark.sql.hive.HiveSessionState.analyzer(HiveSess
>> ionState.scala:62)
>>         at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed
>> (QueryExecution.scala:49)
>>         at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
>>         at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:582)
>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
>> ssorImpl.java:57)
>>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>> thodAccessorImpl.java:43)
>>         at java.lang.reflect.Method.invoke(Method.java:606)
>>         at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:237)
>>         at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.
>> java:357)
>>         at py4j.Gateway.invoke(Gateway.java:280)
>>         at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.j
>> ava:132)
>>         at py4j.commands.CallCommand.execute(CallCommand.java:79)
>>         at py4j.GatewayConnection.run(GatewayConnection.java:214)
>>         at java.lang.Thread.run(Thread.java:745)
>>
>>
>> --
>> Ruslan Dautkhanov
>>
>
>

Re: Zepelin problem in HA HDFS

Posted by Felix Cheung <fe...@hotmail.com>.

Quite possibly since Spark is talking to HDFS.

Does it work in your environment when HA switch over with a long running spark shell session?


________________________________
From: Ruslan Dautkhanov <da...@gmail.com>
Sent: Sunday, November 20, 2016 5:27:54 PM
To: users@zeppelin.apache.org
Subject: Re: Zepelin problem in HA HDFS

When I failed over HDFS HA nameservice to another namenode, Zeppelin now has the same
error stack *but* for the other namenode, which now became standby.

Not sure if has something to do with Spark 2.0..



--
Ruslan Dautkhanov

On Sun, Nov 20, 2016 at 4:59 PM, Ruslan Dautkhanov <da...@gmail.com>> wrote:
Running into issues with Zeppelin in a cluster that runs HA HDFS.
See complete exception stack [1].
"pc1udatahad01.x.y/10.20.32.<http://10.20.32.>54:8020...
category READ is not supported in state standby"
Yes, pc1udatahad01 is a current standby, why Spark/HMS/doesn't switch over to the active one?
hdfs-site.xml that exists in zeppelin home/conf has a symlink
hdfs-site.xml -> /etc/hive/conf/hdfs-site.xml
and hdfs config properly points to a HA HDFS namespace.

Thoughts?

Interesting side effect is that HMS switches to a local Derby database (I sent email on this last week in a separate email chain). See [1] stack - it seems Hive/HMS tries to talk to HDFS and fails over to a local Derby database.



Zeppelin 0.6.2
Spark 2.0.2
Hive 1.1
RHEL 6.6
Java 7



[1]

 INFO [2016-11-20 16:47:21,044] ({Thread-40} RetryInvocationHandler.java[invoke]:148) - Exception while invoking getFileInfo of class ClientNamenodeProtocolTranslatorPB over pc1udatahad01.x.y/10.20.32.54:8020<http://10.20.32.54:8020>. Trying to fail over immediately.
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
        at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:88)
        at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1831)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1449)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:4271)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:897)
        at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getFileInfo(AuthorizationProviderProxyClientProtocol.java:528)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:829)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)

        at org.apache.hadoop.ipc.Client.call(Client.java:1472)
        at org.apache.hadoop.ipc.Client.call(Client.java:1409)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
        at com.sun.proxy.$Proxy16.getFileInfo(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:762)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
        at com.sun.proxy.$Proxy17.getFileInfo(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2121)
        at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1215)
        at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1211)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1211)
        at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1412)
        at org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(SessionState.java:616)
        at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:574)
        at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:518)
        at org.apache.spark.sql.hive.client.HiveClientImpl.<init>(HiveClientImpl.scala:189)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
        at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264)
        at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:354)
        at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:258)
        at org.apache.spark.sql.hive.HiveSharedState.metadataHive$lzycompute(HiveSharedState.scala:39)
        at org.apache.spark.sql.hive.HiveSharedState.metadataHive(HiveSharedState.scala:38)
        at org.apache.spark.sql.hive.HiveSharedState.externalCatalog$lzycompute(HiveSharedState.scala:46)
        at org.apache.spark.sql.hive.HiveSharedState.externalCatalog(HiveSharedState.scala:45)
        at org.apache.spark.sql.hive.HiveSessionState.catalog$lzycompute(HiveSessionState.scala:50)
        at org.apache.spark.sql.hive.HiveSessionState.catalog(HiveSessionState.scala:48)
        at org.apache.spark.sql.hive.HiveSessionState$$anon$1.<init>(HiveSessionState.scala:63)
        at org.apache.spark.sql.hive.HiveSessionState.analyzer$lzycompute(HiveSessionState.scala:63)
        at org.apache.spark.sql.hive.HiveSessionState.analyzer(HiveSessionState.scala:62)
        at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:49)
        at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
        at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:582)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:237)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:280)
        at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
        at py4j.commands.CallCommand.execute(CallCommand.java:79)
        at py4j.GatewayConnection.run(GatewayConnection.java:214)
        at java.lang.Thread.run(Thread.java:745)


--
Ruslan Dautkhanov

Re: Zepelin problem in HA HDFS

Posted by Ruslan Dautkhanov <da...@gmail.com>.

When I failed over HDFS HA nameservice to another namenode, Zeppelin now
has the same
error stack *but* for the other namenode, which now became standby.

Not sure if has something to do with Spark 2.0..



-- 
Ruslan Dautkhanov

On Sun, Nov 20, 2016 at 4:59 PM, Ruslan Dautkhanov <da...@gmail.com>
wrote:

> Running into issues with Zeppelin in a cluster that runs HA HDFS.
> See complete exception stack [1].
> "pc1udatahad01.x.y/10.20.32.54:8020...
> category READ is not supported in state standby"
> Yes, pc1udatahad01 is a current standby, why Spark/HMS/doesn't switch over
> to the active one?
> hdfs-site.xml that exists in zeppelin home/conf has a symlink
> hdfs-site.xml -> /etc/hive/conf/hdfs-site.xml
> and hdfs config properly points to a HA HDFS namespace.
>
> Thoughts?
>
> Interesting side effect is that HMS switches to a local Derby database (I
> sent email on this last week in a separate email chain). See [1] stack - it
> seems Hive/HMS tries to talk to HDFS and fails over to a local Derby
> database.
>
>
>
> Zeppelin 0.6.2
> Spark 2.0.2
> Hive 1.1
> RHEL 6.6
> Java 7
>
>
>
> [1]
>
>  INFO [2016-11-20 16:47:21,044] ({Thread-40} RetryInvocationHandler.java[invoke]:148)
> - Exception while invoking getFileInfo of class
> ClientNamenodeProtocolTranslatorPB over pc1udatahad01.x.y/10.20.32.54:8020.
> Trying to fail over immediately.
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
> Operation category READ is not supported in state standby. Visit
> https://s.apache.org/sbnn-error
>         at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.
> checkOperation(StandbyState.java:88)
>         at org.apache.hadoop.hdfs.server.namenode.NameNode$
> NameNodeHAContext.checkOperation(NameNode.java:1831)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.
> checkOperation(FSNamesystem.java:1449)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.
> getFileInfo(FSNamesystem.java:4271)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.
> getFileInfo(NameNodeRpcServer.java:897)
>         at org.apache.hadoop.hdfs.server.namenode.
> AuthorizationProviderProxyClientProtocol.getFileInfo(
> AuthorizationProviderProxyClientProtocol.java:528)
>         at org.apache.hadoop.hdfs.protocolPB.
> ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(
> ClientNamenodeProtocolServerSideTranslatorPB.java:829)
>         at org.apache.hadoop.hdfs.protocol.proto.
> ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(
> ClientNamenodeProtocolProtos.java)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$
> ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(
> UserGroupInformation.java:1709)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
>
>         at org.apache.hadoop.ipc.Client.call(Client.java:1472)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1409)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.
> invoke(ProtobufRpcEngine.java:230)
>         at com.sun.proxy.$Proxy16.getFileInfo(Unknown Source)
>         at org.apache.hadoop.hdfs.protocolPB.
> ClientNamenodeProtocolTranslatorPB.getFileInfo(
> ClientNamenodeProtocolTranslatorPB.java:762)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:57)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(
> RetryInvocationHandler.java:256)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(
> RetryInvocationHandler.java:104)
>         at com.sun.proxy.$Proxy17.getFileInfo(Unknown Source)
>         at org.apache.hadoop.hdfs.DFSClient.getFileInfo(
> DFSClient.java:2121)
>         at org.apache.hadoop.hdfs.DistributedFileSystem$19.
> doCall(DistributedFileSystem.java:1215)
>         at org.apache.hadoop.hdfs.DistributedFileSystem$19.
> doCall(DistributedFileSystem.java:1211)
>         at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(
> FileSystemLinkResolver.java:81)
>         at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(
> DistributedFileSystem.java:1211)
>         at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1412)
>         at org.apache.hadoop.hive.ql.session.SessionState.
> createRootHDFSDir(SessionState.java:616)
>         at org.apache.hadoop.hive.ql.session.SessionState.
> createSessionDirs(SessionState.java:574)
>         at org.apache.hadoop.hive.ql.session.SessionState.start(
> SessionState.java:518)
>         at org.apache.spark.sql.hive.client.HiveClientImpl.<init>(
> HiveClientImpl.scala:189)
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance(
> NativeConstructorAccessorImpl.java:57)
>         at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
> DelegatingConstructorAccessorImpl.java:45)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>         at org.apache.spark.sql.hive.client.IsolatedClientLoader.
> createClient(IsolatedClientLoader.scala:264)
>         at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(
> HiveUtils.scala:354)
>         at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(
> HiveUtils.scala:258)
>         at org.apache.spark.sql.hive.HiveSharedState.metadataHive$
> lzycompute(HiveSharedState.scala:39)
>         at org.apache.spark.sql.hive.HiveSharedState.metadataHive(
> HiveSharedState.scala:38)
>         at org.apache.spark.sql.hive.HiveSharedState.
> externalCatalog$lzycompute(HiveSharedState.scala:46)
>         at org.apache.spark.sql.hive.HiveSharedState.externalCatalog(
> HiveSharedState.scala:45)
>         at org.apache.spark.sql.hive.HiveSessionState.catalog$
> lzycompute(HiveSessionState.scala:50)
>         at org.apache.spark.sql.hive.HiveSessionState.catalog(
> HiveSessionState.scala:48)
>         at org.apache.spark.sql.hive.HiveSessionState$$anon$1.<
> init>(HiveSessionState.scala:63)
>         at org.apache.spark.sql.hive.HiveSessionState.analyzer$
> lzycompute(HiveSessionState.scala:63)
>         at org.apache.spark.sql.hive.HiveSessionState.analyzer(
> HiveSessionState.scala:62)
>         at org.apache.spark.sql.execution.QueryExecution.
> assertAnalyzed(QueryExecution.scala:49)
>         at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
>         at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:582)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:57)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:237)
>         at py4j.reflection.ReflectionEngine.invoke(
> ReflectionEngine.java:357)
>         at py4j.Gateway.invoke(Gateway.java:280)
>         at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.
> java:132)
>         at py4j.commands.CallCommand.execute(CallCommand.java:79)
>         at py4j.GatewayConnection.run(GatewayConnection.java:214)
>         at java.lang.Thread.run(Thread.java:745)
>
>
> --
> Ruslan Dautkhanov
>