You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Emma Lin <li...@vmware.com> on 2012/01/20 05:52:14 UTC

Issues during setting up hadoop security cluster

Gurus,
I'm setting up a security cluster of hadoop .23. But now, the communication between Data Node and Name Node, Node Manager and Resource Manager have problem.
When I start the Node Manager, it will report following error, and then shutdown itself. Did you ever see such issue? Do you have any idea on how to triage this issue?

2012-01-20 12:03:08,258 INFO  ipc.HadoopYarnRPC (HadoopYarnProtoRPC.java:getProxy(48)) - Creating a HadoopYarnProtoRpc proxy for protocol interface org.apache.hadoop.yarn.server.api.ResourceTracker
2012-01-20 12:03:08,291 INFO  nodemanager.NodeStatusUpdaterImpl (NodeStatusUpdaterImpl.java:registerWithRM(155)) - Connected to ResourceManager at hadoopRM.example.aurora:9003
2012-01-20 12:03:20,399 WARN  ipc.Client (Client.java:run(526)) - Couldn't setup connection for nm/hadoopNM.example.aurora@EXAMPLE.AURORA to rm/hadoopRM.example.aurora@EXAMPLE.AURORA
2012-01-20 12:03:20,405 ERROR service.CompositeService (CompositeService.java:start(72)) - Error starting services org.apache.hadoop.yarn.server.nodemanager.NodeManager
org.apache.avro.AvroRuntimeException: java.lang.reflect.UndeclaredThrowableException
        at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:132)
        at org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.start(NodeManager.java:163)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:231)
Caused by: java.lang.reflect.UndeclaredThrowableException
        at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:66)
        at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:161)
        at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:128)
        ... 3 more
Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed on local exception: java.io.IOException: Couldn't setup connection for nm/hadoopNM.example.aurora@EXAMPLE.AURORA to rm/hadoopRM.example.aurora@EXAMPLE.AURORA; Host Details : local host is: "hadoopNM/10.112.127.102"; destination host is: ""hadoopRM.example.aurora":9003;
        at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:139)
        at $Proxy14.registerNodeManager(Unknown Source)
        at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:59)
        ... 5 more
Caused by: java.io.IOException: Failed on local exception: java.io.IOException: Couldn't setup connection for nm/hadoopNM.example.aurora@EXAMPLE.AURORA to rm/hadoopRM.example.aurora@EXAMPLE.AURORA; Host Details : local host is: "hadoopNM/10.112.127.102"; destination host is: ""hadoopRM.example.aurora":9003;
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:655)
        at org.apache.hadoop.ipc.Client.call(Client.java:1089)
        at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:136)
        ... 7 more
Caused by: java.io.IOException: Couldn't setup connection for nm/hadoopNM.example.aurora@EXAMPLE.AURORA to rm/hadoopRM.example.aurora@EXAMPLE.AURORA
        at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:527)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
        at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:499)
        at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:583)
        at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:205)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1195)
        at org.apache.hadoop.ipc.Client.call(Client.java:1065)
        ... 8 more
Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - UNKNOWN_SERVER)]
        at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:194)
        at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:137)
        at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:407)
        at org.apache.hadoop.ipc.Client$Connection.access$1200(Client.java:205)
        at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:576)
        at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:573)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
        at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:572)
        ... 11 more
Caused by: GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - UNKNOWN_SERVER)
        at sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:663)
        at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:230)
        at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:162)
        at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:175)
        ... 20 more
Caused by: KrbException: Server not found in Kerberos database (7) - UNKNOWN_SERVER
        at sun.security.krb5.KrbTgsRep.<init>(KrbTgsRep.java:64)
        at sun.security.krb5.KrbTgsReq.getReply(KrbTgsReq.java:185)
        at sun.security.krb5.internal.CredentialsUtil.serviceCreds(CredentialsUtil.java:294)
        at sun.security.krb5.internal.CredentialsUtil.acquireServiceCreds(CredentialsUtil.java:106)
        at sun.security.krb5.Credentials.acquireServiceCreds(Credentials.java:557)
        at sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:594)
        ... 23 more
Caused by: KrbException: Identifier doesn't match expected value (906)
        at sun.security.krb5.internal.KDCRep.init(KDCRep.java:133)
        at sun.security.krb5.internal.TGSRep.init(TGSRep.java:58)
        at sun.security.krb5.internal.TGSRep.<init>(TGSRep.java:53)
        at sun.security.krb5.KrbTgsRep.<init>(KrbTgsRep.java:46)
        ... 28 more

The error said that no valid server credential, but I've add those credentials in Resource Manager node. The keytab result is as following:
line@hadoopRM:~$ klist -k -e -t /etc/krb5.keytab
Keytab name: WRFILE:/etc/krb5.keytab
KVNO Timestamp         Principal
---- ----------------- --------------------------------------------------------
   2 01/20/12 10:55:02 rm/hadoopRM.example.aurora@EXAMPLE.AURORA (aes256-cts-hmac-sha1-96)
   2 01/20/12 10:55:02 rm/hadoopRM.example.aurora@EXAMPLE.AURORA (arcfour-hmac)
   2 01/20/12 10:55:02 rm/hadoopRM.example.aurora@EXAMPLE.AURORA (des3-cbc-sha1)
   2 01/20/12 10:55:02 rm/hadoopRM.example.aurora@EXAMPLE.AURORA (des-cbc-crc)
   2 01/19/12 11:19:11 host/hadoopRM.example.aurora@EXAMPLE.AURORA (aes256-cts-hmac-sha1-96)
   2 01/19/12 11:19:11 host/hadoopRM.example.aurora@EXAMPLE.AURORA (arcfour-hmac)
   2 01/19/12 11:19:11 host/hadoopRM.example.aurora@EXAMPLE.AURORA (des3-cbc-sha1)
   2 01/19/12 11:19:11 host/hadoopRM.example.aurora@EXAMPLE.AURORA (des-cbc-crc)
   2 01/19/12 11:20:15 jhs/hadoopRM.example.aurora@EXAMPLE.AURORA (aes256-cts-hmac-sha1-96)
   2 01/19/12 11:20:15 jhs/hadoopRM.example.aurora@EXAMPLE.AURORA (arcfour-hmac)
   2 01/19/12 11:20:15 jhs/hadoopRM.example.aurora@EXAMPLE.AURORA (des3-cbc-sha1)
   2 01/19/12 11:20:15 jhs/hadoopRM.example.aurora@EXAMPLE.AURORA (des-cbc-crc)

The whole node manager log is attached.

Any idea is appreciated.
Thanks
Emma

Re: Issues during setting up hadoop security cluster

Posted by Vinod Kumar Vavilapalli <vi...@hortonworks.com>.
You are on the right path for sure.

Where are you updating the JCE policy jar? (I know the RM-NM case is
working after this, so just checking)

May be the datanodes are not using the same JRE that you updated with
the new policy jar? Can you check that? jsvc shouldn't cause any more
issues, it should be related to your JAVA_HOME in case of datanode.

Thanks,
+Vinod

On Fri, Jan 20, 2012 at 2:33 AM, Emma Lin <li...@vmware.com> wrote:
> After remove the upper-case, the problem disappeared. Now I get node manager connected to resource manager successfully.
> Thank you Vinod.
>
> But now, I get another issue to connect Name Node from Data Node. The log in Name Node is as following:
> 2012-01-20 18:17:02,127 WARN  ipc.Server (Server.java:saslReadAndProcess(1070)) - Auth failed for 10.112.127.14:60456:null
> 2012-01-20 18:17:02,128 INFO  ipc.Server (Server.java:doRead(572)) - IPC Server listener on 9000: readAndProcess threw exception javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: Failure unspecified at GSS-API level (Mechanism level: Encryption type AES256 CTS mode with HMAC SHA1-96 is not supported/enabled)] from client 10.112.127.14. Count of bytes read: 0
> javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: Failure unspecified at GSS-API level (Mechanism level: Encryption type AES256 CTS mode with HMAC SHA1-96 is not supported/enabled)]
>        at com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:159)
>        at org.apache.hadoop.ipc.Server$Connection.saslReadAndProcess(Server.java:1054)
>        at org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1232)
>        at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:567)
>        at org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:366)
>        at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:341)
> Caused by: GSSException: Failure unspecified at GSS-API level (Mechanism level: Encryption type AES256 CTS mode with HMAC SHA1-96 is not supported/enabled)
>        at sun.security.jgss.krb5.Krb5Context.acceptSecContext(Krb5Context.java:741)
>        at sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:323)
>        at sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:267)
>        at com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:137)
>        ... 5 more
> Caused by: KrbException: Encryption type AES256 CTS mode with HMAC SHA1-96 is not supported/enabled
>        at sun.security.krb5.EncryptionKey.findKey(EncryptionKey.java:481)
>        at sun.security.krb5.KrbApReq.authenticate(KrbApReq.java:260)
>        at sun.security.krb5.KrbApReq.<init>(KrbApReq.java:134)
>        at sun.security.jgss.krb5.InitSecContextToken.<init>(InitSecContextToken.java:79)
>        at sun.security.jgss.krb5.Krb5Context.acceptSecContext(Krb5Context.java:724)
>        ... 8 more
>
> From the internet, someone said that it's because the Java support AES 128 by default. And to support AES 256, we need to install unlimited JCE policy. But after install the JCE, node manager can connect to resource manager, the data node still cannot connect to name node.
> As the datanode is started through jsvc, I don't know if the java setting does not work after executed through jsvc. But anyway, it still complain for the AES 256 is not supported.
>
> Any ideas?
> Thanks
> Emma
>
>
> -----Original Message-----
> From: Vinod Kumar Vavilapalli [mailto:vinodkv@hortonworks.com]
> Sent: 2012年1月20日 13:23
> To: common-user@hadoop.apache.org
> Subject: Re: Issues during setting up hadoop security cluster
>
> Hi,
>
> Just today evening, I happened to run into someone who had the same
> issue. After some debugging, I cornered that to the hostnames having
> upper-case characters. Somehow, when DataNode or NodeManager try to
> get a service ticket for their corresponding services (NameNode and
> ResourceManager respectively), the hostname were getting converted
> into all lowercase. You can see if it is the same situation with you
> by looking at krb5kdc logs.
>
> If that is the case, changing the hostnames everywhere to be all
> small-case may help. Please try that and let me know.
>
> HTH,
> +Vinod
>
>
> On Thu, Jan 19, 2012 at 8:52 PM, Emma Lin <li...@vmware.com> wrote:
>> Gurus,
>>
>> I’m setting up a security cluster of hadoop .23. But now, the communication
>> between Data Node and Name Node, Node Manager and Resource Manager have
>> problem.
>>
>> When I start the Node Manager, it will report following error, and then
>> shutdown itself. Did you ever see such issue? Do you have any idea on how to
>> triage this issue?
>>
>>
>>
>> 2012-01-20 12:03:08,258 INFO  ipc.HadoopYarnRPC
>> (HadoopYarnProtoRPC.java:getProxy(48)) - Creating a HadoopYarnProtoRpc proxy
>> for protocol interface org.apache.hadoop.yarn.server.api.ResourceTracker
>>
>> 2012-01-20 12:03:08,291 INFO  nodemanager.NodeStatusUpdaterImpl
>> (NodeStatusUpdaterImpl.java:registerWithRM(155)) - Connected to
>> ResourceManager at hadoopRM.example.aurora:9003
>>
>> 2012-01-20 12:03:20,399 WARN  ipc.Client (Client.java:run(526)) - Couldn't
>> setup connection for nm/hadoopNM.example.aurora@EXAMPLE.AURORA to
>> rm/hadoopRM.example.aurora@EXAMPLE.AURORA
>>
>> 2012-01-20 12:03:20,405 ERROR service.CompositeService
>> (CompositeService.java:start(72)) - Error starting services
>> org.apache.hadoop.yarn.server.nodemanager.NodeManager
>>
>> org.apache.avro.AvroRuntimeException:
>> java.lang.reflect.UndeclaredThrowableException
>>
>>         at
>> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:132)
>>
>>         at
>> org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
>>
>>         at
>> org.apache.hadoop.yarn.server.nodemanager.NodeManager.start(NodeManager.java:163)
>>
>>         at
>> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:231)
>>
>> Caused by: java.lang.reflect.UndeclaredThrowableException
>>
>>         at
>> org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:66)
>>
>>         at
>> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:161)
>>
>>         at
>> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:128)
>>
>>         ... 3 more
>>
>> Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed
>> on local exception: java.io.IOException: Couldn't setup connection for
>> nm/hadoopNM.example.aurora@EXAMPLE.AURORA to
>> rm/hadoopRM.example.aurora@EXAMPLE.AURORA; Host Details : local host is:
>> "hadoopNM/10.112.127.102"; destination host is:
>> ""hadoopRM.example.aurora":9003;
>>
>>         at
>> org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:139)
>>
>>         at $Proxy14.registerNodeManager(Unknown Source)
>>
>>         at
>> org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:59)
>>
>>         ... 5 more
>>
>> Caused by: java.io.IOException: Failed on local exception:
>> java.io.IOException: Couldn't setup connection for
>> nm/hadoopNM.example.aurora@EXAMPLE.AURORA to
>> rm/hadoopRM.example.aurora@EXAMPLE.AURORA; Host Details : local host is:
>> "hadoopNM/10.112.127.102"; destination host is:
>> ""hadoopRM.example.aurora":9003;
>>
>>         at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:655)
>>
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1089)
>>
>>         at
>> org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:136)
>>
>>         ... 7 more
>>
>> Caused by: java.io.IOException: Couldn't setup connection for
>> nm/hadoopNM.example.aurora@EXAMPLE.AURORA to
>> rm/hadoopRM.example.aurora@EXAMPLE.AURORA
>>
>>         at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:527)
>>
>>         at java.security.AccessController.doPrivileged(Native Method)
>>
>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
>>
>>         at
>> org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:499)
>>
>>         at
>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:583)
>>
>>         at
>> org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:205)
>>
>>         at org.apache.hadoop.ipc.Client.getConnection(Client.java:1195)
>>
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1065)
>>
>>         ... 8 more
>>
>> Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by
>> GSSException: No valid credentials provided (Mechanism level: Server not
>> found in Kerberos database (7) - UNKNOWN_SERVER)]
>>
>>         at
>> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:194)
>>
>>         at
>> org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:137)
>>
>>         at
>> org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:407)
>>
>>         at
>> org.apache.hadoop.ipc.Client$Connection.access$1200(Client.java:205)
>>
>>         at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:576)
>>
>>         at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:573)
>>
>>         at java.security.AccessController.doPrivileged(Native Method)
>>
>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
>>
>>         at
>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:572)
>>
>>         ... 11 more
>>
>> Caused by: GSSException: No valid credentials provided (Mechanism level:
>> Server not found in Kerberos database (7) - UNKNOWN_SERVER)
>>
>>         at
>> sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:663)
>>
>>         at
>> sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:230)
>>
>>         at
>> sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:162)
>>
>>         at
>> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:175)
>>
>>         ... 20 more
>>
>> Caused by: KrbException: Server not found in Kerberos database (7) -
>> UNKNOWN_SERVER
>>
>>         at sun.security.krb5.KrbTgsRep.<init>(KrbTgsRep.java:64)
>>
>>         at sun.security.krb5.KrbTgsReq.getReply(KrbTgsReq.java:185)
>>
>>         at
>> sun.security.krb5.internal.CredentialsUtil.serviceCreds(CredentialsUtil.java:294)
>>
>>         at
>> sun.security.krb5.internal.CredentialsUtil.acquireServiceCreds(CredentialsUtil.java:106)
>>
>>         at
>> sun.security.krb5.Credentials.acquireServiceCreds(Credentials.java:557)
>>
>>         at
>> sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:594)
>>
>>         ... 23 more
>>
>> Caused by: KrbException: Identifier doesn't match expected value (906)
>>
>>         at sun.security.krb5.internal.KDCRep.init(KDCRep.java:133)
>>
>>         at sun.security.krb5.internal.TGSRep.init(TGSRep.java:58)
>>
>>         at sun.security.krb5.internal.TGSRep.<init>(TGSRep.java:53)
>>
>>         at sun.security.krb5.KrbTgsRep.<init>(KrbTgsRep.java:46)
>>
>>         ... 28 more
>>
>>
>>
>> The error said that no valid server credential, but I’ve add those
>> credentials in Resource Manager node. The keytab result is as following:
>>
>> line@hadoopRM:~$ klist -k -e -t /etc/krb5.keytab
>>
>> Keytab name: WRFILE:/etc/krb5.keytab
>>
>> KVNO Timestamp         Principal
>>
>> ---- -----------------
>> --------------------------------------------------------
>>
>>    2 01/20/12 10:55:02 rm/hadoopRM.example.aurora@EXAMPLE.AURORA
>> (aes256-cts-hmac-sha1-96)
>>
>>    2 01/20/12 10:55:02 rm/hadoopRM.example.aurora@EXAMPLE.AURORA
>> (arcfour-hmac)
>>
>>    2 01/20/12 10:55:02 rm/hadoopRM.example.aurora@EXAMPLE.AURORA
>> (des3-cbc-sha1)
>>
>>    2 01/20/12 10:55:02 rm/hadoopRM.example.aurora@EXAMPLE.AURORA
>> (des-cbc-crc)
>>
>>    2 01/19/12 11:19:11 host/hadoopRM.example.aurora@EXAMPLE.AURORA
>> (aes256-cts-hmac-sha1-96)
>>
>>    2 01/19/12 11:19:11 host/hadoopRM.example.aurora@EXAMPLE.AURORA
>> (arcfour-hmac)
>>
>>    2 01/19/12 11:19:11 host/hadoopRM.example.aurora@EXAMPLE.AURORA
>> (des3-cbc-sha1)
>>
>>    2 01/19/12 11:19:11 host/hadoopRM.example.aurora@EXAMPLE.AURORA
>> (des-cbc-crc)
>>
>>    2 01/19/12 11:20:15 jhs/hadoopRM.example.aurora@EXAMPLE.AURORA
>> (aes256-cts-hmac-sha1-96)
>>
>>    2 01/19/12 11:20:15 jhs/hadoopRM.example.aurora@EXAMPLE.AURORA
>> (arcfour-hmac)
>>
>>    2 01/19/12 11:20:15 jhs/hadoopRM.example.aurora@EXAMPLE.AURORA
>> (des3-cbc-sha1)
>>
>>    2 01/19/12 11:20:15 jhs/hadoopRM.example.aurora@EXAMPLE.AURORA
>> (des-cbc-crc)
>>
>>
>>
>> The whole node manager log is attached.
>>
>>
>>
>> Any idea is appreciated.
>>
>> Thanks
>>
>> Emma

RE: Issues during setting up hadoop security cluster

Posted by Emma Lin <li...@vmware.com>.
After remove the upper-case, the problem disappeared. Now I get node manager connected to resource manager successfully.
Thank you Vinod.

But now, I get another issue to connect Name Node from Data Node. The log in Name Node is as following:
2012-01-20 18:17:02,127 WARN  ipc.Server (Server.java:saslReadAndProcess(1070)) - Auth failed for 10.112.127.14:60456:null
2012-01-20 18:17:02,128 INFO  ipc.Server (Server.java:doRead(572)) - IPC Server listener on 9000: readAndProcess threw exception javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: Failure unspecified at GSS-API level (Mechanism level: Encryption type AES256 CTS mode with HMAC SHA1-96 is not supported/enabled)] from client 10.112.127.14. Count of bytes read: 0
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: Failure unspecified at GSS-API level (Mechanism level: Encryption type AES256 CTS mode with HMAC SHA1-96 is not supported/enabled)]
        at com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:159)
        at org.apache.hadoop.ipc.Server$Connection.saslReadAndProcess(Server.java:1054)
        at org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1232)
        at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:567)
        at org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:366)
        at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:341)
Caused by: GSSException: Failure unspecified at GSS-API level (Mechanism level: Encryption type AES256 CTS mode with HMAC SHA1-96 is not supported/enabled)
        at sun.security.jgss.krb5.Krb5Context.acceptSecContext(Krb5Context.java:741)
        at sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:323)
        at sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:267)
        at com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:137)
        ... 5 more
Caused by: KrbException: Encryption type AES256 CTS mode with HMAC SHA1-96 is not supported/enabled
        at sun.security.krb5.EncryptionKey.findKey(EncryptionKey.java:481)
        at sun.security.krb5.KrbApReq.authenticate(KrbApReq.java:260)
        at sun.security.krb5.KrbApReq.<init>(KrbApReq.java:134)
        at sun.security.jgss.krb5.InitSecContextToken.<init>(InitSecContextToken.java:79)
        at sun.security.jgss.krb5.Krb5Context.acceptSecContext(Krb5Context.java:724)
        ... 8 more

From the internet, someone said that it's because the Java support AES 128 by default. And to support AES 256, we need to install unlimited JCE policy. But after install the JCE, node manager can connect to resource manager, the data node still cannot connect to name node.
As the datanode is started through jsvc, I don't know if the java setting does not work after executed through jsvc. But anyway, it still complain for the AES 256 is not supported. 

Any ideas?
Thanks
Emma


-----Original Message-----
From: Vinod Kumar Vavilapalli [mailto:vinodkv@hortonworks.com] 
Sent: 2012年1月20日 13:23
To: common-user@hadoop.apache.org
Subject: Re: Issues during setting up hadoop security cluster

Hi,

Just today evening, I happened to run into someone who had the same
issue. After some debugging, I cornered that to the hostnames having
upper-case characters. Somehow, when DataNode or NodeManager try to
get a service ticket for their corresponding services (NameNode and
ResourceManager respectively), the hostname were getting converted
into all lowercase. You can see if it is the same situation with you
by looking at krb5kdc logs.

If that is the case, changing the hostnames everywhere to be all
small-case may help. Please try that and let me know.

HTH,
+Vinod


On Thu, Jan 19, 2012 at 8:52 PM, Emma Lin <li...@vmware.com> wrote:
> Gurus,
>
> I’m setting up a security cluster of hadoop .23. But now, the communication
> between Data Node and Name Node, Node Manager and Resource Manager have
> problem.
>
> When I start the Node Manager, it will report following error, and then
> shutdown itself. Did you ever see such issue? Do you have any idea on how to
> triage this issue?
>
>
>
> 2012-01-20 12:03:08,258 INFO  ipc.HadoopYarnRPC
> (HadoopYarnProtoRPC.java:getProxy(48)) - Creating a HadoopYarnProtoRpc proxy
> for protocol interface org.apache.hadoop.yarn.server.api.ResourceTracker
>
> 2012-01-20 12:03:08,291 INFO  nodemanager.NodeStatusUpdaterImpl
> (NodeStatusUpdaterImpl.java:registerWithRM(155)) - Connected to
> ResourceManager at hadoopRM.example.aurora:9003
>
> 2012-01-20 12:03:20,399 WARN  ipc.Client (Client.java:run(526)) - Couldn't
> setup connection for nm/hadoopNM.example.aurora@EXAMPLE.AURORA to
> rm/hadoopRM.example.aurora@EXAMPLE.AURORA
>
> 2012-01-20 12:03:20,405 ERROR service.CompositeService
> (CompositeService.java:start(72)) - Error starting services
> org.apache.hadoop.yarn.server.nodemanager.NodeManager
>
> org.apache.avro.AvroRuntimeException:
> java.lang.reflect.UndeclaredThrowableException
>
>         at
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:132)
>
>         at
> org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
>
>         at
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.start(NodeManager.java:163)
>
>         at
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:231)
>
> Caused by: java.lang.reflect.UndeclaredThrowableException
>
>         at
> org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:66)
>
>         at
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:161)
>
>         at
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:128)
>
>         ... 3 more
>
> Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed
> on local exception: java.io.IOException: Couldn't setup connection for
> nm/hadoopNM.example.aurora@EXAMPLE.AURORA to
> rm/hadoopRM.example.aurora@EXAMPLE.AURORA; Host Details : local host is:
> "hadoopNM/10.112.127.102"; destination host is:
> ""hadoopRM.example.aurora":9003;
>
>         at
> org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:139)
>
>         at $Proxy14.registerNodeManager(Unknown Source)
>
>         at
> org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:59)
>
>         ... 5 more
>
> Caused by: java.io.IOException: Failed on local exception:
> java.io.IOException: Couldn't setup connection for
> nm/hadoopNM.example.aurora@EXAMPLE.AURORA to
> rm/hadoopRM.example.aurora@EXAMPLE.AURORA; Host Details : local host is:
> "hadoopNM/10.112.127.102"; destination host is:
> ""hadoopRM.example.aurora":9003;
>
>         at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:655)
>
>         at org.apache.hadoop.ipc.Client.call(Client.java:1089)
>
>         at
> org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:136)
>
>         ... 7 more
>
> Caused by: java.io.IOException: Couldn't setup connection for
> nm/hadoopNM.example.aurora@EXAMPLE.AURORA to
> rm/hadoopRM.example.aurora@EXAMPLE.AURORA
>
>         at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:527)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
>
>         at
> org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:499)
>
>         at
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:583)
>
>         at
> org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:205)
>
>         at org.apache.hadoop.ipc.Client.getConnection(Client.java:1195)
>
>         at org.apache.hadoop.ipc.Client.call(Client.java:1065)
>
>         ... 8 more
>
> Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by
> GSSException: No valid credentials provided (Mechanism level: Server not
> found in Kerberos database (7) - UNKNOWN_SERVER)]
>
>         at
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:194)
>
>         at
> org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:137)
>
>         at
> org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:407)
>
>         at
> org.apache.hadoop.ipc.Client$Connection.access$1200(Client.java:205)
>
>         at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:576)
>
>         at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:573)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
>
>         at
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:572)
>
>         ... 11 more
>
> Caused by: GSSException: No valid credentials provided (Mechanism level:
> Server not found in Kerberos database (7) - UNKNOWN_SERVER)
>
>         at
> sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:663)
>
>         at
> sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:230)
>
>         at
> sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:162)
>
>         at
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:175)
>
>         ... 20 more
>
> Caused by: KrbException: Server not found in Kerberos database (7) -
> UNKNOWN_SERVER
>
>         at sun.security.krb5.KrbTgsRep.<init>(KrbTgsRep.java:64)
>
>         at sun.security.krb5.KrbTgsReq.getReply(KrbTgsReq.java:185)
>
>         at
> sun.security.krb5.internal.CredentialsUtil.serviceCreds(CredentialsUtil.java:294)
>
>         at
> sun.security.krb5.internal.CredentialsUtil.acquireServiceCreds(CredentialsUtil.java:106)
>
>         at
> sun.security.krb5.Credentials.acquireServiceCreds(Credentials.java:557)
>
>         at
> sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:594)
>
>         ... 23 more
>
> Caused by: KrbException: Identifier doesn't match expected value (906)
>
>         at sun.security.krb5.internal.KDCRep.init(KDCRep.java:133)
>
>         at sun.security.krb5.internal.TGSRep.init(TGSRep.java:58)
>
>         at sun.security.krb5.internal.TGSRep.<init>(TGSRep.java:53)
>
>         at sun.security.krb5.KrbTgsRep.<init>(KrbTgsRep.java:46)
>
>         ... 28 more
>
>
>
> The error said that no valid server credential, but I’ve add those
> credentials in Resource Manager node. The keytab result is as following:
>
> line@hadoopRM:~$ klist -k -e -t /etc/krb5.keytab
>
> Keytab name: WRFILE:/etc/krb5.keytab
>
> KVNO Timestamp         Principal
>
> ---- -----------------
> --------------------------------------------------------
>
>    2 01/20/12 10:55:02 rm/hadoopRM.example.aurora@EXAMPLE.AURORA
> (aes256-cts-hmac-sha1-96)
>
>    2 01/20/12 10:55:02 rm/hadoopRM.example.aurora@EXAMPLE.AURORA
> (arcfour-hmac)
>
>    2 01/20/12 10:55:02 rm/hadoopRM.example.aurora@EXAMPLE.AURORA
> (des3-cbc-sha1)
>
>    2 01/20/12 10:55:02 rm/hadoopRM.example.aurora@EXAMPLE.AURORA
> (des-cbc-crc)
>
>    2 01/19/12 11:19:11 host/hadoopRM.example.aurora@EXAMPLE.AURORA
> (aes256-cts-hmac-sha1-96)
>
>    2 01/19/12 11:19:11 host/hadoopRM.example.aurora@EXAMPLE.AURORA
> (arcfour-hmac)
>
>    2 01/19/12 11:19:11 host/hadoopRM.example.aurora@EXAMPLE.AURORA
> (des3-cbc-sha1)
>
>    2 01/19/12 11:19:11 host/hadoopRM.example.aurora@EXAMPLE.AURORA
> (des-cbc-crc)
>
>    2 01/19/12 11:20:15 jhs/hadoopRM.example.aurora@EXAMPLE.AURORA
> (aes256-cts-hmac-sha1-96)
>
>    2 01/19/12 11:20:15 jhs/hadoopRM.example.aurora@EXAMPLE.AURORA
> (arcfour-hmac)
>
>    2 01/19/12 11:20:15 jhs/hadoopRM.example.aurora@EXAMPLE.AURORA
> (des3-cbc-sha1)
>
>    2 01/19/12 11:20:15 jhs/hadoopRM.example.aurora@EXAMPLE.AURORA
> (des-cbc-crc)
>
>
>
> The whole node manager log is attached.
>
>
>
> Any idea is appreciated.
>
> Thanks
>
> Emma

RE: Issues during setting up hadoop security cluster

Posted by Emma Lin <li...@vmware.com>.
Vinod,
Thanks for your point. I'm trying to do it. Will let you know the result soon.
Thanks
Emma

-----Original Message-----
From: Vinod Kumar Vavilapalli [mailto:vinodkv@hortonworks.com] 
Sent: 2012年1月20日 13:23
To: common-user@hadoop.apache.org
Subject: Re: Issues during setting up hadoop security cluster

Hi,

Just today evening, I happened to run into someone who had the same
issue. After some debugging, I cornered that to the hostnames having
upper-case characters. Somehow, when DataNode or NodeManager try to
get a service ticket for their corresponding services (NameNode and
ResourceManager respectively), the hostname were getting converted
into all lowercase. You can see if it is the same situation with you
by looking at krb5kdc logs.

If that is the case, changing the hostnames everywhere to be all
small-case may help. Please try that and let me know.

HTH,
+Vinod


On Thu, Jan 19, 2012 at 8:52 PM, Emma Lin <li...@vmware.com> wrote:
> Gurus,
>
> I’m setting up a security cluster of hadoop .23. But now, the communication
> between Data Node and Name Node, Node Manager and Resource Manager have
> problem.
>
> When I start the Node Manager, it will report following error, and then
> shutdown itself. Did you ever see such issue? Do you have any idea on how to
> triage this issue?
>
>
>
> 2012-01-20 12:03:08,258 INFO  ipc.HadoopYarnRPC
> (HadoopYarnProtoRPC.java:getProxy(48)) - Creating a HadoopYarnProtoRpc proxy
> for protocol interface org.apache.hadoop.yarn.server.api.ResourceTracker
>
> 2012-01-20 12:03:08,291 INFO  nodemanager.NodeStatusUpdaterImpl
> (NodeStatusUpdaterImpl.java:registerWithRM(155)) - Connected to
> ResourceManager at hadoopRM.example.aurora:9003
>
> 2012-01-20 12:03:20,399 WARN  ipc.Client (Client.java:run(526)) - Couldn't
> setup connection for nm/hadoopNM.example.aurora@EXAMPLE.AURORA to
> rm/hadoopRM.example.aurora@EXAMPLE.AURORA
>
> 2012-01-20 12:03:20,405 ERROR service.CompositeService
> (CompositeService.java:start(72)) - Error starting services
> org.apache.hadoop.yarn.server.nodemanager.NodeManager
>
> org.apache.avro.AvroRuntimeException:
> java.lang.reflect.UndeclaredThrowableException
>
>         at
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:132)
>
>         at
> org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
>
>         at
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.start(NodeManager.java:163)
>
>         at
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:231)
>
> Caused by: java.lang.reflect.UndeclaredThrowableException
>
>         at
> org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:66)
>
>         at
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:161)
>
>         at
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:128)
>
>         ... 3 more
>
> Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed
> on local exception: java.io.IOException: Couldn't setup connection for
> nm/hadoopNM.example.aurora@EXAMPLE.AURORA to
> rm/hadoopRM.example.aurora@EXAMPLE.AURORA; Host Details : local host is:
> "hadoopNM/10.112.127.102"; destination host is:
> ""hadoopRM.example.aurora":9003;
>
>         at
> org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:139)
>
>         at $Proxy14.registerNodeManager(Unknown Source)
>
>         at
> org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:59)
>
>         ... 5 more
>
> Caused by: java.io.IOException: Failed on local exception:
> java.io.IOException: Couldn't setup connection for
> nm/hadoopNM.example.aurora@EXAMPLE.AURORA to
> rm/hadoopRM.example.aurora@EXAMPLE.AURORA; Host Details : local host is:
> "hadoopNM/10.112.127.102"; destination host is:
> ""hadoopRM.example.aurora":9003;
>
>         at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:655)
>
>         at org.apache.hadoop.ipc.Client.call(Client.java:1089)
>
>         at
> org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:136)
>
>         ... 7 more
>
> Caused by: java.io.IOException: Couldn't setup connection for
> nm/hadoopNM.example.aurora@EXAMPLE.AURORA to
> rm/hadoopRM.example.aurora@EXAMPLE.AURORA
>
>         at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:527)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
>
>         at
> org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:499)
>
>         at
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:583)
>
>         at
> org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:205)
>
>         at org.apache.hadoop.ipc.Client.getConnection(Client.java:1195)
>
>         at org.apache.hadoop.ipc.Client.call(Client.java:1065)
>
>         ... 8 more
>
> Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by
> GSSException: No valid credentials provided (Mechanism level: Server not
> found in Kerberos database (7) - UNKNOWN_SERVER)]
>
>         at
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:194)
>
>         at
> org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:137)
>
>         at
> org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:407)
>
>         at
> org.apache.hadoop.ipc.Client$Connection.access$1200(Client.java:205)
>
>         at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:576)
>
>         at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:573)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
>
>         at
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:572)
>
>         ... 11 more
>
> Caused by: GSSException: No valid credentials provided (Mechanism level:
> Server not found in Kerberos database (7) - UNKNOWN_SERVER)
>
>         at
> sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:663)
>
>         at
> sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:230)
>
>         at
> sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:162)
>
>         at
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:175)
>
>         ... 20 more
>
> Caused by: KrbException: Server not found in Kerberos database (7) -
> UNKNOWN_SERVER
>
>         at sun.security.krb5.KrbTgsRep.<init>(KrbTgsRep.java:64)
>
>         at sun.security.krb5.KrbTgsReq.getReply(KrbTgsReq.java:185)
>
>         at
> sun.security.krb5.internal.CredentialsUtil.serviceCreds(CredentialsUtil.java:294)
>
>         at
> sun.security.krb5.internal.CredentialsUtil.acquireServiceCreds(CredentialsUtil.java:106)
>
>         at
> sun.security.krb5.Credentials.acquireServiceCreds(Credentials.java:557)
>
>         at
> sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:594)
>
>         ... 23 more
>
> Caused by: KrbException: Identifier doesn't match expected value (906)
>
>         at sun.security.krb5.internal.KDCRep.init(KDCRep.java:133)
>
>         at sun.security.krb5.internal.TGSRep.init(TGSRep.java:58)
>
>         at sun.security.krb5.internal.TGSRep.<init>(TGSRep.java:53)
>
>         at sun.security.krb5.KrbTgsRep.<init>(KrbTgsRep.java:46)
>
>         ... 28 more
>
>
>
> The error said that no valid server credential, but I’ve add those
> credentials in Resource Manager node. The keytab result is as following:
>
> line@hadoopRM:~$ klist -k -e -t /etc/krb5.keytab
>
> Keytab name: WRFILE:/etc/krb5.keytab
>
> KVNO Timestamp         Principal
>
> ---- -----------------
> --------------------------------------------------------
>
>    2 01/20/12 10:55:02 rm/hadoopRM.example.aurora@EXAMPLE.AURORA
> (aes256-cts-hmac-sha1-96)
>
>    2 01/20/12 10:55:02 rm/hadoopRM.example.aurora@EXAMPLE.AURORA
> (arcfour-hmac)
>
>    2 01/20/12 10:55:02 rm/hadoopRM.example.aurora@EXAMPLE.AURORA
> (des3-cbc-sha1)
>
>    2 01/20/12 10:55:02 rm/hadoopRM.example.aurora@EXAMPLE.AURORA
> (des-cbc-crc)
>
>    2 01/19/12 11:19:11 host/hadoopRM.example.aurora@EXAMPLE.AURORA
> (aes256-cts-hmac-sha1-96)
>
>    2 01/19/12 11:19:11 host/hadoopRM.example.aurora@EXAMPLE.AURORA
> (arcfour-hmac)
>
>    2 01/19/12 11:19:11 host/hadoopRM.example.aurora@EXAMPLE.AURORA
> (des3-cbc-sha1)
>
>    2 01/19/12 11:19:11 host/hadoopRM.example.aurora@EXAMPLE.AURORA
> (des-cbc-crc)
>
>    2 01/19/12 11:20:15 jhs/hadoopRM.example.aurora@EXAMPLE.AURORA
> (aes256-cts-hmac-sha1-96)
>
>    2 01/19/12 11:20:15 jhs/hadoopRM.example.aurora@EXAMPLE.AURORA
> (arcfour-hmac)
>
>    2 01/19/12 11:20:15 jhs/hadoopRM.example.aurora@EXAMPLE.AURORA
> (des3-cbc-sha1)
>
>    2 01/19/12 11:20:15 jhs/hadoopRM.example.aurora@EXAMPLE.AURORA
> (des-cbc-crc)
>
>
>
> The whole node manager log is attached.
>
>
>
> Any idea is appreciated.
>
> Thanks
>
> Emma

Re: Issues during setting up hadoop security cluster

Posted by Vinod Kumar Vavilapalli <vi...@hortonworks.com>.
Hi,

Just today evening, I happened to run into someone who had the same
issue. After some debugging, I cornered that to the hostnames having
upper-case characters. Somehow, when DataNode or NodeManager try to
get a service ticket for their corresponding services (NameNode and
ResourceManager respectively), the hostname were getting converted
into all lowercase. You can see if it is the same situation with you
by looking at krb5kdc logs.

If that is the case, changing the hostnames everywhere to be all
small-case may help. Please try that and let me know.

HTH,
+Vinod


On Thu, Jan 19, 2012 at 8:52 PM, Emma Lin <li...@vmware.com> wrote:
> Gurus,
>
> I’m setting up a security cluster of hadoop .23. But now, the communication
> between Data Node and Name Node, Node Manager and Resource Manager have
> problem.
>
> When I start the Node Manager, it will report following error, and then
> shutdown itself. Did you ever see such issue? Do you have any idea on how to
> triage this issue?
>
>
>
> 2012-01-20 12:03:08,258 INFO  ipc.HadoopYarnRPC
> (HadoopYarnProtoRPC.java:getProxy(48)) - Creating a HadoopYarnProtoRpc proxy
> for protocol interface org.apache.hadoop.yarn.server.api.ResourceTracker
>
> 2012-01-20 12:03:08,291 INFO  nodemanager.NodeStatusUpdaterImpl
> (NodeStatusUpdaterImpl.java:registerWithRM(155)) - Connected to
> ResourceManager at hadoopRM.example.aurora:9003
>
> 2012-01-20 12:03:20,399 WARN  ipc.Client (Client.java:run(526)) - Couldn't
> setup connection for nm/hadoopNM.example.aurora@EXAMPLE.AURORA to
> rm/hadoopRM.example.aurora@EXAMPLE.AURORA
>
> 2012-01-20 12:03:20,405 ERROR service.CompositeService
> (CompositeService.java:start(72)) - Error starting services
> org.apache.hadoop.yarn.server.nodemanager.NodeManager
>
> org.apache.avro.AvroRuntimeException:
> java.lang.reflect.UndeclaredThrowableException
>
>         at
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:132)
>
>         at
> org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
>
>         at
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.start(NodeManager.java:163)
>
>         at
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:231)
>
> Caused by: java.lang.reflect.UndeclaredThrowableException
>
>         at
> org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:66)
>
>         at
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:161)
>
>         at
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:128)
>
>         ... 3 more
>
> Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed
> on local exception: java.io.IOException: Couldn't setup connection for
> nm/hadoopNM.example.aurora@EXAMPLE.AURORA to
> rm/hadoopRM.example.aurora@EXAMPLE.AURORA; Host Details : local host is:
> "hadoopNM/10.112.127.102"; destination host is:
> ""hadoopRM.example.aurora":9003;
>
>         at
> org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:139)
>
>         at $Proxy14.registerNodeManager(Unknown Source)
>
>         at
> org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:59)
>
>         ... 5 more
>
> Caused by: java.io.IOException: Failed on local exception:
> java.io.IOException: Couldn't setup connection for
> nm/hadoopNM.example.aurora@EXAMPLE.AURORA to
> rm/hadoopRM.example.aurora@EXAMPLE.AURORA; Host Details : local host is:
> "hadoopNM/10.112.127.102"; destination host is:
> ""hadoopRM.example.aurora":9003;
>
>         at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:655)
>
>         at org.apache.hadoop.ipc.Client.call(Client.java:1089)
>
>         at
> org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:136)
>
>         ... 7 more
>
> Caused by: java.io.IOException: Couldn't setup connection for
> nm/hadoopNM.example.aurora@EXAMPLE.AURORA to
> rm/hadoopRM.example.aurora@EXAMPLE.AURORA
>
>         at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:527)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
>
>         at
> org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:499)
>
>         at
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:583)
>
>         at
> org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:205)
>
>         at org.apache.hadoop.ipc.Client.getConnection(Client.java:1195)
>
>         at org.apache.hadoop.ipc.Client.call(Client.java:1065)
>
>         ... 8 more
>
> Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by
> GSSException: No valid credentials provided (Mechanism level: Server not
> found in Kerberos database (7) - UNKNOWN_SERVER)]
>
>         at
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:194)
>
>         at
> org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:137)
>
>         at
> org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:407)
>
>         at
> org.apache.hadoop.ipc.Client$Connection.access$1200(Client.java:205)
>
>         at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:576)
>
>         at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:573)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
>
>         at
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:572)
>
>         ... 11 more
>
> Caused by: GSSException: No valid credentials provided (Mechanism level:
> Server not found in Kerberos database (7) - UNKNOWN_SERVER)
>
>         at
> sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:663)
>
>         at
> sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:230)
>
>         at
> sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:162)
>
>         at
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:175)
>
>         ... 20 more
>
> Caused by: KrbException: Server not found in Kerberos database (7) -
> UNKNOWN_SERVER
>
>         at sun.security.krb5.KrbTgsRep.<init>(KrbTgsRep.java:64)
>
>         at sun.security.krb5.KrbTgsReq.getReply(KrbTgsReq.java:185)
>
>         at
> sun.security.krb5.internal.CredentialsUtil.serviceCreds(CredentialsUtil.java:294)
>
>         at
> sun.security.krb5.internal.CredentialsUtil.acquireServiceCreds(CredentialsUtil.java:106)
>
>         at
> sun.security.krb5.Credentials.acquireServiceCreds(Credentials.java:557)
>
>         at
> sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:594)
>
>         ... 23 more
>
> Caused by: KrbException: Identifier doesn't match expected value (906)
>
>         at sun.security.krb5.internal.KDCRep.init(KDCRep.java:133)
>
>         at sun.security.krb5.internal.TGSRep.init(TGSRep.java:58)
>
>         at sun.security.krb5.internal.TGSRep.<init>(TGSRep.java:53)
>
>         at sun.security.krb5.KrbTgsRep.<init>(KrbTgsRep.java:46)
>
>         ... 28 more
>
>
>
> The error said that no valid server credential, but I’ve add those
> credentials in Resource Manager node. The keytab result is as following:
>
> line@hadoopRM:~$ klist -k -e -t /etc/krb5.keytab
>
> Keytab name: WRFILE:/etc/krb5.keytab
>
> KVNO Timestamp         Principal
>
> ---- -----------------
> --------------------------------------------------------
>
>    2 01/20/12 10:55:02 rm/hadoopRM.example.aurora@EXAMPLE.AURORA
> (aes256-cts-hmac-sha1-96)
>
>    2 01/20/12 10:55:02 rm/hadoopRM.example.aurora@EXAMPLE.AURORA
> (arcfour-hmac)
>
>    2 01/20/12 10:55:02 rm/hadoopRM.example.aurora@EXAMPLE.AURORA
> (des3-cbc-sha1)
>
>    2 01/20/12 10:55:02 rm/hadoopRM.example.aurora@EXAMPLE.AURORA
> (des-cbc-crc)
>
>    2 01/19/12 11:19:11 host/hadoopRM.example.aurora@EXAMPLE.AURORA
> (aes256-cts-hmac-sha1-96)
>
>    2 01/19/12 11:19:11 host/hadoopRM.example.aurora@EXAMPLE.AURORA
> (arcfour-hmac)
>
>    2 01/19/12 11:19:11 host/hadoopRM.example.aurora@EXAMPLE.AURORA
> (des3-cbc-sha1)
>
>    2 01/19/12 11:19:11 host/hadoopRM.example.aurora@EXAMPLE.AURORA
> (des-cbc-crc)
>
>    2 01/19/12 11:20:15 jhs/hadoopRM.example.aurora@EXAMPLE.AURORA
> (aes256-cts-hmac-sha1-96)
>
>    2 01/19/12 11:20:15 jhs/hadoopRM.example.aurora@EXAMPLE.AURORA
> (arcfour-hmac)
>
>    2 01/19/12 11:20:15 jhs/hadoopRM.example.aurora@EXAMPLE.AURORA
> (des3-cbc-sha1)
>
>    2 01/19/12 11:20:15 jhs/hadoopRM.example.aurora@EXAMPLE.AURORA
> (des-cbc-crc)
>
>
>
> The whole node manager log is attached.
>
>
>
> Any idea is appreciated.
>
> Thanks
>
> Emma