You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ambari.apache.org by Loïc Chanel <lo...@telecomnancy.net> on 2015/09/01 15:06:31 UTC

Re: Problem with HBase + Kerberos

As someone advised me to do so, I made a kinit with the keytabs on all the
hosts where HBase component are running, and restarted HDFS and HBase, as
when restarting HBase the first time I had an error linked to hdfs keytab.

Once this is done, the errors are gone for some time, but I would bet my
paycheck that tomorrow I'll have the same errors again.
Therefore, this is most certainly related to a Kerberos expiration, but why
doesn't HBase try to renew the ticket which seems to be expired ?

As this is highly linked to Ambari deployment of Kerberos, I added the
corresponding mailing list to the discussion, hoping that someone may have
a clear idea on how to solve this problem.
Thanks in advance for your help,


Loïc

Loïc CHANEL
Engineering student at TELECOM Nancy
Trainee at Worldline - Villeurbanne

2015-09-01 10:15 GMT+02:00 Loïc Chanel <lo...@telecomnancy.net>:

> But how could the credentials be invalid, as they were created and managed
> only by Ambari ?
> Also I tried to connect manually with the keytab, and it works :
>
> kinit -k -t /etc/security/keytabs/hbase.service.keytab
> hbase/vm-regionserver@REALM.WL
> [root@vm-regionserver /]# klist
> Ticket cache: FILE:/tmp/krb5cc_0
> Default principal: hbase/vm-regionserver@REALM.WL
>
> Valid starting     Expires            Service principal
> 09/01/15 10:02:18  09/02/15 10:02:18  krbtgt/REALM.WL@REALM.WL
>         renew until 09/01/15 10:02:18
>
> But I still have the errors in HBase RegionServer logs :
>
> 2015-09-01 10:04:41,616 DEBUG [regionserver60020]
> security.HBaseSaslRpcClient: Creating SASL GSSAPI client. Server's Kerberos
> principal name is hbase/vm-master@REALM.WL
> 2015-09-01 10:04:41,617 WARN  [regionserver60020] ipc.RpcClient: Couldn't
> setup connection for hbase/vm-regionserver@WESTEROS.WL to
> hbase/vm-master@REALM.WL
> 2015-09-01 10:04:41,618 WARN  [regionserver60020]
> regionserver.HRegionServer: error telling master we are up
> com.google.protobuf.ServiceException: java.io.IOException: Couldn't setup
> connection for hbase/vm-regionserver@REALM.WL to hbase/vm-master@REALM.WL
>         at
> org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1739)
>         at
> org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1777)
>         at
> org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$BlockingStub.regionServerStartup(RegionServerStatusProtos.java:5402)
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:2114)
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:877)
>         at java.lang.Thread.run(Unknown Source)
> Caused by: java.io.IOException: Couldn't setup connection for
> hbase/vm-regionserver@REALM.WL to hbase/vm-master@REALM.WL
>
>         at
> org.apache.hadoop.hbase.ipc.RpcClient$Connection$1.run(RpcClient.java:869)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Unknown Source)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>         at
> org.apache.hadoop.hbase.ipc.RpcClient$Connection.handleSaslConnectionFailure(RpcClient.java:841)
>         at
> org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:951)
>         at
> org.apache.hadoop.hbase.ipc.RpcClient$Connection.writeRequest(RpcClient.java:1094)
>         at
> org.apache.hadoop.hbase.ipc.RpcClient$Connection.tracedWriteRequest(RpcClient.java:1061)
>         at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1516)
>         at
> org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1724)
>         ... 5 more
> Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused
> by GSSException: No valid credentials provided (Mechanism level: Failed to
> find any Kerberos tgt)]
>         at
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(Unknown
> Source)
>         at
> org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:177)
>         at
> org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupSaslConnection(RpcClient.java:815)
>         at
> org.apache.hadoop.hbase.ipc.RpcClient$Connection.access$800(RpcClient.java:349)
>         at
> org.apache.hadoop.hbase.ipc.RpcClient$Connection$2.run(RpcClient.java:943)
>         at
> org.apache.hadoop.hbase.ipc.RpcClient$Connection$2.run(RpcClient.java:940)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Unknown Source)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>         at
> org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:940)
>         ... 9 more
> Caused by: GSSException: No valid credentials provided (Mechanism level:
> Failed to find any Kerberos tgt)
>         at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Unknown
> Source)
>         at
> sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Unknown Source)
>         at
> sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Unknown Source)
>         at sun.security.jgss.GSSManagerImpl.getMechanismContext(Unknown
> Source)
>         at sun.security.jgss.GSSContextImpl.initSecContext(Unknown Source)
>         at sun.security.jgss.GSSContextImpl.initSecContext(Unknown Source)
>         ... 19 more
> 2015-09-01 10:04:41,619 WARN  [regionserver60020]
> regionserver.HRegionServer: reportForDuty failed; sleeping and then
> retrying.
>
> So I don't see what I could check or change to make these errors disappear.
> Is there something I'm missing ?
>
> Thanks,
>
>
> Loïc
>
>
> Loïc CHANEL
> Engineering student at TELECOM Nancy
> Trainee at Worldline - Villeurbanne
>
> 2015-08-31 19:20 GMT+02:00 Ted Yu <yu...@gmail.com>:
>
>> Hi,
>> The keytab you used seems to be headless keytab.
>> Here is the sample output from klist when keytab for hbase services is
>> used:
>>
>> klist
>> Ticket cache: FILE:/tmp/krb5cc_1002
>> Default principal: hbase/xxx.novalocal@EXAMPLE.COM
>>
>> Valid starting    Expires           Service principal
>> 31/08/2015 17:19  01/09/2015 17:19  krbtgt/EXAMPLE.COM@EXAMPLE.COM
>> renew until 31/08/2015 17:19
>>
>> FYI
>>
>> On Fri, Aug 21, 2015 at 12:44 AM, Loïc Chanel <
>> loic.chanel@telecomnancy.net>
>> wrote:
>>
>> > Sorry if I didn't mention that, but yeah, I ran kinit before invoking
>> hbase
>> > shell, and klists command says that my user has a ticket.
>> > [root@host /]# klist
>> > Ticket cache: FILE:/tmp/krb5cc_0
>> > Default principal: testuser@REALM
>> >
>> > Valid starting     Expires            Service principal
>> > 08/21/15 09:39:33  08/22/15 09:39:33  krbtgt/REALM@REALM
>> >         renew until 08/21/15 09:39:33
>> >
>> >
>> > Loïc CHANEL
>> > Engineering student at TELECOM Nancy
>> > Trainee at Worldline - Villeurbanne
>> >
>> > 2015-08-21 6:12 GMT+02:00 anil gupta <an...@gmail.com>:
>> >
>> > > Did you run kinit command before invoking "hbase shell"? What does
>> klist
>> > > command says?
>> > >
>> > > On Thu, Aug 20, 2015 at 6:47 AM, Loïc Chanel <
>> > loic.chanel@telecomnancy.net
>> > > >
>> > > wrote:
>> > >
>> > > > By the way, as this may help to find my issue, I just tested typing
>> > > *whoami
>> > > > *in HBase shell : this returned me exactly what it should :
>> > > > testuser@REALM (auth:KERBEROS)
>> > > >     groups: nobody, toast
>> > > >
>> > > > Loïc CHANEL
>> > > > Engineering student at TELECOM Nancy
>> > > > Trainee at Worldline - Villeurbanne
>> > > >
>> > > > 2015-08-20 15:17 GMT+02:00 Loïc Chanel <
>> loic.chanel@telecomnancy.net>:
>> > > >
>> > > > > Nothing more with your option :/
>> > > > >
>> > > > > Loïc CHANEL
>> > > > > Engineering student at TELECOM Nancy
>> > > > > Trainee at Worldline - Villeurbanne
>> > > > >
>> > > > > 2015-08-20 15:04 GMT+02:00 Loïc Chanel <
>> loic.chanel@telecomnancy.net
>> > >:
>> > > > >
>> > > > >> I'm using HDP 2.2.4.2, with HBase 0.98.4.2.2.
>> > > > >> I have unlimited strength JCE installed.
>> > > > >>
>> > > > >> I'll try to have more clues with this option.
>> > > > >>
>> > > > >> Loïc CHANEL
>> > > > >> Engineering student at TELECOM Nancy
>> > > > >> Trainee at Worldline - Villeurbanne
>> > > > >>
>> > > > >> 2015-08-20 14:58 GMT+02:00 Ted Yu <yu...@gmail.com>:
>> > > > >>
>> > > > >>> Which hbase / hadoop release are you using ?
>> > > > >>>
>> > > > >>> Running with -Dsun.security.krb5.debug=true will provide more
>> clue.
>> > > > >>>
>> > > > >>> Do you have unlimited strength JCE installed ?
>> > > > >>>
>> > > > >>> Cheers
>> > > > >>>
>> > > > >>> On Thu, Aug 20, 2015 at 5:46 AM, Loïc Chanel <
>> > > > >>> loic.chanel@telecomnancy.net>
>> > > > >>> wrote:
>> > > > >>>
>> > > > >>> > Hi all,
>> > > > >>> >
>> > > > >>> > Since I kerberized my cluster, it seems like I can't use HBase
>> > > > anymore
>> > > > >>> ...
>> > > > >>> > For example, executing  create 'toto','titi' on HBase shell
>> > results
>> > > > in
>> > > > >>> the
>> > > > >>> > printing of this line endlessly :
>> > > > >>> > WARN  [main] security.UserGroupInformation: Not attempting to
>> > > > re-login
>> > > > >>> > since the last re-login was attempted less than 600 seconds
>> > before.
>> > > > >>> >
>> > > > >>> > And nothing else happens.
>> > > > >>> > I tried to restart HDFS and HBase, and to re-generate
>> credentials
>> > > and
>> > > > >>> > keytabs, but nothing changed.
>> > > > >>> > As for the logs, they are not very explicits, as the only
>> thing
>> > > they
>> > > > >>> say
>> > > > >>> > (and keep saying) is :
>> > > > >>> >
>> > > > >>> > 2015-08-20 13:50:12,697 DEBUG [RpcServer.reader=2,port=60000]
>> > > > >>> > ipc.RpcServer: Created SASL server with mechanism = GSSAPI
>> > > > >>> > 2015-08-20 13:50:12,698 DEBUG [RpcServer.reader=2,port=60000]
>> > > > >>> > ipc.RpcServer: Have read input token of size 650 for
>> processing
>> > by
>> > > > >>> > saslServer.evaluateResponse()
>> > > > >>> > 2015-08-20 13:50:12,704 DEBUG [RpcServer.reader=2,port=60000]
>> > > > >>> > ipc.RpcServer: Will send token of size 108 from saslServer.
>> > > > >>> > 2015-08-20 13:50:12,706 DEBUG [RpcServer.reader=2,port=60000]
>> > > > >>> > ipc.RpcServer: Have read input token of size 0 for processing
>> by
>> > > > >>> > saslServer.evaluateResponse()
>> > > > >>> > 2015-08-20 13:50:12,707 DEBUG [RpcServer.reader=2,port=60000]
>> > > > >>> > ipc.RpcServer: Will send token of size 32 from saslServer.
>> > > > >>> > 2015-08-20 13:50:12,708 DEBUG [RpcServer.reader=2,port=60000]
>> > > > >>> > ipc.RpcServer: RpcServer.listener,port=60000: DISCONNECTING
>> > client
>> > > > >>> > 192.168.6.148:43014 because read count=-1. Number of active
>> > > > >>> connections: 3
>> > > > >>> >
>> > > > >>> > Do anyone has an idea about where this might come from, or
>> how to
>> > > > >>> solve it
>> > > > >>> > ? Because I couldn't find much documentation about this.
>> > > > >>> > Thanks in advance for your help !
>> > > > >>> >
>> > > > >>> >
>> > > > >>> > Loïc
>> > > > >>> >
>> > > > >>> > Loïc CHANEL
>> > > > >>> > Engineering student at TELECOM Nancy
>> > > > >>> > Trainee at Worldline - Villeurbanne
>> > > > >>> >
>> > > > >>>
>> > > > >>
>> > > > >>
>> > > > >
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > Thanks & Regards,
>> > > Anil Gupta
>> > >
>> >
>>
>
>