You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Buğra Çakır <bc...@beartell.com> on 2023/02/09 14:58:06 UTC

Kerberos HBase Master Halted

Hi all,

We have a HBase 2.4.11 configured with Kerberos authentication with a KDC.
All regions are up and also HBase master is up at first. After some time HBase 
master complains about the following log and halted itself.
Zookeeper configuration for Kerberos like below.

-------------------------------
Zookeeper Configuration
-------------------------------
authProvider.1 = org.apache.zookeeper.server.auth.SASLAuthenticationProvider
jaasLoginRenew=3600000
kerberos.removeHostFromPrincipal = true
kerberos.removeRealmFromPrincipal = true

------------------------
hbase-env.sh
------------------------

export HBASE_OPTS="$HBASE_OPTS -Djava.security.auth.login.config=/etc/hbase/
conf/hbase-jaas.conf"

----------------------
HBase-Master Log
----------------------
WARN  [Thread-25] zookeeper.Login: TGT renewal thread has been interrupted and 
will exit.
2023-02-09 17:47:58,523 INFO  [ReadOnlyZKClient-bda01.beartell.com:
2181,bda02.beartell.com:2181,bda03.beartell.com:2181@0x5df21317] 
zookeeper.ZooKeeper: Session: 0x100010fde9f000e closed
2023-02-09 17:51:58,595 ERROR [master/bda01:16000:becomeActiveMaster] 
master.HMaster: Failed to become active master
java.lang.IllegalStateException: Expected the service ClusterSchemaServiceImpl 
[FAILED] to be RUNNING, but the service has FAILED
	at 
org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:
379)
	at 
org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:
319)
	at 
org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:
1321)
	at 
org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:
1052)
	at 
org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:
2181)
	at 
org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:511)
	at java.lang.Thread.run(Thread.java:750)
Caused by: java.io.IOException: Timedout 300000ms waiting for namespace table 
to be assigned and enabled: tableName=hbase:namespace, state=ENABLED
	at 
org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:
107)
	at 
org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:
63)
	at 
org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:
249)
	at 
org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:
1319)
	... 4 more
2023-02-09 17:51:58,598 ERROR [master/bda01:16000:becomeActiveMaster] 
master.HMaster: Master server abort: loaded coprocessors are: []
2023-02-09 17:51:58,598 ERROR [master/bda01:16000:becomeActiveMaster] 
master.HMaster: ***** ABORTING master bda01.beartell.com,16000,1675954008264: 
Unhandled exception. Starting shutdown. *****
:1319)

Br,
Bugra

Re: Kerberos HBase Master Halted

Posted by Sergey Soldatov <se...@gmail.com>.

You need to check the logs for the status of the hbase:namespace table. If
it is stuck in the transition than the hbase master would not become active
until you solve the problem. You might need to use hbck2 to reassign
hbase:namespace region.

Thanks,
Sergey

On Thu, Feb 9, 2023 at 6:58 AM Buğra Çakır <bc...@beartell.com> wrote:

> Hi all,
>
> We have a HBase 2.4.11 configured with Kerberos authentication with a KDC.
> All regions are up and also HBase master is up at first. After some time
> HBase
> master complains about the following log and halted itself.
> Zookeeper configuration for Kerberos like below.
>
> -------------------------------
> Zookeeper Configuration
> -------------------------------
> authProvider.1 =
> org.apache.zookeeper.server.auth.SASLAuthenticationProvider
> jaasLoginRenew=3600000
> kerberos.removeHostFromPrincipal = true
> kerberos.removeRealmFromPrincipal = true
>
> ------------------------
> hbase-env.sh
> ------------------------
>
> export HBASE_OPTS="$HBASE_OPTS
> -Djava.security.auth.login.config=/etc/hbase/
> conf/hbase-jaas.conf"
>
> ----------------------
> HBase-Master Log
> ----------------------
> WARN  [Thread-25] zookeeper.Login: TGT renewal thread has been interrupted
> and
> will exit.
> 2023-02-09 17:47:58,523 INFO  [ReadOnlyZKClient-bda01.beartell.com:
> 2181,bda02.beartell.com:2181,bda03.beartell.com:2181@0x5df21317]
> zookeeper.ZooKeeper: Session: 0x100010fde9f000e closed
> 2023-02-09 17:51:58,595 ERROR [master/bda01:16000:becomeActiveMaster]
> master.HMaster: Failed to become active master
> java.lang.IllegalStateException: Expected the service
> ClusterSchemaServiceImpl
> [FAILED] to be RUNNING, but the service has FAILED
>         at
>
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:
> 379)
>         at
>
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:
> 319)
>         at
>
> org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:
> 1321)
>         at
>
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:
> 1052)
>         at
>
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:
> 2181)
>         at
> org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:511)
>         at java.lang.Thread.run(Thread.java:750)
> Caused by: java.io.IOException: Timedout 300000ms waiting for namespace
> table
> to be assigned and enabled: tableName=hbase:namespace, state=ENABLED
>         at
>
> org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:
> 107)
>         at
>
> org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:
> 63)
>         at
>
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:
> 249)
>         at
>
> org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:
> 1319)
>         ... 4 more
> 2023-02-09 17:51:58,598 ERROR [master/bda01:16000:becomeActiveMaster]
> master.HMaster: Master server abort: loaded coprocessors are: []
> 2023-02-09 17:51:58,598 ERROR [master/bda01:16000:becomeActiveMaster]
> master.HMaster: ***** ABORTING master bda01.beartell.com,16000,1675954008264:
>
> Unhandled exception. Starting shutdown. *****
> :1319)
>
> Br,
> Bugra
>
>
>
>
>
>