You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Sergey Soldatov <se...@gmail.com> on 2023/03/20 00:44:36 UTC
Re: Kerberos HBase Master Halted

You need to check the logs for the status of the hbase:namespace table. If
it is stuck in the transition than the hbase master would not become active
until you solve the problem. You might need to use hbck2 to reassign
hbase:namespace region.

Thanks,
Sergey

On Thu, Feb 9, 2023 at 6:58 AM Buğra Çakır <bc...@beartell.com> wrote:

> Hi all,
>
> We have a HBase 2.4.11 configured with Kerberos authentication with a KDC.
> All regions are up and also HBase master is up at first. After some time
> HBase
> master complains about the following log and halted itself.
> Zookeeper configuration for Kerberos like below.
>
> -------------------------------
> Zookeeper Configuration
> -------------------------------
> authProvider.1 =
> org.apache.zookeeper.server.auth.SASLAuthenticationProvider
> jaasLoginRenew=3600000
> kerberos.removeHostFromPrincipal = true
> kerberos.removeRealmFromPrincipal = true
>
> ------------------------
> hbase-env.sh
> ------------------------
>
> export HBASE_OPTS="$HBASE_OPTS
> -Djava.security.auth.login.config=/etc/hbase/
> conf/hbase-jaas.conf"
>
> ----------------------
> HBase-Master Log
> ----------------------
> WARN  [Thread-25] zookeeper.Login: TGT renewal thread has been interrupted
> and
> will exit.
> 2023-02-09 17:47:58,523 INFO  [ReadOnlyZKClient-bda01.beartell.com:
> 2181,bda02.beartell.com:2181,bda03.beartell.com:2181@0x5df21317]
> zookeeper.ZooKeeper: Session: 0x100010fde9f000e closed
> 2023-02-09 17:51:58,595 ERROR [master/bda01:16000:becomeActiveMaster]
> master.HMaster: Failed to become active master
> java.lang.IllegalStateException: Expected the service
> ClusterSchemaServiceImpl
> [FAILED] to be RUNNING, but the service has FAILED
>         at
>
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:
> 379)
>         at
>
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:
> 319)
>         at
>
> org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:
> 1321)
>         at
>
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:
> 1052)
>         at
>
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:
> 2181)
>         at
> org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:511)
>         at java.lang.Thread.run(Thread.java:750)
> Caused by: java.io.IOException: Timedout 300000ms waiting for namespace
> table
> to be assigned and enabled: tableName=hbase:namespace, state=ENABLED
>         at
>
> org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:
> 107)
>         at
>
> org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:
> 63)
>         at
>
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:
> 249)
>         at
>
> org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:
> 1319)
>         ... 4 more
> 2023-02-09 17:51:58,598 ERROR [master/bda01:16000:becomeActiveMaster]
> master.HMaster: Master server abort: loaded coprocessors are: []
> 2023-02-09 17:51:58,598 ERROR [master/bda01:16000:becomeActiveMaster]
> master.HMaster: ***** ABORTING master bda01.beartell.com,16000,1675954008264:
>
> Unhandled exception. Starting shutdown. *****
> :1319)
>
> Br,
> Bugra
>
>
>
>
>
>