You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2018/01/09 21:03:00 UTC

[jira] [Comment Edited] (HBASE-19694) The initialization order for a fresh cluster is incorrect

    [ https://issues.apache.org/jira/browse/HBASE-19694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16319155#comment-16319155 ] 

stack edited comment on HBASE-19694 at 1/9/18 9:02 PM:
-------------------------------------------------------

.001 simplifies the server startup. Purges the latch thingy from HBASE-16367 and undoes our master initialization happening in a background thread; rather we inline the active master setup and won't call RegionServer run until after we've become active Master.

Tested on cluster and locally.


was (Author: stack):
.001 simplifies the server startup. Purges the latch thingy from HBASE-16367 and undoes our master initialization happening in a background thread; rather we inline the active master setup and won't call RegionServer run until after we've become active Master.

> The initialization order for a fresh cluster is incorrect
> ---------------------------------------------------------
>
>                 Key: HBASE-19694
>                 URL: https://issues.apache.org/jira/browse/HBASE-19694
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Duo Zhang
>            Assignee: stack
>            Priority: Critical
>             Fix For: 2.0.0-beta-1
>
>         Attachments: HBASE-19694.branch-2.001.patch
>
>
> The cluster id will set once we become the active master in finishActiveMasterInitialization, but the blockUntilBecomingActiveMaster and finishActiveMasterInitialization are both called in a thread to make the constructor of HMaster return without blocking. And since HMaster itself is also a HRegionServer, it will create a Connection and then start calling reportForDuty. And when creating the ConnectionImplementation, we will read the cluster id from zk, but the cluster id may have not been set yet since it is set in another thread, we will get an exception and use the default cluster id instead.
> I always get this when running UTs which will start a mini cluster
> {noformat}
> 2018-01-03 15:16:37,916 WARN  [M:0;zhangduo-ubuntu:32848] client.ConnectionImplementation(528): Retrieve cluster id failed
> java.util.concurrent.ExecutionException: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /hbase/hbaseid
> 	at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
> 	at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
> 	at org.apache.hadoop.hbase.client.ConnectionImplementation.retrieveClusterId(ConnectionImplementation.java:526)
> 	at org.apache.hadoop.hbase.client.ConnectionImplementation.<init>(ConnectionImplementation.java:286)
> 	at org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.<init>(ConnectionUtils.java:141)
> 	at org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.<init>(ConnectionUtils.java:137)
> 	at org.apache.hadoop.hbase.client.ConnectionUtils.createShortCircuitConnection(ConnectionUtils.java:185)
> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.createClusterConnection(HRegionServer.java:781)
> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.setupClusterConnection(HRegionServer.java:812)
> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:827)
> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:938)
> 	at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:550)
> 	at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /hbase/hbaseid
> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> 	at org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$ZKTask$1.exec(ReadOnlyZKClient.java:163)
> 	at org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient.run(ReadOnlyZKClient.java:311)
> 	... 1 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)