You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Gaojinchao <ga...@huawei.com> on 2011/09/06 13:56:56 UTC

A issue about region server startup.

In my cluster , I found a issue that root was assigned after 30 minutes when the cluster start up, because region server startup slowly.

Region server start up :

Step 1: report to master it has started up.

Step 2: init the service



Hmaster start up:

Step 1:Wait for the region server.

Step 2:Assign the root and meta.



Because the preceding section, if the region server initializes service slowly. Master may throw exception "Server is not running yet".

The root can't be assigned and need wait for 30 minutes.



Is there any good way to modify this bug? Who can give me some suggestions?



The logs:

2011-08-26 07:53:26,065 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for -ROOT-,,0.70236052 so generated a random one; hri=-ROOT-,,0.70236052, src=, dest=linux-kqm6,20020,1314316432465; 1 (online=1, exclude=null) available servers

2011-08-26 07:53:26,065 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Assigning region -ROOT-,,0.70236052 to linux-kqm6,20020,1314316432465

2011-08-26 07:53:26,065 DEBUG org.apache.hadoop.hbase.master.ServerManager: New connection to linux-kqm6,20020,1314316432465

2011-08-26 07:53:33,251 WARN org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: RemoteException connecting to RS

org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hbase.ipc.ServerNotRunningException: Server is not running yet

         at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1038)



         at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:771)

         at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)

         at $Proxy6.getProtocolVersion(Unknown Source)

         at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419)

         at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393)

         at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444)

         at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349)

         at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:969)

         at org.apache.hadoop.hbase.master.ServerManager.getServerConnection(ServerManager.java:620)

         at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:555)

         at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1043)

         at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:858)

         at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:838)

         at org.apache.hadoop.hbase.master.AssignmentManager.assignRoot(AssignmentManager.java:1304)

         at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:431)

         at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:388)

         at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:282)

2011-08-26 07:53:33,254 WARN org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of -ROOT-,,0.70236052 to serverName=linux-kqm6,20020,1314316432465, load=(requests=0, regions=0, usedHeap=0, maxHeap=0), trying to assign elsewhere instead; retry=0

org.apache.hadoop.hbase.ipc.ServerNotRunningException: org.apache.hadoop.hbase.ipc.ServerNotRunningException: Server is not running yet

         at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1038)



         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

         at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

         at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)

         at java.lang.reflect.Constructor.newInstance(Constructor.java:513)

         at org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:96)

         at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:977)

         at org.apache.hadoop.hbase.master.ServerManager.getServerConnection(ServerManager.java:620)

         at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:555)

         at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1043)

         at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:858)

         at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:838)

         at org.apache.hadoop.hbase.master.AssignmentManager.assignRoot(AssignmentManager.java:1304)

         at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:431)

         at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:388)

         at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:282)

2011-08-26 07:53:33,254 WARN org.apache.hadoop.hbase.master.AssignmentManager: Unable to find a viable location to assign region -ROOT-,,0.70236052

2011-08-26 07:55:41,226 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=linux-kqm5,20020,1314316540832, regionCount=0, userLoad=false

2011-08-26 07:56:13,037 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=linux-kqm4,20020,1314316645788, regionCount=0, userLoad=false

2011-08-26 08:23:36,528 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out:  -ROOT-,,0.70236052 state=OFFLINE, ts=1314316413254

2011-08-26 08:23:36,529 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been OFFLINE for too long, reassigning -ROOT-,,0.70236052 to a random server

2011-08-26 08:23:36,529 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=-ROOT-,,0.70236052 state=OFFLINE, ts=1314316413254



Re: A issue about region server startup.

Posted by Gaojinchao <ga...@huawei.com>.
I know it. It is not that case.

Look this logs:
Region server��
2011-08-26 07:53:52,743 DEBUG org.apache.hadoop.hbase.regionserver.PriorityCompactionQueue: Create PriorityCompactionQueue
2011-08-26 07:53:52,743 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Runs every 10000000ms
2011-08-26 07:53:52,814 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at linux-kqm5:20000
2011-08-26 07:53:52,902 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Connected to master at linux-kqm5:20000

//told master that we are up
2011-08-26 07:53:52,931 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at linux-kqm5:20000 that we are up
2011-08-26 07:53:52,949 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us address to use. Was=linux-kqm6:20020, Now=linux-kqm6:20020

//init the service 
2011-08-26 07:53:52,949 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://linux-kqm4:9000
2011-08-26 07:53:52,949 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://linux-kqm4:9000/hbase
2011-08-26 07:53:52,989 INFO com.huawei.isap.ump.ha.client.ProxyRPCConfigurationUtil: Following server values are configured for the property:namenode.servers values:[hdfs://linux-kqm4:9000, hdfs://linux-kqm5:9000]
2011-08-26 07:53:53,061 INFO com.huawei.isap.ump.rpc.AbstractProxyFactory: DFSClient is in HA mode .Server Details:[linux-kqm4/192.200.10.11:9000, linux-kqm5/192.200.10.12:9000]
2011-08-26 07:53:53,069 WARN com.huawei.isap.ump.hdfs.nodestatus.NamenodeStatusListenerProvider: Unable to create NameNodeStatusListener with class name:[null] configured.
2011-08-26 07:53:53,077 INFO org.apache.hadoop.ipc.Client: The value of 'ipc.client.connect.max.socket.retries' is : 5
2011-08-26 07:53:53,077 INFO org.apache.hadoop.ipc.Client: The value of 'max.ping.retries.on.socket.timeout' is :1
2011-08-26 07:53:54,106 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: linux-kqm4/192.200.10.11:9000. Already tried 0 time(s).
2011-08-26 07:53:55,106 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: linux-kqm4/192.200.10.11:9000. Already tried 1 time(s).
2011-08-26 07:53:56,107 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: linux-kqm4/192.200.10.11:9000. Already tried 2 time(s).
2011-08-26 07:53:57,108 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: linux-kqm4/192.200.10.11:9000. Already tried 3 time(s).
2011-08-26 07:53:58,108 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: linux-kqm4/192.200.10.11:9000. Already tried 4 time(s).
2011-08-26 07:53:59,109 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: linux-kqm4/192.200.10.11:9000. Already tried 5 time(s).
2011-08-26 07:54:00,110 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: linux-kqm4/192.200.10.11:9000. Already tried 6 time(s).
2011-08-26 07:54:01,111 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: linux-kqm4/192.200.10.11:9000. Already tried 7 time(s).
2011-08-26 07:54:02,111 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: linux-kqm4/192.200.10.11:9000. Already tried 8 time(s).
2011-08-26 07:54:03,112 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: linux-kqm4/192.200.10.11:9000. Already tried 9 time(s).
2011-08-26 07:54:03,113 ERROR com.huawei.isap.ump.ha.client.HARPCConnector: Exception while getting the proxy for server: linux-kqm4/192.200.10.11:9000. Thread Name: regionserver20020, Reason: java.net.ConnectException: Connection refused
2011-08-26 07:54:03,132 INFO com.huawei.isap.ump.ha.client.HARPCConnector: New connection established with the configured server : 
ActiveServerConnectionInfo:
Metadata:linux-kqm5/192.200.10.12:9000
Version:1678831321169382011-08-26 07:54:03,133 INFO org.apache.hadoop.hdfs.DFSClient: DFSClient has connected to the active namenode: linux-kqm5/192.200.10.12:9000
2011-08-26 07:54:03,137 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: logdir=hdfs://linux-kqm4:9000/hbase/.logs/linux-kqm6,20020,1314316432465
2011-08-26 07:54:03,329 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: HLog configuration: blocksize=256 MB, rollsize=243.2 MB, enabled=true, optionallogflushinternal=1000ms
2011-08-26 07:54:03,388 INFO org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter: Using syncFs �C HDFS-200
2011-08-26 07:54:03,409 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: New hlog /hbase/.logs/linux-kqm6,20020,1314316432465/linux-kqm6%3A20020.1314316443329
2011-08-26 07:54:03,409 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Using getNumCurrentReplicas--HDFS-826
2011-08-26 07:54:03,418 INFO org.apache.hadoop.hbase.metrics: MetricsString added: revision
2011-08-26 07:54:03,418 INFO org.apache.hadoop.hbase.metrics: MetricsString added: hdfsUser
2011-08-26 07:54:03,418 INFO org.apache.hadoop.hbase.metrics: MetricsString added: hdfsDate
2011-08-26 07:54:03,418 INFO org.apache.hadoop.hbase.metrics: MetricsString added: hdfsUrl
2011-08-26 07:54:03,418 INFO org.apache.hadoop.hbase.metrics: MetricsString added: date
2011-08-26 07:54:03,418 INFO org.apache.hadoop.hbase.metrics: MetricsString added: hdfsRevision
2011-08-26 07:54:03,418 INFO org.apache.hadoop.hbase.metrics: MetricsString added: user
2011-08-26 07:54:03,418 INFO org.apache.hadoop.hbase.metrics: MetricsString added: hdfsVersion
2011-08-26 07:54:03,418 INFO org.apache.hadoop.hbase.metrics: MetricsString added: url
2011-08-26 07:54:03,418 INFO org.apache.hadoop.hbase.metrics: MetricsString added: version
2011-08-26 07:54:03,418 INFO org.apache.hadoop.hbase.metrics: new MBeanInfo
2011-08-26 07:54:03,419 INFO org.apache.hadoop.hbase.metrics: new MBeanInfo
2011-08-26 07:54:03,419 INFO org.apache.hadoop.hbase.regionserver.metrics.RegionServerMetrics: Initialized
2011-08-26 07:54:03,422 DEBUG org.apache.hadoop.hbase.executor.ExecutorService: Starting executor service name=RS_OPEN_REGION-linux-kqm6,20020,1314316432465, corePoolSize=32, maxPoolSize=32
2011-08-26 07:54:03,422 DEBUG org.apache.hadoop.hbase.executor.ExecutorService: Starting executor service name=RS_OPEN_ROOT-linux-kqm6,20020,1314316432465, corePoolSize=1, maxPoolSize=1
2011-08-26 07:54:03,422 DEBUG org.apache.hadoop.hbase.executor.ExecutorService: Starting executor service name=RS_OPEN_META-linux-kqm6,20020,1314316432465, corePoolSize=1, maxPoolSize=1
2011-08-26 07:54:03,422 DEBUG org.apache.hadoop.hbase.executor.ExecutorService: Starting executor service name=RS_CLOSE_REGION-linux-kqm6,20020,1314316432465, corePoolSize=3, maxPoolSize=3
2011-08-26 07:54:03,422 DEBUG org.apache.hadoop.hbase.executor.ExecutorService: Starting executor service name=RS_CLOSE_ROOT-linux-kqm6,20020,1314316432465, corePoolSize=1, maxPoolSize=1
2011-08-26 07:54:03,422 DEBUG org.apache.hadoop.hbase.executor.ExecutorService: Starting executor service name=RS_CLOSE_META-linux-kqm6,20020,1314316432465, corePoolSize=1, maxPoolSize=1
2011-08-26 07:54:03,505 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2011-08-26 07:54:03,563 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 20030
2011-08-26 07:54:03,564 INFO org.apache.hadoop.http.HttpServer: listener.getLocalPort() returned 20030 webServer.getConnectors()[0].getLocalPort() returned 20030
2011-08-26 07:54:03,564 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 20030
2011-08-26 07:54:03,564 INFO org.mortbay.log: jetty-6.1.26
2011-08-26 07:54:03,856 INFO org.mortbay.log: Started SelectChannelConnector@0.0.0.0:20030

//init rpc finished and started to provide services 
2011-08-26 07:54:03,858 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server listener on 20020: starting
2011-08-26 07:54:03,858 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server Responder: starting


-----�ʼ�ԭ��-----
������: Ramkrishna S Vasudevan [mailto:ramakrishnas@huawei.com] 
����ʱ��: 2011��9��6�� 21:08
�ռ���: dev@hbase.apache.org
����: RE: A issue about region server startup.

Hi Gao

That is because of the timeout.period that is set to default value of 30
mins.

But why was the initialization of the service slow ? May be any exception
while trying to open the root region?

If that is the case you see HBASE-4287 is one solution.
Without HBASE-4287 surely it will take 30 mins.
Regards
Ram

-----Original Message-----
From: Gaojinchao [mailto:gaojinchao@huawei.com] 
Sent: Tuesday, September 06, 2011 5:27 PM
To: dev@hbase.apache.org
Subject: A issue about region server startup.

In my cluster , I found a issue that root was assigned after 30 minutes when
the cluster start up, because region server startup slowly.

Region server start up :

Step 1: report to master it has started up.

Step 2: init the service



Hmaster start up:

Step 1:Wait for the region server.

Step 2:Assign the root and meta.



Because the preceding section, if the region server initializes service
slowly. Master may throw exception "Server is not running yet".

The root can't be assigned and need wait for 30 minutes.



Is there any good way to modify this bug? Who can give me some suggestions?



The logs:

2011-08-26 07:53:26,065 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: No previous transition
plan was found (or we are ignoring an existing plan) for -ROOT-,,0.70236052
so generated a random one; hri=-ROOT-,,0.70236052, src=,
dest=linux-kqm6,20020,1314316432465; 1 (online=1, exclude=null) available
servers

2011-08-26 07:53:26,065 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: Assigning region
-ROOT-,,0.70236052 to linux-kqm6,20020,1314316432465

2011-08-26 07:53:26,065 DEBUG org.apache.hadoop.hbase.master.ServerManager:
New connection to linux-kqm6,20020,1314316432465

2011-08-26 07:53:33,251 WARN
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
RemoteException connecting to RS

org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.hbase.ipc.ServerNotRunningException: Server is not running
yet

         at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1038)



         at
org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:771)

         at
org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)

         at $Proxy6.getProtocolVersion(Unknown Source)

         at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419)

         at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393)

         at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444)

         at
org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349)

         at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.
getHRegionConnection(HConnectionManager.java:969)

         at
org.apache.hadoop.hbase.master.ServerManager.getServerConnection(ServerManag
er.java:620)

         at
org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.ja
va:555)

         at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.ja
va:1043)

         at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.ja
va:858)

         at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.ja
va:838)

         at
org.apache.hadoop.hbase.master.AssignmentManager.assignRoot(AssignmentManage
r.java:1304)

         at
org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:431)

         at
org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:388
)

         at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:282)

2011-08-26 07:53:33,254 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of
-ROOT-,,0.70236052 to serverName=linux-kqm6,20020,1314316432465,
load=(requests=0, regions=0, usedHeap=0, maxHeap=0), trying to assign
elsewhere instead; retry=0

org.apache.hadoop.hbase.ipc.ServerNotRunningException:
org.apache.hadoop.hbase.ipc.ServerNotRunningException: Server is not running
yet

         at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1038)



         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)

         at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAcces
sorImpl.java:39)

         at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstruc
torAccessorImpl.java:27)

         at java.lang.reflect.Constructor.newInstance(Constructor.java:513)

         at
org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteE
xceptionHandler.java:96)

         at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.
getHRegionConnection(HConnectionManager.java:977)

         at
org.apache.hadoop.hbase.master.ServerManager.getServerConnection(ServerManag
er.java:620)

         at
org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.ja
va:555)

         at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.ja
va:1043)

         at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.ja
va:858)

         at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.ja
va:838)

         at
org.apache.hadoop.hbase.master.AssignmentManager.assignRoot(AssignmentManage
r.java:1304)

         at
org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:431)

         at
org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:388
)

         at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:282)

2011-08-26 07:53:33,254 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Unable to find a viable
location to assign region -ROOT-,,0.70236052

2011-08-26 07:55:41,226 INFO org.apache.hadoop.hbase.master.ServerManager:
Registering server=linux-kqm5,20020,1314316540832, regionCount=0,
userLoad=false

2011-08-26 07:56:13,037 INFO org.apache.hadoop.hbase.master.ServerManager:
Registering server=linux-kqm4,20020,1314316645788, regionCount=0,
userLoad=false

2011-08-26 08:23:36,528 INFO
org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition
timed out:  -ROOT-,,0.70236052 state=OFFLINE, ts=1314316413254

2011-08-26 08:23:36,529 INFO
org.apache.hadoop.hbase.master.AssignmentManager: Region has been OFFLINE
for too long, reassigning -ROOT-,,0.70236052 to a random server

2011-08-26 08:23:36,529 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE;
was=-ROOT-,,0.70236052 state=OFFLINE, ts=1314316413254




RE: A issue about region server startup.

Posted by Ramkrishna S Vasudevan <ra...@huawei.com>.
Hi Gao

That is because of the timeout.period that is set to default value of 30
mins.

But why was the initialization of the service slow ? May be any exception
while trying to open the root region?

If that is the case you see HBASE-4287 is one solution.
Without HBASE-4287 surely it will take 30 mins.
Regards
Ram

-----Original Message-----
From: Gaojinchao [mailto:gaojinchao@huawei.com] 
Sent: Tuesday, September 06, 2011 5:27 PM
To: dev@hbase.apache.org
Subject: A issue about region server startup.

In my cluster , I found a issue that root was assigned after 30 minutes when
the cluster start up, because region server startup slowly.

Region server start up :

Step 1: report to master it has started up.

Step 2: init the service



Hmaster start up:

Step 1:Wait for the region server.

Step 2:Assign the root and meta.



Because the preceding section, if the region server initializes service
slowly. Master may throw exception "Server is not running yet".

The root can't be assigned and need wait for 30 minutes.



Is there any good way to modify this bug? Who can give me some suggestions?



The logs:

2011-08-26 07:53:26,065 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: No previous transition
plan was found (or we are ignoring an existing plan) for -ROOT-,,0.70236052
so generated a random one; hri=-ROOT-,,0.70236052, src=,
dest=linux-kqm6,20020,1314316432465; 1 (online=1, exclude=null) available
servers

2011-08-26 07:53:26,065 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: Assigning region
-ROOT-,,0.70236052 to linux-kqm6,20020,1314316432465

2011-08-26 07:53:26,065 DEBUG org.apache.hadoop.hbase.master.ServerManager:
New connection to linux-kqm6,20020,1314316432465

2011-08-26 07:53:33,251 WARN
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
RemoteException connecting to RS

org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.hbase.ipc.ServerNotRunningException: Server is not running
yet

         at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1038)



         at
org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:771)

         at
org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)

         at $Proxy6.getProtocolVersion(Unknown Source)

         at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419)

         at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393)

         at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444)

         at
org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349)

         at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.
getHRegionConnection(HConnectionManager.java:969)

         at
org.apache.hadoop.hbase.master.ServerManager.getServerConnection(ServerManag
er.java:620)

         at
org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.ja
va:555)

         at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.ja
va:1043)

         at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.ja
va:858)

         at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.ja
va:838)

         at
org.apache.hadoop.hbase.master.AssignmentManager.assignRoot(AssignmentManage
r.java:1304)

         at
org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:431)

         at
org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:388
)

         at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:282)

2011-08-26 07:53:33,254 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of
-ROOT-,,0.70236052 to serverName=linux-kqm6,20020,1314316432465,
load=(requests=0, regions=0, usedHeap=0, maxHeap=0), trying to assign
elsewhere instead; retry=0

org.apache.hadoop.hbase.ipc.ServerNotRunningException:
org.apache.hadoop.hbase.ipc.ServerNotRunningException: Server is not running
yet

         at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1038)



         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)

         at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAcces
sorImpl.java:39)

         at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstruc
torAccessorImpl.java:27)

         at java.lang.reflect.Constructor.newInstance(Constructor.java:513)

         at
org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteE
xceptionHandler.java:96)

         at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.
getHRegionConnection(HConnectionManager.java:977)

         at
org.apache.hadoop.hbase.master.ServerManager.getServerConnection(ServerManag
er.java:620)

         at
org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.ja
va:555)

         at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.ja
va:1043)

         at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.ja
va:858)

         at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.ja
va:838)

         at
org.apache.hadoop.hbase.master.AssignmentManager.assignRoot(AssignmentManage
r.java:1304)

         at
org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:431)

         at
org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:388
)

         at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:282)

2011-08-26 07:53:33,254 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Unable to find a viable
location to assign region -ROOT-,,0.70236052

2011-08-26 07:55:41,226 INFO org.apache.hadoop.hbase.master.ServerManager:
Registering server=linux-kqm5,20020,1314316540832, regionCount=0,
userLoad=false

2011-08-26 07:56:13,037 INFO org.apache.hadoop.hbase.master.ServerManager:
Registering server=linux-kqm4,20020,1314316645788, regionCount=0,
userLoad=false

2011-08-26 08:23:36,528 INFO
org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition
timed out:  -ROOT-,,0.70236052 state=OFFLINE, ts=1314316413254

2011-08-26 08:23:36,529 INFO
org.apache.hadoop.hbase.master.AssignmentManager: Region has been OFFLINE
for too long, reassigning -ROOT-,,0.70236052 to a random server

2011-08-26 08:23:36,529 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE;
was=-ROOT-,,0.70236052 state=OFFLINE, ts=1314316413254