You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Jerry He (JIRA)" <ji...@apache.org> on 2015/03/23 17:43:11 UTC

[jira] [Created] (HBASE-13317) Region server reportForDuty stuck looping if there is a master change

Jerry He created HBASE-13317:
--------------------------------

             Summary: Region server reportForDuty stuck looping if there is a master change
                 Key: HBASE-13317
                 URL: https://issues.apache.org/jira/browse/HBASE-13317
             Project: HBase
          Issue Type: Bug
          Components: regionserver
    Affects Versions: 0.98.12, 1.0.0, 2.0.0
            Reporter: Jerry He
            Assignee: Jerry He
             Fix For: 2.0.0, 1.0.1, 0.98.13


During cluster startup, region server reportForDuty gets stuck looping if there is a master change.

{noformat}
2015-03-22 11:15:16,186 INFO  [regionserver60020] regionserver.HRegionServer: reportForDuty to master=bigaperf274,60000,1427045883965 with port=60020, startcode=1427048115174
2015-03-22 11:15:16,272 WARN  [regionserver60020] regionserver.HRegionServer: error telling master we are up
com.google.protobuf.ServiceException: java.net.ConnectException: Connection refused
	at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1678)
	at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719)
	at org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$BlockingStub.regionServerStartup(RegionServerStatusProtos.java:8277)
	at org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:2137)
	at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:896)
	at java.lang.Thread.run(Thread.java:745)
2015-03-22 11:15:16,274 WARN  [regionserver60020] regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying.
2015-03-22 11:15:19,274 INFO  [regionserver60020] regionserver.HRegionServer: reportForDuty to master=bigaperf273,60000,1427048108439 with port=60020, startcode=1427048115174
2015-03-22 11:15:19,275 WARN  [regionserver60020] regionserver.HRegionServer: error telling master we are up
com.google.protobuf.ServiceException: java.net.ConnectException: Connection refused
	at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1678)
	at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719)
	at org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$BlockingStub.regionServerStartup(RegionServerStatusProtos.java:8277)
	at org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:2137)
	at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:896)
	at java.lang.Thread.run(Thread.java:745)
2015-03-22 11:15:19,276 WARN  [regionserver60020] regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying.
2015-03-22 11:15:22,276 INFO  [regionserver60020] regionserver.HRegionServer: reportForDuty to master=bigaperf273,60000,1427048108439 with port=60020, startcode=1427048115174
2015-03-22 11:15:22,296 DEBUG [regionserver60020] regionserver.HRegionServer: Master is not running yet
2015-03-22 11:15:22,296 WARN  [regionserver60020] regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying.
2015-03-22 11:15:25,296 INFO  [regionserver60020] regionserver.HRegionServer: reportForDuty to master=bigaperf273,60000,1427048108439 with port=60020, startcode=1427048115174
2015-03-22 11:15:25,299 DEBUG [regionserver60020] regionserver.HRegionServer: Master is not running yet
2015-03-22 11:15:25,299 WARN  [regionserver60020] regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying.
2015-03-22 11:15:28,299 INFO  [regionserver60020] regionserver.HRegionServer: reportForDuty to master=bigaperf273,60000,1427048108439 with port=60020, startcode=1427048115174
2015-03-22 11:15:28,302 DEBUG [regionserver60020] regionserver.HRegionServer: Master is not running yet
2015-03-22 11:15:28,302 WARN  [regionserver60020] regionserver.HRegionServer: reportForDuty failed; sleeping and then retrying.
{noformat}

What happended is the region server first got master=bigaperf274,60000,1427045883965.  Before it was able to report successfully, the maser changed to bigaperf273,60000,1427048108439.
We were supposed to open a new connection to the new master. But we never did, looping and trying to old address forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)