You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Rushabh Shah (Jira)" <ji...@apache.org> on 2020/05/10 02:09:00 UTC

[jira] [Commented] (HBASE-24243) Unable to start HRegionserver and Master node considers as a dead region

    [ https://issues.apache.org/jira/browse/HBASE-24243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103591#comment-17103591 ] 

Rushabh Shah commented on HBASE-24243:
--------------------------------------

[~dinesh4747] I don't see any exception message in either master or regionserver logs that you added in description. Without any additional information (like thread dump of both services), it would be difficult to help. Thank you !

> Unable to start HRegionserver and Master node considers as a dead region
> ------------------------------------------------------------------------
>
>                 Key: HBASE-24243
>                 URL: https://issues.apache.org/jira/browse/HBASE-24243
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: regionserver
>            Reporter: Dinesh Nithyanandam
>            Priority: Blocker
>         Attachments: site.xml
>
>
> Hi Team,
> I am currently using Apache Hbase version - 1.3.6 and I am trying to run Master and region server separately and then join the cluster dynamically but it was region server was not starting and hangs at "*The RegionServer is initializing*!"
> Commands used as below: (Master and region are on separate nodes )
> Node A - Hbase Master - */opt/hbase/bin/hbase-daemon.sh --config /usr/local/bin/hbase/conf start master*
> Node B - Hbase Region - */opt/hbase/bin/hbase-daemon.sh --config /usr/local/bin/hbase/conf start regionserver*
> *{color:#ff0000}Please advice If the above command is the right way to start hbase master and region{color}*
> Environment - *Google Compute Engine (GCE) Instance groups/VM's*
> OS Type - *CentOS -7*
> Master running ports *- 16000.tcp 16010/web* 
> Region server running ports *- 16020/tcp* *16030/web*
> Also not sure on how to enable reverse DNS across both the machines and whether that is the problem and please do advice on how do i achieve it
> *Master logs:*
> From the below master logs it clearly says that master is trying to connect to region and then eventually getting disconnected from the client region server 
>  * *{color:#ff0000}"{color}{color:#ff0000}*DEBUG [RpcServer.reader=1,bindAddress=pinpoint-master-v000-rh5k.c.gcp-ushi-telemetry-npe.internal,port=16000] ipc.RpcServer: RpcServer.listener,port=16000: DISCONNECTING client 10.148.6.13:45732 because read count=-1. Number of active connections: 1*{color}"*
> *complete logs*
> 2020-04-22 19:38:24,812 DEBUG [RpcServer.listener,port=16000] ipc.RpcServer: RpcServer.listener,port=16000: connection from 10.148.6.13:45732; # active connections: 1
>  2020-04-22 19:38:24,961 DEBUG [RpcServer.FifoWFPBQ.default.handler=29,queue=2,port=16000] ipc.RpcServer: RpcServer.FifoWFPBQ.default.handler=29,queue=2,port=16000: callId: 0 service: RegionServerStatusService methodName: RegionServerStartup size: 47 connection: 10.148.6.13:45732
>  2020-04-22 19:38:30,591 DEBUG [*pinpoint-master-v000-rh5k:16000*.activeMasterManager] ipc.RpcClientImpl: Connecting to *pinpoint-r-v000-976s.c.gcp-ushi-telemetry-npe.internal/10.148.6.13:16020*
>  2020-04-22 19:38:31,268 *DEBUG [hconnection-0x5f02b9cb-shared--pool3-t1] ipc.RpcClientImpl: Connecting to pinpoint-r-v000-976s.c.gcp-ushi-telemetry-npe.internal/10.148.6.13:16020*
>  2020-04-22 19:38:31,478 DEBUG [ProcedureExecutor-3] ipc.RpcClientImpl: Connecting to pinpoint-r-v000-976s.c.gcp-ushi-telemetry-npe.internal/10.148.6.13:16020
>  2020-04-22 19:39:32,714 *DEBUG [RpcServer.reader=1,bindAddress=pinpoint-master-v000-rh5k.c.gcp-ushi-telemetry-npe.internal,port=16000] ipc.RpcServer: RpcServer.listener,port=16000: DISCONNECTING client 10.148.6.13:45732 because read count=-1. Number of active connections: 1*
>  
> *Region server logs:*
> From the below logs region server discovers the master on it's own but unable to join the cluster with below logs
> ===============================================================
>  
> *{color:#ff0000}2020-04-22 19:38:24,675 INFO [regionserver/pinpoint-r-v000-976s.c.gcp-ushi-telemetry-npe.internal/10.148.6.13:16020] regionserver.HRegionServer: reportForDuty to master=pinpoint-master-v000-rh5k.c.gcp-ushi-telemetry-npe.internal,16000{color}*,1587584303253 with port=16020, startcode=1587583634667
>  2020-04-22 19:38:24,801 DEBUG [regionserver/pinpoint-r-v000-976s.c.gcp-ushi-telemetry-npe.internal/10.148.6.13:16020] ipc.RpcClientImpl: Connecting to pinpoint-master-v000-rh5k.c.gcp-ushi-telemetry-npe.internal/10.148.6.154:16000
>  2020-04-22 19:38:28,005 INFO [regionserver/pinpoint-r-v000-976s.c.gcp-ushi-telemetry-npe.internal/10.148.6.13:16020] regionserver.HRegionServer: reportForDuty to master=pinpoint-master-v000-rh5k.c.gcp-ushi-telemetry-npe.internal,16000,1587584303253 with port=16020, startcode=1587583634667
>  2020-04-22 19:38:28,033 INFO [regionserver/pinpoint-r-v000-976s.c.gcp-ushi-telemetry-npe.internal/10.148.6.13:16020] regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://10.148.6.68:9000/hbase
>  2020-04-22 19:38:28,033 INFO [regionserver/pinpoint-r-v000-976s.c.gcp-ushi-telemetry-npe.internal/10.148.6.13:16020] regionserver.HRegionServer: Config from master: fs.defaultFS=hdfs://10.148.6.68:9000
>  2020-04-22 19:38:28,033 INFO [regionserver/pinpoint-r-v000-976s.c.gcp-ushi-telemetry-npe.internal/10.148.6.13:16020] regionserver.HRegionServer: Config from master: hbase.master.info.port=16010
> ===============================================================
>  
> 2020-04-22 19:38:24,801 DEBUG [regionserver/pinpoint-r-v000-976s.c.gcp-ushi-telemetry-npe.internal/10.148.6.13:16020] ipc.RpcClientImpl: Connecting to pinpoint-master-v000-rh5k.c.gcp-ushi-telemetry-npe.internal/10.148.6.154:16000
>  2020-04-22 19:38:30,592 DEBUG [RpcServer.listener,port=16020] ipc.RpcServer: RpcServer.listener,port=16020: connection from 10.148.6.154:53050; # active connections: 1
>  2020-04-22 19:38:31,269 DEBUG [RpcServer.listener,port=16020] ipc.RpcServer: RpcServer.listener,port=16020: connection from 10.148.6.154:53052; # active connections: 2
>  2020-04-22 19:38:31,479 DEBUG [RpcServer.listener,port=16020] ipc.RpcServer: RpcServer.listener,port=16020: connection from 10.148.6.154:53056; # active connections: 3
>  2020-04-22 19:39:32,413 DEBUG [RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020] ipc.RpcServer: RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020: callId: 3 service: AdminService methodName: OpenRegion size: 81 connection: 10.148.6.154:53050
>  2020-04-22 19:39:32,440 DEBUG [RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020] ipc.RpcServer: RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020: callId: 4 service: AdminService methodName: OpenRegion size: 81 connection: 10.148.6.154:53050
>  2020-04-22 19:39:32,443 DEBUG [RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020] ipc.RpcServer: RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020: callId: 5 service: AdminService methodName: OpenRegion size: 81 connection: 10.148.6.154:53050
>  2020-04-22 19:39:32,445 DEBUG [RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020] ipc.RpcServer: RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020: callId: 6 service: AdminService methodName: OpenRegion size: 81 connection: 10.148.6.154:53050
>  2020-04-22 19:39:32,447 DEBUG [RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020] ipc.RpcServer: RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020: callId: 7 service: AdminService methodName: OpenRegion size: 81 connection: 10.148.6.154:53050
>  2020-04-22 19:39:32,450 DEBUG [RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020] ipc.RpcServer: RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020: callId: 8 service: AdminService methodName: OpenRegion size: 81 connection: 10.148.6.154:53050
>  2020-04-22 19:39:32,452 DEBUG [RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020] ipc.RpcServer: RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020: callId: 9 service: AdminService methodName: OpenRegion size: 81 connection: 10.148.6.154:53050
>  2020-04-22 19:39:32,454 DEBUG [RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020] ipc.RpcServer: RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020: callId: 10 service: AdminService methodName: OpenRegion size: 81 connection: 10.148.6.154:53050
>  2020-04-22 19:39:32,456 DEBUG [RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020] ipc.RpcServer: RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020: callId: 11 service: AdminService methodName: OpenRegion size: 81 connection: 10.148.6.154:53050
>  2020-04-22 19:39:32,458 DEBUG [RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020] ipc.RpcServer: RpcServer.FifoWFPBQ.priority.handler=19,queue=1,port=16020: callId: 12 service: AdminService methodName: OpenRegion size: 81 connection: 10.148.6.154:53050
> ===============================================================
> 2020-04-23 04:40:07,751 DEBUG [RpcServer.reader=3,bindAddress=pinpoint-r-v000-976s.c.gcp-ushi-telemetry-npe.internal,port=16020] ipc.RpcServer: RpcServer.listener,port=16020: DISCONNECTING client 10.148.6.13:44272 because read count=-1. Number of active connections: 1
>  2020-04-23 04:40:17,751 DEBUG [RpcServer.listener,port=16020] ipc.RpcServer: RpcServer.listener,port=16020: connection from 10.148.6.13:44280; # active connections: 1
>  2020-04-23 04:40:17,752 DEBUG [RpcServer.reader=4,bindAddress=pinpoint-r-v000-976s.c.gcp-ushi-telemetry-npe.internal,port=16020] ipc.RpcServer: RpcServer.listener,port=16020: DISCONNECTING client 10.148.6.13:44280 because read count=-1. Number of active connections: 1
>  2020-04-23 04:40:27,752 DEBUG [RpcServer.listener,port=16020] ipc.RpcServer: RpcServer.listener,port=16020: connection from 10.148.6.13:44282; # active connections: 1
>  2020-04-23 04:40:27,752 DEBUG [RpcServer.reader=5,bindAddress=pinpoint-r-v000-976s.c.gcp-ushi-telemetry-npe.internal,port=16020] ipc.RpcServer: RpcServer.listener,port=16020: DISCONNECTING client 10.148.6.13:44282 because read count=-1. Number of active connections: 1
>  2020-04-23 04:40:37,752 DEBUG [RpcServer.listener,port=16020] ipc.RpcServer: RpcServer.listener,port=16020: connection from 10.148.6.13:44284; # active connections: 1
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)