You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "kaushik mandal (Jira)" <ji...@apache.org> on 2022/02/14 06:05:00 UTC

[jira] [Created] (HBASE-26754) hbase master crash after running couple of days with error STUCK Region-In-Transition rit=FAILED_OPEN, location=null, table=hbase:meta, region=xxxxxxxxxx

kaushik mandal created HBASE-26754:
--------------------------------------

             Summary: hbase master crash after running couple of days with error STUCK Region-In-Transition rit=FAILED_OPEN, location=null, table=hbase:meta, region=xxxxxxxxxx
                 Key: HBASE-26754
                 URL: https://issues.apache.org/jira/browse/HBASE-26754
             Project: HBase
          Issue Type: Bug
    Affects Versions: 2.4.8
            Reporter: kaushik mandal


hbase master not responding after running couple of days and region server keep restarting.

we are seeing bellow warning in master and region server

 

 

WARN [ProcExecTimeout] assignment.AssignmentManager: STUCK Region-In-Transition rit=FAILED_OPEN, location=null, table=hbase:meta, region=xxxxxxxxxxxxx

[master/xxxx-infra-xxxxx-hbase-master-0:16000.Chore.3] master.HMaster: Not running balancer because processing dead regionserver(s): 2022-02-07 19:54:11,512 INFO [ReadOnlyZKClient-xxxxxx-zookeeper:2181@0x2fcc92d9] zookeeper.ZooKeeper: Initiating client connection, connectString=xxxx-zookeeper:2181 sessionTimeout=90000 watcher=org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$$Lambda$158/0x000000010057b440@48d2e00b

 

 

WARN [ProcExecTimeout] assignment.AssignmentManager: STUCK Region-In-Transition rit=FAILED_OPEN, location=null, table=hbase:meta, region=1588230740 2022-02-07 19:54:15,643 INFO [hconnection-0x31420403-shared-pool7-t9731] client.RpcRetryingCallerImpl: org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3223) at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1414) at org.apache.hadoop.hbase.regionserver.RSRpcServices.newRegionScanner(RSRpcServices.java:2947) at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3272) at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:42002) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) , details=row '' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=xxx-infra-xxxx-hbase-regionserver-0.xxx-infra-xxxx-hbase-regionserver.default.svc.cluster.local,16020,1644089730940, seqNum=-1

 


from region server logs

2022-02-05 19:39:16,722 WARN [RpcServer.default.FPBQ.Fifo.handler=109,queue=5,port=16020] regionserver.RSRpcServices: Client tried to access missing scanner 0 2022-02-05 19:39:16,722 WARN [RpcServer.default.FPBQ.Fifo.handler=25,queue=12,port=16020] regionserver.RSRpcServices: Client tried to access missing scanner 0 2022-02-05 19:39:16,721 WARN [RpcServer.default.FPBQ.Fifo.handler=24,queue=11,port=16020] regionserver.RSRpcServices: Client tried to access missing scanner 0 2022-02-05 19:39:16,721 WARN [RpcServer.default.FPBQ.Fifo.handler=112,queue=8,port=16020] regionserver.RSRpcServices: Client tried to access missing scanner 0 2022-02-05 19:39:16,721 WARN [RpcServer.default.FPBQ.Fifo.handler=40,queue=1,port=16020] regionserver.RSRpcServices: Client tried to access missing scanner 0 ==> /opt/hbase-2.0.1/logs/SecurityAuth.audit <== 2022-02-05 19:39:17,882 INFO SecurityLogger.org.apache.hadoop.hbase.Server: Auth successful for hdfs (auth:) 2022-02-05 19:39:17,882 INFO SecurityLogger.org.apache.hadoop.hbase.Server: Connection from 10.42.0.124 port: 44876 with unknown version info 2022-02-05 19:40:18,307 INFO SecurityLogger.org.apache.hadoop.hbase.Server: Auth successful for hdfs (auth:) 2022-02-05 19:40:18,307 INFO SecurityLogger.org.apache.hadoop.hbase.Server: Connection from 10.42.0.124 port: 51098 with unknown version info ==> /opt/hbase-2.0.1/logs/hbase--regionserver-xxxx-infra-xxxxx-hbase-regionserver-0.log <== 2022-02-05 19:40:32,848 INFO [LruBlockCacheStatsExecutor] hfile.LruBlockCache: totalSize=300.98 KB, freeSize=399.71 MB, max=400 MB, blockCount=0, accesses=0, hits=0, hitRatio=0, cachingAccesses=0, cachingHits=0, cachingHitsRatio=0,evictions=29, evicted=0, evictedPerRun=0.0



--
This message was sent by Atlassian Jira
(v8.20.1#820001)