You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Ankit Singhal (JIRA)" <ji...@apache.org> on 2018/07/18 20:04:00 UTC

[jira] [Created] (HBASE-20908) Infinite loop on regionserver if region replica are reduced

Ankit Singhal created HBASE-20908:
-------------------------------------

             Summary: Infinite loop on regionserver if region replica are reduced 
                 Key: HBASE-20908
                 URL: https://issues.apache.org/jira/browse/HBASE-20908
             Project: HBase
          Issue Type: Bug
          Components: read replicas
    Affects Versions: 2.0.0, 1.2.0
            Reporter: Ankit Singhal
            Assignee: Ankit Singhal


Steps to reproduce
{code}
hbase(main):003:0> create 'myTable','cf',{REGION_REPLICATION=>3}


hbase(main):003:0> put 'myTable','r1','cf:col1','1'
0 row(s) in 0.1230 seconds

hbase(main):004:0> disable 'myTable'
alter '0 row(s) in 2.3040 seconds

hbase(main):005:0> alter 'myTable',{REGION_REPLICATION=>1}
Updating all regions with the new schema...
1/1 regions updated.
Done.
0 row(s) in 11.9550 seconds

hbase(main):006:0> enable 'myTable'
0 row(s) in 1.2620 seconds

hbase(main):007:0> put 'myTable1','r2','cf:col1','1'
0 row(s) in 0.0060 seconds

{code}


This is the replica region request which will not be present now in Meta but was there in cache. Server will say that he is not serving this region.
{code}
com.google.protobuf.ServiceException: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.NotServingRegionException): org.apache.hadoop.hbase.NotServingRegionException: Region d997d9b47a106216b9b117617ec09015 is not online on 10.22.9.76,16020,1531341039091
	at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3124)
	at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3106)
	at org.apache.hadoop.hbase.regionserver.RSRpcServices.replay(RSRpcServices.java:1714)
	at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:22773)
	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2150)
	at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:187)
	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:167)
{code}

Eventually, when we will update our cache after looking into meta , we will get into an infinite loop as this event will not be replicated because the location of the replica will not appear again.
{code}
java.net.SocketTimeoutException: callTimeout=1200000, callDuration=2181316: Can't get the location null
	at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:170)
	at org.apache.hadoop.hbase.replication.regionserver.RegionReplicaReplicationEndpoint$RetryingRpcCallable.call(RegionReplicaReplicationEndpoint.java:606)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get the location
	at org.apache.hadoop.hbase.client.RegionAdminServiceCallable.getRegionLocations(RegionAdminServiceCallable.java:178)
	at org.apache.hadoop.hbase.client.RegionAdminServiceCallable.getLocation(RegionAdminServiceCallable.java:105)
	at org.apache.hadoop.hbase.client.RegionAdminServiceCallable.prepare(RegionAdminServiceCallable.java:89)
	at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:134)
	... 5 more
Caused by: java.io.IOException: HRegionInfo was null in myTable, row=keyvalues={myTable,,1531262022075.f2b68622cfd5851023be29d5599db6c9./info:regioninfo/1531262022425/Put/vlen=41/seqid=0, myTable,,1531262022075.f2b68622cfd5851023be29d5599db6c9./info:seqnumDuringOpen/1531341209944/Put/vlen=8/seqid=0, myTable,,1531262022075.f2b68622cfd5851023be29d5599db6c9./info:server/1531341209944/Put/vlen=16/seqid=0, myTable,,1531262022075.f2b68622cfd5851023be29d5599db6c9./info:serverstartcode/1531341209944/Put/vlen=8/seqid=0}
	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1289)
	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1179)
	at org.apache.hadoop.hbase.client.RegionAdminServiceCallable.getRegionLocations(RegionAdminServiceCallable.java:170)
	... 8 more

{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)