You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Jonathan Leech (JIRA)" <ji...@apache.org> on 2015/09/11 19:40:46 UTC

[jira] [Created] (HBASE-14410) HBase replication hangs

Jonathan Leech created HBASE-14410:
--------------------------------------

             Summary: HBase replication hangs
                 Key: HBASE-14410
                 URL: https://issues.apache.org/jira/browse/HBASE-14410
             Project: HBase
          Issue Type: Bug
          Components: Replication
    Affects Versions: 1.0.0
         Environment: CDH5.4.2
            Reporter: Jonathan Leech


Replication hangs until target cluster is restarted. 
IPC queue was at max bytes on a single region server on target cluster. Master appeared OK. Region server serving hbase:meta appeared OK. Have seen this several times since upgrade from .98.6 to 1.0.0.

Observed this in the stack trace in single region server on target cluster:
"hconnection-0x59e10d51-shared--pool8-t97669" daemon prio=10 tid=0x0000000001235000 nid=0xa47 in Object.wait() [0x00007ff5186fb000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1189)
        - locked <0x00000004147a0000> (a org.apache.hadoop.hbase.ipc.Call)
        at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:216)
        at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:300)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.get(ClientProtos.java:31865)
        at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRowOrBefore(ProtobufUtil.java:1580)
        at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1294)
        at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1126)
        at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.findAllLocationsOrFail(AsyncProcess.java:916)
        at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.groupAndSendMultiAction(AsyncProcess.java:833)
        at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.resubmit(AsyncProcess.java:1156)
        at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.receiveGlobalFailure(AsyncProcess.java:1123)
        at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.access$1100(AsyncProcess.java:574)
        at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl$SingleServerRequestRunnable.run(AsyncProcess.java:705)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)