You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Heng Chen (JIRA)" <ji...@apache.org> on 2015/08/02 09:59:04 UTC

[jira] [Commented] (HBASE-14178) regionserver blocks because of waiting for offsetLock

    [ https://issues.apache.org/jira/browse/HBASE-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14650634#comment-14650634 ] 

Heng Chen commented on HBASE-14178:
-----------------------------------

And when hbase client timeout begin,  i found some WARN in  regionserver's log like below:
{code}
2015-08-02 11:22:31,091 WARN  [B.DefaultRpcServer.handler=10,queue=1,port=60020] ipc.RpcServer: RpcServer.respondercallId: 2003001 service: ClientService methodName: Multi size: 606 connection: 10.11.11.151:39022: output error
2015-08-02 11:22:31,092 WARN  [B.DefaultRpcServer.handler=10,queue=1,port=60020] ipc.RpcServer: B.DefaultRpcServer.handler=10,queue=1,port=60020: caught a ClosedChannelException, this means that the server was processing a request but the client went away. The error message was: null
2015-08-02 11:22:39,728 DEBUG [LruStats #0] hfile.LruBlockCache: Total=2.98 GB, free=160.88 MB, max=3.14 GB, blocks=3373164288, accesses=72835847, hits=45919402, hitRatio=63.05%, , cachingAccesses=67264519, cachingHits=45352036, cachingHitsRatio=67.42%, evictions=289532, evicted=1080418, evictedPerRun=3.7316012382507324
2015-08-02 11:24:19,922 WARN  [B.DefaultRpcServer.handler=0,queue=0,port=60020] ipc.RpcServer: RpcServer.respondercallId: 2005736 service: ClientService methodName: Multi size: 606 connection: 10.11.11.152:2419: output error
2015-08-02 11:24:19,924 WARN  [B.DefaultRpcServer.handler=0,queue=0,port=60020] ipc.RpcServer: B.DefaultRpcServer.handler=0,queue=0,port=60020: caught: java.io.IOException: Connection reset by peer
        at sun.nio.ch.FileDispatcherImpl.writev0(Native Method)
        at sun.nio.ch.SocketDispatcher.writev(SocketDispatcher.java:51)
        at sun.nio.ch.IOUtil.write(IOUtil.java:148)
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:503)
        at org.apache.hadoop.hbase.ipc.BufferChain.write(BufferChain.java:106)
        at org.apache.hadoop.hbase.ipc.RpcServer.channelWrite(RpcServer.java:2224)
        at org.apache.hadoop.hbase.ipc.RpcServer$Responder.processResponse(RpcServer.java:1012)
        at org.apache.hadoop.hbase.ipc.RpcServer$Responder.doRespond(RpcServer.java:1089)
        at org.apache.hadoop.hbase.ipc.RpcServer$Call.sendResponseIfReady(RpcServer.java:503)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
        at java.lang.Thread.run(Thread.java:745)

2015-08-02 11:25:09,832 WARN  [RpcServer.reader=2,port=60020] ipc.RpcServer: RpcServer.listener,port=60020: count of bytes read: 0
java.io.IOException: Connection reset by peer
        at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
        at sun.nio.ch.IOUtil.read(IOUtil.java:197)
        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
        at org.apache.hadoop.hbase.ipc.RpcServer.channelRead(RpcServer.java:2244)
        at org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1423)
        at org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:798)
        at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:589)
        at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:564)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

{code}

> regionserver blocks because of waiting for offsetLock
> -----------------------------------------------------
>
>                 Key: HBASE-14178
>                 URL: https://issues.apache.org/jira/browse/HBASE-14178
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Heng Chen
>            Priority: Critical
>         Attachments: jstack
>
>
> My regionserver blocks, and all client rpc timeout.
> I print the regionserver's jstack,  it seems a lot of thread was blocked for waiting offsetLock, detail infomation belows:
> {code}
> "B.DefaultRpcServer.handler=2,queue=2,port=60020" #82 daemon prio=5 os_prio=0 tid=0x0000000001827000 nid=0x2cdc in Object.wait() [0x00007f3831b72000]
>    java.lang.Thread.State: WAITING (on object monitor)
>         at java.lang.Object.wait(Native Method)
>         at java.lang.Object.wait(Object.java:502)
>         at org.apache.hadoop.hbase.util.IdLock.getLockEntry(IdLock.java:79)
>         - locked <0x0000000773af7c18> (a org.apache.hadoop.hbase.util.IdLock$Entry)
>         at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:352)
>         at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:253)
>         at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:524)
>         at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:572)
>         at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:257)
>         at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:173)
>         at org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:55)
>         at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:313)
>         at org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:269)
>         at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:695)
>         at org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:683)
>         at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:533)
>         at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:140)
>         at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:3889)
>         at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3969)
>         at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3847)
>         at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3820)
>         - locked <0x00000005e5c55ad0> (a org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl)
>         at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3807)
>         at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4779)
>         at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4753)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:2916)
>         at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29583)
>         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027)
>         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
>         at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
>         at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
>         at java.lang.Thread.run(Thread.java:745)
>    Locked ownable synchronizers:
>         - <0x00000005e5c55c08> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)