You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Enis Soztutar (JIRA)" <ji...@apache.org> on 2016/02/17 01:36:18 UTC

[jira] [Commented] (HBASE-15278) AsyncRPCClient hangs if Connection closes before RPC call response

    [ https://issues.apache.org/jira/browse/HBASE-15278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15149592#comment-15149592 ] 

Enis Soztutar commented on HBASE-15278:
---------------------------------------

This is the logs from test showing that the server closes the connection.
{code}
2016-02-16 15:55:57,824 DEBUG [RpcServer.reader=1,bindAddress=localhost,port=60232] ipc.RpcServer$Listener(909): RpcServer.listener,port=60232: Caught exception while reading:
org.apache.hadoop.hbase.DoNotRetryIOException: RPC data length of 257 received from 127.0.0.1 is greater than max allowed 100. Set "hbase.ipc.max.request.size" on server to override this limit (not recommended). Closing client connection
	at org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1643)
	at org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:899)
	at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:681)
	at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:657)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
2016-02-16 15:55:57,825 DEBUG [RpcServer.reader=1,bindAddress=localhost,port=60232] ipc.RpcServer$Listener(915): RpcServer.listener,port=60232: DISCONNECTING client 127.0.0.1:60236 because read count=-1. Number of active connections: 1
2016-02-16 15:55:57,963 WARN  [AsyncRpcChannel-pool2-t1] logging.Slf4JLogger(151): An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception.
java.io.IOException: Connection reset by peer
	at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
	at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
	at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
	at sun.nio.ch.IOUtil.read(IOUtil.java:192)
	at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
	at io.netty.buffer.UnpooledUnsafeDirectByteBuf.setBytes(UnpooledUnsafeDirectByteBuf.java:446)
	at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881)
	at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:225)
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
	at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
	at java.lang.Thread.run(Thread.java:745)
{code}

This is the thread dump: 
{code}
"AsyncRpcChannel-pool9-t1" daemon prio=5 tid=0x00007fe954053800 nid=0xeb0b runnable [0x0000000128884000]
   java.lang.Thread.State: RUNNABLE
	at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method)
	at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:200)
	at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:103)
	at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
	- locked <0x00000007ab965dd8> (a io.netty.channel.nio.SelectedSelectionKeySet)
	- locked <0x00000007ab965d10> (a java.util.Collections$UnmodifiableSet)
	- locked <0x00000007ab965bf0> (a sun.nio.ch.KQueueSelectorImpl)
	at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
	at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:622)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:310)
	at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
	at java.lang.Thread.run(Thread.java:745)

"RpcServer.listener,port=59545" daemon prio=5 tid=0x00007fe954053000 nid=0xc813 runnable [0x00000001243c6000]
   java.lang.Thread.State: RUNNABLE
	at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method)
	at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:200)
	at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:103)
	at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
	- locked <0x00000007ab9550e0> (a sun.nio.ch.Util$2)
	- locked <0x00000007ab9550d0> (a java.util.Collections$UnmodifiableSet)
	- locked <0x00000007ab954fb0> (a sun.nio.ch.KQueueSelectorImpl)
	at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
	at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:102)
	at org.apache.hadoop.hbase.ipc.RpcServer$Listener.run(RpcServer.java:776)

"RpcServer.responder" daemon prio=5 tid=0x00007fe953810800 nid=0x5b1f runnable [0x000000012415a000]
   java.lang.Thread.State: RUNNABLE
	at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method)
	at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:200)
	at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:103)
	at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
	- locked <0x00000007ab9596b8> (a sun.nio.ch.Util$2)
	- locked <0x00000007ab9596a8> (a java.util.Collections$UnmodifiableSet)
	- locked <0x00000007ab959588> (a sun.nio.ch.KQueueSelectorImpl)
	at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
	at org.apache.hadoop.hbase.ipc.RpcServer$Responder.doRunLoop(RpcServer.java:1016)
	at org.apache.hadoop.hbase.ipc.RpcServer$Responder.run(RpcServer.java:963)

"RpcServer.reader=2,bindAddress=localhost,port=59545" daemon prio=5 tid=0x00007fe9568f3000 nid=0xea0b runnable [0x0000000124057000]
   java.lang.Thread.State: RUNNABLE
	at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method)
	at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:200)
	at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:103)
	at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
	- locked <0x00000007ab957220> (a sun.nio.ch.Util$2)
	- locked <0x00000007ab957210> (a java.util.Collections$UnmodifiableSet)
	- locked <0x00000007ab9570f0> (a sun.nio.ch.KQueueSelectorImpl)
	at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
	at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:102)
	at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:670)
	- locked <0x00000007ab9570d8> (a org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader)
	at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:657)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)

"RpcServer.reader=1,bindAddress=localhost,port=59545" daemon prio=5 tid=0x00007fe95607b000 nid=0x391b runnable [0x0000000123f54000]
   java.lang.Thread.State: RUNNABLE
	at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method)
	at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:200)
	at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:103)
	at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
	- locked <0x00000007ab956808> (a sun.nio.ch.Util$2)
	- locked <0x00000007ab9567f8> (a java.util.Collections$UnmodifiableSet)
	- locked <0x00000007ab9566d8> (a sun.nio.ch.KQueueSelectorImpl)
	at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
	at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:102)
	at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:670)
	- locked <0x00000007ab9566c0> (a org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader)
	at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:657)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)

"RpcServer.reader=0,bindAddress=localhost,port=59545" daemon prio=5 tid=0x00007fe9568e9800 nid=0xca0f runnable [0x000000012397f000]
   java.lang.Thread.State: RUNNABLE
	at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method)
	at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:200)
	at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:103)
	at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
	- locked <0x00000007ab955da0> (a sun.nio.ch.Util$2)
	- locked <0x00000007ab955d90> (a java.util.Collections$UnmodifiableSet)
	- locked <0x00000007ab955c70> (a sun.nio.ch.KQueueSelectorImpl)
	at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
	at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:102)
	at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:670)
	- locked <0x00000007ab955c58> (a org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader)
	at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:657)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)

"AsyncRpcChannel-pool5-t1" daemon prio=5 tid=0x00007fe9532dd800 nid=0x620f runnable [0x000000012764e000]
   java.lang.Thread.State: RUNNABLE
	at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method)
	at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:200)
	at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:103)
	at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
	- locked <0x00000007af5980e0> (a io.netty.channel.nio.SelectedSelectionKeySet)
	- locked <0x00000007af598100> (a java.util.Collections$UnmodifiableSet)
	- locked <0x00000007af598090> (a sun.nio.ch.KQueueSelectorImpl)
	at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
	at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:622)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:310)
	at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
	at java.lang.Thread.run(Thread.java:745)

"Timer for 'HBase' metrics system" daemon prio=5 tid=0x00007fe9532e7800 nid=0x5d03 in Object.wait() [0x00000001242c3000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	- waiting on <0x00000007af61a210> (a java.util.TaskQueue)
	at java.util.TimerThread.mainLoop(Timer.java:552)
	- locked <0x00000007af61a210> (a java.util.TaskQueue)
	at java.util.TimerThread.run(Timer.java:505)

"ReaderThread" prio=5 tid=0x00007fe9559d1800 nid=0x5103 runnable [0x0000000123366000]
   java.lang.Thread.State: RUNNABLE
	at java.net.SocketInputStream.socketRead0(Native Method)
	at java.net.SocketInputStream.read(SocketInputStream.java:152)
	at java.net.SocketInputStream.read(SocketInputStream.java:122)
	at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283)
	at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325)
	at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177)
	- locked <0x00000007af63e098> (a java.io.InputStreamReader)
	at java.io.InputStreamReader.read(InputStreamReader.java:184)
	at java.io.BufferedReader.fill(BufferedReader.java:154)
	at java.io.BufferedReader.readLine(BufferedReader.java:317)
	- locked <0x00000007af63e098> (a java.io.InputStreamReader)
	at java.io.BufferedReader.readLine(BufferedReader.java:382)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner$ReaderThread.run(RemoteTestRunner.java:135)

"Service Thread" daemon prio=5 tid=0x00007fe95303b800 nid=0x4d03 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread1" daemon prio=5 tid=0x00007fe95303a800 nid=0x4b03 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread0" daemon prio=5 tid=0x00007fe95581b800 nid=0x4903 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=5 tid=0x00007fe955818000 nid=0x4813 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=5 tid=0x00007fe95202d000 nid=0x3503 in Object.wait() [0x0000000120b79000]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	- waiting on <0x00000007af58a630> (a java.lang.ref.ReferenceQueue$Lock)
	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
	- locked <0x00000007af58a630> (a java.lang.ref.ReferenceQueue$Lock)
	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
	at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)

"Reference Handler" daemon prio=5 tid=0x00007fe952813800 nid=0x3303 in Object.wait() [0x0000000120a76000]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	- waiting on <0x00000007af58a018> (a java.lang.ref.Reference$Lock)
	at java.lang.Object.wait(Object.java:503)
	at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
	- locked <0x00000007af58a018> (a java.lang.ref.Reference$Lock)

"main" prio=5 tid=0x00007fe95200a800 nid=0x1303 in Object.wait() [0x000000011175e000]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	- waiting on <0x00000007ab991190> (a org.apache.hadoop.hbase.ipc.AsyncCall)
	at java.lang.Object.wait(Object.java:503)
	at io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:254)
	- locked <0x00000007ab991190> (a org.apache.hadoop.hbase.ipc.AsyncCall)
	at io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:32)
	at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:31)
	at org.apache.hadoop.hbase.ipc.AsyncRpcClient.call(AsyncRpcClient.java:248)
	at org.apache.hadoop.hbase.ipc.AbstractTestIPC.testRpcMaxRequestSize(AbstractTestIPC.java:318)
{code}

I think we need to add another handler to the channel pipeline for cancelling the outstanding RPC calls if the connection close event happens. But I don't think I'll have the time to work on this for now. 

> AsyncRPCClient hangs if Connection closes before RPC call response 
> -------------------------------------------------------------------
>
>                 Key: HBASE-15278
>                 URL: https://issues.apache.org/jira/browse/HBASE-15278
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Enis Soztutar
>            Priority: Blocker
>             Fix For: 2.0.0
>
>
> The test for HBASE-15212 discovered an issue with Async RPC Client. 
> In that test, we are closing the connection if an RPC call writes a call larger than max allowed size, the server closes the connection. However the async client does not seem to handle connection closes with outstanding RPC calls. The client just hangs. 
> Marking this blocker against 2.0 since it is default there. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)