You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Steven Phillips (JIRA)" <ji...@apache.org> on 2015/04/23 02:36:39 UTC
[jira] [Commented] (DRILL-2847) DrillBufs from the RPC layer are being leaked

    [ https://issues.apache.org/jira/browse/DRILL-2847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14508222#comment-14508222 ] 

Steven Phillips commented on DRILL-2847:
----------------------------------------

My thoughts:

There are basically two different stack traces in Chris' log:

{code}
io.netty.buffer.UnsafeDirectLittleEndian.<init>:91
io.netty.buffer.PooledByteBufAllocatorL.newDirectBuffer:52
io.netty.buffer.PooledByteBufAllocatorL.directBuffer:66
io.netty.buffer.PooledByteBufAllocatorL.directBuffer:1
io.netty.buffer.AbstractByteBufAllocator.directBuffer:141
io.netty.buffer.AbstractByteBufAllocator.buffer:75
org.apache.drill.exec.rpc.RpcEncoder.encode:87
org.apache.drill.exec.rpc.RpcEncoder.encode:1
io.netty.handler.codec.MessageToMessageEncoder.write:89
io.netty.channel.AbstractChannelHandlerContext.invokeWrite:658
io.netty.channel.AbstractChannelHandlerContext.access$2000:32
io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write:939
io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write:991
io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run:924
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks:380
io.netty.channel.nio.NioEventLoop.run:357
io.netty.util.concurrent.SingleThreadEventExecutor$2.run:116
{code}

I believe this is a buffer allocated by Netty for reading off the socket. I think it is expected that this would be here, because Netty will reuse these buffers. However, this buffer is not allocated through TopLevelAllocator, so Drill is unable to account for it, which I think is a bit problematic. But if the size and number of these buffers is small, that is probably acceptable. We should investigate and confirm whether the number of these kinds of buffers is small and bounded.

The other one:

{code}
io.netty.buffer.UnsafeDirectLittleEndian.<init>:91
io.netty.buffer.PooledByteBufAllocatorL.newDirectBuffer:52
io.netty.buffer.PooledByteBufAllocatorL.directBuffer:66
org.apache.drill.exec.memory.TopLevelAllocator.buffer:94
org.apache.drill.exec.memory.TopLevelAllocator.buffer:102
org.apache.drill.exec.rpc.ProtobufLengthDecoder.decode:83
org.apache.drill.exec.rpc.data.DataProtobufLengthDecoder$Server.decode:52
io.netty.handler.codec.ByteToMessageDecoder.callDecode:247
io.netty.handler.codec.ByteToMessageDecoder.channelRead:147
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead:333
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead:319
io.netty.channel.ChannelInboundHandlerAdapter.channelRead:86
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead:333
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead:319
io.netty.channel.DefaultChannelPipeline.fireChannelRead:787
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read:130
io.netty.channel.nio.NioEventLoop.processSelectedKey:511
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized:468
io.netty.channel.nio.NioEventLoop.processSelectedKeys:382
io.netty.channel.nio.NioEventLoop.run:354
io.netty.util.concurrent.SingleThreadEventExecutor$2.run:116
{code}

Perhaps my thinking is wrong, but I think we should not be seeing these left over. This is not the socket read buffer, but rather the buffer that the rpc layer copies data into after reading data off the wire.

Please correct me if my understanding is wrong.

> DrillBufs from the RPC layer are being leaked
> ---------------------------------------------
>
>                 Key: DRILL-2847
>                 URL: https://issues.apache.org/jira/browse/DRILL-2847
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Flow, Execution - RPC
>    Affects Versions: 0.9.0
>            Reporter: Chris Westin
>            Assignee: Jacques Nadeau
>             Fix For: 1.0.0
>
>         Attachments: DRILL-2847-bug.2.patch.txt, DRILL-2847-bug.patch.txt, drill-mem.log
>
>
> I've created a patch that demonstrates this. In the patch, code is added to UnsafeDirectLittleEndian to track all the instances of that class that are created (which happens when buffers are allocated inside TopLevelAllocator). release() is overridden to remove these from the tracked list when they are released. An @After action is added to TestTpchDistributed which checks on the count of outstanding buffers. If the test is run, every case fails, and each failure shows a progressively larger and larger number of outstanding buffers. There are no complaints from the allocator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)