You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Nico Kruber (JIRA)" <ji...@apache.org> on 2018/07/16 16:37:00 UTC

[jira] [Comment Edited] (FLINK-9860) Netty resource leak on receiver side

    [ https://issues.apache.org/jira/browse/FLINK-9860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16545457#comment-16545457 ] 

Nico Kruber edited comment on FLINK-9860 at 7/16/18 4:36 PM:
-------------------------------------------------------------

The e2e test that was running when the leak occurred actually runs with parallelism 1 on 1 taskmanager. Therefore, it cannot be in the Flink-internal communication between TMs. Also, looking at the logs in more details, it is reported from the JM log anyway.

The only call that is being executed at this stage (around job submission) is {{flink list -r}} but, unfortunately, I was not able to reproduce this without or with {{env.java.opts: -Dio.netty.leakDetection.level=paranoid}} which would give more details.


was (Author: nicok):
The e2e test that was running when the leak occurred actually runs with parallelism 1 on 1 taskmanager. Therefore, it cannot be in the Flink-internal communication between TMs. Also, looking at the logs in more details, it is reported from the JM log anyway.

The only call that is being executed at this stage (around job submission) is {{flink list -r}} but, unfortunately, I was not able to reproduce this with or without {{env.java.opts: -Dio.netty.leakDetection.level=paranoid}} which would give more details.

> Netty resource leak on receiver side
> ------------------------------------
>
>                 Key: FLINK-9860
>                 URL: https://issues.apache.org/jira/browse/FLINK-9860
>             Project: Flink
>          Issue Type: Bug
>          Components: Network
>    Affects Versions: 1.6.0
>            Reporter: Till Rohrmann
>            Assignee: Nico Kruber
>            Priority: Blocker
>              Labels: test-stability
>             Fix For: 1.6.0
>
>
> The Hadoop-free Wordcount end-to-end test fails with the following exception:
> {code}
> ERROR org.apache.flink.shaded.netty4.io.netty.util.ResourceLeakDetector  - LEAK: ByteBuf.release() was not called before it's garbage-collected. See http://netty.io/wiki/reference-counted-objects.html for more information.
> Recent access records: 
> Created at:
> 	org.apache.flink.shaded.netty4.io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:331)
> 	org.apache.flink.shaded.netty4.io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:185)
> 	org.apache.flink.shaded.netty4.io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:176)
> 	org.apache.flink.shaded.netty4.io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:137)
> 	org.apache.flink.shaded.netty4.io.netty.channel.DefaultMaxMessagesRecvByteBufAllocator$MaxMessageHandle.allocate(DefaultMaxMessagesRecvByteBufAllocator.java:114)
> 	org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:147)
> 	org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
> 	org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
> 	org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
> 	org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
> 	org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884)
> 	org.apache.flink.shaded.netty4.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> {code}
> We might have a resource leak on the receiving side of our network stack.
> https://api.travis-ci.org/v3/job/404225956/log.txt



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)