You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flume.apache.org by "Siddharth Ahuja (JIRA)" <ji...@apache.org> on 2016/05/02 01:36:12 UTC

[jira] [Created] (FLUME-2905) NetcatSource - Socket not closed when an exception is encountered during start() leading to file descriptor leaks

Siddharth Ahuja created FLUME-2905:
--------------------------------------

             Summary: NetcatSource - Socket not closed when an exception is encountered during start() leading to file descriptor leaks
                 Key: FLUME-2905
                 URL: https://issues.apache.org/jira/browse/FLUME-2905
             Project: Flume
          Issue Type: Bug
          Components: Sinks+Sources
    Affects Versions: v1.6.0
            Reporter: Siddharth Ahuja


During the flume agent start-up, the flume configuration containing the NetcatSource is parsed and the source's start() is called. If there is an issue while binding the channel's socket to a local address to configure the socket to listen for connections following exception is thrown but the socket open just before is not closed. 

2016-05-01 03:04:37,273 ERROR org.apache.flume.lifecycle.LifecycleSupervisor: Unable to start EventDrivenSourceRunner: { source:org.apache.flume.source.NetcatSource{name:src-1,state:IDLE} } - Exception follows.
org.apache.flume.FlumeException: java.net.BindException: Address already in use
        at org.apache.flume.source.NetcatSource.start(NetcatSource.java:173)
        at org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44)
        at org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.BindException: Address already in use
        at sun.nio.ch.Net.bind0(Native Method)
        at sun.nio.ch.Net.bind(Net.java:444)
        at sun.nio.ch.Net.bind(Net.java:436)
        at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214)
        at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
        at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67)
        at org.apache.flume.source.NetcatSource.start(NetcatSource.java:167)
        ... 9 more

The source's start() is then called again leading to another socket being opened but not closed and so on. This leads to file descriptor (socket) leaks.

This can be easily reproduced as follows:
1. Set Netcat as the source in flume agent configuration.
2. Set the bind port for the netcat source to a port which is already in use. e.g. in my case I used 50010 which is the port for DataNode's XCeiver Protocol in use by the HDFS service.
3. Start flume agent and perform "lsof -p <flume_process_id> | wc -l". Notice the file descriptors keep on growing due to socket leaks with errors like: "can't identify protocol".




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)