You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flume.apache.org by "Tobias Heintz (JIRA)" <ji...@apache.org> on 2015/08/17 17:16:46 UTC

[jira] [Updated] (FLUME-2765) ThriftSource spaws too many threads

     [ https://issues.apache.org/jira/browse/FLUME-2765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tobias Heintz updated FLUME-2765:
---------------------------------
    Attachment: thread-dump-flume-1.6.txt

I've attached a thread dump. We've limited the number of threads to 200, but take a look at all the idling "Flume Thrift IPC" threads. These are the ones that are created until the memory is full.

For a test setup look at the agent configuration above. We aren't doing anything fancy really. We are sending around 100 msg/s to Flume from our PHP frontend code using the generated client code

I've looked at the source for a while now, buy sadly my inside knowledge isn't enough to actually fix this issue. I am starting to think however that the issue possibly lies in the thrift server component that's being used.

> ThriftSource spaws too many threads
> -----------------------------------
>
>                 Key: FLUME-2765
>                 URL: https://issues.apache.org/jira/browse/FLUME-2765
>             Project: Flume
>          Issue Type: Bug
>          Components: Sinks+Sources
>    Affects Versions: 1.6
>            Reporter: Tobias Heintz
>         Attachments: thread-dump-flume-1.6.txt
>
>
> We are in the process of migrating from the old Flume to version 1.6. We are using the ThriftSource with the new KafkaSink. Here's what our config looks like:
> {code}
> agent1.channels = ch1
> agent1.sources = thriftSrc
> agent1.sinks = kafka
> agent1.channels.ch1.type = memory
> agent1.channels.ch1.capacity = 10000
> agent1.channels.ch1.transactionCapacity = 500
> # THRIFT
> agent1.sources.thriftSrc.type = thrift
> agent1.sources.thriftSrc.channels = ch1
> agent1.sources.thriftSrc.bind = 0.0.0.0
> agent1.sources.thriftSrc.port = 4042
> agent1.sources.thriftSrc.threads = 150 # if we don't set this option, the source keeps creating more and more threads until all heap memory is used up and then it crashes
> # KAFKA
> agent1.sinks.kafka.channel = ch1
> agent1.sinks.kafka.type = org.apache.flume.sink.kafka.KafkaSink
> agent1.sinks.kafka.batchSize = 50
> agent1.sinks.kafka.brokerList = broker.example.com:9092
> agent1.sinks.kafka.requiredAcks = 1
> agent1.sinks.kafka.topic = topic1
> {code}
> We have been noticing some bad behavior by the Thrift source/Thrift server using the JMX connection. If we don't restrict the number of threads, it spawns thousands of new threads, apparently one for every message it receives. These threads all have the name "Flume Thrift IPC Thread [number]" and according to the jvisualvm console they are always idle. At some point all of the JVM memory is used up through creating new threads and flume crashes with the following exception:
> {code}
> 12 Aug 2015 16:56:11,721 ERROR [Thread-1] (org.apache.thrift.server.TThreadedSelectorServer$SelectorThread.run:544)  - run() exiting due to uncaught error
> java.lang.OutOfMemoryError: unable to create new native thread
>         at java.lang.Thread.start0(Native Method)
>         at java.lang.Thread.start(Thread.java:714)
>         at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:949)
>         at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1360)
>         at org.apache.thrift.server.TThreadedSelectorServer.requestInvoke(TThreadedSelectorServer.java:310)
>         at org.apache.thrift.server.AbstractNonblockingServer$AbstractSelectThread.handleRead(AbstractNonblockingServer.java:209)
>         at org.apache.thrift.server.TThreadedSelectorServer$SelectorThread.select(TThreadedSelectorServer.java:576)
>         at org.apache.thrift.server.TThreadedSelectorServer$SelectorThread.run(TThreadedSelectorServer.java:536)
> {code}
> When we set the option to restrict the number of threads, the server sticks to that number and runs smoothly, however it drops messages occasionally (may have a different cause).
> I am wondering whether this is a bug or in some way expected behavior? What are the best practices for using a ThriftSource? Are there further parameters to possibly tune (like channel.capacity)?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)