You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by "Rick Kellogg (JIRA)" <ji...@apache.org> on 2015/10/09 02:55:26 UTC

[jira] [Updated] (STORM-12) Reduce Thread Usage of Netty Transport

     [ https://issues.apache.org/jira/browse/STORM-12?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rick Kellogg updated STORM-12:
------------------------------
    Component/s: storm-core

> Reduce Thread Usage of Netty Transport
> --------------------------------------
>
>                 Key: STORM-12
>                 URL: https://issues.apache.org/jira/browse/STORM-12
>             Project: Apache Storm
>          Issue Type: Improvement
>          Components: storm-core
>            Reporter: Robert Joseph Evans
>            Assignee: Robert Joseph Evans
>             Fix For: 0.9.2-incubating
>
>
> When users start to create large topologies the storm netty messaging layer
> uses lots of threads.  This has resulted in OOMs because the default ulimit on most linux distros is around 4000 processes.  It looks like the messaging layer wants to have one thread per server it is connected to, so that means the total number of other workers in the System.
> For one particular case we saw.
>       1 (Curator delay thread)
>       1 (Curator Event Processor)
>       1 (Finalizer)
>       1 (GC???)
>       1 (Storm messaging recv thread asking netty for messages)
>       1 (Thread pool polling on a Synchronous queue???)
>       1 (ZK Connection)
>       1 (ZK epoll)
>       2 (???)
>       2 (Netty epoll)
>       6 (Timer Thread)
>      15 (Disruptor consume batches)
>     104 (Netty Thread pool taking messages to be sent)
> and this process was dieing with OOMs because it could not create any more netty threads.
> Looking at the code it appears that come from two different things.  First The Client code is using it's own thread pool for each Client instead of sharing a thread pool, but also the protocol itself blocks the thread in takeMessages() if there are no messages to send.
> So we need to make the thread pool shared between all of the clients and modify the protocol so that takeMessages does not block.  But with it not blocking we also need a way to have Client.send write directly to the Channel in some situations so that the messages still are sent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)