You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by "Rick Kellogg (JIRA)" <ji...@apache.org> on 2015/10/09 03:40:27 UTC

[jira] [Updated] (STORM-95) Topology hangs with worker processor threads TIMED_WAITING. Edit

     [ https://issues.apache.org/jira/browse/STORM-95?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rick Kellogg updated STORM-95:
------------------------------
    Component/s: storm-core

> Topology hangs with worker processor threads TIMED_WAITING. Edit
> ----------------------------------------------------------------
>
>                 Key: STORM-95
>                 URL: https://issues.apache.org/jira/browse/STORM-95
>             Project: Apache Storm
>          Issue Type: Improvement
>          Components: storm-core
>            Reporter: James Xu
>
> https://github.com/nathanmarz/storm/issues/763
> Hi Nathan,
> We are this issue very frequently now while using Storm 0.8.2. There are no errors in any worker logs/ supervisor.log/nimbus.log . However the topology stops processing the tuples.
> On collesting the thread dump of the worker processor we can see all the threads are going into TIMED_WAITING states and toplogy hangs.
> The following is the brief on our toplogy.
> We are using BaseIRich Spout and bolts.
> We have one file reader spout and three processing bolts.(24, 48 and 24 executors)
> Each tuple will contain 100 messages of size 10kb each totaling 1mb.
> We aim to process 30 mil such records within 6 hrs.
> We are running it on SUSE Linux 11 entreprise server.
> We are using all the recomended versions (Storm 0.8.2,Java 1.7, Zookeeper 3.4.5, ZeroMQ - 2.1.7, JZMQ-)
> Below are the list of variuos combination of the storm configuration we tried.
> Conf -3
> worker.childopts: "-Xmx3072m"
> topology.acker.executors: 20
> topology.max.spout.pending: 50
> topology.message.timeout.secs: 300
> topology.executor.receive.buffer.size: 16384 #batched
> topology.executor.send.buffer.size: 16384
> Conf-2
> worker.childopts: "-Xmx3072m"
> topology.acker.executors: 20
> topology.max.spout.pending: 300
> topology.message.timeout.secs: 300
> topology.executor.receive.buffer.size: 16384 #batched
> topology.executor.send.buffer.size: 16384
> Conf-1
> worker.childopts: "-Xmx3072m"
> topology.acker.executors: 20
> topology.max.spout.pending: 1000
> topology.message.timeout.secs: 300
> Also attaching the thread dumps for your reference.
> We desperately need your help to resolve this issue as we are looking to go live soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)