You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@storm.apache.org by "Scott Bessler (JIRA)" <ji...@apache.org> on 2015/09/10 22:06:46 UTC

[jira] [Created] (STORM-1041) Topology with kafka spout stops processing

Scott Bessler created STORM-1041:
------------------------------------

             Summary: Topology with kafka spout stops processing
                 Key: STORM-1041
                 URL: https://issues.apache.org/jira/browse/STORM-1041
             Project: Apache Storm
          Issue Type: Bug
    Affects Versions: 0.9.5
            Reporter: Scott Bessler
            Priority: Critical


Topology:
 KafkaSpout (1 task/executor) -> bolt that does grouping (1 task/executor) -> bolt that does processing (176 tasks/executors)
 8 workers
 Using Netty

Sometimes when a worker dies (we've seen it happen due to an OOM or load from a co-located worker) it will try to restart on the same node, then 20s later shutdown and start on another node.

While the worker was dead and then killed, other workers have had netty drop messages. In theory these messages should timeout and be replayed. Our message timeout is 30s. 

However these messages never timeout, and the MAX_SPOUT_PENDING has been reached, so no more tuples are emitted/processed.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)