You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by Dave Webb <Da...@gmx.de> on 2016/11/01 09:16:28 UTC

Out of Heap Memory despite using TOPOLOGY_MAX_SPOUT_PENDING

Hello,

I have a simple WordCount topology which only uses BaseBasicBolts and BaseRichSpouts.
When running with multiple workers, I experience increasing Heap usage until it is eventually full,
which causes the workers to restart.
Adjusting the Heap size does not permanently solve this problem, it just delays it.

I have read about TOPOLOGY_MAX_SPOUT_PENDING and Guaranteed Message Processing [1], which is basically
mandatory for running a stable multi-worker topology.

However, even though my workers are now "throttled" with this parameter, still run out of Heap at some point.
I am aware that "topology.max.spout.pending" is applied to each Spout instance individually, meaning that the
total amount of pending tuples is this value times the amount of spout instances.
But there is now way that a _single_ Spout instance which tolerates only 10 tuples "in flight" at a time almost
instantly fills an 1.5GB Heap.

$ jmap -histo:live 9476

 num     #instances         #bytes  class name
----------------------------------------------
   1:       5648304      989149784  [B
   2:       5647555      135541320  backtype.storm.messaging.TaskMessage


How can 5.6 million TaskMessages be possible? Only _one single_ tuple can be in flight. Each tuple is split into
50 tuples by the first Bolt and each of them is reduced by a final CounterBolt.

My questions are the following:

1. How can I query the amount of tuples in my topology currently being "in flight".
Storm needs this number itself in one way or the other, but how can I access it externally?

2. How can I further troubleshoot my Out of Memory problems?
I need to which Bolt is the bottleneck. Restarting the topology and observing is incredibly slow and cumbersome.

3. Do I really _don't_ need to "ack()" anywhere in my code if I'm only using BaseBasicBolts and BaseRichSpouts?

Thanks

[1] http://storm.apache.org/releases/0.10.1/Guaranteeing-message-processing.html