You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by John Yost <so...@gmail.com> on 2015/06/23 14:41:24 UTC

Noob shuffle question--emitting to passthrough bolt dramatically decreases throughput

Hey Everyone,

I have 48 workers, 192 executors for the first bolt and 12 for a second,
receiving bolt.

When I emit tuples from a bolt it is dramatically dropping performance with
the number of tuples acked going from 3.8 million/minute down to  300000.
In the bolt that is receiving the tuples I turned off all processing so the
receiving bolt is a passthrough that simply acks the incoming tuples. Also,
I tested removing the second bolt so that the first bolt processes the
incoming tuples, emits and acks them.  The tuples per minute numbers went
back up to 3.3 million.

Based upon what I've seen, I was thinking the performance issue was due to
shuffling, so I switched to localOrShuffleGrouping, and that did not make a
difference.  I then figured that matching the number of receiving bolt
executors to match the number of workers would help as this should (I
think?) ensure there is local shuffling only. This appears to improve
performance by 10% or so.

Any ideas as to why the emit/receive appears to be decreasing throughput by
over a factor of 10?

Thanks

--John