You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by Rodrigo Valladares <ro...@gmail.com> on 2016/03/23 13:02:11 UTC

Shuffle Grouping not distributing shuffle correctly

I am doing some experiments with storm and notice when the spouts are not
evenly distributed across all workers the shuffle grouping does not work
the way it is supposed to. I did an experiment by setting the spout
parallelism in a two worker cluster (each worker on different machines) and
noticed that almost all the processing was done in the spout machine. Isn't
the shuffle grouping supposed to distribute the load evenly among them? I
seems it is working like localAndShuffleGrouping, but my topology is just
using vanilla ShuffleGrouping.

I am using version.1.0.x.

I am attaching storm UI screenshot. Node1 is where the spout is, notice
that almost all the tuples are run in executors on node1.

Thank you,
Rodrigo Valladares Cotta
Master's Student, Computer Science
University of Nebraska-Lincoln