You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by Luke Rohde <ro...@gmail.com> on 2014/11/14 21:57:46 UTC

idle executors

Hi, I have a topology that’s bottlenecked right now by a terminal bolt that’s writing small batches to an endpoint. I’ve increased the number of executors several times so that it’s no longer bottlenecked there, but I still notice when there’s a traffic spike that despite capacity hovering around 1.0, probably half of the executors are idle.
Can anyone give insight as to why this might be? I’ve read the docs on storm parallelism and can’t understand why this is happening. FWIW, all of the non-fieldsGrouping bolts are using localOrShuffleGrouping - perhaps this has something to do with it? I have a feeling that this is the core of the problem, but it’s not clear to me why exactly you wouldn’t use localOrShuffle over Shuffle.

Thanks, Luke

Re: idle executors

Posted by Nathan Marz <na...@nathanmarz.com>.
If you have two workers, worker A has 75% of the stream, worker B has 25%
of the stream, and you use nothing but localOrShuffle groupings, then
worker A will have to handle 3x more work than worker B. One way such an
unbalanced partitioning can happen is if you do a fieldsGrouping and one
field value is particularly unbalanced. One case where you'd use shuffle
grouping instead of localOrShuffle is when you need to even out the
distribution of the stream.

On Fri, Nov 14, 2014 at 12:57 PM, Luke Rohde <ro...@gmail.com> wrote:

> Hi, I have a topology that’s bottlenecked right now by a terminal bolt
> that’s writing small batches to an endpoint. I’ve increased the number of
> executors several times so that it’s no longer bottlenecked there, but I
> still notice when there’s a traffic spike that despite capacity hovering
> around 1.0, probably half of the executors are idle.
> Can anyone give insight as to why this might be? I’ve read the docs on
> storm parallelism and can’t understand why this is happening. FWIW, all of
> the non-fieldsGrouping bolts are using localOrShuffleGrouping - perhaps
> this has something to do with it? I have a feeling that this is the core of
> the problem, but it’s not clear to me why exactly you wouldn’t use
> localOrShuffle over Shuffle.
>
> Thanks, Luke




-- 
Twitter: @nathanmarz
http://nathanmarz.com