You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Nathan Marz <na...@rapleaf.com> on 2009/02/28 02:25:57 UTC

Shuffle speed?

The Hadoop shuffle phase seems painstakingly slow. For example, I am  
running a very large job, and all the reducers report a status such as:

"reduce > copy (14266 of 28243 at 1.30 MB/s)"

This is after all the mappers are finished. Is it supposed to be so  
slow?


Re: Shuffle speed?

Posted by hc busy <hc...@gmail.com>.
There are a few things that caused this to happen to me earlier on.

Make sure to check that it actually makes progress. Sometimes, slowness is
result of negative progress: it gets to say 10% complete on reduce, and then
drop back down to 5%...In that case the output can output that line with the
slow throughput rate.

changing a few of the settings below did improve on things, but ultimately,
what fixed it for us was buying more hardware.

;-)

On Sun, Mar 1, 2009 at 10:21 PM, Jothi Padmanabhan <jo...@yahoo-inc.com>wrote:

> There are a lot of factors that affect shuffle speed.
>
> Some of them are:
>
> 1. The Number of reducers concurrently running in a node
> 2. The number of parallel copier threads that are pulling in map data (
> mapred.reduce.parallel.copies)
> 3. Size of the individual map outputs. If Map outputs are huge, they are
> shuffled to disk and there might be some contention if several files are
> written to disk at the same time
> 4. Size of the buffer reserved to accommodate map outputs on the reducer
> side ( mapred.job.shuffle.input.buffer.percent).
>
> Jothi
>
>
>
> On 2/28/09 6:55 AM, "Nathan Marz" <na...@rapleaf.com> wrote:
>
> > The Hadoop shuffle phase seems painstakingly slow. For example, I am
> > running a very large job, and all the reducers report a status such as:
> >
> > "reduce > copy (14266 of 28243 at 1.30 MB/s)"
> >
> > This is after all the mappers are finished. Is it supposed to be so
> > slow?
> >
>
>

Re: Shuffle speed?

Posted by Jothi Padmanabhan <jo...@yahoo-inc.com>.
There are a lot of factors that affect shuffle speed.

Some of them are:

1. The Number of reducers concurrently running in a node
2. The number of parallel copier threads that are pulling in map data (
mapred.reduce.parallel.copies)
3. Size of the individual map outputs. If Map outputs are huge, they are
shuffled to disk and there might be some contention if several files are
written to disk at the same time
4. Size of the buffer reserved to accommodate map outputs on the reducer
side ( mapred.job.shuffle.input.buffer.percent).

Jothi



On 2/28/09 6:55 AM, "Nathan Marz" <na...@rapleaf.com> wrote:

> The Hadoop shuffle phase seems painstakingly slow. For example, I am
> running a very large job, and all the reducers report a status such as:
> 
> "reduce > copy (14266 of 28243 at 1.30 MB/s)"
> 
> This is after all the mappers are finished. Is it supposed to be so
> slow?
>