You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by John Lilley <jo...@redpoint.net> on 2013/05/21 18:58:32 UTC

MapReduce shuffle algorithm

I am very interested in a deep understanding of the MapReduce "Shuffle" phase algorithm and implementation.  Are there whitepapers I could read for an explanation?  Or another mailing list for this question?  Obviously there is the code ;-)
john


RE: MapReduce shuffle algorithm

Posted by John Lilley <jo...@redpoint.net>.
Thanks!  I will read the elephant book more thoroughly.
john

From: Bertrand Dechoux [mailto:dechouxb@gmail.com]
Sent: Tuesday, May 21, 2013 1:22 PM
To: user@hadoop.apache.org
Subject: Re: MapReduce shuffle algorithm

An introduction to the subject can be found in the best known reference :

Hadoop: The Definitive Guide, 3rd Edition
Storage and Analysis at Internet Scale
By Tom White<http://shop.oreilly.com/product/0636920021773.do#tab_04>
Publisher: O'Reilly Media / Yahoo Press
Released: May 2012
Chapter 6 How MapReduce Works -> Shuffle and Sort -> around page 208
http://shop.oreilly.com/product/0636920021773.do

After reading this, you should have a good understanding of the architecture and know that indeed there is no "shuffle phase replication factor" (cf your question on another thread). For the technical details, the code is probably the next step.
Regards
Bertrand


On Tue, May 21, 2013 at 6:58 PM, John Lilley <jo...@redpoint.net>> wrote:
I am very interested in a deep understanding of the MapReduce "Shuffle" phase algorithm and implementation.  Are there whitepapers I could read for an explanation?  Or another mailing list for this question?  Obviously there is the code ;-)
john



RE: MapReduce shuffle algorithm

Posted by John Lilley <jo...@redpoint.net>.
Thanks!  I will read the elephant book more thoroughly.
john

From: Bertrand Dechoux [mailto:dechouxb@gmail.com]
Sent: Tuesday, May 21, 2013 1:22 PM
To: user@hadoop.apache.org
Subject: Re: MapReduce shuffle algorithm

An introduction to the subject can be found in the best known reference :

Hadoop: The Definitive Guide, 3rd Edition
Storage and Analysis at Internet Scale
By Tom White<http://shop.oreilly.com/product/0636920021773.do#tab_04>
Publisher: O'Reilly Media / Yahoo Press
Released: May 2012
Chapter 6 How MapReduce Works -> Shuffle and Sort -> around page 208
http://shop.oreilly.com/product/0636920021773.do

After reading this, you should have a good understanding of the architecture and know that indeed there is no "shuffle phase replication factor" (cf your question on another thread). For the technical details, the code is probably the next step.
Regards
Bertrand


On Tue, May 21, 2013 at 6:58 PM, John Lilley <jo...@redpoint.net>> wrote:
I am very interested in a deep understanding of the MapReduce "Shuffle" phase algorithm and implementation.  Are there whitepapers I could read for an explanation?  Or another mailing list for this question?  Obviously there is the code ;-)
john



RE: MapReduce shuffle algorithm

Posted by John Lilley <jo...@redpoint.net>.
Thanks!  I will read the elephant book more thoroughly.
john

From: Bertrand Dechoux [mailto:dechouxb@gmail.com]
Sent: Tuesday, May 21, 2013 1:22 PM
To: user@hadoop.apache.org
Subject: Re: MapReduce shuffle algorithm

An introduction to the subject can be found in the best known reference :

Hadoop: The Definitive Guide, 3rd Edition
Storage and Analysis at Internet Scale
By Tom White<http://shop.oreilly.com/product/0636920021773.do#tab_04>
Publisher: O'Reilly Media / Yahoo Press
Released: May 2012
Chapter 6 How MapReduce Works -> Shuffle and Sort -> around page 208
http://shop.oreilly.com/product/0636920021773.do

After reading this, you should have a good understanding of the architecture and know that indeed there is no "shuffle phase replication factor" (cf your question on another thread). For the technical details, the code is probably the next step.
Regards
Bertrand


On Tue, May 21, 2013 at 6:58 PM, John Lilley <jo...@redpoint.net>> wrote:
I am very interested in a deep understanding of the MapReduce "Shuffle" phase algorithm and implementation.  Are there whitepapers I could read for an explanation?  Or another mailing list for this question?  Obviously there is the code ;-)
john



RE: MapReduce shuffle algorithm

Posted by John Lilley <jo...@redpoint.net>.
Thanks!  I will read the elephant book more thoroughly.
john

From: Bertrand Dechoux [mailto:dechouxb@gmail.com]
Sent: Tuesday, May 21, 2013 1:22 PM
To: user@hadoop.apache.org
Subject: Re: MapReduce shuffle algorithm

An introduction to the subject can be found in the best known reference :

Hadoop: The Definitive Guide, 3rd Edition
Storage and Analysis at Internet Scale
By Tom White<http://shop.oreilly.com/product/0636920021773.do#tab_04>
Publisher: O'Reilly Media / Yahoo Press
Released: May 2012
Chapter 6 How MapReduce Works -> Shuffle and Sort -> around page 208
http://shop.oreilly.com/product/0636920021773.do

After reading this, you should have a good understanding of the architecture and know that indeed there is no "shuffle phase replication factor" (cf your question on another thread). For the technical details, the code is probably the next step.
Regards
Bertrand


On Tue, May 21, 2013 at 6:58 PM, John Lilley <jo...@redpoint.net>> wrote:
I am very interested in a deep understanding of the MapReduce "Shuffle" phase algorithm and implementation.  Are there whitepapers I could read for an explanation?  Or another mailing list for this question?  Obviously there is the code ;-)
john



Re: MapReduce shuffle algorithm

Posted by Bertrand Dechoux <de...@gmail.com>.
 An introduction to the subject can be found in the best known reference :

Hadoop: The Definitive Guide, 3rd Edition
 Storage and Analysis at Internet Scale
By Tom White <http://shop.oreilly.com/product/0636920021773.do#tab_04>
Publisher: O'Reilly Media / Yahoo Press
Released: May 2012

Chapter 6 How MapReduce Works -> Shuffle and Sort -> around page 208
http://shop.oreilly.com/product/0636920021773.do

After reading this, you should have a good understanding of the
architecture and know that indeed there is no "shuffle phase replication
factor" (cf your question on another thread). For the technical details,
the code is probably the next step.

Regards

Bertrand



On Tue, May 21, 2013 at 6:58 PM, John Lilley <jo...@redpoint.net>wrote:

>  I am very interested in a deep understanding of the MapReduce “Shuffle”
> phase algorithm and implementation.  Are there whitepapers I could read for
> an explanation?  Or another mailing list for this question?  Obviously
> there is the code ;-)****
>
> john****
>
> ** **
>

Re: MapReduce shuffle algorithm

Posted by Bertrand Dechoux <de...@gmail.com>.
 An introduction to the subject can be found in the best known reference :

Hadoop: The Definitive Guide, 3rd Edition
 Storage and Analysis at Internet Scale
By Tom White <http://shop.oreilly.com/product/0636920021773.do#tab_04>
Publisher: O'Reilly Media / Yahoo Press
Released: May 2012

Chapter 6 How MapReduce Works -> Shuffle and Sort -> around page 208
http://shop.oreilly.com/product/0636920021773.do

After reading this, you should have a good understanding of the
architecture and know that indeed there is no "shuffle phase replication
factor" (cf your question on another thread). For the technical details,
the code is probably the next step.

Regards

Bertrand



On Tue, May 21, 2013 at 6:58 PM, John Lilley <jo...@redpoint.net>wrote:

>  I am very interested in a deep understanding of the MapReduce “Shuffle”
> phase algorithm and implementation.  Are there whitepapers I could read for
> an explanation?  Or another mailing list for this question?  Obviously
> there is the code ;-)****
>
> john****
>
> ** **
>

Re: MapReduce shuffle algorithm

Posted by Bertrand Dechoux <de...@gmail.com>.
 An introduction to the subject can be found in the best known reference :

Hadoop: The Definitive Guide, 3rd Edition
 Storage and Analysis at Internet Scale
By Tom White <http://shop.oreilly.com/product/0636920021773.do#tab_04>
Publisher: O'Reilly Media / Yahoo Press
Released: May 2012

Chapter 6 How MapReduce Works -> Shuffle and Sort -> around page 208
http://shop.oreilly.com/product/0636920021773.do

After reading this, you should have a good understanding of the
architecture and know that indeed there is no "shuffle phase replication
factor" (cf your question on another thread). For the technical details,
the code is probably the next step.

Regards

Bertrand



On Tue, May 21, 2013 at 6:58 PM, John Lilley <jo...@redpoint.net>wrote:

>  I am very interested in a deep understanding of the MapReduce “Shuffle”
> phase algorithm and implementation.  Are there whitepapers I could read for
> an explanation?  Or another mailing list for this question?  Obviously
> there is the code ;-)****
>
> john****
>
> ** **
>

Re: MapReduce shuffle algorithm

Posted by Bertrand Dechoux <de...@gmail.com>.
 An introduction to the subject can be found in the best known reference :

Hadoop: The Definitive Guide, 3rd Edition
 Storage and Analysis at Internet Scale
By Tom White <http://shop.oreilly.com/product/0636920021773.do#tab_04>
Publisher: O'Reilly Media / Yahoo Press
Released: May 2012

Chapter 6 How MapReduce Works -> Shuffle and Sort -> around page 208
http://shop.oreilly.com/product/0636920021773.do

After reading this, you should have a good understanding of the
architecture and know that indeed there is no "shuffle phase replication
factor" (cf your question on another thread). For the technical details,
the code is probably the next step.

Regards

Bertrand



On Tue, May 21, 2013 at 6:58 PM, John Lilley <jo...@redpoint.net>wrote:

>  I am very interested in a deep understanding of the MapReduce “Shuffle”
> phase algorithm and implementation.  Are there whitepapers I could read for
> an explanation?  Or another mailing list for this question?  Obviously
> there is the code ;-)****
>
> john****
>
> ** **
>