You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by parth <pa...@yahoo.com> on 2010/02/27 12:06:29 UTC
Availability of values in a key in Reduce stage
Hi,
I am confused on a particular point about reducer. can anyone guide me about
the same ?
When mapper starts generating key value pairs, will it all be available in
reducer i.e. after all mappers have exited? I mean for a key K will all
values be grouped and available in reducer. Or Will the reducer run on a
single key-value pair as it becomes available ?
Second option seems high unrealistic.
Thanks,
Parth
--
View this message in context: http://old.nabble.com/Availability-of-values-in-a-key-in-Reduce-stage-tp27727136p27727136.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.
Re: Availability of values in a key in Reduce stage
Posted by Amar Kamat <am...@yahoo-inc.com>.
Parth,
The reducer process has 2 distinct steps
1. Shuffle
2. Reduce
In shuffle phase, the reducer 'r' does the following
1. copies the data generated by all the mappers for the reducer 'r'
2. sorts it
After the shuffle phase the reduce phase starts. In this phase the reducer invokes the reduce() function for each [k,<v1,v2...>] pairs generated in the shuffle phase.
Amar
On 2/27/10 4:36 PM, "parth" <pa...@yahoo.com> wrote:
Hi,
I am confused on a particular point about reducer. can anyone guide me about
the same ?
When mapper starts generating key value pairs, will it all be available in
reducer i.e. after all mappers have exited? I mean for a key K will all
values be grouped and available in reducer. Or Will the reducer run on a
single key-value pair as it becomes available ?
Second option seems high unrealistic.
Thanks,
Parth
--
View this message in context: http://old.nabble.com/Availability-of-values-in-a-key-in-Reduce-stage-tp27727136p27727136.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.