You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Todd <bi...@163.com> on 2014/12/21 14:59:07 UTC

Question about shuffle/merge/sort phrase

Hi, Hadoopers,
I got a question about shuffle/sort/merge phrase related..
My understanding is that shuffle is used to transfer the mapper output(key/value pairs) from mapper node to reducer node, and merge phrase is used to merge all the mapper output from all mapper nodes, and sort phrase is used to sort the key/value pair by key,
Then my question, whose responsibility is it that brings each key with all its values together (The reducer's input is a key and an iterative values).

Thanks.

Re: Re: Question about shuffle/merge/sort phrase

Posted by Sandeep Khurana <sk...@gmail.com>.
After reducer has the data then it does sorting and merging at its end
too.After merging (data of same key together), it passes the keys and the
collection of values for each key to reducer 1 by 1 as you said. The sort
and merge on reducer side do this merging of records of same key (after
sorting).

Have  a look at the diagram at
http://hadoop-gyan.blogspot.in/2012/11/map-reduce-shuffle-and-sort.html
 (its not my blog).

On Mon, Dec 22, 2014 at 10:52 AM, bit1129@163.com <bi...@163.com> wrote:

> Then what exactly happens after Reducer pulls all mapper output key/value
> pairs from all the mapper nodes before reducer see the
> <key,value1,value2..>?
>
> ------------------------------
> bit1129@163.com
>
>
> *From:* Susheel Kumar Gadalay <sk...@gmail.com>
> *Date:* 2014-12-22 13:20
> *To:* user <us...@hadoop.apache.org>
> *Subject:* Re: Question about shuffle/merge/sort phrase
> Sorry, typo
>
> It is the reducer which will pull the mapper o/p as soon as it completes.
>
> On 12/22/14, Susheel Kumar Gadalay <sk...@gmail.com> wrote:
> > It is the mapper which will push the o/p to the respective reducer as
> > soon as it completes.
> >
> > The no of reducers are known at the beginning itself.
> > The mapper as it process the input split, generate the o/p of for each
> > reducer (if the mapper o/p key is eligible for the reducer).
> > The reducer will wait till the completion of all map tasks to start it
> > processing.
> >
> >
> > On 12/22/14, bit1129@163.com <bi...@163.com> wrote:
> >> Could some one help me on this question? thanks.
> >>
> >>
> >>
> >> bit1129@163.com
> >>
> >> 发件人: Todd
> >> 发送时间: 2014-12-21 21:59
> >> 收件人: user@hadoop.apache.org
> >> 主题: Question about shuffle/merge/sort phrase
> >> Hi, Hadoopers,
> >> I got a question about shuffle/sort/merge phrase related..
> >> My understanding is that shuffle is used to transfer the mapper
> >> output(key/value pairs) from mapper node to reducer node, and merge
> >> phrase
> >> is used to merge all the mapper output from all mapper nodes, and sort
> >> phrase is used to sort the key/value pair by key,
> >> Then my question, whose responsibility is it that brings each key with
> >> all
> >> its values together (The reducer's input is a key and an iterative
> >> values).
> >>
> >>
> >> Thanks.
> >>
> >
>
>


-- 
Thanks and regards
Sandeep Khurana

Re: Re: Question about shuffle/merge/sort phrase

Posted by Sandeep Khurana <sk...@gmail.com>.
After reducer has the data then it does sorting and merging at its end
too.After merging (data of same key together), it passes the keys and the
collection of values for each key to reducer 1 by 1 as you said. The sort
and merge on reducer side do this merging of records of same key (after
sorting).

Have  a look at the diagram at
http://hadoop-gyan.blogspot.in/2012/11/map-reduce-shuffle-and-sort.html
 (its not my blog).

On Mon, Dec 22, 2014 at 10:52 AM, bit1129@163.com <bi...@163.com> wrote:

> Then what exactly happens after Reducer pulls all mapper output key/value
> pairs from all the mapper nodes before reducer see the
> <key,value1,value2..>?
>
> ------------------------------
> bit1129@163.com
>
>
> *From:* Susheel Kumar Gadalay <sk...@gmail.com>
> *Date:* 2014-12-22 13:20
> *To:* user <us...@hadoop.apache.org>
> *Subject:* Re: Question about shuffle/merge/sort phrase
> Sorry, typo
>
> It is the reducer which will pull the mapper o/p as soon as it completes.
>
> On 12/22/14, Susheel Kumar Gadalay <sk...@gmail.com> wrote:
> > It is the mapper which will push the o/p to the respective reducer as
> > soon as it completes.
> >
> > The no of reducers are known at the beginning itself.
> > The mapper as it process the input split, generate the o/p of for each
> > reducer (if the mapper o/p key is eligible for the reducer).
> > The reducer will wait till the completion of all map tasks to start it
> > processing.
> >
> >
> > On 12/22/14, bit1129@163.com <bi...@163.com> wrote:
> >> Could some one help me on this question? thanks.
> >>
> >>
> >>
> >> bit1129@163.com
> >>
> >> 发件人: Todd
> >> 发送时间: 2014-12-21 21:59
> >> 收件人: user@hadoop.apache.org
> >> 主题: Question about shuffle/merge/sort phrase
> >> Hi, Hadoopers,
> >> I got a question about shuffle/sort/merge phrase related..
> >> My understanding is that shuffle is used to transfer the mapper
> >> output(key/value pairs) from mapper node to reducer node, and merge
> >> phrase
> >> is used to merge all the mapper output from all mapper nodes, and sort
> >> phrase is used to sort the key/value pair by key,
> >> Then my question, whose responsibility is it that brings each key with
> >> all
> >> its values together (The reducer's input is a key and an iterative
> >> values).
> >>
> >>
> >> Thanks.
> >>
> >
>
>


-- 
Thanks and regards
Sandeep Khurana

Re: Re: Question about shuffle/merge/sort phrase

Posted by Srivathsala Chary Vangeepuram <sr...@gmail.com>.
Todd:

1. Map Task spits out key,value pairs in sorted order.
2. Shuffle is actually copy phase in Reduce Task.
3. Then Reduce task performs merge operation on the Map output intermediate
key/value pairs.
4. Reduce Task builds the iterable list of values for each key.

I was trying to understand which method does this in Reduce task. I'll come
back to you. (I think base code is in Task.java)

Regards,
Chary
On Sun, Dec 21, 2014 at 9:40 PM, Susheel Kumar Gadalay <sk...@gmail.com>
wrote:

> What I explained is shuffle phase.
>
> After the reducer pulls the data, it does a sort on the key part only
> and calls the corresponding reduce method.
> On 12/22/14, bit1129@163.com <bi...@163.com> wrote:
> > Then what exactly happens after Reducer pulls all mapper output key/value
> > pairs from all the mapper nodes before reducer see the
> > <key,value1,value2..>?
> >
> >
> >
>  > bit1129@163.com
> >
> > From: Susheel Kumar Gadalay
> > Date: 2014-12-22 13:20
> > To: user
> > Subject: Re: Question about shuffle/merge/sort phrase
> > Sorry, typo
> >
> > It is the reducer which will pull the mapper o/p as soon as it completes.
> >
> > On 12/22/14, Susheel Kumar Gadalay <sk...@gmail.com> wrote:
> >> It is the mapper which will push the o/p to the respective reducer as
> >> soon as it completes.
> >>
> >> The no of reducers are known at the beginning itself.
> >> The mapper as it process the input split, generate the o/p of for each
> >> reducer (if the mapper o/p key is eligible for the reducer).
> >> The reducer will wait till the completion of all map tasks to start it
> >> processing.
> >>
> >>
> >> On 12/22/14, bit1129@163.com <bi...@163.com> wrote:
> >>> Could some one help me on this question? thanks.
> >>>
> >>>
> >>>
> >>> bit1129@163.com
> >>>
> >>> 发件人: Todd
> >>> 发送时间: 2014-12-21 21:59
> >>> 收件人: user@hadoop.apache.org
> >>> 主题: Question about shuffle/merge/sort phrase
> >>> Hi, Hadoopers,
> >>> I got a question about shuffle/sort/merge phrase related..
> >>> My understanding is that shuffle is used to transfer the mapper
> >>> output(key/value pairs) from mapper node to reducer node, and merge
> >>> phrase
> >>> is used to merge all the mapper output from all mapper nodes, and sort
> >>> phrase is used to sort the key/value pair by key,
> >>> Then my question, whose responsibility is it that brings each key with
> >>> all
> >>> its values together (The reducer's input is a key and an iterative
> >>> values).
> >>>
> >>>
> >>> Thanks.
> >>>
> >>
> >
>

Re: Re: Question about shuffle/merge/sort phrase

Posted by Srivathsala Chary Vangeepuram <sr...@gmail.com>.
Todd:

1. Map Task spits out key,value pairs in sorted order.
2. Shuffle is actually copy phase in Reduce Task.
3. Then Reduce task performs merge operation on the Map output intermediate
key/value pairs.
4. Reduce Task builds the iterable list of values for each key.

I was trying to understand which method does this in Reduce task. I'll come
back to you. (I think base code is in Task.java)

Regards,
Chary
On Sun, Dec 21, 2014 at 9:40 PM, Susheel Kumar Gadalay <sk...@gmail.com>
wrote:

> What I explained is shuffle phase.
>
> After the reducer pulls the data, it does a sort on the key part only
> and calls the corresponding reduce method.
> On 12/22/14, bit1129@163.com <bi...@163.com> wrote:
> > Then what exactly happens after Reducer pulls all mapper output key/value
> > pairs from all the mapper nodes before reducer see the
> > <key,value1,value2..>?
> >
> >
> >
>  > bit1129@163.com
> >
> > From: Susheel Kumar Gadalay
> > Date: 2014-12-22 13:20
> > To: user
> > Subject: Re: Question about shuffle/merge/sort phrase
> > Sorry, typo
> >
> > It is the reducer which will pull the mapper o/p as soon as it completes.
> >
> > On 12/22/14, Susheel Kumar Gadalay <sk...@gmail.com> wrote:
> >> It is the mapper which will push the o/p to the respective reducer as
> >> soon as it completes.
> >>
> >> The no of reducers are known at the beginning itself.
> >> The mapper as it process the input split, generate the o/p of for each
> >> reducer (if the mapper o/p key is eligible for the reducer).
> >> The reducer will wait till the completion of all map tasks to start it
> >> processing.
> >>
> >>
> >> On 12/22/14, bit1129@163.com <bi...@163.com> wrote:
> >>> Could some one help me on this question? thanks.
> >>>
> >>>
> >>>
> >>> bit1129@163.com
> >>>
> >>> 发件人: Todd
> >>> 发送时间: 2014-12-21 21:59
> >>> 收件人: user@hadoop.apache.org
> >>> 主题: Question about shuffle/merge/sort phrase
> >>> Hi, Hadoopers,
> >>> I got a question about shuffle/sort/merge phrase related..
> >>> My understanding is that shuffle is used to transfer the mapper
> >>> output(key/value pairs) from mapper node to reducer node, and merge
> >>> phrase
> >>> is used to merge all the mapper output from all mapper nodes, and sort
> >>> phrase is used to sort the key/value pair by key,
> >>> Then my question, whose responsibility is it that brings each key with
> >>> all
> >>> its values together (The reducer's input is a key and an iterative
> >>> values).
> >>>
> >>>
> >>> Thanks.
> >>>
> >>
> >
>

Re: Re: Question about shuffle/merge/sort phrase

Posted by Srivathsala Chary Vangeepuram <sr...@gmail.com>.
Todd:

1. Map Task spits out key,value pairs in sorted order.
2. Shuffle is actually copy phase in Reduce Task.
3. Then Reduce task performs merge operation on the Map output intermediate
key/value pairs.
4. Reduce Task builds the iterable list of values for each key.

I was trying to understand which method does this in Reduce task. I'll come
back to you. (I think base code is in Task.java)

Regards,
Chary
On Sun, Dec 21, 2014 at 9:40 PM, Susheel Kumar Gadalay <sk...@gmail.com>
wrote:

> What I explained is shuffle phase.
>
> After the reducer pulls the data, it does a sort on the key part only
> and calls the corresponding reduce method.
> On 12/22/14, bit1129@163.com <bi...@163.com> wrote:
> > Then what exactly happens after Reducer pulls all mapper output key/value
> > pairs from all the mapper nodes before reducer see the
> > <key,value1,value2..>?
> >
> >
> >
>  > bit1129@163.com
> >
> > From: Susheel Kumar Gadalay
> > Date: 2014-12-22 13:20
> > To: user
> > Subject: Re: Question about shuffle/merge/sort phrase
> > Sorry, typo
> >
> > It is the reducer which will pull the mapper o/p as soon as it completes.
> >
> > On 12/22/14, Susheel Kumar Gadalay <sk...@gmail.com> wrote:
> >> It is the mapper which will push the o/p to the respective reducer as
> >> soon as it completes.
> >>
> >> The no of reducers are known at the beginning itself.
> >> The mapper as it process the input split, generate the o/p of for each
> >> reducer (if the mapper o/p key is eligible for the reducer).
> >> The reducer will wait till the completion of all map tasks to start it
> >> processing.
> >>
> >>
> >> On 12/22/14, bit1129@163.com <bi...@163.com> wrote:
> >>> Could some one help me on this question? thanks.
> >>>
> >>>
> >>>
> >>> bit1129@163.com
> >>>
> >>> 发件人: Todd
> >>> 发送时间: 2014-12-21 21:59
> >>> 收件人: user@hadoop.apache.org
> >>> 主题: Question about shuffle/merge/sort phrase
> >>> Hi, Hadoopers,
> >>> I got a question about shuffle/sort/merge phrase related..
> >>> My understanding is that shuffle is used to transfer the mapper
> >>> output(key/value pairs) from mapper node to reducer node, and merge
> >>> phrase
> >>> is used to merge all the mapper output from all mapper nodes, and sort
> >>> phrase is used to sort the key/value pair by key,
> >>> Then my question, whose responsibility is it that brings each key with
> >>> all
> >>> its values together (The reducer's input is a key and an iterative
> >>> values).
> >>>
> >>>
> >>> Thanks.
> >>>
> >>
> >
>

Re: Re: Question about shuffle/merge/sort phrase

Posted by Srivathsala Chary Vangeepuram <sr...@gmail.com>.
Todd:

1. Map Task spits out key,value pairs in sorted order.
2. Shuffle is actually copy phase in Reduce Task.
3. Then Reduce task performs merge operation on the Map output intermediate
key/value pairs.
4. Reduce Task builds the iterable list of values for each key.

I was trying to understand which method does this in Reduce task. I'll come
back to you. (I think base code is in Task.java)

Regards,
Chary
On Sun, Dec 21, 2014 at 9:40 PM, Susheel Kumar Gadalay <sk...@gmail.com>
wrote:

> What I explained is shuffle phase.
>
> After the reducer pulls the data, it does a sort on the key part only
> and calls the corresponding reduce method.
> On 12/22/14, bit1129@163.com <bi...@163.com> wrote:
> > Then what exactly happens after Reducer pulls all mapper output key/value
> > pairs from all the mapper nodes before reducer see the
> > <key,value1,value2..>?
> >
> >
> >
>  > bit1129@163.com
> >
> > From: Susheel Kumar Gadalay
> > Date: 2014-12-22 13:20
> > To: user
> > Subject: Re: Question about shuffle/merge/sort phrase
> > Sorry, typo
> >
> > It is the reducer which will pull the mapper o/p as soon as it completes.
> >
> > On 12/22/14, Susheel Kumar Gadalay <sk...@gmail.com> wrote:
> >> It is the mapper which will push the o/p to the respective reducer as
> >> soon as it completes.
> >>
> >> The no of reducers are known at the beginning itself.
> >> The mapper as it process the input split, generate the o/p of for each
> >> reducer (if the mapper o/p key is eligible for the reducer).
> >> The reducer will wait till the completion of all map tasks to start it
> >> processing.
> >>
> >>
> >> On 12/22/14, bit1129@163.com <bi...@163.com> wrote:
> >>> Could some one help me on this question? thanks.
> >>>
> >>>
> >>>
> >>> bit1129@163.com
> >>>
> >>> 发件人: Todd
> >>> 发送时间: 2014-12-21 21:59
> >>> 收件人: user@hadoop.apache.org
> >>> 主题: Question about shuffle/merge/sort phrase
> >>> Hi, Hadoopers,
> >>> I got a question about shuffle/sort/merge phrase related..
> >>> My understanding is that shuffle is used to transfer the mapper
> >>> output(key/value pairs) from mapper node to reducer node, and merge
> >>> phrase
> >>> is used to merge all the mapper output from all mapper nodes, and sort
> >>> phrase is used to sort the key/value pair by key,
> >>> Then my question, whose responsibility is it that brings each key with
> >>> all
> >>> its values together (The reducer's input is a key and an iterative
> >>> values).
> >>>
> >>>
> >>> Thanks.
> >>>
> >>
> >
>

Re: Re: Question about shuffle/merge/sort phrase

Posted by Susheel Kumar Gadalay <sk...@gmail.com>.
What I explained is shuffle phase.

After the reducer pulls the data, it does a sort on the key part only
and calls the corresponding reduce method.
On 12/22/14, bit1129@163.com <bi...@163.com> wrote:
> Then what exactly happens after Reducer pulls all mapper output key/value
> pairs from all the mapper nodes before reducer see the
> <key,value1,value2..>?
>
>
>
> bit1129@163.com
>
> From: Susheel Kumar Gadalay
> Date: 2014-12-22 13:20
> To: user
> Subject: Re: Question about shuffle/merge/sort phrase
> Sorry, typo
>
> It is the reducer which will pull the mapper o/p as soon as it completes.
>
> On 12/22/14, Susheel Kumar Gadalay <sk...@gmail.com> wrote:
>> It is the mapper which will push the o/p to the respective reducer as
>> soon as it completes.
>>
>> The no of reducers are known at the beginning itself.
>> The mapper as it process the input split, generate the o/p of for each
>> reducer (if the mapper o/p key is eligible for the reducer).
>> The reducer will wait till the completion of all map tasks to start it
>> processing.
>>
>>
>> On 12/22/14, bit1129@163.com <bi...@163.com> wrote:
>>> Could some one help me on this question? thanks.
>>>
>>>
>>>
>>> bit1129@163.com
>>>
>>> 发件人: Todd
>>> 发送时间: 2014-12-21 21:59
>>> 收件人: user@hadoop.apache.org
>>> 主题: Question about shuffle/merge/sort phrase
>>> Hi, Hadoopers,
>>> I got a question about shuffle/sort/merge phrase related..
>>> My understanding is that shuffle is used to transfer the mapper
>>> output(key/value pairs) from mapper node to reducer node, and merge
>>> phrase
>>> is used to merge all the mapper output from all mapper nodes, and sort
>>> phrase is used to sort the key/value pair by key,
>>> Then my question, whose responsibility is it that brings each key with
>>> all
>>> its values together (The reducer's input is a key and an iterative
>>> values).
>>>
>>>
>>> Thanks.
>>>
>>
>

Re: Re: Question about shuffle/merge/sort phrase

Posted by Sandeep Khurana <sk...@gmail.com>.
After reducer has the data then it does sorting and merging at its end
too.After merging (data of same key together), it passes the keys and the
collection of values for each key to reducer 1 by 1 as you said. The sort
and merge on reducer side do this merging of records of same key (after
sorting).

Have  a look at the diagram at
http://hadoop-gyan.blogspot.in/2012/11/map-reduce-shuffle-and-sort.html
 (its not my blog).

On Mon, Dec 22, 2014 at 10:52 AM, bit1129@163.com <bi...@163.com> wrote:

> Then what exactly happens after Reducer pulls all mapper output key/value
> pairs from all the mapper nodes before reducer see the
> <key,value1,value2..>?
>
> ------------------------------
> bit1129@163.com
>
>
> *From:* Susheel Kumar Gadalay <sk...@gmail.com>
> *Date:* 2014-12-22 13:20
> *To:* user <us...@hadoop.apache.org>
> *Subject:* Re: Question about shuffle/merge/sort phrase
> Sorry, typo
>
> It is the reducer which will pull the mapper o/p as soon as it completes.
>
> On 12/22/14, Susheel Kumar Gadalay <sk...@gmail.com> wrote:
> > It is the mapper which will push the o/p to the respective reducer as
> > soon as it completes.
> >
> > The no of reducers are known at the beginning itself.
> > The mapper as it process the input split, generate the o/p of for each
> > reducer (if the mapper o/p key is eligible for the reducer).
> > The reducer will wait till the completion of all map tasks to start it
> > processing.
> >
> >
> > On 12/22/14, bit1129@163.com <bi...@163.com> wrote:
> >> Could some one help me on this question? thanks.
> >>
> >>
> >>
> >> bit1129@163.com
> >>
> >> 发件人: Todd
> >> 发送时间: 2014-12-21 21:59
> >> 收件人: user@hadoop.apache.org
> >> 主题: Question about shuffle/merge/sort phrase
> >> Hi, Hadoopers,
> >> I got a question about shuffle/sort/merge phrase related..
> >> My understanding is that shuffle is used to transfer the mapper
> >> output(key/value pairs) from mapper node to reducer node, and merge
> >> phrase
> >> is used to merge all the mapper output from all mapper nodes, and sort
> >> phrase is used to sort the key/value pair by key,
> >> Then my question, whose responsibility is it that brings each key with
> >> all
> >> its values together (The reducer's input is a key and an iterative
> >> values).
> >>
> >>
> >> Thanks.
> >>
> >
>
>


-- 
Thanks and regards
Sandeep Khurana

Re: Re: Question about shuffle/merge/sort phrase

Posted by Susheel Kumar Gadalay <sk...@gmail.com>.
What I explained is shuffle phase.

After the reducer pulls the data, it does a sort on the key part only
and calls the corresponding reduce method.
On 12/22/14, bit1129@163.com <bi...@163.com> wrote:
> Then what exactly happens after Reducer pulls all mapper output key/value
> pairs from all the mapper nodes before reducer see the
> <key,value1,value2..>?
>
>
>
> bit1129@163.com
>
> From: Susheel Kumar Gadalay
> Date: 2014-12-22 13:20
> To: user
> Subject: Re: Question about shuffle/merge/sort phrase
> Sorry, typo
>
> It is the reducer which will pull the mapper o/p as soon as it completes.
>
> On 12/22/14, Susheel Kumar Gadalay <sk...@gmail.com> wrote:
>> It is the mapper which will push the o/p to the respective reducer as
>> soon as it completes.
>>
>> The no of reducers are known at the beginning itself.
>> The mapper as it process the input split, generate the o/p of for each
>> reducer (if the mapper o/p key is eligible for the reducer).
>> The reducer will wait till the completion of all map tasks to start it
>> processing.
>>
>>
>> On 12/22/14, bit1129@163.com <bi...@163.com> wrote:
>>> Could some one help me on this question? thanks.
>>>
>>>
>>>
>>> bit1129@163.com
>>>
>>> 发件人: Todd
>>> 发送时间: 2014-12-21 21:59
>>> 收件人: user@hadoop.apache.org
>>> 主题: Question about shuffle/merge/sort phrase
>>> Hi, Hadoopers,
>>> I got a question about shuffle/sort/merge phrase related..
>>> My understanding is that shuffle is used to transfer the mapper
>>> output(key/value pairs) from mapper node to reducer node, and merge
>>> phrase
>>> is used to merge all the mapper output from all mapper nodes, and sort
>>> phrase is used to sort the key/value pair by key,
>>> Then my question, whose responsibility is it that brings each key with
>>> all
>>> its values together (The reducer's input is a key and an iterative
>>> values).
>>>
>>>
>>> Thanks.
>>>
>>
>

Re: Re: Question about shuffle/merge/sort phrase

Posted by Sandeep Khurana <sk...@gmail.com>.
After reducer has the data then it does sorting and merging at its end
too.After merging (data of same key together), it passes the keys and the
collection of values for each key to reducer 1 by 1 as you said. The sort
and merge on reducer side do this merging of records of same key (after
sorting).

Have  a look at the diagram at
http://hadoop-gyan.blogspot.in/2012/11/map-reduce-shuffle-and-sort.html
 (its not my blog).

On Mon, Dec 22, 2014 at 10:52 AM, bit1129@163.com <bi...@163.com> wrote:

> Then what exactly happens after Reducer pulls all mapper output key/value
> pairs from all the mapper nodes before reducer see the
> <key,value1,value2..>?
>
> ------------------------------
> bit1129@163.com
>
>
> *From:* Susheel Kumar Gadalay <sk...@gmail.com>
> *Date:* 2014-12-22 13:20
> *To:* user <us...@hadoop.apache.org>
> *Subject:* Re: Question about shuffle/merge/sort phrase
> Sorry, typo
>
> It is the reducer which will pull the mapper o/p as soon as it completes.
>
> On 12/22/14, Susheel Kumar Gadalay <sk...@gmail.com> wrote:
> > It is the mapper which will push the o/p to the respective reducer as
> > soon as it completes.
> >
> > The no of reducers are known at the beginning itself.
> > The mapper as it process the input split, generate the o/p of for each
> > reducer (if the mapper o/p key is eligible for the reducer).
> > The reducer will wait till the completion of all map tasks to start it
> > processing.
> >
> >
> > On 12/22/14, bit1129@163.com <bi...@163.com> wrote:
> >> Could some one help me on this question? thanks.
> >>
> >>
> >>
> >> bit1129@163.com
> >>
> >> 发件人: Todd
> >> 发送时间: 2014-12-21 21:59
> >> 收件人: user@hadoop.apache.org
> >> 主题: Question about shuffle/merge/sort phrase
> >> Hi, Hadoopers,
> >> I got a question about shuffle/sort/merge phrase related..
> >> My understanding is that shuffle is used to transfer the mapper
> >> output(key/value pairs) from mapper node to reducer node, and merge
> >> phrase
> >> is used to merge all the mapper output from all mapper nodes, and sort
> >> phrase is used to sort the key/value pair by key,
> >> Then my question, whose responsibility is it that brings each key with
> >> all
> >> its values together (The reducer's input is a key and an iterative
> >> values).
> >>
> >>
> >> Thanks.
> >>
> >
>
>


-- 
Thanks and regards
Sandeep Khurana

Re: Re: Question about shuffle/merge/sort phrase

Posted by Susheel Kumar Gadalay <sk...@gmail.com>.
What I explained is shuffle phase.

After the reducer pulls the data, it does a sort on the key part only
and calls the corresponding reduce method.
On 12/22/14, bit1129@163.com <bi...@163.com> wrote:
> Then what exactly happens after Reducer pulls all mapper output key/value
> pairs from all the mapper nodes before reducer see the
> <key,value1,value2..>?
>
>
>
> bit1129@163.com
>
> From: Susheel Kumar Gadalay
> Date: 2014-12-22 13:20
> To: user
> Subject: Re: Question about shuffle/merge/sort phrase
> Sorry, typo
>
> It is the reducer which will pull the mapper o/p as soon as it completes.
>
> On 12/22/14, Susheel Kumar Gadalay <sk...@gmail.com> wrote:
>> It is the mapper which will push the o/p to the respective reducer as
>> soon as it completes.
>>
>> The no of reducers are known at the beginning itself.
>> The mapper as it process the input split, generate the o/p of for each
>> reducer (if the mapper o/p key is eligible for the reducer).
>> The reducer will wait till the completion of all map tasks to start it
>> processing.
>>
>>
>> On 12/22/14, bit1129@163.com <bi...@163.com> wrote:
>>> Could some one help me on this question? thanks.
>>>
>>>
>>>
>>> bit1129@163.com
>>>
>>> 发件人: Todd
>>> 发送时间: 2014-12-21 21:59
>>> 收件人: user@hadoop.apache.org
>>> 主题: Question about shuffle/merge/sort phrase
>>> Hi, Hadoopers,
>>> I got a question about shuffle/sort/merge phrase related..
>>> My understanding is that shuffle is used to transfer the mapper
>>> output(key/value pairs) from mapper node to reducer node, and merge
>>> phrase
>>> is used to merge all the mapper output from all mapper nodes, and sort
>>> phrase is used to sort the key/value pair by key,
>>> Then my question, whose responsibility is it that brings each key with
>>> all
>>> its values together (The reducer's input is a key and an iterative
>>> values).
>>>
>>>
>>> Thanks.
>>>
>>
>

Re: Re: Question about shuffle/merge/sort phrase

Posted by Susheel Kumar Gadalay <sk...@gmail.com>.
What I explained is shuffle phase.

After the reducer pulls the data, it does a sort on the key part only
and calls the corresponding reduce method.
On 12/22/14, bit1129@163.com <bi...@163.com> wrote:
> Then what exactly happens after Reducer pulls all mapper output key/value
> pairs from all the mapper nodes before reducer see the
> <key,value1,value2..>?
>
>
>
> bit1129@163.com
>
> From: Susheel Kumar Gadalay
> Date: 2014-12-22 13:20
> To: user
> Subject: Re: Question about shuffle/merge/sort phrase
> Sorry, typo
>
> It is the reducer which will pull the mapper o/p as soon as it completes.
>
> On 12/22/14, Susheel Kumar Gadalay <sk...@gmail.com> wrote:
>> It is the mapper which will push the o/p to the respective reducer as
>> soon as it completes.
>>
>> The no of reducers are known at the beginning itself.
>> The mapper as it process the input split, generate the o/p of for each
>> reducer (if the mapper o/p key is eligible for the reducer).
>> The reducer will wait till the completion of all map tasks to start it
>> processing.
>>
>>
>> On 12/22/14, bit1129@163.com <bi...@163.com> wrote:
>>> Could some one help me on this question? thanks.
>>>
>>>
>>>
>>> bit1129@163.com
>>>
>>> 发件人: Todd
>>> 发送时间: 2014-12-21 21:59
>>> 收件人: user@hadoop.apache.org
>>> 主题: Question about shuffle/merge/sort phrase
>>> Hi, Hadoopers,
>>> I got a question about shuffle/sort/merge phrase related..
>>> My understanding is that shuffle is used to transfer the mapper
>>> output(key/value pairs) from mapper node to reducer node, and merge
>>> phrase
>>> is used to merge all the mapper output from all mapper nodes, and sort
>>> phrase is used to sort the key/value pair by key,
>>> Then my question, whose responsibility is it that brings each key with
>>> all
>>> its values together (The reducer's input is a key and an iterative
>>> values).
>>>
>>>
>>> Thanks.
>>>
>>
>

Re: Re: Question about shuffle/merge/sort phrase

Posted by "bit1129@163.com" <bi...@163.com>.
Then what exactly happens after Reducer pulls all mapper output key/value pairs from all the mapper nodes before reducer see the <key,value1,value2..>?



bit1129@163.com
 
From: Susheel Kumar Gadalay
Date: 2014-12-22 13:20
To: user
Subject: Re: Question about shuffle/merge/sort phrase
Sorry, typo
 
It is the reducer which will pull the mapper o/p as soon as it completes.
 
On 12/22/14, Susheel Kumar Gadalay <sk...@gmail.com> wrote:
> It is the mapper which will push the o/p to the respective reducer as
> soon as it completes.
>
> The no of reducers are known at the beginning itself.
> The mapper as it process the input split, generate the o/p of for each
> reducer (if the mapper o/p key is eligible for the reducer).
> The reducer will wait till the completion of all map tasks to start it
> processing.
>
>
> On 12/22/14, bit1129@163.com <bi...@163.com> wrote:
>> Could some one help me on this question? thanks.
>>
>>
>>
>> bit1129@163.com
>>
>> 发件人: Todd
>> 发送时间: 2014-12-21 21:59
>> 收件人: user@hadoop.apache.org
>> 主题: Question about shuffle/merge/sort phrase
>> Hi, Hadoopers,
>> I got a question about shuffle/sort/merge phrase related..
>> My understanding is that shuffle is used to transfer the mapper
>> output(key/value pairs) from mapper node to reducer node, and merge
>> phrase
>> is used to merge all the mapper output from all mapper nodes, and sort
>> phrase is used to sort the key/value pair by key,
>> Then my question, whose responsibility is it that brings each key with
>> all
>> its values together (The reducer's input is a key and an iterative
>> values).
>>
>>
>> Thanks.
>>
>

Re: Re: Question about shuffle/merge/sort phrase

Posted by "bit1129@163.com" <bi...@163.com>.
Then what exactly happens after Reducer pulls all mapper output key/value pairs from all the mapper nodes before reducer see the <key,value1,value2..>?



bit1129@163.com
 
From: Susheel Kumar Gadalay
Date: 2014-12-22 13:20
To: user
Subject: Re: Question about shuffle/merge/sort phrase
Sorry, typo
 
It is the reducer which will pull the mapper o/p as soon as it completes.
 
On 12/22/14, Susheel Kumar Gadalay <sk...@gmail.com> wrote:
> It is the mapper which will push the o/p to the respective reducer as
> soon as it completes.
>
> The no of reducers are known at the beginning itself.
> The mapper as it process the input split, generate the o/p of for each
> reducer (if the mapper o/p key is eligible for the reducer).
> The reducer will wait till the completion of all map tasks to start it
> processing.
>
>
> On 12/22/14, bit1129@163.com <bi...@163.com> wrote:
>> Could some one help me on this question? thanks.
>>
>>
>>
>> bit1129@163.com
>>
>> 发件人: Todd
>> 发送时间: 2014-12-21 21:59
>> 收件人: user@hadoop.apache.org
>> 主题: Question about shuffle/merge/sort phrase
>> Hi, Hadoopers,
>> I got a question about shuffle/sort/merge phrase related..
>> My understanding is that shuffle is used to transfer the mapper
>> output(key/value pairs) from mapper node to reducer node, and merge
>> phrase
>> is used to merge all the mapper output from all mapper nodes, and sort
>> phrase is used to sort the key/value pair by key,
>> Then my question, whose responsibility is it that brings each key with
>> all
>> its values together (The reducer's input is a key and an iterative
>> values).
>>
>>
>> Thanks.
>>
>

Re: Re: Question about shuffle/merge/sort phrase

Posted by "bit1129@163.com" <bi...@163.com>.
Then what exactly happens after Reducer pulls all mapper output key/value pairs from all the mapper nodes before reducer see the <key,value1,value2..>?



bit1129@163.com
 
From: Susheel Kumar Gadalay
Date: 2014-12-22 13:20
To: user
Subject: Re: Question about shuffle/merge/sort phrase
Sorry, typo
 
It is the reducer which will pull the mapper o/p as soon as it completes.
 
On 12/22/14, Susheel Kumar Gadalay <sk...@gmail.com> wrote:
> It is the mapper which will push the o/p to the respective reducer as
> soon as it completes.
>
> The no of reducers are known at the beginning itself.
> The mapper as it process the input split, generate the o/p of for each
> reducer (if the mapper o/p key is eligible for the reducer).
> The reducer will wait till the completion of all map tasks to start it
> processing.
>
>
> On 12/22/14, bit1129@163.com <bi...@163.com> wrote:
>> Could some one help me on this question? thanks.
>>
>>
>>
>> bit1129@163.com
>>
>> 发件人: Todd
>> 发送时间: 2014-12-21 21:59
>> 收件人: user@hadoop.apache.org
>> 主题: Question about shuffle/merge/sort phrase
>> Hi, Hadoopers,
>> I got a question about shuffle/sort/merge phrase related..
>> My understanding is that shuffle is used to transfer the mapper
>> output(key/value pairs) from mapper node to reducer node, and merge
>> phrase
>> is used to merge all the mapper output from all mapper nodes, and sort
>> phrase is used to sort the key/value pair by key,
>> Then my question, whose responsibility is it that brings each key with
>> all
>> its values together (The reducer's input is a key and an iterative
>> values).
>>
>>
>> Thanks.
>>
>

Re: Re: Question about shuffle/merge/sort phrase

Posted by "bit1129@163.com" <bi...@163.com>.
Then what exactly happens after Reducer pulls all mapper output key/value pairs from all the mapper nodes before reducer see the <key,value1,value2..>?



bit1129@163.com
 
From: Susheel Kumar Gadalay
Date: 2014-12-22 13:20
To: user
Subject: Re: Question about shuffle/merge/sort phrase
Sorry, typo
 
It is the reducer which will pull the mapper o/p as soon as it completes.
 
On 12/22/14, Susheel Kumar Gadalay <sk...@gmail.com> wrote:
> It is the mapper which will push the o/p to the respective reducer as
> soon as it completes.
>
> The no of reducers are known at the beginning itself.
> The mapper as it process the input split, generate the o/p of for each
> reducer (if the mapper o/p key is eligible for the reducer).
> The reducer will wait till the completion of all map tasks to start it
> processing.
>
>
> On 12/22/14, bit1129@163.com <bi...@163.com> wrote:
>> Could some one help me on this question? thanks.
>>
>>
>>
>> bit1129@163.com
>>
>> 发件人: Todd
>> 发送时间: 2014-12-21 21:59
>> 收件人: user@hadoop.apache.org
>> 主题: Question about shuffle/merge/sort phrase
>> Hi, Hadoopers,
>> I got a question about shuffle/sort/merge phrase related..
>> My understanding is that shuffle is used to transfer the mapper
>> output(key/value pairs) from mapper node to reducer node, and merge
>> phrase
>> is used to merge all the mapper output from all mapper nodes, and sort
>> phrase is used to sort the key/value pair by key,
>> Then my question, whose responsibility is it that brings each key with
>> all
>> its values together (The reducer's input is a key and an iterative
>> values).
>>
>>
>> Thanks.
>>
>

Re: Question about shuffle/merge/sort phrase

Posted by Susheel Kumar Gadalay <sk...@gmail.com>.
Sorry, typo

It is the reducer which will pull the mapper o/p as soon as it completes.

On 12/22/14, Susheel Kumar Gadalay <sk...@gmail.com> wrote:
> It is the mapper which will push the o/p to the respective reducer as
> soon as it completes.
>
> The no of reducers are known at the beginning itself.
> The mapper as it process the input split, generate the o/p of for each
> reducer (if the mapper o/p key is eligible for the reducer).
> The reducer will wait till the completion of all map tasks to start it
> processing.
>
>
> On 12/22/14, bit1129@163.com <bi...@163.com> wrote:
>> Could some one help me on this question? thanks.
>>
>>
>>
>> bit1129@163.com
>>
>> 发件人: Todd
>> 发送时间: 2014-12-21 21:59
>> 收件人: user@hadoop.apache.org
>> 主题: Question about shuffle/merge/sort phrase
>> Hi, Hadoopers,
>> I got a question about shuffle/sort/merge phrase related..
>> My understanding is that shuffle is used to transfer the mapper
>> output(key/value pairs) from mapper node to reducer node, and merge
>> phrase
>> is used to merge all the mapper output from all mapper nodes, and sort
>> phrase is used to sort the key/value pair by key,
>> Then my question, whose responsibility is it that brings each key with
>> all
>> its values together (The reducer's input is a key and an iterative
>> values).
>>
>>
>> Thanks.
>>
>

Re: Question about shuffle/merge/sort phrase

Posted by Susheel Kumar Gadalay <sk...@gmail.com>.
Sorry, typo

It is the reducer which will pull the mapper o/p as soon as it completes.

On 12/22/14, Susheel Kumar Gadalay <sk...@gmail.com> wrote:
> It is the mapper which will push the o/p to the respective reducer as
> soon as it completes.
>
> The no of reducers are known at the beginning itself.
> The mapper as it process the input split, generate the o/p of for each
> reducer (if the mapper o/p key is eligible for the reducer).
> The reducer will wait till the completion of all map tasks to start it
> processing.
>
>
> On 12/22/14, bit1129@163.com <bi...@163.com> wrote:
>> Could some one help me on this question? thanks.
>>
>>
>>
>> bit1129@163.com
>>
>> 发件人: Todd
>> 发送时间: 2014-12-21 21:59
>> 收件人: user@hadoop.apache.org
>> 主题: Question about shuffle/merge/sort phrase
>> Hi, Hadoopers,
>> I got a question about shuffle/sort/merge phrase related..
>> My understanding is that shuffle is used to transfer the mapper
>> output(key/value pairs) from mapper node to reducer node, and merge
>> phrase
>> is used to merge all the mapper output from all mapper nodes, and sort
>> phrase is used to sort the key/value pair by key,
>> Then my question, whose responsibility is it that brings each key with
>> all
>> its values together (The reducer's input is a key and an iterative
>> values).
>>
>>
>> Thanks.
>>
>

Re: Question about shuffle/merge/sort phrase

Posted by Susheel Kumar Gadalay <sk...@gmail.com>.
Sorry, typo

It is the reducer which will pull the mapper o/p as soon as it completes.

On 12/22/14, Susheel Kumar Gadalay <sk...@gmail.com> wrote:
> It is the mapper which will push the o/p to the respective reducer as
> soon as it completes.
>
> The no of reducers are known at the beginning itself.
> The mapper as it process the input split, generate the o/p of for each
> reducer (if the mapper o/p key is eligible for the reducer).
> The reducer will wait till the completion of all map tasks to start it
> processing.
>
>
> On 12/22/14, bit1129@163.com <bi...@163.com> wrote:
>> Could some one help me on this question? thanks.
>>
>>
>>
>> bit1129@163.com
>>
>> 发件人: Todd
>> 发送时间: 2014-12-21 21:59
>> 收件人: user@hadoop.apache.org
>> 主题: Question about shuffle/merge/sort phrase
>> Hi, Hadoopers,
>> I got a question about shuffle/sort/merge phrase related..
>> My understanding is that shuffle is used to transfer the mapper
>> output(key/value pairs) from mapper node to reducer node, and merge
>> phrase
>> is used to merge all the mapper output from all mapper nodes, and sort
>> phrase is used to sort the key/value pair by key,
>> Then my question, whose responsibility is it that brings each key with
>> all
>> its values together (The reducer's input is a key and an iterative
>> values).
>>
>>
>> Thanks.
>>
>

Re: Question about shuffle/merge/sort phrase

Posted by Susheel Kumar Gadalay <sk...@gmail.com>.
Sorry, typo

It is the reducer which will pull the mapper o/p as soon as it completes.

On 12/22/14, Susheel Kumar Gadalay <sk...@gmail.com> wrote:
> It is the mapper which will push the o/p to the respective reducer as
> soon as it completes.
>
> The no of reducers are known at the beginning itself.
> The mapper as it process the input split, generate the o/p of for each
> reducer (if the mapper o/p key is eligible for the reducer).
> The reducer will wait till the completion of all map tasks to start it
> processing.
>
>
> On 12/22/14, bit1129@163.com <bi...@163.com> wrote:
>> Could some one help me on this question? thanks.
>>
>>
>>
>> bit1129@163.com
>>
>> 发件人: Todd
>> 发送时间: 2014-12-21 21:59
>> 收件人: user@hadoop.apache.org
>> 主题: Question about shuffle/merge/sort phrase
>> Hi, Hadoopers,
>> I got a question about shuffle/sort/merge phrase related..
>> My understanding is that shuffle is used to transfer the mapper
>> output(key/value pairs) from mapper node to reducer node, and merge
>> phrase
>> is used to merge all the mapper output from all mapper nodes, and sort
>> phrase is used to sort the key/value pair by key,
>> Then my question, whose responsibility is it that brings each key with
>> all
>> its values together (The reducer's input is a key and an iterative
>> values).
>>
>>
>> Thanks.
>>
>

Re: Question about shuffle/merge/sort phrase

Posted by Susheel Kumar Gadalay <sk...@gmail.com>.
It is the mapper which will push the o/p to the respective reducer as
soon as it completes.

The no of reducers are known at the beginning itself.
The mapper as it process the input split, generate the o/p of for each
reducer (if the mapper o/p key is eligible for the reducer).
The reducer will wait till the completion of all map tasks to start it
processing.


On 12/22/14, bit1129@163.com <bi...@163.com> wrote:
> Could some one help me on this question? thanks.
>
>
>
> bit1129@163.com
>
> 发件人: Todd
> 发送时间: 2014-12-21 21:59
> 收件人: user@hadoop.apache.org
> 主题: Question about shuffle/merge/sort phrase
> Hi, Hadoopers,
> I got a question about shuffle/sort/merge phrase related..
> My understanding is that shuffle is used to transfer the mapper
> output(key/value pairs) from mapper node to reducer node, and merge phrase
> is used to merge all the mapper output from all mapper nodes, and sort
> phrase is used to sort the key/value pair by key,
> Then my question, whose responsibility is it that brings each key with all
> its values together (The reducer's input is a key and an iterative values).
>
>
> Thanks.
>

Re: Question about shuffle/merge/sort phrase

Posted by Susheel Kumar Gadalay <sk...@gmail.com>.
It is the mapper which will push the o/p to the respective reducer as
soon as it completes.

The no of reducers are known at the beginning itself.
The mapper as it process the input split, generate the o/p of for each
reducer (if the mapper o/p key is eligible for the reducer).
The reducer will wait till the completion of all map tasks to start it
processing.


On 12/22/14, bit1129@163.com <bi...@163.com> wrote:
> Could some one help me on this question? thanks.
>
>
>
> bit1129@163.com
>
> 发件人: Todd
> 发送时间: 2014-12-21 21:59
> 收件人: user@hadoop.apache.org
> 主题: Question about shuffle/merge/sort phrase
> Hi, Hadoopers,
> I got a question about shuffle/sort/merge phrase related..
> My understanding is that shuffle is used to transfer the mapper
> output(key/value pairs) from mapper node to reducer node, and merge phrase
> is used to merge all the mapper output from all mapper nodes, and sort
> phrase is used to sort the key/value pair by key,
> Then my question, whose responsibility is it that brings each key with all
> its values together (The reducer's input is a key and an iterative values).
>
>
> Thanks.
>

Re: Question about shuffle/merge/sort phrase

Posted by Susheel Kumar Gadalay <sk...@gmail.com>.
It is the mapper which will push the o/p to the respective reducer as
soon as it completes.

The no of reducers are known at the beginning itself.
The mapper as it process the input split, generate the o/p of for each
reducer (if the mapper o/p key is eligible for the reducer).
The reducer will wait till the completion of all map tasks to start it
processing.


On 12/22/14, bit1129@163.com <bi...@163.com> wrote:
> Could some one help me on this question? thanks.
>
>
>
> bit1129@163.com
>
> 发件人: Todd
> 发送时间: 2014-12-21 21:59
> 收件人: user@hadoop.apache.org
> 主题: Question about shuffle/merge/sort phrase
> Hi, Hadoopers,
> I got a question about shuffle/sort/merge phrase related..
> My understanding is that shuffle is used to transfer the mapper
> output(key/value pairs) from mapper node to reducer node, and merge phrase
> is used to merge all the mapper output from all mapper nodes, and sort
> phrase is used to sort the key/value pair by key,
> Then my question, whose responsibility is it that brings each key with all
> its values together (The reducer's input is a key and an iterative values).
>
>
> Thanks.
>

Re: Question about shuffle/merge/sort phrase

Posted by Susheel Kumar Gadalay <sk...@gmail.com>.
It is the mapper which will push the o/p to the respective reducer as
soon as it completes.

The no of reducers are known at the beginning itself.
The mapper as it process the input split, generate the o/p of for each
reducer (if the mapper o/p key is eligible for the reducer).
The reducer will wait till the completion of all map tasks to start it
processing.


On 12/22/14, bit1129@163.com <bi...@163.com> wrote:
> Could some one help me on this question? thanks.
>
>
>
> bit1129@163.com
>
> 发件人: Todd
> 发送时间: 2014-12-21 21:59
> 收件人: user@hadoop.apache.org
> 主题: Question about shuffle/merge/sort phrase
> Hi, Hadoopers,
> I got a question about shuffle/sort/merge phrase related..
> My understanding is that shuffle is used to transfer the mapper
> output(key/value pairs) from mapper node to reducer node, and merge phrase
> is used to merge all the mapper output from all mapper nodes, and sort
> phrase is used to sort the key/value pair by key,
> Then my question, whose responsibility is it that brings each key with all
> its values together (The reducer's input is a key and an iterative values).
>
>
> Thanks.
>

回复: Question about shuffle/merge/sort phrase

Posted by "bit1129@163.com" <bi...@163.com>.
Could some one help me on this question? thanks.



bit1129@163.com
 
发件人: Todd
发送时间: 2014-12-21 21:59
收件人: user@hadoop.apache.org
主题: Question about shuffle/merge/sort phrase
Hi, Hadoopers,
I got a question about shuffle/sort/merge phrase related..
My understanding is that shuffle is used to transfer the mapper output(key/value pairs) from mapper node to reducer node, and merge phrase is used to merge all the mapper output from all mapper nodes, and sort phrase is used to sort the key/value pair by key, 
Then my question, whose responsibility is it that brings each key with all its values together (The reducer's input is a key and an iterative values). 

Thanks.

回复: Question about shuffle/merge/sort phrase

Posted by "bit1129@163.com" <bi...@163.com>.
Could some one help me on this question? thanks.



bit1129@163.com
 
发件人: Todd
发送时间: 2014-12-21 21:59
收件人: user@hadoop.apache.org
主题: Question about shuffle/merge/sort phrase
Hi, Hadoopers,
I got a question about shuffle/sort/merge phrase related..
My understanding is that shuffle is used to transfer the mapper output(key/value pairs) from mapper node to reducer node, and merge phrase is used to merge all the mapper output from all mapper nodes, and sort phrase is used to sort the key/value pair by key, 
Then my question, whose responsibility is it that brings each key with all its values together (The reducer's input is a key and an iterative values). 

Thanks.

Re: RE: Question about shuffle/merge/sort phrase

Posted by "bit1129@163.com" <bi...@163.com>.
Thanks Rohith,

My question regarding this is on the Reducer side, not related with Combiner( which happens on the mapper node).

When all mappers' output key/value pairs shuffle to the reduer nodes, , three things should be done.
1. Merge mapper' output key/value pairs from all the mapper nodes together.
2. The key/value pairs are sorted by the key
3. All the values of the same key will form an iterative collection into a format like <key, value1,value2,value3...>
My question is who takes this responsibiltiy to form this iterative collection?

Thanks. 



bit1129@163.com
 
From: Rohith Sharma K S
Date: 2014-12-22 12:23
To: user@hadoop.apache.org
Subject: RE: Question about shuffle/merge/sort phrase
whose responsibility is it that brings each key with all its values together
>> You can set combiner class in your job. For more information , refer
http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html
 
Thanks & Regards
Rohith Sharma K S
 
This e-mail and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!
 
From: Todd [mailto:bit1129@163.com] 
Sent: 21 December 2014 19:29
To: user@hadoop.apache.org
Subject: Question about shuffle/merge/sort phrase
 
Hi, Hadoopers,
I got a question about shuffle/sort/merge phrase related..
My understanding is that shuffle is used to transfer the mapper output(key/value pairs) from mapper node to reducer node, and merge phrase is used to merge all the mapper output from all mapper nodes, and sort phrase is used to sort the key/value pair by key, 
Then my question, whose responsibility is it that brings each key with all its values together (The reducer's input is a key and an iterative values). 

Thanks.

Re: RE: Question about shuffle/merge/sort phrase

Posted by "bit1129@163.com" <bi...@163.com>.
Thanks Rohith,

My question regarding this is on the Reducer side, not related with Combiner( which happens on the mapper node).

When all mappers' output key/value pairs shuffle to the reduer nodes, , three things should be done.
1. Merge mapper' output key/value pairs from all the mapper nodes together.
2. The key/value pairs are sorted by the key
3. All the values of the same key will form an iterative collection into a format like <key, value1,value2,value3...>
My question is who takes this responsibiltiy to form this iterative collection?

Thanks. 



bit1129@163.com
 
From: Rohith Sharma K S
Date: 2014-12-22 12:23
To: user@hadoop.apache.org
Subject: RE: Question about shuffle/merge/sort phrase
whose responsibility is it that brings each key with all its values together
>> You can set combiner class in your job. For more information , refer
http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html
 
Thanks & Regards
Rohith Sharma K S
 
This e-mail and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!
 
From: Todd [mailto:bit1129@163.com] 
Sent: 21 December 2014 19:29
To: user@hadoop.apache.org
Subject: Question about shuffle/merge/sort phrase
 
Hi, Hadoopers,
I got a question about shuffle/sort/merge phrase related..
My understanding is that shuffle is used to transfer the mapper output(key/value pairs) from mapper node to reducer node, and merge phrase is used to merge all the mapper output from all mapper nodes, and sort phrase is used to sort the key/value pair by key, 
Then my question, whose responsibility is it that brings each key with all its values together (The reducer's input is a key and an iterative values). 

Thanks.

Re: RE: Question about shuffle/merge/sort phrase

Posted by "bit1129@163.com" <bi...@163.com>.
Thanks Rohith,

My question regarding this is on the Reducer side, not related with Combiner( which happens on the mapper node).

When all mappers' output key/value pairs shuffle to the reduer nodes, , three things should be done.
1. Merge mapper' output key/value pairs from all the mapper nodes together.
2. The key/value pairs are sorted by the key
3. All the values of the same key will form an iterative collection into a format like <key, value1,value2,value3...>
My question is who takes this responsibiltiy to form this iterative collection?

Thanks. 



bit1129@163.com
 
From: Rohith Sharma K S
Date: 2014-12-22 12:23
To: user@hadoop.apache.org
Subject: RE: Question about shuffle/merge/sort phrase
whose responsibility is it that brings each key with all its values together
>> You can set combiner class in your job. For more information , refer
http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html
 
Thanks & Regards
Rohith Sharma K S
 
This e-mail and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!
 
From: Todd [mailto:bit1129@163.com] 
Sent: 21 December 2014 19:29
To: user@hadoop.apache.org
Subject: Question about shuffle/merge/sort phrase
 
Hi, Hadoopers,
I got a question about shuffle/sort/merge phrase related..
My understanding is that shuffle is used to transfer the mapper output(key/value pairs) from mapper node to reducer node, and merge phrase is used to merge all the mapper output from all mapper nodes, and sort phrase is used to sort the key/value pair by key, 
Then my question, whose responsibility is it that brings each key with all its values together (The reducer's input is a key and an iterative values). 

Thanks.

Re: RE: Question about shuffle/merge/sort phrase

Posted by "bit1129@163.com" <bi...@163.com>.
Thanks Rohith,

My question regarding this is on the Reducer side, not related with Combiner( which happens on the mapper node).

When all mappers' output key/value pairs shuffle to the reduer nodes, , three things should be done.
1. Merge mapper' output key/value pairs from all the mapper nodes together.
2. The key/value pairs are sorted by the key
3. All the values of the same key will form an iterative collection into a format like <key, value1,value2,value3...>
My question is who takes this responsibiltiy to form this iterative collection?

Thanks. 



bit1129@163.com
 
From: Rohith Sharma K S
Date: 2014-12-22 12:23
To: user@hadoop.apache.org
Subject: RE: Question about shuffle/merge/sort phrase
whose responsibility is it that brings each key with all its values together
>> You can set combiner class in your job. For more information , refer
http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html
 
Thanks & Regards
Rohith Sharma K S
 
This e-mail and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!
 
From: Todd [mailto:bit1129@163.com] 
Sent: 21 December 2014 19:29
To: user@hadoop.apache.org
Subject: Question about shuffle/merge/sort phrase
 
Hi, Hadoopers,
I got a question about shuffle/sort/merge phrase related..
My understanding is that shuffle is used to transfer the mapper output(key/value pairs) from mapper node to reducer node, and merge phrase is used to merge all the mapper output from all mapper nodes, and sort phrase is used to sort the key/value pair by key, 
Then my question, whose responsibility is it that brings each key with all its values together (The reducer's input is a key and an iterative values). 

Thanks.

RE: Question about shuffle/merge/sort phrase

Posted by Rohith Sharma K S <ro...@huawei.com>.
whose responsibility is it that brings each key with all its values together
>> You can set combiner class in your job. For more information , refer
http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html

Thanks & Regards
Rohith Sharma K S

This e-mail and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!

From: Todd [mailto:bit1129@163.com]
Sent: 21 December 2014 19:29
To: user@hadoop.apache.org
Subject: Question about shuffle/merge/sort phrase

Hi, Hadoopers,
I got a question about shuffle/sort/merge phrase related..
My understanding is that shuffle is used to transfer the mapper output(key/value pairs) from mapper node to reducer node, and merge phrase is used to merge all the mapper output from all mapper nodes, and sort phrase is used to sort the key/value pair by key,
Then my question, whose responsibility is it that brings each key with all its values together (The reducer's input is a key and an iterative values).

Thanks.

RE: Question about shuffle/merge/sort phrase

Posted by Rohith Sharma K S <ro...@huawei.com>.
whose responsibility is it that brings each key with all its values together
>> You can set combiner class in your job. For more information , refer
http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html

Thanks & Regards
Rohith Sharma K S

This e-mail and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!

From: Todd [mailto:bit1129@163.com]
Sent: 21 December 2014 19:29
To: user@hadoop.apache.org
Subject: Question about shuffle/merge/sort phrase

Hi, Hadoopers,
I got a question about shuffle/sort/merge phrase related..
My understanding is that shuffle is used to transfer the mapper output(key/value pairs) from mapper node to reducer node, and merge phrase is used to merge all the mapper output from all mapper nodes, and sort phrase is used to sort the key/value pair by key,
Then my question, whose responsibility is it that brings each key with all its values together (The reducer's input is a key and an iterative values).

Thanks.

RE: Question about shuffle/merge/sort phrase

Posted by Rohith Sharma K S <ro...@huawei.com>.
whose responsibility is it that brings each key with all its values together
>> You can set combiner class in your job. For more information , refer
http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html

Thanks & Regards
Rohith Sharma K S

This e-mail and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!

From: Todd [mailto:bit1129@163.com]
Sent: 21 December 2014 19:29
To: user@hadoop.apache.org
Subject: Question about shuffle/merge/sort phrase

Hi, Hadoopers,
I got a question about shuffle/sort/merge phrase related..
My understanding is that shuffle is used to transfer the mapper output(key/value pairs) from mapper node to reducer node, and merge phrase is used to merge all the mapper output from all mapper nodes, and sort phrase is used to sort the key/value pair by key,
Then my question, whose responsibility is it that brings each key with all its values together (The reducer's input is a key and an iterative values).

Thanks.

RE: Question about shuffle/merge/sort phrase

Posted by Rohith Sharma K S <ro...@huawei.com>.
whose responsibility is it that brings each key with all its values together
>> You can set combiner class in your job. For more information , refer
http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html

Thanks & Regards
Rohith Sharma K S

This e-mail and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!

From: Todd [mailto:bit1129@163.com]
Sent: 21 December 2014 19:29
To: user@hadoop.apache.org
Subject: Question about shuffle/merge/sort phrase

Hi, Hadoopers,
I got a question about shuffle/sort/merge phrase related..
My understanding is that shuffle is used to transfer the mapper output(key/value pairs) from mapper node to reducer node, and merge phrase is used to merge all the mapper output from all mapper nodes, and sort phrase is used to sort the key/value pair by key,
Then my question, whose responsibility is it that brings each key with all its values together (The reducer's input is a key and an iterative values).

Thanks.

回复: Question about shuffle/merge/sort phrase

Posted by "bit1129@163.com" <bi...@163.com>.
Could some one help me on this question? thanks.



bit1129@163.com
 
发件人: Todd
发送时间: 2014-12-21 21:59
收件人: user@hadoop.apache.org
主题: Question about shuffle/merge/sort phrase
Hi, Hadoopers,
I got a question about shuffle/sort/merge phrase related..
My understanding is that shuffle is used to transfer the mapper output(key/value pairs) from mapper node to reducer node, and merge phrase is used to merge all the mapper output from all mapper nodes, and sort phrase is used to sort the key/value pair by key, 
Then my question, whose responsibility is it that brings each key with all its values together (The reducer's input is a key and an iterative values). 

Thanks.

回复: Question about shuffle/merge/sort phrase

Posted by "bit1129@163.com" <bi...@163.com>.
Could some one help me on this question? thanks.



bit1129@163.com
 
发件人: Todd
发送时间: 2014-12-21 21:59
收件人: user@hadoop.apache.org
主题: Question about shuffle/merge/sort phrase
Hi, Hadoopers,
I got a question about shuffle/sort/merge phrase related..
My understanding is that shuffle is used to transfer the mapper output(key/value pairs) from mapper node to reducer node, and merge phrase is used to merge all the mapper output from all mapper nodes, and sort phrase is used to sort the key/value pair by key, 
Then my question, whose responsibility is it that brings each key with all its values together (The reducer's input is a key and an iterative values). 

Thanks.