You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-dev@hadoop.apache.org by 俞盛朋 <th...@gmail.com> on 2012/08/20 04:41:21 UTC

答复: 答复: Significance of file.out.index during Shuffle Phase ?

Oh sorry, I've misunderstood your question.  Forget what I've said please

-----邮件原件-----
发件人: Pavan Kulkarni [mailto:pavan.baburao@gmail.com] 
发送时间: 2012年8月20日 9:48
收件人: mapreduce-dev@hadoop.apache.org
主题: Re: 答复: Significance of file.out.index during Shuffle Phase ?

Hi,

  But I don't see those files during the executions.I only see file.out in
the job_ID/attempID/output/  folder.

On Sun, Aug 19, 2012 at 8:44 PM, 俞盛朋 <th...@gmail.com> wrote:

> The MapReduce program would create an output file for each reducer, 
> named "part-xxxxxx" by default
>
> -----邮件原件-----
> 发件人: Pavan Kulkarni [mailto:pavan.baburao@gmail.com]
> 发送时间: 2012年8月19日 23:58
> 收件人: mapreduce-dev@hadoop.apache.org
> 主题: Re: Significance of file.out.index during Shuffle Phase ?
>
> Ohh ,Thanks a lot Harsh. Exactly what I was looking for.
> I wanted to create different file.out's for different reducers. 
> Something like
> file.out.1 for reducer 1, file.out.2 for reducer etc. Is it possible 
> to do this in the MapReduce program or I need to tweak some Hadoop 
> source files for that? Thanks.
>
> On Sun, Aug 19, 2012 at 7:02 AM, Harsh J <ha...@cloudera.com> wrote:
>
> > Hey Pavan,
> >
> > Yes you've got it almost right on how file.out is served to each 
> > reducer. See the code at
> >
> > http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-mapreduce-pr
> > oj 
> > ect/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main
> > /j ava/org/apache/hadoop/mapred/ShuffleHandler.java?view=markup
> > (Method under L502:L565 that sends data for a specific 
> > reduce/partition ID (integer)).
> >
> > On Sun, Aug 19, 2012 at 9:05 AM, Pavan Kulkarni 
> > <pa...@gmail.com>
> > wrote:
> > > Hi,
> > >
> > >   I was trying to understand how exactly the reducers find out how 
> > > to
> > fetch
> > > the data of its own partition from Map nodes.
> > > During the executions of MapReduce, I see that *file.out* is 
> > > created on
> > Map
> > > nodes, so my question is how does a reducer know what part of 
> > > file.out to fetch? Is the *file.out.index* play any
> > role?
> > > Any help is appreciated .Thanks
> > >
> > >
> > >
> > > --With Regards
> > > Pavan Kulkarni
> >
> >
> >
> > --
> > Harsh J
> >
>
>
>
> --
>
> --With Regards
> Pavan Kulkarni
>
>


-- 

--With Regards
Pavan Kulkarni