You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by Pavan Kulkarni <pa...@gmail.com> on 2012/07/17 02:20:12 UTC

Where are the Map-output files produced ?

Hi,

  I am trying to create  a hardlink between the files created after the Map
phase
and the Reducer nodes which are behind Lustre. So basically the entire copy
phase during shuffle is eliminated.
 To create these hardlinks I need the exact fully qualified filenames of
the partitioned Map outputs and also the Path to the File on the Reducer
node where it is copied to.
  I am working on hadoop-1.0.2 version and the entire process happens in
the ReduceTask.java class.
I see that the files on Reduce node to which data is read into is
named as *output/map_1.out-2.
*Is this correct?
Also I couldn't find out the fully specified path of the files on Map-side
i.e the names of the partitioned Map-output files.
 Anyone has any idea how to find out the fully qualified pathnames of these
files?
Any help is highly appreciated.Thanks

-- 

--With Regards
Pavan Kulkarni