You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-dev@hadoop.apache.org by ch huang <ju...@gmail.com> on 2013/12/20 06:44:54 UTC

problem about use mapreduce get big file

hi,maillist:
     i want to produce a big input file( or a collection of big files) use
MR job,my output file is very sample ,like following info

1,text
2,text
3,text
......

the question is the whole output ,it's first column is a count, but if i
has 16 map ,it output will be 16 file like
xxx_m1_xxx
......
xxx_m15_xxx
i do know  how to guarantee the first file output is (if each file has 2
record)

1,text
2,text

and second is
3,text
4,text

so i can combine them into
1.text
2,text
3,text
4,text
  what i think is if i can know current map position in map construct like
(i have a map array ,and i can get the index which current map task on
though map task context)

anyone can help?