You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-issues@hadoop.apache.org by Sheetal Gosrani <sg...@barracuda.com> on 2012/10/18 23:52:10 UTC

MultipleOutputs writes to only 1 output file even if configured to write to 2 output files

Hello,

I am trying to read from cassandra and write the reducers output to multiple output files using MultipleOutputs api. The file formats in my case are custom output formats extending FileOutputFormat. I have configured my job in a similar manner as shown in MultipleOutputs javadocs api: http://hadoop.apache.org/docs/r1.0.3/api/index.html

However, when I run the job, I only get one output file named part-r-0000 which is in text output format. If job.setOutputFormatClass is not set, by default it considers TextOutputFormat to be the format. It completely ignores the output formats I specified in MulitpleOutputs.addNamedOutput(job, "format1", MyCustomFileFormat1.class, Text.class, Text.class) and MulitpleOutputs.addNamedOutput(job, "format2", MyCustomFileFormat2.class, Text.class, Text.class). Is someone else facing similar problem or am I doing something wrong ?

I also tried to write a very simple MR program which reads from a text file and writes the output in 2 formats TextOutputFormat and SequenceFileOutputFormat as shown in the MultipleOutputs api. However, no luck there as well. I get only 1 output file in text output format.

Can someone help me with this ?

Thanks,
Sheetal

'Like' us on Facebook for exclusive content and other resources on all Barracuda Networks solutions.
Visit http://barracudanetworks.com/facebook