You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by Sheetal Gosrani <sg...@barracuda.com> on 2012/10/18 23:52:10 UTC
MultipleOutputs writes to only 1 output file even if configured to
write to 2 output files
Hello,
I am trying to read from cassandra and write the reducers output to multiple output files using MultipleOutputs api. The file formats in my case are custom output formats extending FileOutputFormat. I have configured my job in a similar manner as shown in MultipleOutputs javadocs api: http://hadoop.apache.org/docs/r1.0.3/api/index.html
However, when I run the job, I only get one output file named part-r-0000 which is in text output format. If job.setOutputFormatClass is not set, by default it considers TextOutputFormat to be the format. It completely ignores the output formats I specified in MulitpleOutputs.addNamedOutput(job, "format1", MyCustomFileFormat1.class, Text.class, Text.class) and MulitpleOutputs.addNamedOutput(job, "format2", MyCustomFileFormat2.class, Text.class, Text.class). Is someone else facing similar problem or am I doing something wrong ?
I also tried to write a very simple MR program which reads from a text file and writes the output in 2 formats TextOutputFormat and SequenceFileOutputFormat as shown in the MultipleOutputs api. However, no luck there as well. I get only 1 output file in text output format.
Can someone help me with this ?
Thanks,
Sheetal
'Like' us on Facebook for exclusive content and other resources on all Barracuda Networks solutions.
Visit http://barracudanetworks.com/facebook