You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Sunil Jagadish <su...@gmail.com> on 2008/11/14 06:38:41 UTC
Writing to multiple output channels
Hi,
I have a mapper which needs to write output into two different kinds of
files (output.collect()).
For my purpose, I do not need any reducers.
public void map(IntWritable key, FeatureVectorWritable value
, OutputCollector<Text, NullWritable> output,
Reporter reporter)
throws IOException
{
// some processing....
output.collect(new Text(builder.toString()), NullWritable.get());
// Ideally I want to do another:
// output.collect(new Text(builder.toString()), NullWritable.get());
// but it will all land up in the same part-xxxxx file.
}
Any ideas on what is the right way of implementing such a thing?
Thanks in advance.
- Sunil Jagadish
Re: Writing to multiple output channels
Posted by Owen O'Malley <om...@apache.org>.
On Nov 13, 2008, at 9:38 PM, Sunil Jagadish wrote:
> I have a mapper which needs to write output into two different kinds
> of
> files (output.collect()).
> For my purpose, I do not need any reducers.
Set the number of reduces to 0.
Open a sequence file in the mapper and write the second stream to it.
You'll end up with two files per a mapper.
-- Owen
Re: Writing to multiple output channels
Posted by Sharad Agarwal <sh...@yahoo-inc.com>.
Sunil Jagadish wrote:
> Hi,
>
> I have a mapper which needs to write output into two different kinds of
> files (output.collect()).
>
check MultipleOutputFormat. That may help.