You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Sunil Jagadish <su...@gmail.com> on 2008/11/14 06:38:41 UTC

Writing to multiple output channels

Hi,

I have a mapper which needs to write output into two different kinds of
files (output.collect()).
For my purpose, I do not need any reducers.

public void map(IntWritable key, FeatureVectorWritable value
                        , OutputCollector<Text, NullWritable> output,
Reporter reporter)
                throws IOException
{
   // some processing....
   output.collect(new Text(builder.toString()), NullWritable.get());
   // Ideally I want to do another:
      // output.collect(new Text(builder.toString()), NullWritable.get());
      // but it will all land up in the same part-xxxxx file.
}


Any ideas on what is the right way of implementing such a thing?

Thanks in advance.


- Sunil Jagadish

Re: Writing to multiple output channels

Posted by Owen O'Malley <om...@apache.org>.
On Nov 13, 2008, at 9:38 PM, Sunil Jagadish wrote:

> I have a mapper which needs to write output into two different kinds  
> of
> files (output.collect()).
> For my purpose, I do not need any reducers.

Set the number of reduces to 0.
Open a sequence file in the mapper and write the second stream to it.
You'll end up with two files per a mapper.

-- Owen

Re: Writing to multiple output channels

Posted by Sharad Agarwal <sh...@yahoo-inc.com>.
Sunil Jagadish wrote:
> Hi,
>
> I have a mapper which needs to write output into two different kinds of
> files (output.collect()).
>   
check MultipleOutputFormat. That may help.