You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by rajgopalv <ra...@gmail.com> on 2010/08/16 08:55:12 UTC
MultipleOutputFormat
0 down vote favorite
Hi. I'm a newbie in Hadoop. I'm trying out the Wordcount program.
Now to try out multiple output files, i use MultipleOutputFormat. this link
helped me in doing it.
http://hadoop.apache.org/common/docs/r0.19.0/api/org/apache/hadoop/mapred/lib/MultipleOutputs.html
in my driver class i had
MultipleOutputs.addNamedOutput(conf, "even",
org.apache.hadoop.mapred.TextOutputFormat.class, Text.class,
IntWritable.class);
MultipleOutputs.addNamedOutput(conf, "odd",
org.apache.hadoop.mapred.TextOutputFormat.class, Text.class,
IntWritable.class);`
and my reduce class became this
public static class Reduce extends MapReduceBase implements
Reducer<Text, IntWritable, Text, IntWritable> {
MultipleOutputs mos = null;
public void configure(JobConf job) {
mos = new MultipleOutputs(job);
}
public void reduce(Text key, Iterator<IntWritable> values,
OutputCollector<Text, IntWritable> output, Reporter reporter)
throws IOException {
int sum = 0;
while (values.hasNext()) {
sum += values.next().get();
}
if (sum % 2 == 0) {
mos.getCollector("even", reporter).collect(key, new
IntWritable(sum));
}else {
mos.getCollector("odd", reporter).collect(key, new
IntWritable(sum));
}
//output.collect(key, new IntWritable(sum));
}
@Override
public void close() throws IOException {
// TODO Auto-generated method stub
mos.close();
}
}
Things worked , but i get LOT of files, (one odd and one even for every
map-reduce)
Question is : How can i have just 2 output files (odd & even) so that every
odd output of every reduce gets written into that odd file, and same for
even.
--
View this message in context: http://old.nabble.com/MultipleOutputFormat-tp29447204p29447204.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.
Re: MultipleOutputFormat
Posted by Patrick Angeles <pa...@cloudera.com>.
In this case, don't bother with MultipleOutput.
Specify 2 reducers, and a custom partitioner that sends 'even' records to
partition 0, and 'odd' partitions to partition 1.
You will have two output files named 'part-00000' and 'part-00001'
corresponding to odd and even.
On Mon, Aug 16, 2010 at 2:55 AM, rajgopalv <ra...@gmail.com> wrote:
>
> 0 down vote favorite
>
>
> Hi. I'm a newbie in Hadoop. I'm trying out the Wordcount program.
>
> Now to try out multiple output files, i use MultipleOutputFormat. this link
> helped me in doing it.
>
> http://hadoop.apache.org/common/docs/r0.19.0/api/org/apache/hadoop/mapred/lib/MultipleOutputs.html
>
> in my driver class i had
>
> MultipleOutputs.addNamedOutput(conf, "even",
> org.apache.hadoop.mapred.TextOutputFormat.class, Text.class,
> IntWritable.class);
>
> MultipleOutputs.addNamedOutput(conf, "odd",
> org.apache.hadoop.mapred.TextOutputFormat.class, Text.class,
> IntWritable.class);`
>
> and my reduce class became this
>
> public static class Reduce extends MapReduceBase implements
> Reducer<Text, IntWritable, Text, IntWritable> {
> MultipleOutputs mos = null;
>
> public void configure(JobConf job) {
> mos = new MultipleOutputs(job);
> }
>
> public void reduce(Text key, Iterator<IntWritable> values,
> OutputCollector<Text, IntWritable> output, Reporter reporter)
> throws IOException {
> int sum = 0;
> while (values.hasNext()) {
> sum += values.next().get();
> }
> if (sum % 2 == 0) {
> mos.getCollector("even", reporter).collect(key, new
> IntWritable(sum));
> }else {
> mos.getCollector("odd", reporter).collect(key, new
> IntWritable(sum));
> }
> //output.collect(key, new IntWritable(sum));
> }
> @Override
> public void close() throws IOException {
> // TODO Auto-generated method stub
> mos.close();
> }
> }
>
> Things worked , but i get LOT of files, (one odd and one even for every
> map-reduce)
>
> Question is : How can i have just 2 output files (odd & even) so that every
> odd output of every reduce gets written into that odd file, and same for
> even.
>
> --
> View this message in context:
> http://old.nabble.com/MultipleOutputFormat-tp29447204p29447204.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>
>
Re: MultipleOutputFormat
Posted by Amareshwari Sri Ramadasu <am...@yahoo-inc.com>.
Try with number of reducers = 1 .
-Amareshwari
On 8/16/10 12:25 PM, "rajgopalv" <ra...@gmail.com> wrote:
0 down vote favorite
Hi. I'm a newbie in Hadoop. I'm trying out the Wordcount program.
Now to try out multiple output files, i use MultipleOutputFormat. this link
helped me in doing it.
http://hadoop.apache.org/common/docs/r0.19.0/api/org/apache/hadoop/mapred/lib/MultipleOutputs.html
in my driver class i had
MultipleOutputs.addNamedOutput(conf, "even",
org.apache.hadoop.mapred.TextOutputFormat.class, Text.class,
IntWritable.class);
MultipleOutputs.addNamedOutput(conf, "odd",
org.apache.hadoop.mapred.TextOutputFormat.class, Text.class,
IntWritable.class);`
and my reduce class became this
public static class Reduce extends MapReduceBase implements
Reducer<Text, IntWritable, Text, IntWritable> {
MultipleOutputs mos = null;
public void configure(JobConf job) {
mos = new MultipleOutputs(job);
}
public void reduce(Text key, Iterator<IntWritable> values,
OutputCollector<Text, IntWritable> output, Reporter reporter)
throws IOException {
int sum = 0;
while (values.hasNext()) {
sum += values.next().get();
}
if (sum % 2 == 0) {
mos.getCollector("even", reporter).collect(key, new
IntWritable(sum));
}else {
mos.getCollector("odd", reporter).collect(key, new
IntWritable(sum));
}
//output.collect(key, new IntWritable(sum));
}
@Override
public void close() throws IOException {
// TODO Auto-generated method stub
mos.close();
}
}
Things worked , but i get LOT of files, (one odd and one even for every
map-reduce)
Question is : How can i have just 2 output files (odd & even) so that every
odd output of every reduce gets written into that odd file, and same for
even.
--
View this message in context: http://old.nabble.com/MultipleOutputFormat-tp29447204p29447204.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.