You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Tarandeep Singh <ta...@gmail.com> on 2008/06/06 01:36:28 UTC
MapWritable as output value of Reducer
hi,
Can I use MapWritable as an output value of a Reducer ?
If yes, how will the (key, value) pairs in the MapWritable object will be
written to the file ? What output format should I use in this case ?
Further, I want to chain the output of the first map reduce job to another
map reduce job, so in the second map reduce job, what input format should I
specify ?
Can I reconstruct the MapWritable objects in the mapper of the second job ?
Thanks,
Taran
Re: MapWritable as output value of Reducer
Posted by Tarandeep Singh <ta...@gmail.com>.
Thanks Nomura Yoshihide,
I will try your suggestion and see how it goes..
Regards,
Taran
On Thu, Jun 5, 2008 at 8:19 PM, NOMURA Yoshihide <y....@jp.fujitsu.com>
wrote:
> Hello Taran,
>
> If you want to use MapWritable as reducer's output value like this class,
>
> public class ReduceA implements Reducer<LongWritable, Text, LongWritable,
> MapWritable>
>
> You couldn't use TextOutputFormat in this case, because MapWritable doesn't
> have any toString() method.
> I think SequenceFileOutputFormat is more suitable.
>
> If you want to chain the jobs, you should use SequenceFileInputFormat and
> SequenceFileOutputFormat like this way.
>
> JobConf confA = new JobConf(A.class);
> conf.setJobName("A");
> conf.setOutputKeyClass(LongWritable.class);
> conf.setOutputValueClass(MapWritable.class);
> conf.setMapperClass(MapA.class);
> conf.setReducerClass(ReduceA.class);
> conf.setInputFormat(TextInputFormat.class);
> conf.setOutputFormat(SequenceFileOutputFormat.class);
> conf.setInputPath(new Path("/inputA"));
> conf.setOutputPath(new Path("/outputA"));
> JobClient.runJob(confA);
>
> JobConf confB = new JobConf(B.class);
> conf.setJobName("B");
> conf.setOutputKeyClass(LongWritable.class);
> conf.setOutputValueClass(MapWritable.class);
> conf.setMapperClass(MapB.class);
> conf.setReducerClass(ReduceB.class);
> conf.setInputFormat(SequenceFileInputFormat.class);
> conf.setOutputFormat(SequenceFileOutputFormat.class);
> conf.setInputPath(new Path("/outputA"));
> conf.setOutputPath(new Path("/outputB"));
> JobClient.runJob(confB);
>
> Regards,
>
> Tarandeep Singh:
>
> hi,
>>
>> Can I use MapWritable as an output value of a Reducer ?
>>
>> If yes, how will the (key, value) pairs in the MapWritable object will be
>> written to the file ? What output format should I use in this case ?
>>
>> Further, I want to chain the output of the first map reduce job to another
>> map reduce job, so in the second map reduce job, what input format should
>> I
>> specify ?
>>
>> Can I reconstruct the MapWritable objects in the mapper of the second job
>> ?
>>
>> Thanks,
>> Taran
>>
>>
> --
> NOMURA Yoshihide:
> Software Innovation Laboratory, Fujitsu Labs. Ltd., Japan
> Tel: 044-754-2675 (Ext: 7112-6358)
> Fax: 044-754-2570 (Ext: 7112-3834)
> E-Mail: [y.nomura@jp.fujitsu.com]
>
>
Re: MapWritable as output value of Reducer
Posted by NOMURA Yoshihide <y....@jp.fujitsu.com>.
Hello Taran,
If you want to use MapWritable as reducer's output value like this class,
public class ReduceA implements Reducer<LongWritable, Text,
LongWritable, MapWritable>
You couldn't use TextOutputFormat in this case, because MapWritable
doesn't have any toString() method.
I think SequenceFileOutputFormat is more suitable.
If you want to chain the jobs, you should use SequenceFileInputFormat
and SequenceFileOutputFormat like this way.
JobConf confA = new JobConf(A.class);
conf.setJobName("A");
conf.setOutputKeyClass(LongWritable.class);
conf.setOutputValueClass(MapWritable.class);
conf.setMapperClass(MapA.class);
conf.setReducerClass(ReduceA.class);
conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(SequenceFileOutputFormat.class);
conf.setInputPath(new Path("/inputA"));
conf.setOutputPath(new Path("/outputA"));
JobClient.runJob(confA);
JobConf confB = new JobConf(B.class);
conf.setJobName("B");
conf.setOutputKeyClass(LongWritable.class);
conf.setOutputValueClass(MapWritable.class);
conf.setMapperClass(MapB.class);
conf.setReducerClass(ReduceB.class);
conf.setInputFormat(SequenceFileInputFormat.class);
conf.setOutputFormat(SequenceFileOutputFormat.class);
conf.setInputPath(new Path("/outputA"));
conf.setOutputPath(new Path("/outputB"));
JobClient.runJob(confB);
Regards,
Tarandeep Singh:
> hi,
>
> Can I use MapWritable as an output value of a Reducer ?
>
> If yes, how will the (key, value) pairs in the MapWritable object will be
> written to the file ? What output format should I use in this case ?
>
> Further, I want to chain the output of the first map reduce job to another
> map reduce job, so in the second map reduce job, what input format should I
> specify ?
>
> Can I reconstruct the MapWritable objects in the mapper of the second job ?
>
> Thanks,
> Taran
>
--
NOMURA Yoshihide:
Software Innovation Laboratory, Fujitsu Labs. Ltd., Japan
Tel: 044-754-2675 (Ext: 7112-6358)
Fax: 044-754-2570 (Ext: 7112-3834)
E-Mail: [y.nomura@jp.fujitsu.com]
Re: MapWritable as output value of Reducer
Posted by Yang Chen <ch...@gmail.com>.
I believe the (key, value) structure is same both input and output file. In
this case, you can consider the job flow.
Like below,
JobConf confA = new JobConf(A.class);
conf.setJobName("A");
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(IntWritable.class);
conf.setMapperClass(MapA.class);
conf.setCombinerClass(ReduceA.class);
conf.setReducerClass(ReduceA.class);
conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);
conf.setInputPath(new Path("/inputA"));
conf.setOutputPath(new Path("/outputA"));
JobClient.runJob(confA);
JobConf confB = new JobConf(B.class);
conf.setJobName("B");
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(IntWritable.class);
conf.setMapperClass(MapB.class);
conf.setCombinerClass(ReduceB.class);
conf.setReducerClass(ReduceB.class);
conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);
conf.setInputPath(new Path("/outputA"));
conf.setOutputPath(new Path("/outputB"));
JobClient.runJob(confB);
On Thu, Jun 5, 2008 at 7:36 PM, Tarandeep Singh <ta...@gmail.com> wrote:
> hi,
>
> Can I use MapWritable as an output value of a Reducer ?
>
> If yes, how will the (key, value) pairs in the MapWritable object will be
> written to the file ? What output format should I use in this case ?
>
> Further, I want to chain the output of the first map reduce job to another
> map reduce job, so in the second map reduce job, what input format should I
> specify ?
>
> Can I reconstruct the MapWritable objects in the mapper of the second job ?
>
> Thanks,
> Taran
>