You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Tarandeep Singh <ta...@gmail.com> on 2008/06/06 01:36:28 UTC

MapWritable as output value of Reducer

hi,

Can I use MapWritable as an output value of a Reducer ?

If yes, how will the (key, value) pairs in the MapWritable object will be
written to the file ? What output format should I use in this case ?

Further, I want to chain the output of the first map reduce job to another
map reduce job, so in the second map reduce job, what input format should I
specify ?

Can I reconstruct the MapWritable objects in the mapper of the second job ?

Thanks,
Taran

Re: MapWritable as output value of Reducer

Posted by Tarandeep Singh <ta...@gmail.com>.

Thanks Nomura Yoshihide,
I will try your suggestion and see how it goes..

Regards,
Taran

On Thu, Jun 5, 2008 at 8:19 PM, NOMURA Yoshihide <y....@jp.fujitsu.com>
wrote:

> Hello Taran,
>
> If you want to use MapWritable as reducer's output value like this class,
>
>  public class ReduceA implements Reducer<LongWritable, Text, LongWritable,
> MapWritable>
>
> You couldn't use TextOutputFormat in this case, because MapWritable doesn't
> have any toString() method.
> I think SequenceFileOutputFormat is more suitable.
>
> If you want to chain the jobs, you should use SequenceFileInputFormat and
> SequenceFileOutputFormat like this way.
>
>     JobConf confA = new JobConf(A.class);
>     conf.setJobName("A");
>     conf.setOutputKeyClass(LongWritable.class);
>     conf.setOutputValueClass(MapWritable.class);
>     conf.setMapperClass(MapA.class);
>     conf.setReducerClass(ReduceA.class);
>     conf.setInputFormat(TextInputFormat.class);
>     conf.setOutputFormat(SequenceFileOutputFormat.class);
>     conf.setInputPath(new Path("/inputA"));
>     conf.setOutputPath(new Path("/outputA"));
>     JobClient.runJob(confA);
>
>     JobConf confB = new JobConf(B.class);
>     conf.setJobName("B");
>     conf.setOutputKeyClass(LongWritable.class);
>     conf.setOutputValueClass(MapWritable.class);
>     conf.setMapperClass(MapB.class);
>     conf.setReducerClass(ReduceB.class);
>     conf.setInputFormat(SequenceFileInputFormat.class);
>     conf.setOutputFormat(SequenceFileOutputFormat.class);
>     conf.setInputPath(new Path("/outputA"));
>     conf.setOutputPath(new Path("/outputB"));
>     JobClient.runJob(confB);
>
> Regards,
>
> Tarandeep Singh:
>
>  hi,
>>
>> Can I use MapWritable as an output value of a Reducer ?
>>
>> If yes, how will the (key, value) pairs in the MapWritable object will be
>> written to the file ? What output format should I use in this case ?
>>
>> Further, I want to chain the output of the first map reduce job to another
>> map reduce job, so in the second map reduce job, what input format should
>> I
>> specify ?
>>
>> Can I reconstruct the MapWritable objects in the mapper of the second job
>> ?
>>
>> Thanks,
>> Taran
>>
>>
> --
> NOMURA Yoshihide:
>    Software Innovation Laboratory, Fujitsu Labs. Ltd., Japan
>    Tel: 044-754-2675 (Ext: 7112-6358)
>    Fax: 044-754-2570 (Ext: 7112-3834)
>    E-Mail: [y.nomura@jp.fujitsu.com]
>
>

Re: MapWritable as output value of Reducer

Posted by NOMURA Yoshihide <y....@jp.fujitsu.com>.

Hello Taran,

If you want to use MapWritable as reducer's output value like this class,

   public class ReduceA implements Reducer<LongWritable, Text, 
LongWritable, MapWritable>

You couldn't use TextOutputFormat in this case, because MapWritable 
doesn't have any toString() method.
I think SequenceFileOutputFormat is more suitable.

If you want to chain the jobs, you should use SequenceFileInputFormat 
and SequenceFileOutputFormat like this way.

      JobConf confA = new JobConf(A.class);
      conf.setJobName("A");
      conf.setOutputKeyClass(LongWritable.class);
      conf.setOutputValueClass(MapWritable.class);
      conf.setMapperClass(MapA.class);
      conf.setReducerClass(ReduceA.class);
      conf.setInputFormat(TextInputFormat.class);
      conf.setOutputFormat(SequenceFileOutputFormat.class);
      conf.setInputPath(new Path("/inputA"));
      conf.setOutputPath(new Path("/outputA"));
      JobClient.runJob(confA);

      JobConf confB = new JobConf(B.class);
      conf.setJobName("B");
      conf.setOutputKeyClass(LongWritable.class);
      conf.setOutputValueClass(MapWritable.class);
      conf.setMapperClass(MapB.class);
      conf.setReducerClass(ReduceB.class);
      conf.setInputFormat(SequenceFileInputFormat.class);
      conf.setOutputFormat(SequenceFileOutputFormat.class);
      conf.setInputPath(new Path("/outputA"));
      conf.setOutputPath(new Path("/outputB"));
      JobClient.runJob(confB);

Regards,

Tarandeep Singh:
> hi,
> 
> Can I use MapWritable as an output value of a Reducer ?
> 
> If yes, how will the (key, value) pairs in the MapWritable object will be
> written to the file ? What output format should I use in this case ?
> 
> Further, I want to chain the output of the first map reduce job to another
> map reduce job, so in the second map reduce job, what input format should I
> specify ?
> 
> Can I reconstruct the MapWritable objects in the mapper of the second job ?
> 
> Thanks,
> Taran
> 

-- 
NOMURA Yoshihide:
     Software Innovation Laboratory, Fujitsu Labs. Ltd., Japan
     Tel: 044-754-2675 (Ext: 7112-6358)
     Fax: 044-754-2570 (Ext: 7112-3834)
     E-Mail: [y.nomura@jp.fujitsu.com]

Re: MapWritable as output value of Reducer

Posted by Yang Chen <ch...@gmail.com>.

I believe the (key, value) structure is same both input and output file. In
this case, you can consider the job flow.
Like below,
      JobConf confA = new JobConf(A.class);
     conf.setJobName("A");
     conf.setOutputKeyClass(Text.class);
     conf.setOutputValueClass(IntWritable.class);
     conf.setMapperClass(MapA.class);
     conf.setCombinerClass(ReduceA.class);
     conf.setReducerClass(ReduceA.class);
     conf.setInputFormat(TextInputFormat.class);
     conf.setOutputFormat(TextOutputFormat.class);
     conf.setInputPath(new Path("/inputA"));
     conf.setOutputPath(new Path("/outputA"));
     JobClient.runJob(confA);

      JobConf confB = new JobConf(B.class);
     conf.setJobName("B");
     conf.setOutputKeyClass(Text.class);
     conf.setOutputValueClass(IntWritable.class);
     conf.setMapperClass(MapB.class);
     conf.setCombinerClass(ReduceB.class);
     conf.setReducerClass(ReduceB.class);
     conf.setInputFormat(TextInputFormat.class);
     conf.setOutputFormat(TextOutputFormat.class);
     conf.setInputPath(new Path("/outputA"));
     conf.setOutputPath(new Path("/outputB"));
     JobClient.runJob(confB);

On Thu, Jun 5, 2008 at 7:36 PM, Tarandeep Singh <ta...@gmail.com> wrote:

> hi,
>
> Can I use MapWritable as an output value of a Reducer ?
>
> If yes, how will the (key, value) pairs in the MapWritable object will be
> written to the file ? What output format should I use in this case ?
>
> Further, I want to chain the output of the first map reduce job to another
> map reduce job, so in the second map reduce job, what input format should I
> specify ?
>
> Can I reconstruct the MapWritable objects in the mapper of the second job ?
>
> Thanks,
> Taran
>