You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Clarence Gardner <cl...@gmail.com> on 2010/09/08 06:50:54 UTC

Describing key value pairs

I'm writing my first m/r program, and seem to be having problems describing
the types of my key-value pairs.

I have this mapper
    public static class Map
      extends Mapper<LongWritable, Text, Text, CensusData>
and this reducer
    public static class Reduce
      extends Reducer<Text, CensusData, Text, IntWritable>

and this in my run() method
    job.setMapperClass(Map.class);
    job.setReducerClass(Reduce.class);

    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(IntWritable.class);

I'm assuming that the types given to setOutput...Class should be the types
output by the reducer. There doesn't seem to be a way to tell the Context
object what the types are. Unfortunately, all the examples I see on the web
output the same types from both the mapper and reducer.

I'm getting this error from my "context.write(county, cd)" statement in my
mapper:
java.io.IOException: Type mismatch in value from map: expected
org.apache.hadoop.io.IntWritable, recieved CensusData
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:850)
at
org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:541)
at
org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at CountyAgi$Map.map(CountyAgi.java:31)

I tried changing setOutputKeyClass to expect LongWritable instead of
IntWriteable, and the error changed correspondingly. So, how would I say
that the mapper outputs (Text, CensusData) and the reducer outputs (Text,
IntWritable)?

Re: Describing key value pairs

Posted by Clarence Gardner <cl...@gmail.com>.
Thanks! That got me going, and the program works now :)

On Tue, Sep 7, 2010 at 10:50 PM, Harsh J <qw...@gmail.com> wrote:

> There's jc.setMapOutput{Key, Value} methods you can use for this.
>
> Harsh J
> http://harshj.com
>
> On 8 Sep 2010 10:21, "Clarence Gardner" <cl...@gmail.com> wrote:
>
> I'm writing my first m/r program, and seem to be having problems describing
> the types of my key-value pairs.
>
> I have this mapper
>     public static class Map
>       extends Mapper<LongWritable, Text, Text, CensusData>
> and this reducer
>      public static class Reduce
>       extends Reducer<Text, CensusData, Text, IntWritable>
>
> and this in my run() method
>      job.setMapperClass(Map.class);
>     job.setReducerClass(Reduce.class);
>
>     job.setOutputKeyClass(Text.class);
>     job.setOutputValueClass(IntWritable.class);
>
> I'm assuming that the types given to setOutput...Class should be the types
> output by the reducer. There doesn't seem to be a way to tell the Context
> object what the types are. Unfortunately, all the examples I see on the web
> output the same types from both the mapper and reducer.
>
> I'm getting this error from my "context.write(county, cd)" statement in my
> mapper:
>  java.io.IOException: Type mismatch in value from map: expected
> org.apache.hadoop.io.IntWritable, recieved CensusData
> at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:850)
> at
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:541)
> at
> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
> at CountyAgi$Map.map(CountyAgi.java:31)
>
> I tried changing setOutputKeyClass to expect LongWritable instead of
> IntWriteable, and the error changed correspondingly. So, how would I say
> that the mapper outputs (Text, CensusData) and the reducer outputs (Text,
> IntWritable)?
>
>

Re: Describing key value pairs

Posted by Harsh J <qw...@gmail.com>.
There's jc.setMapOutput{Key, Value} methods you can use for this.

Harsh J
http://harshj.com

On 8 Sep 2010 10:21, "Clarence Gardner" <cl...@gmail.com> wrote:

I'm writing my first m/r program, and seem to be having problems describing
the types of my key-value pairs.

I have this mapper
    public static class Map
      extends Mapper<LongWritable, Text, Text, CensusData>
and this reducer
    public static class Reduce
      extends Reducer<Text, CensusData, Text, IntWritable>

and this in my run() method
    job.setMapperClass(Map.class);
    job.setReducerClass(Reduce.class);

    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(IntWritable.class);

I'm assuming that the types given to setOutput...Class should be the types
output by the reducer. There doesn't seem to be a way to tell the Context
object what the types are. Unfortunately, all the examples I see on the web
output the same types from both the mapper and reducer.

I'm getting this error from my "context.write(county, cd)" statement in my
mapper:
java.io.IOException: Type mismatch in value from map: expected
org.apache.hadoop.io.IntWritable, recieved CensusData
 at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:850)
at
org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:541)
 at
org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at CountyAgi$Map.map(CountyAgi.java:31)

I tried changing setOutputKeyClass to expect LongWritable instead of
IntWriteable, and the error changed correspondingly. So, how would I say
that the mapper outputs (Text, CensusData) and the reducer outputs (Text,
IntWritable)?