You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by garpinc <ga...@hotmail.com> on 2011/08/02 05:40:47 UTC

maprd vs mapreduce api

I was following this tutorial on version 0.19.1

http://v-lad.org/Tutorials/Hadoop/23%20-%20create%20the%20project.html

I however wanted to use the latest version of api 0.20.2

The original code in tutorial had following lines
conf.setMapperClass(org.apache.hadoop.mapred.lib.IdentityMapper.class);
conf.setReducerClass(org.apache.hadoop.mapred.lib.IdentityReducer.class);

both Identity classes are deprecated.. So seemed the solution was to create
mapper and reducer as follows:
 public static class NOOPMapper 
      extends Mapper<Text, IntWritable, Text, IntWritable>{
   
     
   public void map(Text key, IntWritable value, Context context
                   ) throws IOException, InterruptedException {

       context.write(key, value);

   }
 }
 
 public static class NOOPReducer 
      extends Reducer<Text,IntWritable,Text,IntWritable> {
   private IntWritable result = new IntWritable();

   public void reduce(Text key, Iterable<IntWritable> values, 
                      Context context
                      ) throws IOException, InterruptedException {
     context.write(key, result);
   }
 }


And then with code:
		Configuration conf = new Configuration();
		Job job = new Job(conf, "testdriver");

		job.setOutputKeyClass(Text.class);
		job.setOutputValueClass(IntWritable.class);

		job.setInputFormatClass(TextInputFormat.class);
		job.setOutputFormatClass(TextOutputFormat.class);

		FileInputFormat.addInputPath(job, new Path("In"));
		FileOutputFormat.setOutputPath(job, new Path("Out"));

		job.setMapperClass(NOOPMapper.class);
		job.setReducerClass(NOOPReducer.class);

		job.waitForCompletion(true);


However I get this message
java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be
cast to org.apache.hadoop.io.Text
	at TestDriver$NOOPMapper.map(TestDriver.java:1)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
11/08/01 16:41:01 INFO mapred.JobClient:  map 0% reduce 0%
11/08/01 16:41:01 INFO mapred.JobClient: Job complete: job_local_0001
11/08/01 16:41:01 INFO mapred.JobClient: Counters: 0



Can anyone tell me what I need for this to work.

Attached is full code..
http://old.nabble.com/file/p32174859/TestDriver.java TestDriver.java 
-- 
View this message in context: http://old.nabble.com/maprd-vs-mapreduce-api-tp32174859p32174859.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Re: maprd vs mapreduce api

Posted by Mohit Anchlia <mo...@gmail.com>.

On Fri, Aug 5, 2011 at 3:42 PM, Stevens, Keith D. <st...@llnl.gov> wrote:
> The Mapper and Reducer class in org.apache.hadoop.mapreduce implement the identity function.  So you should be able to just do
>
> conf.setMapperClass(org.apache.hadoop.mapreduce.Mapper.class);
> conf.setReducerClass(org.apache.hadoop.mapreduce.Reducer.class);
>
> without having to implement your own no-op classes.
>
> I recommend reading the javadoc for differences between the old api and the new api, for example http://hadoop.apache.org/common/docs/r0.20.2/api/index.html indicates the different functionality of Mapper in the new api and it's dual use as the identity mapper.

Sorry for asking on this thread :) Does Definitive Guide 2 cover the new api?
>
> Cheers,
> --Keith
>
> On Aug 5, 2011, at 1:15 PM, garpinc wrote:
>
>>
>> I was following this tutorial on version 0.19.1
>>
>> http://v-lad.org/Tutorials/Hadoop/23%20-%20create%20the%20project.html
>>
>> I however wanted to use the latest version of api 0.20.2
>>
>> The original code in tutorial had following lines
>> conf.setMapperClass(org.apache.hadoop.mapred.lib.IdentityMapper.class);
>> conf.setReducerClass(org.apache.hadoop.mapred.lib.IdentityReducer.class);
>>
>> both Identity classes are deprecated.. So seemed the solution was to create
>> mapper and reducer as follows:
>> public static class NOOPMapper
>>      extends Mapper<Text, IntWritable, Text, IntWritable>{
>>
>>
>>   public void map(Text key, IntWritable value, Context context
>>                   ) throws IOException, InterruptedException {
>>
>>       context.write(key, value);
>>
>>   }
>> }
>>
>> public static class NOOPReducer
>>      extends Reducer<Text,IntWritable,Text,IntWritable> {
>>   private IntWritable result = new IntWritable();
>>
>>   public void reduce(Text key, Iterable<IntWritable> values,
>>                      Context context
>>                      ) throws IOException, InterruptedException {
>>     context.write(key, result);
>>   }
>> }
>>
>>
>> And then with code:
>>               Configuration conf = new Configuration();
>>               Job job = new Job(conf, "testdriver");
>>
>>               job.setOutputKeyClass(Text.class);
>>               job.setOutputValueClass(IntWritable.class);
>>
>>               job.setInputFormatClass(TextInputFormat.class);
>>               job.setOutputFormatClass(TextOutputFormat.class);
>>
>>               FileInputFormat.addInputPath(job, new Path("In"));
>>               FileOutputFormat.setOutputPath(job, new Path("Out"));
>>
>>               job.setMapperClass(NOOPMapper.class);
>>               job.setReducerClass(NOOPReducer.class);
>>
>>               job.waitForCompletion(true);
>>
>>
>> However I get this message
>> java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be
>> cast to org.apache.hadoop.io.Text
>>       at TestDriver$NOOPMapper.map(TestDriver.java:1)
>>       at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>>       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
>>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>>       at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
>> 11/08/01 16:41:01 INFO mapred.JobClient:  map 0% reduce 0%
>> 11/08/01 16:41:01 INFO mapred.JobClient: Job complete: job_local_0001
>> 11/08/01 16:41:01 INFO mapred.JobClient: Counters: 0
>>
>>
>>
>> Can anyone tell me what I need for this to work.
>>
>> Attached is full code..
>> http://old.nabble.com/file/p32174859/TestDriver.java TestDriver.java
>> --
>> View this message in context: http://old.nabble.com/maprd-vs-mapreduce-api-tp32174859p32174859.html
>> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>>
>
>

Re: maprd vs mapreduce api

Posted by "Stevens, Keith D." <st...@llnl.gov>.

The Mapper and Reducer class in org.apache.hadoop.mapreduce implement the identity function.  So you should be able to just do 

conf.setMapperClass(org.apache.hadoop.mapreduce.Mapper.class);
conf.setReducerClass(org.apache.hadoop.mapreduce.Reducer.class);

without having to implement your own no-op classes.

I recommend reading the javadoc for differences between the old api and the new api, for example http://hadoop.apache.org/common/docs/r0.20.2/api/index.html indicates the different functionality of Mapper in the new api and it's dual use as the identity mapper.

Cheers,
--Keith

On Aug 5, 2011, at 1:15 PM, garpinc wrote:

> 
> I was following this tutorial on version 0.19.1
> 
> http://v-lad.org/Tutorials/Hadoop/23%20-%20create%20the%20project.html
> 
> I however wanted to use the latest version of api 0.20.2
> 
> The original code in tutorial had following lines
> conf.setMapperClass(org.apache.hadoop.mapred.lib.IdentityMapper.class);
> conf.setReducerClass(org.apache.hadoop.mapred.lib.IdentityReducer.class);
> 
> both Identity classes are deprecated.. So seemed the solution was to create
> mapper and reducer as follows:
> public static class NOOPMapper 
>      extends Mapper<Text, IntWritable, Text, IntWritable>{
> 
> 
>   public void map(Text key, IntWritable value, Context context
>                   ) throws IOException, InterruptedException {
> 
>       context.write(key, value);
> 
>   }
> }
> 
> public static class NOOPReducer 
>      extends Reducer<Text,IntWritable,Text,IntWritable> {
>   private IntWritable result = new IntWritable();
> 
>   public void reduce(Text key, Iterable<IntWritable> values, 
>                      Context context
>                      ) throws IOException, InterruptedException {
>     context.write(key, result);
>   }
> }
> 
> 
> And then with code:
> 		Configuration conf = new Configuration();
> 		Job job = new Job(conf, "testdriver");
> 
> 		job.setOutputKeyClass(Text.class);
> 		job.setOutputValueClass(IntWritable.class);
> 
> 		job.setInputFormatClass(TextInputFormat.class);
> 		job.setOutputFormatClass(TextOutputFormat.class);
> 
> 		FileInputFormat.addInputPath(job, new Path("In"));
> 		FileOutputFormat.setOutputPath(job, new Path("Out"));
> 
> 		job.setMapperClass(NOOPMapper.class);
> 		job.setReducerClass(NOOPReducer.class);
> 
> 		job.waitForCompletion(true);
> 
> 
> However I get this message
> java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be
> cast to org.apache.hadoop.io.Text
> 	at TestDriver$NOOPMapper.map(TestDriver.java:1)
> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> 	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> 11/08/01 16:41:01 INFO mapred.JobClient:  map 0% reduce 0%
> 11/08/01 16:41:01 INFO mapred.JobClient: Job complete: job_local_0001
> 11/08/01 16:41:01 INFO mapred.JobClient: Counters: 0
> 
> 
> 
> Can anyone tell me what I need for this to work.
> 
> Attached is full code..
> http://old.nabble.com/file/p32174859/TestDriver.java TestDriver.java 
> -- 
> View this message in context: http://old.nabble.com/maprd-vs-mapreduce-api-tp32174859p32174859.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>

Re: maprd vs mapreduce api

Posted by Roger Chen <ro...@ucdavis.edu>.

Your reducer is writing IntWritable but your output format class is still
Text. Change one of those so they match the other.

On Mon, Aug 1, 2011 at 8:40 PM, garpinc <ga...@hotmail.com> wrote:

>
> I was following this tutorial on version 0.19.1
>
> http://v-lad.org/Tutorials/Hadoop/23%20-%20create%20the%20project.html
>
> I however wanted to use the latest version of api 0.20.2
>
> The original code in tutorial had following lines
> conf.setMapperClass(org.apache.hadoop.mapred.lib.IdentityMapper.class);
> conf.setReducerClass(org.apache.hadoop.mapred.lib.IdentityReducer.class);
>
> both Identity classes are deprecated.. So seemed the solution was to create
> mapper and reducer as follows:
>  public static class NOOPMapper
>      extends Mapper<Text, IntWritable, Text, IntWritable>{
>
>
>   public void map(Text key, IntWritable value, Context context
>                   ) throws IOException, InterruptedException {
>
>       context.write(key, value);
>
>   }
>  }
>
>  public static class NOOPReducer
>      extends Reducer<Text,IntWritable,Text,IntWritable> {
>   private IntWritable result = new IntWritable();
>
>   public void reduce(Text key, Iterable<IntWritable> values,
>                      Context context
>                      ) throws IOException, InterruptedException {
>     context.write(key, result);
>   }
>  }
>
>
> And then with code:
>                Configuration conf = new Configuration();
>                Job job = new Job(conf, "testdriver");
>
>                job.setOutputKeyClass(Text.class);
>                job.setOutputValueClass(IntWritable.class);
>
>                job.setInputFormatClass(TextInputFormat.class);
>                job.setOutputFormatClass(TextOutputFormat.class);
>
>                FileInputFormat.addInputPath(job, new Path("In"));
>                FileOutputFormat.setOutputPath(job, new Path("Out"));
>
>                job.setMapperClass(NOOPMapper.class);
>                job.setReducerClass(NOOPReducer.class);
>
>                job.waitForCompletion(true);
>
>
> However I get this message
> java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be
> cast to org.apache.hadoop.io.Text
>        at TestDriver$NOOPMapper.map(TestDriver.java:1)
>        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>        at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> 11/08/01 16:41:01 INFO mapred.JobClient:  map 0% reduce 0%
> 11/08/01 16:41:01 INFO mapred.JobClient: Job complete: job_local_0001
> 11/08/01 16:41:01 INFO mapred.JobClient: Counters: 0
>
>
>
> Can anyone tell me what I need for this to work.
>
> Attached is full code..
> http://old.nabble.com/file/p32174859/TestDriver.java TestDriver.java
> --
> View this message in context:
> http://old.nabble.com/maprd-vs-mapreduce-api-tp32174859p32174859.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>
>


-- 
Roger Chen
UC Davis Genome Center