You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Chandrashekhar Kotekar <sh...@gmail.com> on 2013/04/24 07:53:22 UTC
Fwd: Multiple ways to write Hadoop program driver - Which one to choose?
Hi,
I have observed that there are multiple ways to write driver method of
Hadoop program.
Following method is given in Hadoop Tutorial by
Yahoo<http://developer.yahoo.com/hadoop/tutorial/module4.html>
public void run(String inputPath, String outputPath) throws Exception {
JobConf conf = new JobConf(WordCount.class);
conf.setJobName("wordcount");
// the keys are words (strings)
conf.setOutputKeyClass(Text.class);
// the values are counts (ints)
conf.setOutputValueClass(IntWritable.class);
conf.setMapperClass(MapClass.class);
conf.setReducerClass(Reduce.class);
FileInputFormat.addInputPath(conf, new Path(inputPath));
FileOutputFormat.setOutputPath(conf, new Path(outputPath));
JobClient.runJob(conf);
}
and this method is given in Hadoop The Definitive Guide 2012 book by
Oreilly.
public static void main(String[] args) throws Exception {
if (args.length != 2) {
System.err.println("Usage: MaxTemperature <input path> <output path>");
System.exit(-1);
}
Job job = new Job();
job.setJarByClass(MaxTemperature.class);
job.setJobName("Max temperature");
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.setMapperClass(MaxTemperatureMapper.class);
job.setReducerClass(MaxTemperatureReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
While trying program given in Oreilly book I found that constructors
of Job class
are deprecated. As Oreilly book is based on Hadoop 2 (yarn) I was surprised
to see that they have used deprecated class.
I would like to know which method everyone uses?
Regards,
Chandrash3khar K0tekar
Mobile - 8884631122
Re: Multiple ways to write Hadoop program driver - Which one to choose?
Posted by yypvsxf19870706 <yy...@gmail.com>.
Hi sheikhar
The deprecated job constructor is actually deprecated according to the job source code .
There is another constructor witch is not deprecated ,you can find in the hint raised by eclipse .
发自我的 iPhone
在 2013-4-24,13:53,Chandrashekhar Kotekar <sh...@gmail.com> 写道:
> Hi,
>
>
> I have observed that there are multiple ways to write driver method of Hadoop program.
>
> Following method is given in Hadoop Tutorial by Yahoo
>
>
> public void run(String inputPath, String outputPath) throws Exception {
> JobConf conf = new JobConf(WordCount.class);
> conf.setJobName("wordcount");
>
> // the keys are words (strings)
> conf.setOutputKeyClass(Text.class);
> // the values are counts (ints)
> conf.setOutputValueClass(IntWritable.class);
>
> conf.setMapperClass(MapClass.class);
> conf.setReducerClass(Reduce.class);
>
> FileInputFormat.addInputPath(conf, new Path(inputPath));
> FileOutputFormat.setOutputPath(conf, new Path(outputPath));
>
> JobClient.runJob(conf);
> }
> and this method is given in Hadoop The Definitive Guide 2012 book by Oreilly.
>
>
> public static void main(String[] args) throws Exception {
> if (args.length != 2) {
> System.err.println("Usage: MaxTemperature <input path> <output path>");
> System.exit(-1);
> }
> Job job = new Job();
> job.setJarByClass(MaxTemperature.class);
> job.setJobName("Max temperature");
> FileInputFormat.addInputPath(job, new Path(args[0]));
> FileOutputFormat.setOutputPath(job, new Path(args[1]));
> job.setMapperClass(MaxTemperatureMapper.class);
> job.setReducerClass(MaxTemperatureReducer.class);
> job.setOutputKeyClass(Text.class);
> job.setOutputValueClass(IntWritable.class);
> System.exit(job.waitForCompletion(true) ? 0 : 1);
> }
> While trying program given in Oreilly book I found that constructors of Job class are deprecated. As Oreilly book is based on Hadoop 2 (yarn) I was surprised to see that they have used deprecated class.
>
> I would like to know which method everyone uses?
>
>
>
>
>
>
>
>
> Regards,
> Chandrash3khar K0tekar
> Mobile - 8884631122
>
Re: Multiple ways to write Hadoop program driver - Which one to choose?
Posted by yypvsxf19870706 <yy...@gmail.com>.
Hi sheikhar
The deprecated job constructor is actually deprecated according to the job source code .
There is another constructor witch is not deprecated ,you can find in the hint raised by eclipse .
�����ҵ� iPhone
�� 2013-4-24��13:53��Chandrashekhar Kotekar <sh...@gmail.com> ���
> Hi,
>
>
> I have observed that there are multiple ways to write driver method of Hadoop program.
>
> Following method is given in Hadoop Tutorial by Yahoo
>
>
> public void run(String inputPath, String outputPath) throws Exception {
> JobConf conf = new JobConf(WordCount.class);
> conf.setJobName("wordcount");
>
> // the keys are words (strings)
> conf.setOutputKeyClass(Text.class);
> // the values are counts (ints)
> conf.setOutputValueClass(IntWritable.class);
>
> conf.setMapperClass(MapClass.class);
> conf.setReducerClass(Reduce.class);
>
> FileInputFormat.addInputPath(conf, new Path(inputPath));
> FileOutputFormat.setOutputPath(conf, new Path(outputPath));
>
> JobClient.runJob(conf);
> }
> and this method is given in Hadoop The Definitive Guide 2012 book by Oreilly.
>
>
> public static void main(String[] args) throws Exception {
> if (args.length != 2) {
> System.err.println("Usage: MaxTemperature <input path> <output path>");
> System.exit(-1);
> }
> Job job = new Job();
> job.setJarByClass(MaxTemperature.class);
> job.setJobName("Max temperature");
> FileInputFormat.addInputPath(job, new Path(args[0]));
> FileOutputFormat.setOutputPath(job, new Path(args[1]));
> job.setMapperClass(MaxTemperatureMapper.class);
> job.setReducerClass(MaxTemperatureReducer.class);
> job.setOutputKeyClass(Text.class);
> job.setOutputValueClass(IntWritable.class);
> System.exit(job.waitForCompletion(true) ? 0 : 1);
> }
> While trying program given in Oreilly book I found that constructors of Job class are deprecated. As Oreilly book is based on Hadoop 2 (yarn) I was surprised to see that they have used deprecated class.
>
> I would like to know which method everyone uses?
>
>
>
>
>
>
>
>
> Regards,
> Chandrash3khar K0tekar
> Mobile - 8884631122
>
Re: Multiple ways to write Hadoop program driver - Which one to choose?
Posted by Jens Scheidtmann <je...@gmail.com>.
Dear Chandrash3khar K0tekar,
Using the run() method implies implementing Tool and using ToolRunner. This
gives as additional benefit that some "standard" hadoop command line
options are available. See here:
http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/com.cloudera.hadoop/hadoop-core/0.20.2-737/org/apache/hadoop/util/ToolRunner.java
Best regards,
Jens
Re: Multiple ways to write Hadoop program driver - Which one to choose?
Posted by Jens Scheidtmann <je...@gmail.com>.
Dear Chandrash3khar K0tekar,
Using the run() method implies implementing Tool and using ToolRunner. This
gives as additional benefit that some "standard" hadoop command line
options are available. See here:
http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/com.cloudera.hadoop/hadoop-core/0.20.2-737/org/apache/hadoop/util/ToolRunner.java
Best regards,
Jens
Re: Multiple ways to write Hadoop program driver - Which one to choose?
Posted by Jens Scheidtmann <je...@gmail.com>.
Dear Chandrash3khar K0tekar,
Using the run() method implies implementing Tool and using ToolRunner. This
gives as additional benefit that some "standard" hadoop command line
options are available. See here:
http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/com.cloudera.hadoop/hadoop-core/0.20.2-737/org/apache/hadoop/util/ToolRunner.java
Best regards,
Jens
Re: Multiple ways to write Hadoop program driver - Which one to choose?
Posted by Jens Scheidtmann <je...@gmail.com>.
Dear Chandrash3khar K0tekar,
Using the run() method implies implementing Tool and using ToolRunner. This
gives as additional benefit that some "standard" hadoop command line
options are available. See here:
http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/com.cloudera.hadoop/hadoop-core/0.20.2-737/org/apache/hadoop/util/ToolRunner.java
Best regards,
Jens
Re: Multiple ways to write Hadoop program driver - Which one to choose?
Posted by yypvsxf19870706 <yy...@gmail.com>.
Hi sheikhar
The deprecated job constructor is actually deprecated according to the job source code .
There is another constructor witch is not deprecated ,you can find in the hint raised by eclipse .
发自我的 iPhone
在 2013-4-24,13:53,Chandrashekhar Kotekar <sh...@gmail.com> 写道:
> Hi,
>
>
> I have observed that there are multiple ways to write driver method of Hadoop program.
>
> Following method is given in Hadoop Tutorial by Yahoo
>
>
> public void run(String inputPath, String outputPath) throws Exception {
> JobConf conf = new JobConf(WordCount.class);
> conf.setJobName("wordcount");
>
> // the keys are words (strings)
> conf.setOutputKeyClass(Text.class);
> // the values are counts (ints)
> conf.setOutputValueClass(IntWritable.class);
>
> conf.setMapperClass(MapClass.class);
> conf.setReducerClass(Reduce.class);
>
> FileInputFormat.addInputPath(conf, new Path(inputPath));
> FileOutputFormat.setOutputPath(conf, new Path(outputPath));
>
> JobClient.runJob(conf);
> }
> and this method is given in Hadoop The Definitive Guide 2012 book by Oreilly.
>
>
> public static void main(String[] args) throws Exception {
> if (args.length != 2) {
> System.err.println("Usage: MaxTemperature <input path> <output path>");
> System.exit(-1);
> }
> Job job = new Job();
> job.setJarByClass(MaxTemperature.class);
> job.setJobName("Max temperature");
> FileInputFormat.addInputPath(job, new Path(args[0]));
> FileOutputFormat.setOutputPath(job, new Path(args[1]));
> job.setMapperClass(MaxTemperatureMapper.class);
> job.setReducerClass(MaxTemperatureReducer.class);
> job.setOutputKeyClass(Text.class);
> job.setOutputValueClass(IntWritable.class);
> System.exit(job.waitForCompletion(true) ? 0 : 1);
> }
> While trying program given in Oreilly book I found that constructors of Job class are deprecated. As Oreilly book is based on Hadoop 2 (yarn) I was surprised to see that they have used deprecated class.
>
> I would like to know which method everyone uses?
>
>
>
>
>
>
>
>
> Regards,
> Chandrash3khar K0tekar
> Mobile - 8884631122
>
Re: Multiple ways to write Hadoop program driver - Which one to choose?
Posted by yypvsxf19870706 <yy...@gmail.com>.
Hi sheikhar
The deprecated job constructor is actually deprecated according to the job source code .
There is another constructor witch is not deprecated ,you can find in the hint raised by eclipse .
�����ҵ� iPhone
�� 2013-4-24��13:53��Chandrashekhar Kotekar <sh...@gmail.com> ���
> Hi,
>
>
> I have observed that there are multiple ways to write driver method of Hadoop program.
>
> Following method is given in Hadoop Tutorial by Yahoo
>
>
> public void run(String inputPath, String outputPath) throws Exception {
> JobConf conf = new JobConf(WordCount.class);
> conf.setJobName("wordcount");
>
> // the keys are words (strings)
> conf.setOutputKeyClass(Text.class);
> // the values are counts (ints)
> conf.setOutputValueClass(IntWritable.class);
>
> conf.setMapperClass(MapClass.class);
> conf.setReducerClass(Reduce.class);
>
> FileInputFormat.addInputPath(conf, new Path(inputPath));
> FileOutputFormat.setOutputPath(conf, new Path(outputPath));
>
> JobClient.runJob(conf);
> }
> and this method is given in Hadoop The Definitive Guide 2012 book by Oreilly.
>
>
> public static void main(String[] args) throws Exception {
> if (args.length != 2) {
> System.err.println("Usage: MaxTemperature <input path> <output path>");
> System.exit(-1);
> }
> Job job = new Job();
> job.setJarByClass(MaxTemperature.class);
> job.setJobName("Max temperature");
> FileInputFormat.addInputPath(job, new Path(args[0]));
> FileOutputFormat.setOutputPath(job, new Path(args[1]));
> job.setMapperClass(MaxTemperatureMapper.class);
> job.setReducerClass(MaxTemperatureReducer.class);
> job.setOutputKeyClass(Text.class);
> job.setOutputValueClass(IntWritable.class);
> System.exit(job.waitForCompletion(true) ? 0 : 1);
> }
> While trying program given in Oreilly book I found that constructors of Job class are deprecated. As Oreilly book is based on Hadoop 2 (yarn) I was surprised to see that they have used deprecated class.
>
> I would like to know which method everyone uses?
>
>
>
>
>
>
>
>
> Regards,
> Chandrash3khar K0tekar
> Mobile - 8884631122
>