You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Kevin Burton <rk...@charter.net> on 2013/04/24 21:13:57 UTC

Maven dependency

I am reading "Hadoop in Action" and the author on page 51 puts forth this
code:

 

public class WordCount2 { 

public static void main(String[] args) { 

   JobClient client = new JobClient(); 

   JobConf conf = new JobConf(WordCount2.class); 

   FileInputFormat.addInputPath(conf, new Path(args[0]));

   FileOutputFormat.setOutputPath(conf, new Path(args[1])); 

   conf.setOutputKeyClass(Text.class);

   conf.setOutputValueClass(LongWritable.class);

   conf.setMapperClass(TokenCountMapper.class);

   conf.setCombinerClass(LongSumReducer.class);

   conf.setReducerClass(LongSumReducer.class);r

   client.setConf(conf);

   try {

       JobClient.runJob(conf);

   } catch (Exception e) {

       e.printStackTrace();

   } 

       } 

}

 

Which is an example for a simple MapReduce job. But being a beginner I am
not sure how to set up a project for this code. If I am using Maven what are
the Maven dependencies that I need? There are several map reduce
dependencies and I am not sure which to pick. Are there other dependencies
need (such as JobConf)? What are the imports needed? During the construction
of the configuration what heuristics are used to find the configuration for
the Hadoop cluster?

 

Thank you.


Re: Maven dependency

Posted by Jay Vyas <ja...@gmail.com>.
this should be enough to get started (you can pick the 1.* version if you
want the newer APIs and stuff, but for the elephant book, the older apis
will work fine as well) .
<dependencies>
    <dependency>
      <groupId>org.apache.hadoop</groupId>
      <artifactId>hadoop-core</artifactId>
      <version>0.20.2</version>
    </dependency>
</dependencies>


On Wed, Apr 24, 2013 at 3:13 PM, Kevin Burton <rk...@charter.net>wrote:

> I am reading “Hadoop in Action” and the author on page 51 puts forth this
> code:****
>
> ** **
>
> public class WordCount2 { ****
>
> public static void main(String[] args) { ****
>
>    JobClient client = new JobClient(); ****
>
>    JobConf conf = new JobConf(WordCount2.class); ****
>
>    FileInputFormat.addInputPath(conf, new Path(args[0]));****
>
>    FileOutputFormat.setOutputPath(conf, new Path(args[1])); ****
>
>    conf.setOutputKeyClass(Text.class);****
>
>    conf.setOutputValueClass(LongWritable.class);****
>
>    conf.setMapperClass(TokenCountMapper.class);****
>
>    conf.setCombinerClass(LongSumReducer.class);****
>
>    conf.setReducerClass(LongSumReducer.class);r****
>
>    client.setConf(conf);****
>
>    try {****
>
>        JobClient.runJob(conf);****
>
>    } catch (Exception e) {****
>
>        e.printStackTrace();****
>
>    } ****
>
>        } ****
>
> }****
>
> ** **
>
> Which is an example for a simple MapReduce job. But being a beginner I am
> not sure how to set up a project for this code. If I am using Maven what
> are the Maven dependencies that I need? There are several map reduce
> dependencies and I am not sure which to pick. Are there other dependencies
> need (such as JobConf)? What are the imports needed? During the
> construction of the configuration what heuristics are used to find the
> configuration for the Hadoop cluster?****
>
> ** **
>
> Thank you.****
>



-- 
Jay Vyas
http://jayunit100.blogspot.com

Re: Maven dependency

Posted by Jay Vyas <ja...@gmail.com>.
this should be enough to get started (you can pick the 1.* version if you
want the newer APIs and stuff, but for the elephant book, the older apis
will work fine as well) .
<dependencies>
    <dependency>
      <groupId>org.apache.hadoop</groupId>
      <artifactId>hadoop-core</artifactId>
      <version>0.20.2</version>
    </dependency>
</dependencies>


On Wed, Apr 24, 2013 at 3:13 PM, Kevin Burton <rk...@charter.net>wrote:

> I am reading “Hadoop in Action” and the author on page 51 puts forth this
> code:****
>
> ** **
>
> public class WordCount2 { ****
>
> public static void main(String[] args) { ****
>
>    JobClient client = new JobClient(); ****
>
>    JobConf conf = new JobConf(WordCount2.class); ****
>
>    FileInputFormat.addInputPath(conf, new Path(args[0]));****
>
>    FileOutputFormat.setOutputPath(conf, new Path(args[1])); ****
>
>    conf.setOutputKeyClass(Text.class);****
>
>    conf.setOutputValueClass(LongWritable.class);****
>
>    conf.setMapperClass(TokenCountMapper.class);****
>
>    conf.setCombinerClass(LongSumReducer.class);****
>
>    conf.setReducerClass(LongSumReducer.class);r****
>
>    client.setConf(conf);****
>
>    try {****
>
>        JobClient.runJob(conf);****
>
>    } catch (Exception e) {****
>
>        e.printStackTrace();****
>
>    } ****
>
>        } ****
>
> }****
>
> ** **
>
> Which is an example for a simple MapReduce job. But being a beginner I am
> not sure how to set up a project for this code. If I am using Maven what
> are the Maven dependencies that I need? There are several map reduce
> dependencies and I am not sure which to pick. Are there other dependencies
> need (such as JobConf)? What are the imports needed? During the
> construction of the configuration what heuristics are used to find the
> configuration for the Hadoop cluster?****
>
> ** **
>
> Thank you.****
>



-- 
Jay Vyas
http://jayunit100.blogspot.com

Re: Maven dependency

Posted by Jay Vyas <ja...@gmail.com>.
this should be enough to get started (you can pick the 1.* version if you
want the newer APIs and stuff, but for the elephant book, the older apis
will work fine as well) .
<dependencies>
    <dependency>
      <groupId>org.apache.hadoop</groupId>
      <artifactId>hadoop-core</artifactId>
      <version>0.20.2</version>
    </dependency>
</dependencies>


On Wed, Apr 24, 2013 at 3:13 PM, Kevin Burton <rk...@charter.net>wrote:

> I am reading “Hadoop in Action” and the author on page 51 puts forth this
> code:****
>
> ** **
>
> public class WordCount2 { ****
>
> public static void main(String[] args) { ****
>
>    JobClient client = new JobClient(); ****
>
>    JobConf conf = new JobConf(WordCount2.class); ****
>
>    FileInputFormat.addInputPath(conf, new Path(args[0]));****
>
>    FileOutputFormat.setOutputPath(conf, new Path(args[1])); ****
>
>    conf.setOutputKeyClass(Text.class);****
>
>    conf.setOutputValueClass(LongWritable.class);****
>
>    conf.setMapperClass(TokenCountMapper.class);****
>
>    conf.setCombinerClass(LongSumReducer.class);****
>
>    conf.setReducerClass(LongSumReducer.class);r****
>
>    client.setConf(conf);****
>
>    try {****
>
>        JobClient.runJob(conf);****
>
>    } catch (Exception e) {****
>
>        e.printStackTrace();****
>
>    } ****
>
>        } ****
>
> }****
>
> ** **
>
> Which is an example for a simple MapReduce job. But being a beginner I am
> not sure how to set up a project for this code. If I am using Maven what
> are the Maven dependencies that I need? There are several map reduce
> dependencies and I am not sure which to pick. Are there other dependencies
> need (such as JobConf)? What are the imports needed? During the
> construction of the configuration what heuristics are used to find the
> configuration for the Hadoop cluster?****
>
> ** **
>
> Thank you.****
>



-- 
Jay Vyas
http://jayunit100.blogspot.com

Re: Maven dependency

Posted by Jay Vyas <ja...@gmail.com>.
this should be enough to get started (you can pick the 1.* version if you
want the newer APIs and stuff, but for the elephant book, the older apis
will work fine as well) .
<dependencies>
    <dependency>
      <groupId>org.apache.hadoop</groupId>
      <artifactId>hadoop-core</artifactId>
      <version>0.20.2</version>
    </dependency>
</dependencies>


On Wed, Apr 24, 2013 at 3:13 PM, Kevin Burton <rk...@charter.net>wrote:

> I am reading “Hadoop in Action” and the author on page 51 puts forth this
> code:****
>
> ** **
>
> public class WordCount2 { ****
>
> public static void main(String[] args) { ****
>
>    JobClient client = new JobClient(); ****
>
>    JobConf conf = new JobConf(WordCount2.class); ****
>
>    FileInputFormat.addInputPath(conf, new Path(args[0]));****
>
>    FileOutputFormat.setOutputPath(conf, new Path(args[1])); ****
>
>    conf.setOutputKeyClass(Text.class);****
>
>    conf.setOutputValueClass(LongWritable.class);****
>
>    conf.setMapperClass(TokenCountMapper.class);****
>
>    conf.setCombinerClass(LongSumReducer.class);****
>
>    conf.setReducerClass(LongSumReducer.class);r****
>
>    client.setConf(conf);****
>
>    try {****
>
>        JobClient.runJob(conf);****
>
>    } catch (Exception e) {****
>
>        e.printStackTrace();****
>
>    } ****
>
>        } ****
>
> }****
>
> ** **
>
> Which is an example for a simple MapReduce job. But being a beginner I am
> not sure how to set up a project for this code. If I am using Maven what
> are the Maven dependencies that I need? There are several map reduce
> dependencies and I am not sure which to pick. Are there other dependencies
> need (such as JobConf)? What are the imports needed? During the
> construction of the configuration what heuristics are used to find the
> configuration for the Hadoop cluster?****
>
> ** **
>
> Thank you.****
>



-- 
Jay Vyas
http://jayunit100.blogspot.com