You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Vjeran Marcinko <vj...@email.t-com.hr> on 2013/04/13 22:19:12 UTC

Few noob MR questions

Hello,

 

I am complete Hadoop and MR newbiew, so please help me with following.

 

I can see that primary way to submit Hadoop MR job is via following command
(wordcount example):

 

hadoop jar wordcount.jar org.mycompany.WordCount

 

1.    Although, looking at all MR examples out there, I see this "hadooo
jar" command used for submitting MR jobs, but actually it has nothing
specific with MR job submissions, it just calls static "main" method,
similar to plain "java" command, and this main method can just print "Hello
world" on console and have no business with MR framework, right?

2.    If above is true, and I assume it is, what is the difference with
using plain java -jar in place of  "hadoop jar"? Both call static "main"
method, and this method could submit MR job via JobClient class, right?

3.    On Hadoop docs (
<http://hadoop.apache.org/docs/r1.0.4/commands_manual.html#Generic+Options>
http://hadoop.apache.org/docs/r1.0.4/commands_manual.html#Generic+Options ),
I see that -libjars options is only present in "hadoop job" command, but not
in "hadoop jar", and later is usually used (for some unknown reason because
former also has -submit ooption?) for submitting jobs, so my question is
does that mean that when using "jar" command I should only priovde
thrid-party libs via "fat jar"?

 

Regards,

Vjeran 


Re: Few noob MR questions

Posted by Bjorn Jonsson <bj...@gmail.com>.
Correct, you can use java -jar to submit a job...with the "driver" code in
a plain static main method. I do it all the time. You can of course run a
Job straight from your IDE Java code also. You can check out the .runJar()
method in the Hadoop API Javadoc to see what the hadoop command does
essentially I think.

Cheers,
Bj


On Sat, Apr 13, 2013 at 3:59 PM, Jens Scheidtmann <
jens.scheidtmann@gmail.com> wrote:

> Dear Vjeran,
>
> your own jobs should implement the Tool Interface and ToolRunner. This
> gives additional standard options on the command line.
>
> Also have a look at class ProgramDriver as used here:
> https://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/ExampleDriver.java
>
> which further simplifies executing your MR jobs.
>
> Best regards,
>
> Jens
>

Re: Few noob MR questions

Posted by Bjorn Jonsson <bj...@gmail.com>.
Correct, you can use java -jar to submit a job...with the "driver" code in
a plain static main method. I do it all the time. You can of course run a
Job straight from your IDE Java code also. You can check out the .runJar()
method in the Hadoop API Javadoc to see what the hadoop command does
essentially I think.

Cheers,
Bj


On Sat, Apr 13, 2013 at 3:59 PM, Jens Scheidtmann <
jens.scheidtmann@gmail.com> wrote:

> Dear Vjeran,
>
> your own jobs should implement the Tool Interface and ToolRunner. This
> gives additional standard options on the command line.
>
> Also have a look at class ProgramDriver as used here:
> https://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/ExampleDriver.java
>
> which further simplifies executing your MR jobs.
>
> Best regards,
>
> Jens
>

Re: Few noob MR questions

Posted by Bjorn Jonsson <bj...@gmail.com>.
Correct, you can use java -jar to submit a job...with the "driver" code in
a plain static main method. I do it all the time. You can of course run a
Job straight from your IDE Java code also. You can check out the .runJar()
method in the Hadoop API Javadoc to see what the hadoop command does
essentially I think.

Cheers,
Bj


On Sat, Apr 13, 2013 at 3:59 PM, Jens Scheidtmann <
jens.scheidtmann@gmail.com> wrote:

> Dear Vjeran,
>
> your own jobs should implement the Tool Interface and ToolRunner. This
> gives additional standard options on the command line.
>
> Also have a look at class ProgramDriver as used here:
> https://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/ExampleDriver.java
>
> which further simplifies executing your MR jobs.
>
> Best regards,
>
> Jens
>

Re: Few noob MR questions

Posted by Bjorn Jonsson <bj...@gmail.com>.
Correct, you can use java -jar to submit a job...with the "driver" code in
a plain static main method. I do it all the time. You can of course run a
Job straight from your IDE Java code also. You can check out the .runJar()
method in the Hadoop API Javadoc to see what the hadoop command does
essentially I think.

Cheers,
Bj


On Sat, Apr 13, 2013 at 3:59 PM, Jens Scheidtmann <
jens.scheidtmann@gmail.com> wrote:

> Dear Vjeran,
>
> your own jobs should implement the Tool Interface and ToolRunner. This
> gives additional standard options on the command line.
>
> Also have a look at class ProgramDriver as used here:
> https://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/ExampleDriver.java
>
> which further simplifies executing your MR jobs.
>
> Best regards,
>
> Jens
>

Re: Few noob MR questions

Posted by Jens Scheidtmann <je...@gmail.com>.
Dear Vjeran,

your own jobs should implement the Tool Interface and ToolRunner. This
gives additional standard options on the command line.

Also have a look at class ProgramDriver as used here:
https://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/ExampleDriver.java

which further simplifies executing your MR jobs.

Best regards,

Jens

Re: Few noob MR questions

Posted by Jens Scheidtmann <je...@gmail.com>.
Dear Vjeran,

your own jobs should implement the Tool Interface and ToolRunner. This
gives additional standard options on the command line.

Also have a look at class ProgramDriver as used here:
https://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/ExampleDriver.java

which further simplifies executing your MR jobs.

Best regards,

Jens

Re: Few noob MR questions

Posted by Jens Scheidtmann <je...@gmail.com>.
Dear Vjeran,

your own jobs should implement the Tool Interface and ToolRunner. This
gives additional standard options on the command line.

Also have a look at class ProgramDriver as used here:
https://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/ExampleDriver.java

which further simplifies executing your MR jobs.

Best regards,

Jens

Re: Few noob MR questions

Posted by Jens Scheidtmann <je...@gmail.com>.
Dear Vjeran,

your own jobs should implement the Tool Interface and ToolRunner. This
gives additional standard options on the command line.

Also have a look at class ProgramDriver as used here:
https://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/ExampleDriver.java

which further simplifies executing your MR jobs.

Best regards,

Jens

Re: Few noob MR questions

Posted by maisnam ns <ma...@gmail.com>.
@Vjeran , running hadoop command over java adds the Hadoop libraries and
their dependencies to the classpath and brings in all the Hadoop
configuration details, the former does not . There's an excellent book on
Hadoop call the Hadoop the Definitive Guide by Tom White, everything is
explained in that book.

Regards
Niranjan Singh


On Sun, Apr 14, 2013 at 1:49 AM, Vjeran Marcinko <
vjeran.marcinko@email.t-com.hr> wrote:

> Hello,****
>
> ** **
>
> I am complete Hadoop and MR newbiew, so please help me with following…****
>
> ** **
>
> I can see that primary way to submit Hadoop MR job is via following
> command (wordcount example):****
>
> ** **
>
> hadoop jar wordcount.jar org.mycompany.WordCount****
>
> ** **
>
> **1.    **Although, looking at all MR examples out there, I see this
> „hadooo jar“ command used for submitting MR jobs, but actually it has
> nothing specific with MR job submissions, it just calls static „main“
> method, similar to plain „java“ command, and this main method can just
> print „Hello world“ on console and have no business with MR framework,
> right?****
>
> **2.    **If above is true, and I assume it is, what is the difference
> with using plain java –jar in place of  „hadoop jar“? Both call static
> „main“ method, and this method could submit MR job via JobClient class,
> right?****
>
> **3.    **On Hadoop docs (
> http://hadoop.apache.org/docs/r1.0.4/commands_manual.html#Generic+Options ),
> I see that –libjars options is only present in „hadoop job“ command, but
> not in „hadoop jar“, and later is usually used (for some unknown reason
> because former also has –submit ooption?) for submitting jobs, so my
> question is does that mean that when using „jar“ command I should only
> priovde thrid-party libs via „fat jar“?****
>
> ** **
>
> Regards,****
>
> Vjeran ****
>

Re: Few noob MR questions

Posted by maisnam ns <ma...@gmail.com>.
@Vjeran , running hadoop command over java adds the Hadoop libraries and
their dependencies to the classpath and brings in all the Hadoop
configuration details, the former does not . There's an excellent book on
Hadoop call the Hadoop the Definitive Guide by Tom White, everything is
explained in that book.

Regards
Niranjan Singh


On Sun, Apr 14, 2013 at 1:49 AM, Vjeran Marcinko <
vjeran.marcinko@email.t-com.hr> wrote:

> Hello,****
>
> ** **
>
> I am complete Hadoop and MR newbiew, so please help me with following…****
>
> ** **
>
> I can see that primary way to submit Hadoop MR job is via following
> command (wordcount example):****
>
> ** **
>
> hadoop jar wordcount.jar org.mycompany.WordCount****
>
> ** **
>
> **1.    **Although, looking at all MR examples out there, I see this
> „hadooo jar“ command used for submitting MR jobs, but actually it has
> nothing specific with MR job submissions, it just calls static „main“
> method, similar to plain „java“ command, and this main method can just
> print „Hello world“ on console and have no business with MR framework,
> right?****
>
> **2.    **If above is true, and I assume it is, what is the difference
> with using plain java –jar in place of  „hadoop jar“? Both call static
> „main“ method, and this method could submit MR job via JobClient class,
> right?****
>
> **3.    **On Hadoop docs (
> http://hadoop.apache.org/docs/r1.0.4/commands_manual.html#Generic+Options ),
> I see that –libjars options is only present in „hadoop job“ command, but
> not in „hadoop jar“, and later is usually used (for some unknown reason
> because former also has –submit ooption?) for submitting jobs, so my
> question is does that mean that when using „jar“ command I should only
> priovde thrid-party libs via „fat jar“?****
>
> ** **
>
> Regards,****
>
> Vjeran ****
>

Re: Few noob MR questions

Posted by maisnam ns <ma...@gmail.com>.
@Vjeran , running hadoop command over java adds the Hadoop libraries and
their dependencies to the classpath and brings in all the Hadoop
configuration details, the former does not . There's an excellent book on
Hadoop call the Hadoop the Definitive Guide by Tom White, everything is
explained in that book.

Regards
Niranjan Singh


On Sun, Apr 14, 2013 at 1:49 AM, Vjeran Marcinko <
vjeran.marcinko@email.t-com.hr> wrote:

> Hello,****
>
> ** **
>
> I am complete Hadoop and MR newbiew, so please help me with following…****
>
> ** **
>
> I can see that primary way to submit Hadoop MR job is via following
> command (wordcount example):****
>
> ** **
>
> hadoop jar wordcount.jar org.mycompany.WordCount****
>
> ** **
>
> **1.    **Although, looking at all MR examples out there, I see this
> „hadooo jar“ command used for submitting MR jobs, but actually it has
> nothing specific with MR job submissions, it just calls static „main“
> method, similar to plain „java“ command, and this main method can just
> print „Hello world“ on console and have no business with MR framework,
> right?****
>
> **2.    **If above is true, and I assume it is, what is the difference
> with using plain java –jar in place of  „hadoop jar“? Both call static
> „main“ method, and this method could submit MR job via JobClient class,
> right?****
>
> **3.    **On Hadoop docs (
> http://hadoop.apache.org/docs/r1.0.4/commands_manual.html#Generic+Options ),
> I see that –libjars options is only present in „hadoop job“ command, but
> not in „hadoop jar“, and later is usually used (for some unknown reason
> because former also has –submit ooption?) for submitting jobs, so my
> question is does that mean that when using „jar“ command I should only
> priovde thrid-party libs via „fat jar“?****
>
> ** **
>
> Regards,****
>
> Vjeran ****
>

Re: Few noob MR questions

Posted by maisnam ns <ma...@gmail.com>.
@Vjeran , running hadoop command over java adds the Hadoop libraries and
their dependencies to the classpath and brings in all the Hadoop
configuration details, the former does not . There's an excellent book on
Hadoop call the Hadoop the Definitive Guide by Tom White, everything is
explained in that book.

Regards
Niranjan Singh


On Sun, Apr 14, 2013 at 1:49 AM, Vjeran Marcinko <
vjeran.marcinko@email.t-com.hr> wrote:

> Hello,****
>
> ** **
>
> I am complete Hadoop and MR newbiew, so please help me with following…****
>
> ** **
>
> I can see that primary way to submit Hadoop MR job is via following
> command (wordcount example):****
>
> ** **
>
> hadoop jar wordcount.jar org.mycompany.WordCount****
>
> ** **
>
> **1.    **Although, looking at all MR examples out there, I see this
> „hadooo jar“ command used for submitting MR jobs, but actually it has
> nothing specific with MR job submissions, it just calls static „main“
> method, similar to plain „java“ command, and this main method can just
> print „Hello world“ on console and have no business with MR framework,
> right?****
>
> **2.    **If above is true, and I assume it is, what is the difference
> with using plain java –jar in place of  „hadoop jar“? Both call static
> „main“ method, and this method could submit MR job via JobClient class,
> right?****
>
> **3.    **On Hadoop docs (
> http://hadoop.apache.org/docs/r1.0.4/commands_manual.html#Generic+Options ),
> I see that –libjars options is only present in „hadoop job“ command, but
> not in „hadoop jar“, and later is usually used (for some unknown reason
> because former also has –submit ooption?) for submitting jobs, so my
> question is does that mean that when using „jar“ command I should only
> priovde thrid-party libs via „fat jar“?****
>
> ** **
>
> Regards,****
>
> Vjeran ****
>