You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Cyril Briquet <cy...@canopeer.org> on 2010/07/27 04:05:28 UTC

Setting jar for embedded Job (Hadoop 0.20.2)

Hi,

I'd like to run a Hadoop (0.20.2) job
from within another application, using ToolRunner.

One class of this other application implements the Tool interface.
The implemented run() method:
* constructs a Job()
* sets the input/output/mapper/reducer
* sets the jar file by calling job.setJarByClass().
* calls job.waitForCompletion()

The question is: where should the jar file be made available?
In the current local directory of the parent application? In the system
directory in HDFS? ...?

I'd like to find documentation and learn how this works.

Thank you,

Cyril

Re: Setting jar for embedded Job (Hadoop 0.20.2)

Posted by Cyril Briquet <cy...@canopeer.org>.
Hi,

Thanks, it works!

So I just tried that, to copy the .jar file containing the mapper and
reducer classes to the current directory from which I'm running the
application launching the Hadoop job. And it works.

Have a great day,

Cyril


N.B.: for the record, the stack trace before putting the .jar in the current
directoy is copy/pasted at the bottom of this e-mail




On Tue, Jul 27, 2010 at 12:32 AM, Hemanth Yamijala <yh...@gmail.com>wrote:

> Hi,
>
> > I'd like to run a Hadoop (0.20.2) job
> > from within another application, using ToolRunner.
> >
> > One class of this other application implements the Tool interface.
> > The implemented run() method:
> > * constructs a Job()
> > * sets the input/output/mapper/reducer
> > * sets the jar file by calling job.setJarByClass().
> > * calls job.waitForCompletion()
> >
> > The question is: where should the jar file be made available?
> > In the current local directory of the parent application? In the system
> > directory in HDFS? ...?
> >
> > I'd like to find documentation and learn how this works.
> >
>
> If you are planning to use job.setJarByClass, the files only need to
> be only on your classpath locally where you are running the
> application. You could look at o.a.h.mapred.JobConf.findContainingJar
> which is passed the class name you set in setJarByClass to see how the
> jar file is located.
>
> Thanks
> Hemanth
>




10/07/27 14:56:11 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
10/07/27 14:56:11 WARN mapred.JobClient: No job jar file set.  User classes
may not be found. See JobConf(Class) or JobConf#setJar(String).
10/07/27 14:56:11 INFO input.FileInputFormat: Total input paths to process :
1
10/07/27 14:56:11 INFO mapred.JobClient: Running job: job_201007261547_0007
10/07/27 14:56:12 INFO mapred.JobClient:  map 0% reduce 0%
10/07/27 14:56:21 INFO mapred.JobClient: Task Id :
attempt_201007261547_0007_m_
000000_0, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException:
x.y.z.my.class.name
       at
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:809)
       at
org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:157)
       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:569)
       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
       at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.lang.ClassNotFoundException: x.y.z.my.class.name
       at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
       at java.security.AccessController.doPrivileged(Native Method)
       at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
       at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
       at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
       at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
       at java.lang.Class.forName0(Native Method)
       at java.lang.Class.forName(Class.java:247)
       at
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:762)
       at
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:807)
       ... 4 more

10/07/27 14:56:27 INFO mapred.JobClient: Task Id :
attempt_201007261547_0007_m_000000_1, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException:
x.y.z.my.class.name
       at
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:809)
       at
org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:157)
       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:569)
       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
       at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.lang.ClassNotFoundException:
x.y.z.my.class.name
       at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
       at java.security.AccessController.doPrivileged(Native Method)
       at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
       at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
       at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
       at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
       at java.lang.Class.forName0(Native Method)
       at java.lang.Class.forName(Class.java:247)
       at
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:762)
       at
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:807)
       ... 4 more

10/07/27 14:56:34 INFO mapred.JobClient: Task Id :
attempt_201007261547_0007_m_000000_2, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException:
x.y.z.my.class.name
       at
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:809)
       at
org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:157)
       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:569)
       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
       at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.lang.ClassNotFoundException: x.y.z.my.class.name
       at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
       at java.security.AccessController.doPrivileged(Native Method)
       at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
       at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
       at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
       at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
       at java.lang.Class.forName0(Native Method)
       at java.lang.Class.forName(Class.java:247)
       at
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:762)
       at
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:807)
       ... 4 more

10/07/27 14:56:43 INFO mapred.JobClient: Job complete: job_201007261547_0007
10/07/27 14:56:43 INFO mapred.JobClient: Counters: 3
10/07/27 14:56:43 INFO mapred.JobClient:   Job Counters
10/07/27 14:56:43 INFO mapred.JobClient:     Rack-local map tasks=4
10/07/27 14:56:43 INFO mapred.JobClient:     Launched map tasks=4
10/07/27 14:56:43 INFO mapred.JobClient:     Failed map tasks=1

Re: Setting jar for embedded Job (Hadoop 0.20.2)

Posted by Hemanth Yamijala <yh...@gmail.com>.
Hi,

> I'd like to run a Hadoop (0.20.2) job
> from within another application, using ToolRunner.
>
> One class of this other application implements the Tool interface.
> The implemented run() method:
> * constructs a Job()
> * sets the input/output/mapper/reducer
> * sets the jar file by calling job.setJarByClass().
> * calls job.waitForCompletion()
>
> The question is: where should the jar file be made available?
> In the current local directory of the parent application? In the system
> directory in HDFS? ...?
>
> I'd like to find documentation and learn how this works.
>

If you are planning to use job.setJarByClass, the files only need to
be only on your classpath locally where you are running the
application. You could look at o.a.h.mapred.JobConf.findContainingJar
which is passed the class name you set in setJarByClass to see how the
jar file is located.

Thanks
Hemanth