You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Utkarsh Gupta <Ut...@infosys.com> on 2012/04/04 08:52:13 UTC

Including third party jar files in Map Reduce job

Hi All,

I am new to Hadoop and was trying to generate random numbers using apache commons math library.
I used Netbeans to build the jar file and the manifest has path to commons-math jar as lib/commons-math3.jar
I have placed this jar file in HADOOP_HOME/lib folder but still I am getting ClassNotFoundException.
I tried using -libjar option with $HADOOP_HOME/bin/Hadoop jar myprg.jar <inputpath> <outputpath> -libjar <jarpath>
And $HADOOP_HOME/bin/Hadoop jar myprg.jar -libjar <jarpath> <inputpath> <outputpath>
But this is not working. Please help.


Thanks and Regards
Utkarsh Gupta



**************** CAUTION - Disclaimer *****************
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely 
for the use of the addressee(s). If you are not the intended recipient, please 
notify the sender by e-mail and delete the original message. Further, you are not 
to copy, disclose, or distribute this e-mail or its contents to any other person and 
any such actions are unlawful. This e-mail may contain viruses. Infosys has taken 
every reasonable precaution to minimize this risk, but is not liable for any damage 
you may sustain as a result of any virus in this e-mail. You should carry out your 
own virus checks before opening the e-mail or attachment. Infosys reserves the 
right to monitor and review the content of all messages sent to or from this e-mail 
address. Messages sent to or from this e-mail address may be stored on the 
Infosys e-mail system.
***INFOSYS******** End of Disclaimer ********INFOSYS***

RE: Including third party jar files in Map Reduce job

Posted by Utkarsh Gupta <Ut...@infosys.com>.
I have tried implementing Tool interface as mentioned @ http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/util/Tool.html
But -libjars option is not working.
I have copied the jar to all the nodes at $HADOOP_HOME/lib folder
But I am still getting the same error. The map task completes for master node (also acting as worker) from where job is run, but the task is failing on all the other nodes
The stack trace is :
hadoop$ bin/hadoop jar /home/hduser1/NetbeansProjects/WordCount/dist/WordCount.jar /user/hduser1/input/ /user/hduser1/output
12/04/04 15:21:00 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
12/04/04 15:21:00 INFO input.FileInputFormat: Total input paths to process : 7
12/04/04 15:21:00 INFO mapred.JobClient: Running job: job_201204041107_0015
12/04/04 15:21:01 INFO mapred.JobClient:  map 0% reduce 0%
12/04/04 15:21:17 INFO mapred.JobClient:  map 27% reduce 0%
12/04/04 15:21:18 INFO mapred.JobClient: Task Id : attempt_201204041107_0015_m_000004_0, Status : FAILED
Error: java.lang.ClassNotFoundException: org.apache.commons.math3.random.RandomDataImpl
	at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
	at wordcount.MyMapper.map(MyMapper.java:22)
	at wordcount.MyMapper.map(MyMapper.java:14)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
	at org.apache.hadoop.mapred.Child.main(Child.java:253)

Thanks and Regards
Utkarsh
-----Original Message-----
From: Devaraj k [mailto:devaraj.k@huawei.com] 
Sent: Wednesday, April 04, 2012 2:08 PM
To: mapreduce-user@hadoop.apache.org
Subject: RE: Including third party jar files in Map Reduce job

As Bejoy mentioned,

If you have copied the jar to $HADOOP_HOME, then you should copy it to all the nodes in the cluster. (or)

If you want to make use of -libjar option, your application should implement Tool to support generic options. Please check the below link for more details.

http://hadoop.apache.org/common/docs/current/commands_manual.html#jar

Thanks
Devaraj
________________________________________
From: Bejoy Ks [bejoy.hadoop@gmail.com]
Sent: Wednesday, April 04, 2012 1:06 PM
To: mapreduce-user@hadoop.apache.org
Subject: Re: Including third party jar files in Map Reduce job

Hi Utkarsh
         You can add third party jars to your map reduce job elegantly in the following ways

1) use - libjars
hadoop jar jarName.jar com.driver.ClassName -libjars /home/some/dir/somejar.jar ....

2) include the third pary jars in /lib folder while packaging your application

3) If you are adding the jar in HADOOP_HOME/lib , you need to add this at all nodes.

Regards
Bejoy KS

On Wed, Apr 4, 2012 at 12:55 PM, Utkarsh Gupta <Ut...@infosys.com>> wrote:
Hi Devaraj,

I have already copied the required jar file in $HADOOP_HOME/lib folder.
Can you tell me where to add generic option -libjars

The stack trace is:
hadoop$ bin/hadoop jar WordCount.jar /user/hduser1/input/ /user/hduser1/output
12/04/04 12:45:51 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
12/04/04 12:45:51 INFO input.FileInputFormat: Total input paths to process : 1
12/04/04 12:45:51 INFO mapred.JobClient: Running job: job_201204041107_0005
12/04/04 12:45:52 INFO mapred.JobClient:  map 0% reduce 0%
12/04/04 12:46:07 INFO mapred.JobClient: Task Id : attempt_201204041107_0005_m_000000_0, Status : FAILED
Error: java.lang.ClassNotFoundException: org.apache.commons.math3.random.RandomDataImpl
       at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
       at java.security.AccessController.doPrivileged(Native Method)
       at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
       at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
       at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
       at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
       at wordcount.MyMapper.map(MyMapper.java:22)
       at wordcount.MyMapper.map(MyMapper.java:14)
       at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
       at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
       at java.security.AccessController.doPrivileged(Native Method)
       at javax.security.auth.Subject.doAs(Subject.java:396)
       at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
       at org.apache.hadoop.mapred.Child.main(Child.java:253)

Thanks and Regards
Utkarsh

-----Original Message-----
From: Devaraj k [mailto:devaraj.k@huawei.com<ma...@huawei.com>]
Sent: Wednesday, April 04, 2012 12:35 PM
To: mapreduce-user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: RE: Including third party jar files in Map Reduce job

Hi Utkarsh,

The usage of the jar command is like this,

Usage: hadoop jar <jar> [mainClass] args...

If you want the commons-math3.jar to be available for all the tasks you can do any one of these 1. Copy the jar file in $HADOOP_HOME/lib dir or 2. Use the generic option -libjars.

Can you give the stack trace of your problem for which class it is giving ClassNotFoundException(i.e for main class or math lib class)?

Thanks
Devaraj
________________________________________
From: Utkarsh Gupta [Utkarsh_Gupta@infosys.com<ma...@infosys.com>]
Sent: Wednesday, April 04, 2012 12:22 PM
To: mapreduce-user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Including third party jar files in Map Reduce job

Hi All,

I am new to Hadoop and was trying to generate random numbers using apache commons math library.
I used Netbeans to build the jar file and the manifest has path to commons-math jar as lib/commons-math3.jar I have placed this jar file in HADOOP_HOME/lib folder but still I am getting ClassNotFoundException.
I tried using -libjars option with $HADOOP_HOME/bin/Hadoop jar myprg.jar <inputpath> <outputpath> -libjars <jarpath> And $HADOOP_HOME/bin/Hadoop jar myprg.jar -libjar <jarpath> <inputpath> <outputpath> But this is not working. Please help.


Thanks and Regards
Utkarsh Gupta



**************** CAUTION - Disclaimer ***************** This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions are unlawful. This e-mail may contain viruses. Infosys has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. Infosys reserves the right to monitor and review the content of all messages sent to or from this e-mail address. Messages sent to or from this e-mail address may be stored on the Infosys e-mail system.
***INFOSYS******** End of Disclaimer ********INFOSYS***




RE: Including third party jar files in Map Reduce job

Posted by Utkarsh Gupta <Ut...@infosys.com>.
Hi Harsh,

This worked this was exactly what I was looking for.
The warning has gone and now I can add third party jar files using DistributedCache.addFileToClassPath() method.
Now there is no need to copy jar to each node's $HADOOP_HOME/lib folder

Thanks a lot
Utkarsh

-----Original Message-----
From: Harsh J [mailto:harsh@cloudera.com] 
Sent: Wednesday, April 04, 2012 6:32 PM
To: mapreduce-user@hadoop.apache.org
Subject: Re: Including third party jar files in Map Reduce job

When using Tool, do not use:

Configuration conf = new Configuration();

Instead get config from the class:

Configuration conf = getConf();

This is documented at
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/util/Tool.html

On Wed, Apr 4, 2012 at 6:25 PM, Utkarsh Gupta <Ut...@infosys.com> wrote:
> Hi Harsh,
> I have implemented Tool like this
>
> public static void main(String[] args) throws Exception {
>        Configuration configuration = new Configuration();
>        int rc = ToolRunner.run(configuration, new WordCount(), args);
>        System.exit(rc);
>    }
>
>    @Override
>    public int run(String[] args) throws Exception {
>        if (args.length < 2) {
>            System.err.println("Usage: WordCount <input path> <output 
> path>");
>            return -1;
>        }
>        Configuration conf = new Configuration();
>        //conf.set("mapred.job.tracker", "local");
>        Job job = new Job(conf, "wordcount");
>
>        job.setJarByClass(WordCount.class);
>        job.setMapperClass(MyMapper.class);
>        job.setReducerClass(MyReducer.class);
>        job.setOutputKeyClass(Text.class);
>        job.setOutputValueClass(IntWritable.class);
>        job.setInputFormatClass(TextInputFormat.class);
>        job.setOutputFormatClass(TextOutputFormat.class);
>        job.setNumReduceTasks(1);
>        FileInputFormat.addInputPath(job, new Path(args[0]));
>        FileOutputFormat.setOutputPath(job, new Path(args[1]));
>        return (job.waitForCompletion(true)) ? 0 : 1;
>    }
>
> This is working but I am unable to figure out why still it is warning 
> -----Original Message-----
> From: Harsh J [mailto:harsh@cloudera.com]
> Sent: Wednesday, April 04, 2012 6:20 PM
> To: mapreduce-user@hadoop.apache.org
> Subject: Re: Including third party jar files in Map Reduce job
>
> Utkarsh,
>
> A log like "12/04/04 15:21:00 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same." indicates you haven't implemented the Tool approach properly (or aren't calling its run()).
>
> On Wed, Apr 4, 2012 at 5:25 PM, Utkarsh Gupta <Ut...@infosys.com> wrote:
>> Hi Devaraj,
>>
>> The code is running now after copying jar @ each node.
>> I might be doing some mistake previously.
>> Thanks Devaraj and Bejoy :)
>>
>>
>> -----Original Message-----
>> From: Devaraj k [mailto:devaraj.k@huawei.com]
>> Sent: Wednesday, April 04, 2012 2:08 PM
>> To: mapreduce-user@hadoop.apache.org
>> Subject: RE: Including third party jar files in Map Reduce job
>>
>> As Bejoy mentioned,
>>
>> If you have copied the jar to $HADOOP_HOME, then you should copy it 
>> to all the nodes in the cluster. (or)
>>
>> If you want to make use of -libjar option, your application should implement Tool to support generic options. Please check the below link for more details.
>>
>> http://hadoop.apache.org/common/docs/current/commands_manual.html#jar
>>
>> Thanks
>> Devaraj
>> ________________________________________
>> From: Bejoy Ks [bejoy.hadoop@gmail.com]
>> Sent: Wednesday, April 04, 2012 1:06 PM
>> To: mapreduce-user@hadoop.apache.org
>> Subject: Re: Including third party jar files in Map Reduce job
>>
>> Hi Utkarsh
>>         You can add third party jars to your map reduce job elegantly 
>> in the following ways
>>
>> 1) use - libjars
>> hadoop jar jarName.jar com.driver.ClassName -libjars /home/some/dir/somejar.jar ....
>>
>> 2) include the third pary jars in /lib folder while packaging your 
>> application
>>
>> 3) If you are adding the jar in HADOOP_HOME/lib , you need to add this at all nodes.
>>
>> Regards
>> Bejoy KS
>>
>> On Wed, Apr 4, 2012 at 12:55 PM, Utkarsh Gupta <Ut...@infosys.com>> wrote:
>> Hi Devaraj,
>>
>> I have already copied the required jar file in $HADOOP_HOME/lib folder.
>> Can you tell me where to add generic option -libjars
>>
>> The stack trace is:
>> hadoop$ bin/hadoop jar WordCount.jar /user/hduser1/input/ 
>> /user/hduser1/output
>> 12/04/04 12:45:51 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
>> 12/04/04 12:45:51 INFO input.FileInputFormat: Total input paths to 
>> process : 1
>> 12/04/04 12:45:51 INFO mapred.JobClient: Running job:
>> job_201204041107_0005
>> 12/04/04 12:45:52 INFO mapred.JobClient:  map 0% reduce 0%
>> 12/04/04 12:46:07 INFO mapred.JobClient: Task Id :
>> attempt_201204041107_0005_m_000000_0, Status : FAILED
>> Error: java.lang.ClassNotFoundException:
>> org.apache.commons.math3.random.RandomDataImpl
>>       at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>>       at java.security.AccessController.doPrivileged(Native Method)
>>       at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>>       at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>>       at 
>> sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>>       at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>>       at wordcount.MyMapper.map(MyMapper.java:22)
>>       at wordcount.MyMapper.map(MyMapper.java:14)
>>       at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>>       at
>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
>>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
>>       at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
>>       at java.security.AccessController.doPrivileged(Native Method)
>>       at javax.security.auth.Subject.doAs(Subject.java:396)
>>       at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma
>> t
>> ion.java:1059)
>>       at org.apache.hadoop.mapred.Child.main(Child.java:253)
>>
>> Thanks and Regards
>> Utkarsh
>>
>> -----Original Message-----
>> From: Devaraj k
>> [mailto:devaraj.k@huawei.com<ma...@huawei.com>]
>> Sent: Wednesday, April 04, 2012 12:35 PM
>> To:
>> mapreduce-user@hadoop.apache.org<mailto:mapreduce-user@hadoop.apache.
>> o
>> rg>
>> Subject: RE: Including third party jar files in Map Reduce job
>>
>> Hi Utkarsh,
>>
>> The usage of the jar command is like this,
>>
>> Usage: hadoop jar <jar> [mainClass] args...
>>
>> If you want the commons-math3.jar to be available for all the tasks you can do any one of these 1. Copy the jar file in $HADOOP_HOME/lib dir or 2. Use the generic option -libjars.
>>
>> Can you give the stack trace of your problem for which class it is giving ClassNotFoundException(i.e for main class or math lib class)?
>>
>> Thanks
>> Devaraj
>> ________________________________________
>> From: Utkarsh Gupta
>> [Utkarsh_Gupta@infosys.com<ma...@infosys.com>]
>> Sent: Wednesday, April 04, 2012 12:22 PM
>> To:
>> mapreduce-user@hadoop.apache.org<mailto:mapreduce-user@hadoop.apache.
>> o
>> rg>
>> Subject: Including third party jar files in Map Reduce job
>>
>> Hi All,
>>
>> I am new to Hadoop and was trying to generate random numbers using apache commons math library.
>> I used Netbeans to build the jar file and the manifest has path to commons-math jar as lib/commons-math3.jar I have placed this jar file in HADOOP_HOME/lib folder but still I am getting ClassNotFoundException.
>> I tried using -libjars option with $HADOOP_HOME/bin/Hadoop jar myprg.jar <inputpath> <outputpath> -libjars <jarpath> And $HADOOP_HOME/bin/Hadoop jar myprg.jar -libjar <jarpath> <inputpath> <outputpath> But this is not working. Please help.
>>
>>
>> Thanks and Regards
>> Utkarsh Gupta
>>
>>
>>
>> **************** CAUTION - Disclaimer ***************** This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions are unlawful. This e-mail may contain viruses. Infosys has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. Infosys reserves the right to monitor and review the content of all messages sent to or from this e-mail address. Messages sent to or from this e-mail address may be stored on the Infosys e-mail system.
>> ***INFOSYS******** End of Disclaimer ********INFOSYS***
>>
>>
>>
>
>
>
> --
> Harsh J



--
Harsh J

Re: Including third party jar files in Map Reduce job

Posted by Ioan Eugen Stan <st...@gmail.com>.
Pe 04.04.2012 16:01, Harsh J a scris:
> When using Tool, do not use:
>
> Configuration conf = new Configuration();
>
> Instead get config from the class:
>
> Configuration conf = getConf();
>
> This is documented at
> http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/util/Tool.html
>

I wish I knew that when I needed it, but it's not too late know either. 
Thanks Harsh, you're doing a great job.

I will update my post, because I had the same problem and a lot of 
people run into it.

Regards,
-- 
Ioan Eugen Stan
http://ieugen.blogspot.com

Re: Including third party jar files in Map Reduce job

Posted by Harsh J <ha...@cloudera.com>.
When using Tool, do not use:

Configuration conf = new Configuration();

Instead get config from the class:

Configuration conf = getConf();

This is documented at
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/util/Tool.html

On Wed, Apr 4, 2012 at 6:25 PM, Utkarsh Gupta <Ut...@infosys.com> wrote:
> Hi Harsh,
> I have implemented Tool like this
>
> public static void main(String[] args) throws Exception {
>        Configuration configuration = new Configuration();
>        int rc = ToolRunner.run(configuration, new WordCount(), args);
>        System.exit(rc);
>    }
>
>    @Override
>    public int run(String[] args) throws Exception {
>        if (args.length < 2) {
>            System.err.println("Usage: WordCount <input path> <output path>");
>            return -1;
>        }
>        Configuration conf = new Configuration();
>        //conf.set("mapred.job.tracker", "local");
>        Job job = new Job(conf, "wordcount");
>
>        job.setJarByClass(WordCount.class);
>        job.setMapperClass(MyMapper.class);
>        job.setReducerClass(MyReducer.class);
>        job.setOutputKeyClass(Text.class);
>        job.setOutputValueClass(IntWritable.class);
>        job.setInputFormatClass(TextInputFormat.class);
>        job.setOutputFormatClass(TextOutputFormat.class);
>        job.setNumReduceTasks(1);
>        FileInputFormat.addInputPath(job, new Path(args[0]));
>        FileOutputFormat.setOutputPath(job, new Path(args[1]));
>        return (job.waitForCompletion(true)) ? 0 : 1;
>    }
>
> This is working but I am unable to figure out why still it is warning
> -----Original Message-----
> From: Harsh J [mailto:harsh@cloudera.com]
> Sent: Wednesday, April 04, 2012 6:20 PM
> To: mapreduce-user@hadoop.apache.org
> Subject: Re: Including third party jar files in Map Reduce job
>
> Utkarsh,
>
> A log like "12/04/04 15:21:00 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same." indicates you haven't implemented the Tool approach properly (or aren't calling its run()).
>
> On Wed, Apr 4, 2012 at 5:25 PM, Utkarsh Gupta <Ut...@infosys.com> wrote:
>> Hi Devaraj,
>>
>> The code is running now after copying jar @ each node.
>> I might be doing some mistake previously.
>> Thanks Devaraj and Bejoy :)
>>
>>
>> -----Original Message-----
>> From: Devaraj k [mailto:devaraj.k@huawei.com]
>> Sent: Wednesday, April 04, 2012 2:08 PM
>> To: mapreduce-user@hadoop.apache.org
>> Subject: RE: Including third party jar files in Map Reduce job
>>
>> As Bejoy mentioned,
>>
>> If you have copied the jar to $HADOOP_HOME, then you should copy it to
>> all the nodes in the cluster. (or)
>>
>> If you want to make use of -libjar option, your application should implement Tool to support generic options. Please check the below link for more details.
>>
>> http://hadoop.apache.org/common/docs/current/commands_manual.html#jar
>>
>> Thanks
>> Devaraj
>> ________________________________________
>> From: Bejoy Ks [bejoy.hadoop@gmail.com]
>> Sent: Wednesday, April 04, 2012 1:06 PM
>> To: mapreduce-user@hadoop.apache.org
>> Subject: Re: Including third party jar files in Map Reduce job
>>
>> Hi Utkarsh
>>         You can add third party jars to your map reduce job elegantly
>> in the following ways
>>
>> 1) use - libjars
>> hadoop jar jarName.jar com.driver.ClassName -libjars /home/some/dir/somejar.jar ....
>>
>> 2) include the third pary jars in /lib folder while packaging your
>> application
>>
>> 3) If you are adding the jar in HADOOP_HOME/lib , you need to add this at all nodes.
>>
>> Regards
>> Bejoy KS
>>
>> On Wed, Apr 4, 2012 at 12:55 PM, Utkarsh Gupta <Ut...@infosys.com>> wrote:
>> Hi Devaraj,
>>
>> I have already copied the required jar file in $HADOOP_HOME/lib folder.
>> Can you tell me where to add generic option -libjars
>>
>> The stack trace is:
>> hadoop$ bin/hadoop jar WordCount.jar /user/hduser1/input/
>> /user/hduser1/output
>> 12/04/04 12:45:51 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
>> 12/04/04 12:45:51 INFO input.FileInputFormat: Total input paths to
>> process : 1
>> 12/04/04 12:45:51 INFO mapred.JobClient: Running job:
>> job_201204041107_0005
>> 12/04/04 12:45:52 INFO mapred.JobClient:  map 0% reduce 0%
>> 12/04/04 12:46:07 INFO mapred.JobClient: Task Id :
>> attempt_201204041107_0005_m_000000_0, Status : FAILED
>> Error: java.lang.ClassNotFoundException:
>> org.apache.commons.math3.random.RandomDataImpl
>>       at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>>       at java.security.AccessController.doPrivileged(Native Method)
>>       at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>>       at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>>       at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>>       at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>>       at wordcount.MyMapper.map(MyMapper.java:22)
>>       at wordcount.MyMapper.map(MyMapper.java:14)
>>       at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>>       at
>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
>>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
>>       at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
>>       at java.security.AccessController.doPrivileged(Native Method)
>>       at javax.security.auth.Subject.doAs(Subject.java:396)
>>       at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
>> ion.java:1059)
>>       at org.apache.hadoop.mapred.Child.main(Child.java:253)
>>
>> Thanks and Regards
>> Utkarsh
>>
>> -----Original Message-----
>> From: Devaraj k
>> [mailto:devaraj.k@huawei.com<ma...@huawei.com>]
>> Sent: Wednesday, April 04, 2012 12:35 PM
>> To:
>> mapreduce-user@hadoop.apache.org<mailto:mapreduce-user@hadoop.apache.o
>> rg>
>> Subject: RE: Including third party jar files in Map Reduce job
>>
>> Hi Utkarsh,
>>
>> The usage of the jar command is like this,
>>
>> Usage: hadoop jar <jar> [mainClass] args...
>>
>> If you want the commons-math3.jar to be available for all the tasks you can do any one of these 1. Copy the jar file in $HADOOP_HOME/lib dir or 2. Use the generic option -libjars.
>>
>> Can you give the stack trace of your problem for which class it is giving ClassNotFoundException(i.e for main class or math lib class)?
>>
>> Thanks
>> Devaraj
>> ________________________________________
>> From: Utkarsh Gupta
>> [Utkarsh_Gupta@infosys.com<ma...@infosys.com>]
>> Sent: Wednesday, April 04, 2012 12:22 PM
>> To:
>> mapreduce-user@hadoop.apache.org<mailto:mapreduce-user@hadoop.apache.o
>> rg>
>> Subject: Including third party jar files in Map Reduce job
>>
>> Hi All,
>>
>> I am new to Hadoop and was trying to generate random numbers using apache commons math library.
>> I used Netbeans to build the jar file and the manifest has path to commons-math jar as lib/commons-math3.jar I have placed this jar file in HADOOP_HOME/lib folder but still I am getting ClassNotFoundException.
>> I tried using -libjars option with $HADOOP_HOME/bin/Hadoop jar myprg.jar <inputpath> <outputpath> -libjars <jarpath> And $HADOOP_HOME/bin/Hadoop jar myprg.jar -libjar <jarpath> <inputpath> <outputpath> But this is not working. Please help.
>>
>>
>> Thanks and Regards
>> Utkarsh Gupta
>>
>>
>>
>> **************** CAUTION - Disclaimer ***************** This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions are unlawful. This e-mail may contain viruses. Infosys has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. Infosys reserves the right to monitor and review the content of all messages sent to or from this e-mail address. Messages sent to or from this e-mail address may be stored on the Infosys e-mail system.
>> ***INFOSYS******** End of Disclaimer ********INFOSYS***
>>
>>
>>
>
>
>
> --
> Harsh J



-- 
Harsh J

RE: Including third party jar files in Map Reduce job

Posted by Utkarsh Gupta <Ut...@infosys.com>.
Hi Harsh,
I have implemented Tool like this

public static void main(String[] args) throws Exception {
        Configuration configuration = new Configuration();
        int rc = ToolRunner.run(configuration, new WordCount(), args);
        System.exit(rc);
    }

    @Override
    public int run(String[] args) throws Exception {
        if (args.length < 2) {
            System.err.println("Usage: WordCount <input path> <output path>");
            return -1;
        }
        Configuration conf = new Configuration();
        //conf.set("mapred.job.tracker", "local");
        Job job = new Job(conf, "wordcount");
       
        job.setJarByClass(WordCount.class);
        job.setMapperClass(MyMapper.class);
        job.setReducerClass(MyReducer.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);
        job.setInputFormatClass(TextInputFormat.class);
        job.setOutputFormatClass(TextOutputFormat.class);
        job.setNumReduceTasks(1);
        FileInputFormat.addInputPath(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));
        return (job.waitForCompletion(true)) ? 0 : 1;
    }

This is working but I am unable to figure out why still it is warning
-----Original Message-----
From: Harsh J [mailto:harsh@cloudera.com] 
Sent: Wednesday, April 04, 2012 6:20 PM
To: mapreduce-user@hadoop.apache.org
Subject: Re: Including third party jar files in Map Reduce job

Utkarsh,

A log like "12/04/04 15:21:00 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same." indicates you haven't implemented the Tool approach properly (or aren't calling its run()).

On Wed, Apr 4, 2012 at 5:25 PM, Utkarsh Gupta <Ut...@infosys.com> wrote:
> Hi Devaraj,
>
> The code is running now after copying jar @ each node.
> I might be doing some mistake previously.
> Thanks Devaraj and Bejoy :)
>
>
> -----Original Message-----
> From: Devaraj k [mailto:devaraj.k@huawei.com]
> Sent: Wednesday, April 04, 2012 2:08 PM
> To: mapreduce-user@hadoop.apache.org
> Subject: RE: Including third party jar files in Map Reduce job
>
> As Bejoy mentioned,
>
> If you have copied the jar to $HADOOP_HOME, then you should copy it to 
> all the nodes in the cluster. (or)
>
> If you want to make use of -libjar option, your application should implement Tool to support generic options. Please check the below link for more details.
>
> http://hadoop.apache.org/common/docs/current/commands_manual.html#jar
>
> Thanks
> Devaraj
> ________________________________________
> From: Bejoy Ks [bejoy.hadoop@gmail.com]
> Sent: Wednesday, April 04, 2012 1:06 PM
> To: mapreduce-user@hadoop.apache.org
> Subject: Re: Including third party jar files in Map Reduce job
>
> Hi Utkarsh
>         You can add third party jars to your map reduce job elegantly 
> in the following ways
>
> 1) use - libjars
> hadoop jar jarName.jar com.driver.ClassName -libjars /home/some/dir/somejar.jar ....
>
> 2) include the third pary jars in /lib folder while packaging your 
> application
>
> 3) If you are adding the jar in HADOOP_HOME/lib , you need to add this at all nodes.
>
> Regards
> Bejoy KS
>
> On Wed, Apr 4, 2012 at 12:55 PM, Utkarsh Gupta <Ut...@infosys.com>> wrote:
> Hi Devaraj,
>
> I have already copied the required jar file in $HADOOP_HOME/lib folder.
> Can you tell me where to add generic option -libjars
>
> The stack trace is:
> hadoop$ bin/hadoop jar WordCount.jar /user/hduser1/input/ 
> /user/hduser1/output
> 12/04/04 12:45:51 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
> 12/04/04 12:45:51 INFO input.FileInputFormat: Total input paths to 
> process : 1
> 12/04/04 12:45:51 INFO mapred.JobClient: Running job: 
> job_201204041107_0005
> 12/04/04 12:45:52 INFO mapred.JobClient:  map 0% reduce 0%
> 12/04/04 12:46:07 INFO mapred.JobClient: Task Id : 
> attempt_201204041107_0005_m_000000_0, Status : FAILED
> Error: java.lang.ClassNotFoundException: 
> org.apache.commons.math3.random.RandomDataImpl
>       at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>       at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>       at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>       at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>       at wordcount.MyMapper.map(MyMapper.java:22)
>       at wordcount.MyMapper.map(MyMapper.java:14)
>       at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>       at 
> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
>       at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:396)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> ion.java:1059)
>       at org.apache.hadoop.mapred.Child.main(Child.java:253)
>
> Thanks and Regards
> Utkarsh
>
> -----Original Message-----
> From: Devaraj k 
> [mailto:devaraj.k@huawei.com<ma...@huawei.com>]
> Sent: Wednesday, April 04, 2012 12:35 PM
> To: 
> mapreduce-user@hadoop.apache.org<mailto:mapreduce-user@hadoop.apache.o
> rg>
> Subject: RE: Including third party jar files in Map Reduce job
>
> Hi Utkarsh,
>
> The usage of the jar command is like this,
>
> Usage: hadoop jar <jar> [mainClass] args...
>
> If you want the commons-math3.jar to be available for all the tasks you can do any one of these 1. Copy the jar file in $HADOOP_HOME/lib dir or 2. Use the generic option -libjars.
>
> Can you give the stack trace of your problem for which class it is giving ClassNotFoundException(i.e for main class or math lib class)?
>
> Thanks
> Devaraj
> ________________________________________
> From: Utkarsh Gupta 
> [Utkarsh_Gupta@infosys.com<ma...@infosys.com>]
> Sent: Wednesday, April 04, 2012 12:22 PM
> To: 
> mapreduce-user@hadoop.apache.org<mailto:mapreduce-user@hadoop.apache.o
> rg>
> Subject: Including third party jar files in Map Reduce job
>
> Hi All,
>
> I am new to Hadoop and was trying to generate random numbers using apache commons math library.
> I used Netbeans to build the jar file and the manifest has path to commons-math jar as lib/commons-math3.jar I have placed this jar file in HADOOP_HOME/lib folder but still I am getting ClassNotFoundException.
> I tried using -libjars option with $HADOOP_HOME/bin/Hadoop jar myprg.jar <inputpath> <outputpath> -libjars <jarpath> And $HADOOP_HOME/bin/Hadoop jar myprg.jar -libjar <jarpath> <inputpath> <outputpath> But this is not working. Please help.
>
>
> Thanks and Regards
> Utkarsh Gupta
>
>
>
> **************** CAUTION - Disclaimer ***************** This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions are unlawful. This e-mail may contain viruses. Infosys has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. Infosys reserves the right to monitor and review the content of all messages sent to or from this e-mail address. Messages sent to or from this e-mail address may be stored on the Infosys e-mail system.
> ***INFOSYS******** End of Disclaimer ********INFOSYS***
>
>
>



--
Harsh J

Re: Including third party jar files in Map Reduce job

Posted by Harsh J <ha...@cloudera.com>.
Utkarsh,

A log like "12/04/04 15:21:00 WARN mapred.JobClient: Use
GenericOptionsParser for parsing the arguments. Applications should
implement Tool for the same." indicates you haven't implemented the
Tool approach properly (or aren't calling its run()).

On Wed, Apr 4, 2012 at 5:25 PM, Utkarsh Gupta <Ut...@infosys.com> wrote:
> Hi Devaraj,
>
> The code is running now after copying jar @ each node.
> I might be doing some mistake previously.
> Thanks Devaraj and Bejoy :)
>
>
> -----Original Message-----
> From: Devaraj k [mailto:devaraj.k@huawei.com]
> Sent: Wednesday, April 04, 2012 2:08 PM
> To: mapreduce-user@hadoop.apache.org
> Subject: RE: Including third party jar files in Map Reduce job
>
> As Bejoy mentioned,
>
> If you have copied the jar to $HADOOP_HOME, then you should copy it to all the nodes in the cluster. (or)
>
> If you want to make use of -libjar option, your application should implement Tool to support generic options. Please check the below link for more details.
>
> http://hadoop.apache.org/common/docs/current/commands_manual.html#jar
>
> Thanks
> Devaraj
> ________________________________________
> From: Bejoy Ks [bejoy.hadoop@gmail.com]
> Sent: Wednesday, April 04, 2012 1:06 PM
> To: mapreduce-user@hadoop.apache.org
> Subject: Re: Including third party jar files in Map Reduce job
>
> Hi Utkarsh
>         You can add third party jars to your map reduce job elegantly in the following ways
>
> 1) use - libjars
> hadoop jar jarName.jar com.driver.ClassName -libjars /home/some/dir/somejar.jar ....
>
> 2) include the third pary jars in /lib folder while packaging your application
>
> 3) If you are adding the jar in HADOOP_HOME/lib , you need to add this at all nodes.
>
> Regards
> Bejoy KS
>
> On Wed, Apr 4, 2012 at 12:55 PM, Utkarsh Gupta <Ut...@infosys.com>> wrote:
> Hi Devaraj,
>
> I have already copied the required jar file in $HADOOP_HOME/lib folder.
> Can you tell me where to add generic option -libjars
>
> The stack trace is:
> hadoop$ bin/hadoop jar WordCount.jar /user/hduser1/input/ /user/hduser1/output
> 12/04/04 12:45:51 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
> 12/04/04 12:45:51 INFO input.FileInputFormat: Total input paths to process : 1
> 12/04/04 12:45:51 INFO mapred.JobClient: Running job: job_201204041107_0005
> 12/04/04 12:45:52 INFO mapred.JobClient:  map 0% reduce 0%
> 12/04/04 12:46:07 INFO mapred.JobClient: Task Id : attempt_201204041107_0005_m_000000_0, Status : FAILED
> Error: java.lang.ClassNotFoundException: org.apache.commons.math3.random.RandomDataImpl
>       at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>       at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>       at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>       at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>       at wordcount.MyMapper.map(MyMapper.java:22)
>       at wordcount.MyMapper.map(MyMapper.java:14)
>       at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
>       at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:396)
>       at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
>       at org.apache.hadoop.mapred.Child.main(Child.java:253)
>
> Thanks and Regards
> Utkarsh
>
> -----Original Message-----
> From: Devaraj k [mailto:devaraj.k@huawei.com<ma...@huawei.com>]
> Sent: Wednesday, April 04, 2012 12:35 PM
> To: mapreduce-user@hadoop.apache.org<ma...@hadoop.apache.org>
> Subject: RE: Including third party jar files in Map Reduce job
>
> Hi Utkarsh,
>
> The usage of the jar command is like this,
>
> Usage: hadoop jar <jar> [mainClass] args...
>
> If you want the commons-math3.jar to be available for all the tasks you can do any one of these 1. Copy the jar file in $HADOOP_HOME/lib dir or 2. Use the generic option -libjars.
>
> Can you give the stack trace of your problem for which class it is giving ClassNotFoundException(i.e for main class or math lib class)?
>
> Thanks
> Devaraj
> ________________________________________
> From: Utkarsh Gupta [Utkarsh_Gupta@infosys.com<ma...@infosys.com>]
> Sent: Wednesday, April 04, 2012 12:22 PM
> To: mapreduce-user@hadoop.apache.org<ma...@hadoop.apache.org>
> Subject: Including third party jar files in Map Reduce job
>
> Hi All,
>
> I am new to Hadoop and was trying to generate random numbers using apache commons math library.
> I used Netbeans to build the jar file and the manifest has path to commons-math jar as lib/commons-math3.jar I have placed this jar file in HADOOP_HOME/lib folder but still I am getting ClassNotFoundException.
> I tried using -libjars option with $HADOOP_HOME/bin/Hadoop jar myprg.jar <inputpath> <outputpath> -libjars <jarpath> And $HADOOP_HOME/bin/Hadoop jar myprg.jar -libjar <jarpath> <inputpath> <outputpath> But this is not working. Please help.
>
>
> Thanks and Regards
> Utkarsh Gupta
>
>
>
> **************** CAUTION - Disclaimer ***************** This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions are unlawful. This e-mail may contain viruses. Infosys has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. Infosys reserves the right to monitor and review the content of all messages sent to or from this e-mail address. Messages sent to or from this e-mail address may be stored on the Infosys e-mail system.
> ***INFOSYS******** End of Disclaimer ********INFOSYS***
>
>
>



-- 
Harsh J

RE: Including third party jar files in Map Reduce job

Posted by Utkarsh Gupta <Ut...@infosys.com>.
Hi Devaraj,

The code is running now after copying jar @ each node.
I might be doing some mistake previously.
Thanks Devaraj and Bejoy :)


-----Original Message-----
From: Devaraj k [mailto:devaraj.k@huawei.com] 
Sent: Wednesday, April 04, 2012 2:08 PM
To: mapreduce-user@hadoop.apache.org
Subject: RE: Including third party jar files in Map Reduce job

As Bejoy mentioned,

If you have copied the jar to $HADOOP_HOME, then you should copy it to all the nodes in the cluster. (or)

If you want to make use of -libjar option, your application should implement Tool to support generic options. Please check the below link for more details.

http://hadoop.apache.org/common/docs/current/commands_manual.html#jar

Thanks
Devaraj
________________________________________
From: Bejoy Ks [bejoy.hadoop@gmail.com]
Sent: Wednesday, April 04, 2012 1:06 PM
To: mapreduce-user@hadoop.apache.org
Subject: Re: Including third party jar files in Map Reduce job

Hi Utkarsh
         You can add third party jars to your map reduce job elegantly in the following ways

1) use - libjars
hadoop jar jarName.jar com.driver.ClassName -libjars /home/some/dir/somejar.jar ....

2) include the third pary jars in /lib folder while packaging your application

3) If you are adding the jar in HADOOP_HOME/lib , you need to add this at all nodes.

Regards
Bejoy KS

On Wed, Apr 4, 2012 at 12:55 PM, Utkarsh Gupta <Ut...@infosys.com>> wrote:
Hi Devaraj,

I have already copied the required jar file in $HADOOP_HOME/lib folder.
Can you tell me where to add generic option -libjars

The stack trace is:
hadoop$ bin/hadoop jar WordCount.jar /user/hduser1/input/ /user/hduser1/output
12/04/04 12:45:51 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
12/04/04 12:45:51 INFO input.FileInputFormat: Total input paths to process : 1
12/04/04 12:45:51 INFO mapred.JobClient: Running job: job_201204041107_0005
12/04/04 12:45:52 INFO mapred.JobClient:  map 0% reduce 0%
12/04/04 12:46:07 INFO mapred.JobClient: Task Id : attempt_201204041107_0005_m_000000_0, Status : FAILED
Error: java.lang.ClassNotFoundException: org.apache.commons.math3.random.RandomDataImpl
       at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
       at java.security.AccessController.doPrivileged(Native Method)
       at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
       at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
       at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
       at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
       at wordcount.MyMapper.map(MyMapper.java:22)
       at wordcount.MyMapper.map(MyMapper.java:14)
       at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
       at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
       at java.security.AccessController.doPrivileged(Native Method)
       at javax.security.auth.Subject.doAs(Subject.java:396)
       at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
       at org.apache.hadoop.mapred.Child.main(Child.java:253)

Thanks and Regards
Utkarsh

-----Original Message-----
From: Devaraj k [mailto:devaraj.k@huawei.com<ma...@huawei.com>]
Sent: Wednesday, April 04, 2012 12:35 PM
To: mapreduce-user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: RE: Including third party jar files in Map Reduce job

Hi Utkarsh,

The usage of the jar command is like this,

Usage: hadoop jar <jar> [mainClass] args...

If you want the commons-math3.jar to be available for all the tasks you can do any one of these 1. Copy the jar file in $HADOOP_HOME/lib dir or 2. Use the generic option -libjars.

Can you give the stack trace of your problem for which class it is giving ClassNotFoundException(i.e for main class or math lib class)?

Thanks
Devaraj
________________________________________
From: Utkarsh Gupta [Utkarsh_Gupta@infosys.com<ma...@infosys.com>]
Sent: Wednesday, April 04, 2012 12:22 PM
To: mapreduce-user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Including third party jar files in Map Reduce job

Hi All,

I am new to Hadoop and was trying to generate random numbers using apache commons math library.
I used Netbeans to build the jar file and the manifest has path to commons-math jar as lib/commons-math3.jar I have placed this jar file in HADOOP_HOME/lib folder but still I am getting ClassNotFoundException.
I tried using -libjars option with $HADOOP_HOME/bin/Hadoop jar myprg.jar <inputpath> <outputpath> -libjars <jarpath> And $HADOOP_HOME/bin/Hadoop jar myprg.jar -libjar <jarpath> <inputpath> <outputpath> But this is not working. Please help.


Thanks and Regards
Utkarsh Gupta



**************** CAUTION - Disclaimer ***************** This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions are unlawful. This e-mail may contain viruses. Infosys has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. Infosys reserves the right to monitor and review the content of all messages sent to or from this e-mail address. Messages sent to or from this e-mail address may be stored on the Infosys e-mail system.
***INFOSYS******** End of Disclaimer ********INFOSYS***




RE: Including third party jar files in Map Reduce job

Posted by Devaraj k <de...@huawei.com>.
As Bejoy mentioned,

If you have copied the jar to $HADOOP_HOME, then you should copy it to all the nodes in the cluster. (or)

If you want to make use of -libjar option, your application should implement Tool to support generic options. Please check the below link for more details.

http://hadoop.apache.org/common/docs/current/commands_manual.html#jar

Thanks
Devaraj
________________________________________
From: Bejoy Ks [bejoy.hadoop@gmail.com]
Sent: Wednesday, April 04, 2012 1:06 PM
To: mapreduce-user@hadoop.apache.org
Subject: Re: Including third party jar files in Map Reduce job

Hi Utkarsh
         You can add third party jars to your map reduce job elegantly in the following ways

1) use - libjars
hadoop jar jarName.jar com.driver.ClassName -libjars /home/some/dir/somejar.jar ....

2) include the third pary jars in /lib folder while packaging your application

3) If you are adding the jar in HADOOP_HOME/lib , you need to add this at all nodes.

Regards
Bejoy KS

On Wed, Apr 4, 2012 at 12:55 PM, Utkarsh Gupta <Ut...@infosys.com>> wrote:
Hi Devaraj,

I have already copied the required jar file in $HADOOP_HOME/lib folder.
Can you tell me where to add generic option -libjars

The stack trace is:
hadoop$ bin/hadoop jar WordCount.jar /user/hduser1/input/ /user/hduser1/output
12/04/04 12:45:51 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
12/04/04 12:45:51 INFO input.FileInputFormat: Total input paths to process : 1
12/04/04 12:45:51 INFO mapred.JobClient: Running job: job_201204041107_0005
12/04/04 12:45:52 INFO mapred.JobClient:  map 0% reduce 0%
12/04/04 12:46:07 INFO mapred.JobClient: Task Id : attempt_201204041107_0005_m_000000_0, Status : FAILED
Error: java.lang.ClassNotFoundException: org.apache.commons.math3.random.RandomDataImpl
       at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
       at java.security.AccessController.doPrivileged(Native Method)
       at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
       at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
       at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
       at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
       at wordcount.MyMapper.map(MyMapper.java:22)
       at wordcount.MyMapper.map(MyMapper.java:14)
       at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
       at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
       at java.security.AccessController.doPrivileged(Native Method)
       at javax.security.auth.Subject.doAs(Subject.java:396)
       at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
       at org.apache.hadoop.mapred.Child.main(Child.java:253)

Thanks and Regards
Utkarsh

-----Original Message-----
From: Devaraj k [mailto:devaraj.k@huawei.com<ma...@huawei.com>]
Sent: Wednesday, April 04, 2012 12:35 PM
To: mapreduce-user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: RE: Including third party jar files in Map Reduce job

Hi Utkarsh,

The usage of the jar command is like this,

Usage: hadoop jar <jar> [mainClass] args...

If you want the commons-math3.jar to be available for all the tasks you can do any one of these 1. Copy the jar file in $HADOOP_HOME/lib dir or 2. Use the generic option -libjars.

Can you give the stack trace of your problem for which class it is giving ClassNotFoundException(i.e for main class or math lib class)?

Thanks
Devaraj
________________________________________
From: Utkarsh Gupta [Utkarsh_Gupta@infosys.com<ma...@infosys.com>]
Sent: Wednesday, April 04, 2012 12:22 PM
To: mapreduce-user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Including third party jar files in Map Reduce job

Hi All,

I am new to Hadoop and was trying to generate random numbers using apache commons math library.
I used Netbeans to build the jar file and the manifest has path to commons-math jar as lib/commons-math3.jar I have placed this jar file in HADOOP_HOME/lib folder but still I am getting ClassNotFoundException.
I tried using -libjars option with $HADOOP_HOME/bin/Hadoop jar myprg.jar <inputpath> <outputpath> -libjars <jarpath> And $HADOOP_HOME/bin/Hadoop jar myprg.jar -libjar <jarpath> <inputpath> <outputpath> But this is not working. Please help.


Thanks and Regards
Utkarsh Gupta



**************** CAUTION - Disclaimer ***************** This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions are unlawful. This e-mail may contain viruses. Infosys has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. Infosys reserves the right to monitor and review the content of all messages sent to or from this e-mail address. Messages sent to or from this e-mail address may be stored on the Infosys e-mail system.
***INFOSYS******** End of Disclaimer ********INFOSYS***




Re: Including third party jar files in Map Reduce job

Posted by Bejoy Ks <be...@gmail.com>.
Hi Utkarsh
         You can add third party jars to your map reduce job elegantly in
the following ways

1) use - libjars
hadoop jar jarName.jar com.driver.ClassName -libjars
/home/some/dir/somejar.jar ....

2) include the third pary jars in /lib folder while packaging your
application

3) If you are adding the jar in HADOOP_HOME/lib , you need to add this at
all nodes.

Regards
Bejoy KS

On Wed, Apr 4, 2012 at 12:55 PM, Utkarsh Gupta <Ut...@infosys.com>wrote:

> Hi Devaraj,
>
> I have already copied the required jar file in $HADOOP_HOME/lib folder.
> Can you tell me where to add generic option -libjars
>
> The stack trace is:
> hadoop$ bin/hadoop jar WordCount.jar /user/hduser1/input/
> /user/hduser1/output
> 12/04/04 12:45:51 WARN mapred.JobClient: Use GenericOptionsParser for
> parsing the arguments. Applications should implement Tool for the same.
> 12/04/04 12:45:51 INFO input.FileInputFormat: Total input paths to process
> : 1
> 12/04/04 12:45:51 INFO mapred.JobClient: Running job: job_201204041107_0005
> 12/04/04 12:45:52 INFO mapred.JobClient:  map 0% reduce 0%
> 12/04/04 12:46:07 INFO mapred.JobClient: Task Id :
> attempt_201204041107_0005_m_000000_0, Status : FAILED
> Error: java.lang.ClassNotFoundException:
> org.apache.commons.math3.random.RandomDataImpl
>        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>        at wordcount.MyMapper.map(MyMapper.java:22)
>        at wordcount.MyMapper.map(MyMapper.java:14)
>        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
>        at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at javax.security.auth.Subject.doAs(Subject.java:396)
>        at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
>        at org.apache.hadoop.mapred.Child.main(Child.java:253)
>
> Thanks and Regards
> Utkarsh
>
> -----Original Message-----
> From: Devaraj k [mailto:devaraj.k@huawei.com]
> Sent: Wednesday, April 04, 2012 12:35 PM
> To: mapreduce-user@hadoop.apache.org
> Subject: RE: Including third party jar files in Map Reduce job
>
> Hi Utkarsh,
>
> The usage of the jar command is like this,
>
> Usage: hadoop jar <jar> [mainClass] args...
>
> If you want the commons-math3.jar to be available for all the tasks you
> can do any one of these 1. Copy the jar file in $HADOOP_HOME/lib dir or 2.
> Use the generic option -libjars.
>
> Can you give the stack trace of your problem for which class it is giving
> ClassNotFoundException(i.e for main class or math lib class)?
>
> Thanks
> Devaraj
> ________________________________________
> From: Utkarsh Gupta [Utkarsh_Gupta@infosys.com]
> Sent: Wednesday, April 04, 2012 12:22 PM
> To: mapreduce-user@hadoop.apache.org
> Subject: Including third party jar files in Map Reduce job
>
> Hi All,
>
> I am new to Hadoop and was trying to generate random numbers using apache
> commons math library.
> I used Netbeans to build the jar file and the manifest has path to
> commons-math jar as lib/commons-math3.jar I have placed this jar file in
> HADOOP_HOME/lib folder but still I am getting ClassNotFoundException.
> I tried using -libjars option with $HADOOP_HOME/bin/Hadoop jar myprg.jar
> <inputpath> <outputpath> -libjars <jarpath> And $HADOOP_HOME/bin/Hadoop jar
> myprg.jar -libjar <jarpath> <inputpath> <outputpath> But this is not
> working. Please help.
>
>
> Thanks and Regards
> Utkarsh Gupta
>
>
>
> **************** CAUTION - Disclaimer ***************** This e-mail
> contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the
> use of the addressee(s). If you are not the intended recipient, please
> notify the sender by e-mail and delete the original message. Further, you
> are not to copy, disclose, or distribute this e-mail or its contents to any
> other person and any such actions are unlawful. This e-mail may contain
> viruses. Infosys has taken every reasonable precaution to minimize this
> risk, but is not liable for any damage you may sustain as a result of any
> virus in this e-mail. You should carry out your own virus checks before
> opening the e-mail or attachment. Infosys reserves the right to monitor and
> review the content of all messages sent to or from this e-mail address.
> Messages sent to or from this e-mail address may be stored on the Infosys
> e-mail system.
> ***INFOSYS******** End of Disclaimer ********INFOSYS***
>
>
>

RE: Including third party jar files in Map Reduce job

Posted by Utkarsh Gupta <Ut...@infosys.com>.
Hi Devaraj,

I have already copied the required jar file in $HADOOP_HOME/lib folder.
Can you tell me where to add generic option -libjars

The stack trace is:
hadoop$ bin/hadoop jar WordCount.jar /user/hduser1/input/ /user/hduser1/output
12/04/04 12:45:51 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
12/04/04 12:45:51 INFO input.FileInputFormat: Total input paths to process : 1
12/04/04 12:45:51 INFO mapred.JobClient: Running job: job_201204041107_0005
12/04/04 12:45:52 INFO mapred.JobClient:  map 0% reduce 0%
12/04/04 12:46:07 INFO mapred.JobClient: Task Id : attempt_201204041107_0005_m_000000_0, Status : FAILED
Error: java.lang.ClassNotFoundException: org.apache.commons.math3.random.RandomDataImpl
	at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
	at wordcount.MyMapper.map(MyMapper.java:22)
	at wordcount.MyMapper.map(MyMapper.java:14)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
	at org.apache.hadoop.mapred.Child.main(Child.java:253)

Thanks and Regards
Utkarsh

-----Original Message-----
From: Devaraj k [mailto:devaraj.k@huawei.com] 
Sent: Wednesday, April 04, 2012 12:35 PM
To: mapreduce-user@hadoop.apache.org
Subject: RE: Including third party jar files in Map Reduce job

Hi Utkarsh,

The usage of the jar command is like this,

Usage: hadoop jar <jar> [mainClass] args... 

If you want the commons-math3.jar to be available for all the tasks you can do any one of these 1. Copy the jar file in $HADOOP_HOME/lib dir or 2. Use the generic option -libjars.

Can you give the stack trace of your problem for which class it is giving ClassNotFoundException(i.e for main class or math lib class)?

Thanks
Devaraj
________________________________________
From: Utkarsh Gupta [Utkarsh_Gupta@infosys.com]
Sent: Wednesday, April 04, 2012 12:22 PM
To: mapreduce-user@hadoop.apache.org
Subject: Including third party jar files in Map Reduce job

Hi All,

I am new to Hadoop and was trying to generate random numbers using apache commons math library.
I used Netbeans to build the jar file and the manifest has path to commons-math jar as lib/commons-math3.jar I have placed this jar file in HADOOP_HOME/lib folder but still I am getting ClassNotFoundException.
I tried using -libjars option with $HADOOP_HOME/bin/Hadoop jar myprg.jar <inputpath> <outputpath> -libjars <jarpath> And $HADOOP_HOME/bin/Hadoop jar myprg.jar -libjar <jarpath> <inputpath> <outputpath> But this is not working. Please help.


Thanks and Regards
Utkarsh Gupta



**************** CAUTION - Disclaimer ***************** This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions are unlawful. This e-mail may contain viruses. Infosys has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. Infosys reserves the right to monitor and review the content of all messages sent to or from this e-mail address. Messages sent to or from this e-mail address may be stored on the Infosys e-mail system.
***INFOSYS******** End of Disclaimer ********INFOSYS***



RE: Including third party jar files in Map Reduce job

Posted by Devaraj k <de...@huawei.com>.
Hi Utkarsh,

The usage of the jar command is like this,

Usage: hadoop jar <jar> [mainClass] args... 

If you want the commons-math3.jar to be available for all the tasks you can do any one of these 1. Copy the jar file in $HADOOP_HOME/lib dir or 2. Use the generic option -libjars.

Can you give the stack trace of your problem for which class it is giving ClassNotFoundException(i.e for main class or math lib class)?

Thanks
Devaraj
________________________________________
From: Utkarsh Gupta [Utkarsh_Gupta@infosys.com]
Sent: Wednesday, April 04, 2012 12:22 PM
To: mapreduce-user@hadoop.apache.org
Subject: Including third party jar files in Map Reduce job

Hi All,

I am new to Hadoop and was trying to generate random numbers using apache commons math library.
I used Netbeans to build the jar file and the manifest has path to commons-math jar as lib/commons-math3.jar
I have placed this jar file in HADOOP_HOME/lib folder but still I am getting ClassNotFoundException.
I tried using –libjar option with $HADOOP_HOME/bin/Hadoop jar myprg.jar <inputpath> <outputpath> –libjar <jarpath>
And $HADOOP_HOME/bin/Hadoop jar myprg.jar –libjar <jarpath> <inputpath> <outputpath>
But this is not working. Please help.


Thanks and Regards
Utkarsh Gupta



**************** CAUTION - Disclaimer *****************
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely
for the use of the addressee(s). If you are not the intended recipient, please
notify the sender by e-mail and delete the original message. Further, you are not
to copy, disclose, or distribute this e-mail or its contents to any other person and
any such actions are unlawful. This e-mail may contain viruses. Infosys has taken
every reasonable precaution to minimize this risk, but is not liable for any damage
you may sustain as a result of any virus in this e-mail. You should carry out your
own virus checks before opening the e-mail or attachment. Infosys reserves the
right to monitor and review the content of all messages sent to or from this e-mail
address. Messages sent to or from this e-mail address may be stored on the
Infosys e-mail system.
***INFOSYS******** End of Disclaimer ********INFOSYS***