You are viewing a plain text version of this content. The canonical link for it is here.

Posted to hdfs-user@hadoop.apache.org by Koert Kuipers <ko...@tresata.com> on 2012/11/04 00:14:52 UTC

hadoop jar question

i am looking at the code for RunJar.java which is behind "hadoop jar" for
hadoop 0.20.2 (from cdh3u5).

i see
1) jar is unpacked to a  temporary directory
2) the file URLs of all the jars found in the lib subdir of the unpacked
jar are gathered into a list called classPath
3) a new ClassLoader loader is created of type URLClassLoader using the
list of URLs classPath from 2)
4) the classloader for the current thread is set to newly created loader
5) the main method of the user provided class is launched using the new
loader

given all this, i would say the new loader will be consulted first
(locally, not on cluster) when my classes need to be loaded, which means
classes found in the jars inside the lib folder of the provided jar should
get preference. yet what i observe is that the classes from ancient jackson
jar that is included with hadoop get preference over the ones i includ (by
putting newer jackson jars inside the lib folder of my jar). why is this?
what am i missing?

thanks!

Re: hadoop jar question

Posted by JAX <ja...@gmail.com>.

Ahh yes - i once found that the cause of this was an old version of hadoop which was created using an obsolete but still functional emr script creator..

but anyways- If you want to manually override jars in jobs  you can-  You have to override the class path using the specific classpath option ... See this post

 http://stackoverflow.com/questions/11685949/overriding-default-hadoop-jars-in-class-path...

Jay Vyas 
MMSB
UCHC

On Nov 3, 2012, at 7:15 PM, Koert Kuipers <ko...@tresata.com> wrote:

> i am looking at the code for RunJar.java which is behind "hadoop jar" for hadoop 0.20.2 (from cdh3u5).
> 
> i see
> 1) jar is unpacked to a  temporary directory
> 2) the file URLs of all the jars found in the lib subdir of the unpacked jar are gathered into a list called classPath
> 3) a new ClassLoader loader is created of type URLClassLoader using the list of URLs classPath from 2)
> 4) the classloader for the current thread is set to newly created loader
> 5) the main method of the user provided class is launched using the new loader
> 
> given all this, i would say the new loader will be consulted first (locally, not on cluster) when my classes need to be loaded, which means classes found in the jars inside the lib folder of the provided jar should get preference. yet what i observe is that the classes from ancient jackson jar that is included with hadoop get preference over the ones i includ (by putting newer jackson jars inside the lib folder of my jar). why is this? what am i missing?
> 
> thanks!

Re: hadoop jar question

Posted by JAX <ja...@gmail.com>.

Ahh yes - i once found that the cause of this was an old version of hadoop which was created using an obsolete but still functional emr script creator..

but anyways- If you want to manually override jars in jobs  you can-  You have to override the class path using the specific classpath option ... See this post

 http://stackoverflow.com/questions/11685949/overriding-default-hadoop-jars-in-class-path...

Jay Vyas 
MMSB
UCHC

On Nov 3, 2012, at 7:15 PM, Koert Kuipers <ko...@tresata.com> wrote:

> i am looking at the code for RunJar.java which is behind "hadoop jar" for hadoop 0.20.2 (from cdh3u5).
> 
> i see
> 1) jar is unpacked to a  temporary directory
> 2) the file URLs of all the jars found in the lib subdir of the unpacked jar are gathered into a list called classPath
> 3) a new ClassLoader loader is created of type URLClassLoader using the list of URLs classPath from 2)
> 4) the classloader for the current thread is set to newly created loader
> 5) the main method of the user provided class is launched using the new loader
> 
> given all this, i would say the new loader will be consulted first (locally, not on cluster) when my classes need to be loaded, which means classes found in the jars inside the lib folder of the provided jar should get preference. yet what i observe is that the classes from ancient jackson jar that is included with hadoop get preference over the ones i includ (by putting newer jackson jars inside the lib folder of my jar). why is this? what am i missing?
> 
> thanks!

Re: hadoop jar question

Posted by JAX <ja...@gmail.com>.

Ahh yes - i once found that the cause of this was an old version of hadoop which was created using an obsolete but still functional emr script creator..

but anyways- If you want to manually override jars in jobs  you can-  You have to override the class path using the specific classpath option ... See this post

 http://stackoverflow.com/questions/11685949/overriding-default-hadoop-jars-in-class-path...

Jay Vyas 
MMSB
UCHC

On Nov 3, 2012, at 7:15 PM, Koert Kuipers <ko...@tresata.com> wrote:

> i am looking at the code for RunJar.java which is behind "hadoop jar" for hadoop 0.20.2 (from cdh3u5).
> 
> i see
> 1) jar is unpacked to a  temporary directory
> 2) the file URLs of all the jars found in the lib subdir of the unpacked jar are gathered into a list called classPath
> 3) a new ClassLoader loader is created of type URLClassLoader using the list of URLs classPath from 2)
> 4) the classloader for the current thread is set to newly created loader
> 5) the main method of the user provided class is launched using the new loader
> 
> given all this, i would say the new loader will be consulted first (locally, not on cluster) when my classes need to be loaded, which means classes found in the jars inside the lib folder of the provided jar should get preference. yet what i observe is that the classes from ancient jackson jar that is included with hadoop get preference over the ones i includ (by putting newer jackson jars inside the lib folder of my jar). why is this? what am i missing?
> 
> thanks!

Re: hadoop jar question

Posted by JAX <ja...@gmail.com>.

Ahh yes - i once found that the cause of this was an old version of hadoop which was created using an obsolete but still functional emr script creator..

but anyways- If you want to manually override jars in jobs  you can-  You have to override the class path using the specific classpath option ... See this post

 http://stackoverflow.com/questions/11685949/overriding-default-hadoop-jars-in-class-path...

Jay Vyas 
MMSB
UCHC

On Nov 3, 2012, at 7:15 PM, Koert Kuipers <ko...@tresata.com> wrote:

> i am looking at the code for RunJar.java which is behind "hadoop jar" for hadoop 0.20.2 (from cdh3u5).
> 
> i see
> 1) jar is unpacked to a  temporary directory
> 2) the file URLs of all the jars found in the lib subdir of the unpacked jar are gathered into a list called classPath
> 3) a new ClassLoader loader is created of type URLClassLoader using the list of URLs classPath from 2)
> 4) the classloader for the current thread is set to newly created loader
> 5) the main method of the user provided class is launched using the new loader
> 
> given all this, i would say the new loader will be consulted first (locally, not on cluster) when my classes need to be loaded, which means classes found in the jars inside the lib folder of the provided jar should get preference. yet what i observe is that the classes from ancient jackson jar that is included with hadoop get preference over the ones i includ (by putting newer jackson jars inside the lib folder of my jar). why is this? what am i missing?
> 
> thanks!