You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Saptarshi Guha <sa...@gmail.com> on 2009/03/24 03:12:33 UTC

JNI and calling Hadoop jar files

Hello,
I'm using some JNI interfaces, via a R. My classpath contains all the
jar files in $HADOOP_HOME and $HADOOP_HOME/lib
My class is
    public SeqKeyList() throws Exception {

	config = new  org.apache.hadoop.conf.Configuration();
	config.addResource(new Path(System.getenv("HADOOP_CONF_DIR")
				    +"/hadoop-default.xml"));
	config.addResource(new Path(System.getenv("HADOOP_CONF_DIR")
				    +"/hadoop-site.xml"));

	System.out.println("C="+config);
	filesystem = FileSystem.get(config);
	System.out.println("C="+config+"F=" +filesystem);
	System.out.println(filesystem.getUri().getScheme());

    }

I am using a distributed filesystem
(org.apache.hadoop.hdfs.DistributedFileSystem for fs.hdfs.impl).
When run from the command line and this class is created everything works fine
When called using jni I get
 java.lang.ClassNotFoundException: org.apache.hadoop.hdfs.DistributedFileSystem

Is this a jni issue? How can it work from the commandline using the
same classpath, yet throw this is exception when run via JNI?
Saptarshi Guha

Re: JNI and calling Hadoop jar files

Posted by Steve Loughran <st...@apache.org>.
jason hadoop wrote:
> The exception reference to *org.apache.hadoop.hdfs.DistributedFileSystem*,
> implies strongly that a hadoop-default.xml file, or at least a  job.xml file
> is present.
> Since hadoop-default.xml is bundled into the hadoop-0.X.Y-core.jar, the
> assumption is that the core jar is available.
> The class not found exception, the implication is that the
> hadoop-0.X.Y-core.jar is not available to jni.
> 
> Given the above constraints, the two likely possibilities are that the -core
> jar is unavailable or damaged, or that somehow the classloader being used
> does not have access to the -core  jar.
> 
> A possible reason for the jar not being available is that the application is
> running on a different machine, or as a different user and the jar is not
> actually present or perhaps readable in the expected location.
> 
> 
> 
> 
> 
> Which way is your JNI, java application calling into a native shared
> library, or a native application calling into a jvm that it instantiates via
> libjvm calls?
> 
> Could you dump the classpath that is in effect before your failing jni call?
> System.getProperty( "java.class.path"), and for that matter,
> "java.library.path", or getenv("CLASSPATH)
> and provide an ls -l of the core.jar from the class path, run as the user
> that owns the process, on the machine that the process is running on.
> 

Or something bad is happening with a dependent library of the filesystem 
that is causing the reflection-based load to fail and die with the root 
cause being lost in the process. Sometimes putting an explicit reference 
to the class you are trying to load is a good way to force the problem 
to surface earlier, and fail with better error messages.

Re: JNI and calling Hadoop jar files

Posted by Saptarshi Guha <sa...@gmail.com>.
Delayed response. However, I stand corrected. I was using a package
called rJava which integrates R and java. Maybe there was a
classloader issue but once I rewrote my stuff using C and JNI, the
issues disappeared.
When I create my configuration, i add as a resource
$HADOOP/conf/hadoop-{default,site},xml in that order.
All problems disappeared.
Sorry, I couldn't provide the information requested with the error
causing approach - I stopped using that package.

regards
Saptarshi Guha

P.S As an aside, If I launch my own java apps, which require the
hadoop configuration etc, I have to manually add the
{default,site}.xml files.



On Tue, Mar 24, 2009 at 6:52 AM, jason hadoop <ja...@gmail.com> wrote:
> The exception reference to *org.apache.hadoop.hdfs.DistributedFileSystem*,
> implies strongly that a hadoop-default.xml file, or at least a  job.xml file
> is present.
> Since hadoop-default.xml is bundled into the hadoop-0.X.Y-core.jar, the
> assumption is that the core jar is available.
> The class not found exception, the implication is that the
> hadoop-0.X.Y-core.jar is not available to jni.
>
> Given the above constraints, the two likely possibilities are that the -core
> jar is unavailable or damaged, or that somehow the classloader being used
> does not have access to the -core  jar.
>
> A possible reason for the jar not being available is that the application is
> running on a different machine, or as a different user and the jar is not
> actually present or perhaps readable in the expected location.
>
>
>
>
>
> Which way is your JNI, java application calling into a native shared
> library, or a native application calling into a jvm that it instantiates via
> libjvm calls?
>
> Could you dump the classpath that is in effect before your failing jni call?
> System.getProperty( "java.class.path"), and for that matter,
> "java.library.path", or getenv("CLASSPATH)
> and provide an ls -l of the core.jar from the class path, run as the user
> that owns the process, on the machine that the process is running on.
>
> <!-- from hadoop-default.xml -->
> <property>
>  <name>fs.hdfs.impl</name>
>  <value>org.apache.hadoop.hdfs.DistributedFileSystem</value>
>  <description>The FileSystem for hdfs: uris.</description>
> </property>
>
>
>
> On Mon, Mar 23, 2009 at 9:47 PM, Jeff Eastman <jd...@windwardsolutions.com>wrote:
>
>> This looks somewhat similar to my Subtle Classloader Issue from yesterday.
>> I'll be watching this thread too.
>>
>> Jeff
>>
>>
>> Saptarshi Guha wrote:
>>
>>> Hello,
>>> I'm using some JNI interfaces, via a R. My classpath contains all the
>>> jar files in $HADOOP_HOME and $HADOOP_HOME/lib
>>> My class is
>>>    public SeqKeyList() throws Exception {
>>>
>>>        config = new  org.apache.hadoop.conf.Configuration();
>>>        config.addResource(new Path(System.getenv("HADOOP_CONF_DIR")
>>>                                    +"/hadoop-default.xml"));
>>>        config.addResource(new Path(System.getenv("HADOOP_CONF_DIR")
>>>                                    +"/hadoop-site.xml"));
>>>
>>>        System.out.println("C="+config);
>>>        filesystem = FileSystem.get(config);
>>>        System.out.println("C="+config+"F=" +filesystem);
>>>        System.out.println(filesystem.getUri().getScheme());
>>>
>>>    }
>>>
>>> I am using a distributed filesystem
>>> (org.apache.hadoop.hdfs.DistributedFileSystem for fs.hdfs.impl).
>>> When run from the command line and this class is created everything works
>>> fine
>>> When called using jni I get
>>>  java.lang.ClassNotFoundException:
>>> org.apache.hadoop.hdfs.DistributedFileSystem
>>>
>>> Is this a jni issue? How can it work from the commandline using the
>>> same classpath, yet throw this is exception when run via JNI?
>>> Saptarshi Guha
>>>
>>>
>>>
>>>
>>
>>
>
>
> --
> Alpha Chapters of my book on Hadoop are available
> http://www.apress.com/book/view/9781430219422
>

Re: JNI and calling Hadoop jar files

Posted by jason hadoop <ja...@gmail.com>.
The exception reference to *org.apache.hadoop.hdfs.DistributedFileSystem*,
implies strongly that a hadoop-default.xml file, or at least a  job.xml file
is present.
Since hadoop-default.xml is bundled into the hadoop-0.X.Y-core.jar, the
assumption is that the core jar is available.
The class not found exception, the implication is that the
hadoop-0.X.Y-core.jar is not available to jni.

Given the above constraints, the two likely possibilities are that the -core
jar is unavailable or damaged, or that somehow the classloader being used
does not have access to the -core  jar.

A possible reason for the jar not being available is that the application is
running on a different machine, or as a different user and the jar is not
actually present or perhaps readable in the expected location.





Which way is your JNI, java application calling into a native shared
library, or a native application calling into a jvm that it instantiates via
libjvm calls?

Could you dump the classpath that is in effect before your failing jni call?
System.getProperty( "java.class.path"), and for that matter,
"java.library.path", or getenv("CLASSPATH)
and provide an ls -l of the core.jar from the class path, run as the user
that owns the process, on the machine that the process is running on.

<!-- from hadoop-default.xml -->
<property>
  <name>fs.hdfs.impl</name>
  <value>org.apache.hadoop.hdfs.DistributedFileSystem</value>
  <description>The FileSystem for hdfs: uris.</description>
</property>



On Mon, Mar 23, 2009 at 9:47 PM, Jeff Eastman <jd...@windwardsolutions.com>wrote:

> This looks somewhat similar to my Subtle Classloader Issue from yesterday.
> I'll be watching this thread too.
>
> Jeff
>
>
> Saptarshi Guha wrote:
>
>> Hello,
>> I'm using some JNI interfaces, via a R. My classpath contains all the
>> jar files in $HADOOP_HOME and $HADOOP_HOME/lib
>> My class is
>>    public SeqKeyList() throws Exception {
>>
>>        config = new  org.apache.hadoop.conf.Configuration();
>>        config.addResource(new Path(System.getenv("HADOOP_CONF_DIR")
>>                                    +"/hadoop-default.xml"));
>>        config.addResource(new Path(System.getenv("HADOOP_CONF_DIR")
>>                                    +"/hadoop-site.xml"));
>>
>>        System.out.println("C="+config);
>>        filesystem = FileSystem.get(config);
>>        System.out.println("C="+config+"F=" +filesystem);
>>        System.out.println(filesystem.getUri().getScheme());
>>
>>    }
>>
>> I am using a distributed filesystem
>> (org.apache.hadoop.hdfs.DistributedFileSystem for fs.hdfs.impl).
>> When run from the command line and this class is created everything works
>> fine
>> When called using jni I get
>>  java.lang.ClassNotFoundException:
>> org.apache.hadoop.hdfs.DistributedFileSystem
>>
>> Is this a jni issue? How can it work from the commandline using the
>> same classpath, yet throw this is exception when run via JNI?
>> Saptarshi Guha
>>
>>
>>
>>
>
>


-- 
Alpha Chapters of my book on Hadoop are available
http://www.apress.com/book/view/9781430219422

Re: JNI and calling Hadoop jar files

Posted by Jeff Eastman <jd...@windwardsolutions.com>.
This looks somewhat similar to my Subtle Classloader Issue from 
yesterday. I'll be watching this thread too.

Jeff

Saptarshi Guha wrote:
> Hello,
> I'm using some JNI interfaces, via a R. My classpath contains all the
> jar files in $HADOOP_HOME and $HADOOP_HOME/lib
> My class is
>     public SeqKeyList() throws Exception {
>
> 	config = new  org.apache.hadoop.conf.Configuration();
> 	config.addResource(new Path(System.getenv("HADOOP_CONF_DIR")
> 				    +"/hadoop-default.xml"));
> 	config.addResource(new Path(System.getenv("HADOOP_CONF_DIR")
> 				    +"/hadoop-site.xml"));
>
> 	System.out.println("C="+config);
> 	filesystem = FileSystem.get(config);
> 	System.out.println("C="+config+"F=" +filesystem);
> 	System.out.println(filesystem.getUri().getScheme());
>
>     }
>
> I am using a distributed filesystem
> (org.apache.hadoop.hdfs.DistributedFileSystem for fs.hdfs.impl).
> When run from the command line and this class is created everything works fine
> When called using jni I get
>  java.lang.ClassNotFoundException: org.apache.hadoop.hdfs.DistributedFileSystem
>
> Is this a jni issue? How can it work from the commandline using the
> same classpath, yet throw this is exception when run via JNI?
> Saptarshi Guha
>
>
>