You are viewing a plain text version of this content. The canonical link for it is here.
Posted to kato-spec@incubator.apache.org by Stuart Monteith <st...@stoo.me.uk> on 2009/11/24 13:53:45 UTC
Diagnosing JNI problems
Hello,
I've just been looking at JNI and how we go about diagnosing
crashes within that.
There are a number of weaknesses in this area in the API and the
implementation of the
CJVMTI agent
In the API we are unable to:
1. Determine the call stack through VM, JNI and native stack
frames, with the correct
interleaving. For example:
native pthread_cond_wait()
native Java_java_lang_Object_wait()
JNI Object.wait()
Java TestArticle.halt()
2. We have no information on local and global variables in native code.
This would would very useful for diagnosing native problem. Of
course,
if we could do that, we would essentially have gdb in Java.
3. The JavaClass API does not easily allow us to determine the
native function that implements
a native method. While we can retrieve bytecode and compiled
code sections, there is nothing
to indicate where native code is held.
The implementation has issues too:
1. There is very little to no native information. Native threads
are entirely missing, for instance.
2. The CJVMTI agent is not invoked during a JVM crash, which means
that they cannot be diagnosed
using our RI as it stands just now. There are the core file
readers, but we'd need to decide
how they would operate.
Practicably, unless we can address the implementation issues, the API
issues are moot.
Given that, I'd propose that we concentrate on the Java API, and remove
the Image API parts that cannot
be implemented, which will be unavoidable if we are to have a credible
RI for the TCK.
Regards,
Stuart
--
Stuart Monteith
http://blog.stoo.me.uk/
Re: Diagnosing JNI problems
Posted by Steve Poole <sp...@googlemail.com>.
On Wed, Nov 25, 2009 at 2:27 PM, Bobrovsky, Konstantin S <
konstantin.s.bobrovsky@intel.com> wrote:
> Hi Stuart, all,
>
> >Given that, I'd propose that we concentrate on the Java API, and remove
> >the Image API parts that cannot
> >be implemented, which will be unavoidable if we are to have a credible
> >RI for the TCK.
>
> I did not actually see API parts which are principally not implementable -
> maybe I overlooked something - could you please list them?
>
> Can RI just throw some "DataUnavailable" exceptions in where it is hard to
> implement something quickly for the "non-coredump" mode (which is the
> primary focus now, as I can see)?
>
>
Yes its possible that an implementation can do that - my general concern
though is about having too many methods , interfaces, classes etc that do
nothing in the RI. If we have a specification in which a few methods are
not implemented in the RI I'd be comfortable but not if the number becomes
excessive. Historically for JSRs the RI is almost always the only
implementation. The RI then has to be a credible , useful, implementation
running against the Sun JVM.
Building an implementation that uses core files requires understanding of
the data structures inside the JVM. That information is available in
OpenJDK but the licence is wrong for Apache and the amount of work is
substantial.
BTW, I tried a simple JNI-involving application on hotspot/ia32, and I can
> see that it reports interleaving frames of native JNI methods correctly up
> to the last Java frame:
>
> Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
> j ClassToLoad.nativeMethodWhichCrashes()V+0
> j ClassToLoad.<clinit>()V+8
> v ~StubRoutines::call_stub
> j JNITest.nativeMethod1()V+0
> j JNITest.JAVA_intermediateCall0()V+8
> j JNITest.JAVAMethod()V+8
> v ~StubRoutines::call_stub
> j JNITest.nativeMethod2()V+0
> v ~StubRoutines::call_stub
> j JNITest.nativeMethod0()V+0
> j JNITest.main([Ljava/lang/String;)V+0
> v ~StubRoutines::call_stub
>
> (I can send you the test sources if you want). I did not try it on JRockit
> or any other JVM though.
>
> > 3. The JavaClass API does not easily allow us to determine the
> >native function that implements
> > a native method. While we can retrieve bytecode and compiled
> >code sections, there is nothing
> > to indicate where native code is held.
>
> Again, for hotspot it is pretty feasible. Pointer to the code of native
> method is kept within a structure corresponding to this method.
>
> What I can see is that CJVMTI-agent-produced dumps can not provide all the
> information necessary to implement every part of current Kato API. But with
> core files much more data is available, so we should not drop API parts
> which can't be implemented based on one of the artifact producers.
>
> BTW, in the beginning of this JSR we spoke of possibility of using Sun's
> Serviceability Agent implementation as another basis for implementation.
> Does anyone know of the status on this?
>
> Nicholas Sterling worked hard for us with Sun to get the SA interface
amended so that it was covered under the classpath exception (
http://openjdk.java.net/legal/gplv2+ce.html ) Unfortunately this was not
possible. To be honest though we did think that the runtime restrictions
imposed by SA (ie need to run with same build of JVM) was going to cause
us problems and reduce adoption.
Thanks,
> Konst
>
> Intel Novosibirsk
> Closed Joint Stock Company Intel A/O
> Registered legal address: Krylatsky Hills Business Park,
> 17 Krylatskaya Str., Bldg 4, Moscow 121614,
> Russian Federation
>
>
> >-----Original Message-----
> >From: Stuart Monteith [mailto:stukato@stoo.me.uk]
> >Sent: Tuesday, November 24, 2009 7:54 PM
> >To: kato-spec@incubator.apache.org
> >Subject: Diagnosing JNI problems
> >
> >Hello,
> > I've just been looking at JNI and how we go about diagnosing
> >crashes within that.
> >There are a number of weaknesses in this area in the API and the
> >implementation of the
> >CJVMTI agent
> >
> >In the API we are unable to:
> > 1. Determine the call stack through VM, JNI and native stack
> >frames, with the correct
> > interleaving. For example:
> >
> >native pthread_cond_wait()
> >native Java_java_lang_Object_wait()
> >JNI Object.wait()
> >Java TestArticle.halt()
> >
> > 2. We have no information on local and global variables in native
> >code.
> > This would would very useful for diagnosing native problem. Of
> >course,
> > if we could do that, we would essentially have gdb in Java.
> >
> > 3. The JavaClass API does not easily allow us to determine the
> >native function that implements
> > a native method. While we can retrieve bytecode and compiled
> >code sections, there is nothing
> > to indicate where native code is held.
> >
> >The implementation has issues too:
> >
> > 1. There is very little to no native information. Native threads
> >are entirely missing, for instance.
> >
> > 2. The CJVMTI agent is not invoked during a JVM crash, which means
> >that they cannot be diagnosed
> > using our RI as it stands just now. There are the core file
> >readers, but we'd need to decide
> > how they would operate.
> >
> >Practicably, unless we can address the implementation issues, the API
> >issues are moot.
> >Given that, I'd propose that we concentrate on the Java API, and remove
> >the Image API parts that cannot
> >be implemented, which will be unavoidable if we are to have a credible
> >RI for the TCK.
> >
> >
> >Regards,
> > Stuart
> >
> >--
> >Stuart Monteith
> >http://blog.stoo.me.uk/
>
>
--
Steve
RE: Diagnosing JNI problems
Posted by "Bobrovsky, Konstantin S" <ko...@intel.com>.
Hi Stuart, all,
>Given that, I'd propose that we concentrate on the Java API, and remove
>the Image API parts that cannot
>be implemented, which will be unavoidable if we are to have a credible
>RI for the TCK.
I did not actually see API parts which are principally not implementable - maybe I overlooked something - could you please list them?
Can RI just throw some "DataUnavailable" exceptions in where it is hard to implement something quickly for the "non-coredump" mode (which is the primary focus now, as I can see)?
BTW, I tried a simple JNI-involving application on hotspot/ia32, and I can see that it reports interleaving frames of native JNI methods correctly up to the last Java frame:
Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j ClassToLoad.nativeMethodWhichCrashes()V+0
j ClassToLoad.<clinit>()V+8
v ~StubRoutines::call_stub
j JNITest.nativeMethod1()V+0
j JNITest.JAVA_intermediateCall0()V+8
j JNITest.JAVAMethod()V+8
v ~StubRoutines::call_stub
j JNITest.nativeMethod2()V+0
v ~StubRoutines::call_stub
j JNITest.nativeMethod0()V+0
j JNITest.main([Ljava/lang/String;)V+0
v ~StubRoutines::call_stub
(I can send you the test sources if you want). I did not try it on JRockit or any other JVM though.
> 3. The JavaClass API does not easily allow us to determine the
>native function that implements
> a native method. While we can retrieve bytecode and compiled
>code sections, there is nothing
> to indicate where native code is held.
Again, for hotspot it is pretty feasible. Pointer to the code of native method is kept within a structure corresponding to this method.
What I can see is that CJVMTI-agent-produced dumps can not provide all the information necessary to implement every part of current Kato API. But with core files much more data is available, so we should not drop API parts which can't be implemented based on one of the artifact producers.
BTW, in the beginning of this JSR we spoke of possibility of using Sun's Serviceability Agent implementation as another basis for implementation. Does anyone know of the status on this?
Thanks,
Konst
Intel Novosibirsk
Closed Joint Stock Company Intel A/O
Registered legal address: Krylatsky Hills Business Park,
17 Krylatskaya Str., Bldg 4, Moscow 121614,
Russian Federation
>-----Original Message-----
>From: Stuart Monteith [mailto:stukato@stoo.me.uk]
>Sent: Tuesday, November 24, 2009 7:54 PM
>To: kato-spec@incubator.apache.org
>Subject: Diagnosing JNI problems
>
>Hello,
> I've just been looking at JNI and how we go about diagnosing
>crashes within that.
>There are a number of weaknesses in this area in the API and the
>implementation of the
>CJVMTI agent
>
>In the API we are unable to:
> 1. Determine the call stack through VM, JNI and native stack
>frames, with the correct
> interleaving. For example:
>
>native pthread_cond_wait()
>native Java_java_lang_Object_wait()
>JNI Object.wait()
>Java TestArticle.halt()
>
> 2. We have no information on local and global variables in native
>code.
> This would would very useful for diagnosing native problem. Of
>course,
> if we could do that, we would essentially have gdb in Java.
>
> 3. The JavaClass API does not easily allow us to determine the
>native function that implements
> a native method. While we can retrieve bytecode and compiled
>code sections, there is nothing
> to indicate where native code is held.
>
>The implementation has issues too:
>
> 1. There is very little to no native information. Native threads
>are entirely missing, for instance.
>
> 2. The CJVMTI agent is not invoked during a JVM crash, which means
>that they cannot be diagnosed
> using our RI as it stands just now. There are the core file
>readers, but we'd need to decide
> how they would operate.
>
>Practicably, unless we can address the implementation issues, the API
>issues are moot.
>Given that, I'd propose that we concentrate on the Java API, and remove
>the Image API parts that cannot
>be implemented, which will be unavoidable if we are to have a credible
>RI for the TCK.
>
>
>Regards,
> Stuart
>
>--
>Stuart Monteith
>http://blog.stoo.me.uk/