You are viewing a plain text version of this content. The canonical link for it is here.
Posted to kato-spec@incubator.apache.org by Nicholas Sterling <Ni...@Sun.COM> on 2009/07/01 08:30:30 UTC

Re: JavaStackFrame/JavaLocation local variable support

Joining this conversation late, so forgive me if this isn't as relevant 
as I think it should be.  :^)

Even at HotSpot safepoints, the code being executed often looks nothing 
like the source.  In particular, compiled code may be heavily inlined.  
Many instructions are just gone, and those that remain are moved up and 
down, smearing methods together so that, variables aside, in general we 
couldn't possibly tell you what method -- or even what class -- you're 
in because you are in several at once.  Of course the debugger doesn't 
have this problem because as soon as you point the debugger at a method 
HotSpot abandons the compiled version and uses the interpreted version, 
or at least something optimized less aggressively.

At least that's my understanding; I'm happy to be corrected.  Not sure 
how stack backtraces for exceptions work -- perhaps that suppresses some 
optimizations?

Nicholas



Stuart Monteith wrote:
>
>
> Steve Poole wrote:
>> On Fri, Jun 26, 2009 at 11:41 AM, Stuart Monteith 
>> <st...@stoo.me.uk>wrote:
>>
>>  
>>> Hi,
>>>   I was wondering what peoples thoughts were regarding program 
>>> counters,
>>> line number table and variable tables.
>>> There is a tension between most users of the Kato API and the JDI 
>>> connector
>>> and its obligations towards supplying the information JDWP requires.
>>> JDWP, for the most part, would like to know the location of a stack 
>>> frame,
>>> i.e. a program counter normally, and using that to look up the variable
>>> tables and line number tables.
>>>
>>>     
>>
>> I do see JDWP as a major use case  for us so we must make sure that 
>> our JDI
>> connector is first class.
>>
>>   
> Agreed. I know that the katoView tomcat commands would benefit too - 
> the FFDC scenario.
>>> Some changes have been made to the API to supply the local variables
>>> (through JavaStackFrame.getVariable(int)) and their locations/types 
>>> through
>>> JavaMethod.getVariables().
>>> However, we haven't resolved the issue of the program counter or the 
>>> line
>>> numbers.
>>>
>>> Just now the line numbers are available through
>>> JavaLocation.getLineNumber(), where available. However, JDWP never 
>>> asks for
>>> a stack frame's line number, it maps from a stack frames
>>> location to a line number using the line number table from a stack 
>>> frame's
>>> method.
>>>
>>>     
>>
>>  
>>> So, should we forgo having the simple JavaLocation.getLineNumber() 
>>> and only
>>> supply the line number table (where appropriate)?
>>>     
>>
>>
>> I was thinking "So what's the use of the getLineNumber method? "   but
>> outside the JDWP scenerio it does enable simple access to the 
>> linenumbers
>> (ie via the xpath approach)   The question is how much use that is 
>> and what
>> we'd be encombering the implementations with.       Since what the  
>> the JDI
>> does is "standard" in its mapping then the RI could provide that code 
>> for
>> implementors to use.
>>
>>   
> I think the concern I have is that not all implementations would be 
> able to supply a program counter or a line number table.
> For instance, the hprof file stores only the line numbers. However, we 
> shouldn't get too hung up on this as our implementation
> for hprof was always going to be limited.
> It is important that we supply all of the necessary information, and 
> supply either helper methods on top or within API to make
> it more digestible for the majority of implementations.
>
>>  
>>> Of course, having a "getProgramCounter()" method would be useful, 
>>> but what
>>> should we do for compiled methods? There is a strong requirement for 
>>> us to
>>> return the contents
>>> of local variables in compiled methods as well as interpreted methods.
>>> However, that requires synthesizing a bytecode program counter to 
>>> retrieve
>>> the correct variables, which implies
>>> that line numbers could be generated too. However, as with C, etc, the
>>> debugging information derived from optimized code is usually 
>>> inaccurate.
>>>     
>>
>>
>>  
>>> For line numbers, I imagine we'd either have the line numbers or not if
>>> they are inaccurate. But for local variables, it would be sensible 
>>> to alter
>>> the variable table information to suit the
>>> optimized code, to give a consistent picture.
>>>
>>>     
>>
>> I think we need to examine this in more detail -  got an example?
>>
>>
>>   
> My experience of the JIT is somewhat limited, but certainly when 
> debugging C programs with optimization,
> it is usual that variables are optimized out, loops unrolled, code 
> reordered, such that  the variable contents and
> line numbers don't match the source. Having said that, I'm sure there 
> are others who could make more authoritative
> comments on this area.
>
> Take:
>    for(int a=0, b=0; a<10; a++) {
>       b = a*2;
>       array[a][b] = array2[a];
>    }
>
> if the compiler did this:
>
>    for(int a=0; a<10;a++) {
>       array[a][a*2] = array2[a];
>    }
>
> Then the local variable "b" would no longer exist in any meaningful 
> sense. My suggestion would be to remove "b" from the
> variable table. Of course, we could have two stack frames in the same 
> method with different levels of optimization, but I believe that's
> probably still an issue anyhow.
>
>
>>  
>>> Regards,
>>>   Stuart
>>>
>>>
>>> Stuart Monteith wrote:
>>>
>>>    
>>>> Hi,
>>>>   I've been looking at local variables in relation to the JDI 
>>>> connector.
>>>> For the BOF at JavaOne we'd like for there to be a prototype of  local
>>>> variable support in the API. I've been looking at what JDWP 
>>>> requires as we
>>>> would have to be able to satisfy its queries using the Kato API. 
>>>> This has
>>>> made me lean towards exposing the variable table and have us 
>>>> retrieve the
>>>> local variables from the stack frames by slot number.
>>>>
>>>> So my suggestion for the API is this:
>>>>
>>>> ---------------------------------
>>>>
>>>> JavaMethod
>>>> -------------
>>>>
>>>> // returns all local variables
>>>> // empty if there are no variables.
>>>> Iterator<JavaVariable> getVariable() throws DataUnavailable;
>>>>
>>>> JavaVariable
>>>> -------------
>>>>
>>>> // Local variable's name
>>>> // throws DataUnavailable if the variable was derived from bytecode 
>>>> and so
>>>> the name is unknown. Caller is free to make a name up.
>>>> String getName() throws DataUnavailable;
>>>>
>>>> // The local variable's signature in JNI format.
>>>> String getSignature();
>>>>
>>>> // The start of the local variable's scope within the bytecode.
>>>> int getStart();
>>>>
>>>> // The number of bytes this variables scope covers over the bytecode.
>>>> int getLength();
>>>>
>>>> // The slot this variable occupies. Passed to 
>>>> JavaStackFrame.getVariable()
>>>> to retrieve the contents.
>>>> int getSlot();
>>>>
>>>>
>>>> JavaStackFrame
>>>> ------------------
>>>>
>>>> // Gets the value of a variable from a stack frame.
>>>> // Returns a JavaObject for an object reference, null for a null 
>>>> object
>>>> reference. Primitives are returned as boxed primitives.
>>>> // throws CorruptDataException if object reference is incorrect, or 
>>>> if the
>>>> float or double are set to invalid values.
>>>> // throws DataUnavailable if this method is not supported or if 
>>>> stack not
>>>> in correct state to return variables.
>>>> // throws IndexOutOfBoundsException if an invalid slot number if 
>>>> passed.
>>>> Object getVariable(int slot) throws CorruptDataException, 
>>>> DataUnavailable,
>>>> IndexOutOfBoundsException;
>>>>
>>>>
>>>> ---------------------------------
>>>>
>>>> The bytecode offset can be calculated with:
>>>>   JavaLocation.getAddress() - (
>>>> JavaMethod.getBytecodeSections().next().getBase().getAddress())
>>>> but I think that might be a little too tedious, and doesn't allow
>>>> cleverness with JITted frames. So we will probably have to add:
>>>>
>>>> // Return program counter in bytecode.
>>>> int JavaLocation.getBytecodePC();
>>>>
>>>> alternatively the JavaVariable.getStart() would use absolute 
>>>> addresses,
>>>> which could conceivably work with JITed frames, if the tables are 
>>>> maintained
>>>> during compilation.
>>>>
>>>> We should also expose the line number table too as that will aid class
>>>> file reproduction and queries for line numbers based on bytecode 
>>>> program
>>>> counters.
>>>>
>>>> A slightly different scheme would have the 
>>>> JavaStackFrame.getVariable(int
>>>> slot) method look like:
>>>>   Object getVariable(JavaVariable var);
>>>> but I don't think it gains us much.
>>>>
>>>> Retrieving all of the variables would therefore look something like 
>>>> this:
>>>>
>>>> void dumpVariables(JavaThread thread) throws Exception {
>>>>   Iterator frames = thread.getStackFrames();
>>>>   while (frames.hasNext()) {
>>>>      JavaStackFrame frame = (JavaStackFrame) frames.next();
>>>>      JavaLocation location = frame.getLocation();
>>>>      JavaMethod method = location.getMethod();
>>>>      int pc = location.getBytecodePC();
>>>>           System.out.println(location.toString()+":");
>>>>
>>>>      Iterator variables = method.getVariables();
>>>>      while (variables.hasNext()) {
>>>>         JavaVariable variable = (JavaVariable) variables.next();
>>>>
>>>>         if (pc >= variable.getStart() && pc <=
>>>> variable.getStart()+variables.getLength()) {
>>>>            Object value = frame.getVariable( variable.getSlot());
>>>>                       System.out.println("\t"+ 
>>>> variable.getSignature()+"
>>>> "+variable.getName()+" = "+ value.toString());
>>>>         }
>>>>      }
>>>>   }
>>>> }
>>>>
>>>> Let me know what you think,
>>>>   Stuart
>>>>
>>>> Stuart Monteith wrote:
>>>>
>>>>      
>>>>> Hello,
>>>>>   With Steve's work on JVMTI/python coming along, the issue of 
>>>>> what to do
>>>>> about local methods is coming up. Currently there is no means to 
>>>>> determine
>>>>> the names and values of local variables through the current API.
>>>>>
>>>>> The most obvious way of implementing this is to have the API do 
>>>>> all of
>>>>> the processing by exposing the variables as name and value pairs.
>>>>>
>>>>> For example:
>>>>> interface JavaStackFrame {
>>>>>   List<LocalVariable> getLocalVariables();
>>>>> }
>>>>>
>>>>> where:
>>>>>
>>>>> interface LocalVariable {
>>>>>   String getName();
>>>>>   Object getValue();
>>>>> }
>>>>>
>>>>> Where the value is a JavaObject, or an boxed primitive.
>>>>>
>>>>> The other extreme is for the necessary information to be made 
>>>>> available
>>>>> for the callers of the API to generate this information themselves.
>>>>> This would mean properly exposing:
>>>>>   Program Counter - currently we have JavaLocation.getAddress(), 
>>>>> which is
>>>>> an address in memory, rather than a bytecode program counter. For 
>>>>> JITted
>>>>> frames we'd still need the bytecode program counter.
>>>>>   Local variable table - this is to determine which variables 
>>>>> there are,
>>>>> their types and their indexes into the local variable array
>>>>>   Local variable array - the contents of the local variables need 
>>>>> to be
>>>>> exposed, and their proper types should be returnable (JavaObject, 
>>>>> int, etc).
>>>>>
>>>>> Doing it that way might be beneficial for more user stories, there is
>>>>> more information available to reconstruct the class file, for 
>>>>> instance.
>>>>> There is also the small matter of what to do when the local variable
>>>>> table is not available. When the API exposes all that it knows the 
>>>>> values
>>>>> might still be retrievable, although I have my doubts as to how 
>>>>> useful that
>>>>> would be if you don't know the types.
>>>>>
>>>>> Thoughts?
>>>>>   Stuart
>>>>>
>>>>>
>>>>>         
>>
>>   


RE: JavaStackFrame/JavaLocation local variable support

Posted by "Bobrovsky, Konstantin S" <ko...@intel.com>.
>> For instance, can the "this" variable (where present) 
>> and the arguments to the method be set as a minimum requirement?

> I don't think it is always possible, even in Hotspot VM.

To put it simpler, there is a chance for us to get values of arguments at any point in the code only if the arguments are 'live' at this point - this seems far too strong requirement. 'This' is probably different - javac might keep it intact at the top of the virtual stack until the end of the method, i.e. it is always 'live'.

Thanks,
Konst
 
Intel Novosibirsk
Closed Joint Stock Company Intel A/O
Registered legal address: Krylatsky Hills Business Park, 
17 Krylatskaya Str., Bldg 4, Moscow 121614, 
Russian Federation
 

-----Original Message-----
From: Bobrovsky, Konstantin S [mailto:konstantin.s.bobrovsky@intel.com] 
Sent: Thursday, July 02, 2009 12:18 PM
To: kato-spec@incubator.apache.org
Subject: RE: JavaStackFrame/JavaLocation local variable support


Stuart, all,

Few comments from my side:

> For instance, can the "this" variable (where present) 
> and the arguments to the method be set as a minimum requirement?

I don't think it is always possible, even in Hotspot VM. According to the JVM spec 'this' and arguments are passed via virtual stack, and the stack slots they occupy can be overwritten by some bytecode executed between the method entry and the point where we query their values. So Hotspot's keeping VM state maps does not help.

On a native level, we need to know calling convention for java methods of each VM, particularly, where arguments are stored (regs or stack) and whether their values are preserved by the callee (extremely unlikely). If we are lucky, we can find a VM for which we can deduce location of arguments at any particular point in method's code. Hotspot (C2) is probably not one of them.

> If a tool operates against a live JVM, it should probably not affect the
> target Java application's
...
> For instance, some JVMs
> might never support local variables, while other might just support them 
> in intepreted frames.

This is important point, IMO. Support for certain features might require live VM to 'deoptimize' certain compiled methods (local variables access is a good example, another one is a breakpoint hit) - do we allow that in the spec or do we at all specify implementation behavior in this case? For debuggers changing optimization level or otherwise affecting execution (preserving only semantics) is justifiable, but for the purposes the Kato is for - probably not, I agree with Stuart.

> With local variable support as an example, as well as assessing what is 
> practicable for the RI should we be investigating the capabilities of 
> the JVMs that would be retrieving local variables to determine the 
> optionality?

IMO, it would be very useful. A VM/feature table for the major VMs out there and for all important features which impact API design (safeponits + deoptimization/on-stack replacement, recompilation, compressed pointers, ...) would help see the whole picture and judge whether a support for a particular feature on the API level is worthwhile or it should be left to custom VM-specific tools only. If we ever decide to do that - I could help with populating Hotspot VM part.

One obvious suggestion: API could allow query of capabilities from the host VM or the loaded org.apache.kato.image.Image so that tools could partially handle optionality of certain features at run-time. This way the same tool can be more 'smart' on a VM with extended capabilities. Similarly to capabilities mechanism in JVMTI.

Thanks,
Konst
 
Intel Novosibirsk
Closed Joint Stock Company Intel A/O
Registered legal address: Krylatsky Hills Business Park, 
17 Krylatskaya Str., Bldg 4, Moscow 121614, 
Russian Federation
 

-----Original Message-----
From: Stuart Monteith [mailto:stukato@stoo.me.uk] 
Sent: Wednesday, July 01, 2009 7:04 PM
To: kato-spec@incubator.apache.org
Subject: Re: JavaStackFrame/JavaLocation local variable support

Thanks Konstantin, I had you in mind when I brought this up :-)

The local variable support is another case of deciding on what "model" 
of the JVM we will present - balancing between implementation
details and presenting a consistent API that can have the same Java 
program introspected in the same by the same analyzing program (JDI,
FFDC, etc) but on different JVMs.

I think that local variables will be an optional feature of the API. I 
do wonder if it will be possible to set a minimum level of
expected support. For instance, can the "this" variable (where present) 
and the arguments to the method be set as a minimum requirement?

We want to be able to bring these issues to a conclusion. The RI is 
important - Steve has been working on getting a JVMTI agent that will
produce the local variables. What I'm not quite sure on is how we decide 
upon the optionality of such features. For instance, some JVMs
might never support local variables, while other might just support them 
in intepreted frames. In addition, the different implementations of
Kato will determine the amount and  information generated. For instance, 
a core file with jsadebugd (or something much like it)
could have another set of features compared to the JVMTI agent.

With local variable support as an example, as well as assessing what is 
practicable for the RI should we be investigating the capabilities of 
the JVMs
that would be retrieving local variables to determine the optionality? 
For that I mean IBM, Oracle and Sun's JVMs and their possible Kato 
implementations
as well as the RI.


Regards,
    Stuart

Bobrovsky, Konstantin S wrote:
>> we know that they were in a -> b -> c somewhere, but we wouldn't know 
>> whether they were still in d or e.  Is that right?
>>     
>
> Sorry, I don't quite understand the question. What is the sense of 'they' and 'were' you implied here?
>
> Thanks,
> Konst
>  
> Intel Novosibirsk
> Closed Joint Stock Company Intel A/O
> Registered legal address: Krylatsky Hills Business Park, 
> 17 Krylatskaya Str., Bldg 4, Moscow 121614, 
> Russian Federation
>  
>
> -----Original Message-----
> From: Nicholas.Sterling@Sun.COM [mailto:Nicholas.Sterling@Sun.COM] 
> Sent: Wednesday, July 01, 2009 5:15 PM
> To: kato-spec@incubator.apache.org
> Subject: Re: JavaStackFrame/JavaLocation local variable support
>
> Bobrovsky, Konstantin S wrote:
>   
>> Hi Nicholas,
>>
>> C2 compiler annotates each safepoint with so-called DebugInfo (serialized together with method's executable image), which records an entire in-lining hierarchy for this particular safepoint, 
>>     
>
> Ah, I'm with you now -- thanks!  :^)  You're right, something like this 
> is needed for de-optimization.
>
> So in general we would be between safepoints, and most call sites are 
> safepoints.  If the first safepoint has backtrace
> a -> b -> c -> d -> e
> and the second has
> a -> b -> c -> g -> h -> i
> we know that they were in a -> b -> c somewhere, but we wouldn't know 
> whether they were still in d or e.  Is that right?
>
> Nicholas
>
>
>   

-- 
Stuart Monteith
http://blog.stoo.me.uk/


RE: JavaStackFrame/JavaLocation local variable support

Posted by "Bobrovsky, Konstantin S" <ko...@intel.com>.
Stuart, all,

Few comments from my side:

> For instance, can the "this" variable (where present) 
> and the arguments to the method be set as a minimum requirement?

I don't think it is always possible, even in Hotspot VM. According to the JVM spec 'this' and arguments are passed via virtual stack, and the stack slots they occupy can be overwritten by some bytecode executed between the method entry and the point where we query their values. So Hotspot's keeping VM state maps does not help.

On a native level, we need to know calling convention for java methods of each VM, particularly, where arguments are stored (regs or stack) and whether their values are preserved by the callee (extremely unlikely). If we are lucky, we can find a VM for which we can deduce location of arguments at any particular point in method's code. Hotspot (C2) is probably not one of them.

> If a tool operates against a live JVM, it should probably not affect the
> target Java application's
...
> For instance, some JVMs
> might never support local variables, while other might just support them 
> in intepreted frames.

This is important point, IMO. Support for certain features might require live VM to 'deoptimize' certain compiled methods (local variables access is a good example, another one is a breakpoint hit) - do we allow that in the spec or do we at all specify implementation behavior in this case? For debuggers changing optimization level or otherwise affecting execution (preserving only semantics) is justifiable, but for the purposes the Kato is for - probably not, I agree with Stuart.

> With local variable support as an example, as well as assessing what is 
> practicable for the RI should we be investigating the capabilities of 
> the JVMs that would be retrieving local variables to determine the 
> optionality?

IMO, it would be very useful. A VM/feature table for the major VMs out there and for all important features which impact API design (safeponits + deoptimization/on-stack replacement, recompilation, compressed pointers, ...) would help see the whole picture and judge whether a support for a particular feature on the API level is worthwhile or it should be left to custom VM-specific tools only. If we ever decide to do that - I could help with populating Hotspot VM part.

One obvious suggestion: API could allow query of capabilities from the host VM or the loaded org.apache.kato.image.Image so that tools could partially handle optionality of certain features at run-time. This way the same tool can be more 'smart' on a VM with extended capabilities. Similarly to capabilities mechanism in JVMTI.

Thanks,
Konst
 
Intel Novosibirsk
Closed Joint Stock Company Intel A/O
Registered legal address: Krylatsky Hills Business Park, 
17 Krylatskaya Str., Bldg 4, Moscow 121614, 
Russian Federation
 

-----Original Message-----
From: Stuart Monteith [mailto:stukato@stoo.me.uk] 
Sent: Wednesday, July 01, 2009 7:04 PM
To: kato-spec@incubator.apache.org
Subject: Re: JavaStackFrame/JavaLocation local variable support

Thanks Konstantin, I had you in mind when I brought this up :-)

The local variable support is another case of deciding on what "model" 
of the JVM we will present - balancing between implementation
details and presenting a consistent API that can have the same Java 
program introspected in the same by the same analyzing program (JDI,
FFDC, etc) but on different JVMs.

I think that local variables will be an optional feature of the API. I 
do wonder if it will be possible to set a minimum level of
expected support. For instance, can the "this" variable (where present) 
and the arguments to the method be set as a minimum requirement?

We want to be able to bring these issues to a conclusion. The RI is 
important - Steve has been working on getting a JVMTI agent that will
produce the local variables. What I'm not quite sure on is how we decide 
upon the optionality of such features. For instance, some JVMs
might never support local variables, while other might just support them 
in intepreted frames. In addition, the different implementations of
Kato will determine the amount and  information generated. For instance, 
a core file with jsadebugd (or something much like it)
could have another set of features compared to the JVMTI agent.

With local variable support as an example, as well as assessing what is 
practicable for the RI should we be investigating the capabilities of 
the JVMs
that would be retrieving local variables to determine the optionality? 
For that I mean IBM, Oracle and Sun's JVMs and their possible Kato 
implementations
as well as the RI.


Regards,
    Stuart

Bobrovsky, Konstantin S wrote:
>> we know that they were in a -> b -> c somewhere, but we wouldn't know 
>> whether they were still in d or e.  Is that right?
>>     
>
> Sorry, I don't quite understand the question. What is the sense of 'they' and 'were' you implied here?
>
> Thanks,
> Konst
>  
> Intel Novosibirsk
> Closed Joint Stock Company Intel A/O
> Registered legal address: Krylatsky Hills Business Park, 
> 17 Krylatskaya Str., Bldg 4, Moscow 121614, 
> Russian Federation
>  
>
> -----Original Message-----
> From: Nicholas.Sterling@Sun.COM [mailto:Nicholas.Sterling@Sun.COM] 
> Sent: Wednesday, July 01, 2009 5:15 PM
> To: kato-spec@incubator.apache.org
> Subject: Re: JavaStackFrame/JavaLocation local variable support
>
> Bobrovsky, Konstantin S wrote:
>   
>> Hi Nicholas,
>>
>> C2 compiler annotates each safepoint with so-called DebugInfo (serialized together with method's executable image), which records an entire in-lining hierarchy for this particular safepoint, 
>>     
>
> Ah, I'm with you now -- thanks!  :^)  You're right, something like this 
> is needed for de-optimization.
>
> So in general we would be between safepoints, and most call sites are 
> safepoints.  If the first safepoint has backtrace
> a -> b -> c -> d -> e
> and the second has
> a -> b -> c -> g -> h -> i
> we know that they were in a -> b -> c somewhere, but we wouldn't know 
> whether they were still in d or e.  Is that right?
>
> Nicholas
>
>
>   

-- 
Stuart Monteith
http://blog.stoo.me.uk/


Re: JavaStackFrame/JavaLocation local variable support

Posted by Stuart Monteith <st...@stoo.me.uk>.
Thanks Konstantin, I had you in mind when I brought this up :-)

The local variable support is another case of deciding on what "model" 
of the JVM we will present - balancing between implementation
details and presenting a consistent API that can have the same Java 
program introspected in the same by the same analyzing program (JDI,
FFDC, etc) but on different JVMs.

I think that local variables will be an optional feature of the API. I 
do wonder if it will be possible to set a minimum level of
expected support. For instance, can the "this" variable (where present) 
and the arguments to the method be set as a minimum requirement?

We want to be able to bring these issues to a conclusion. The RI is 
important - Steve has been working on getting a JVMTI agent that will
produce the local variables. What I'm not quite sure on is how we decide 
upon the optionality of such features. For instance, some JVMs
might never support local variables, while other might just support them 
in intepreted frames. In addition, the different implementations of
Kato will determine the amount and  information generated. For instance, 
a core file with jsadebugd (or something much like it)
could have another set of features compared to the JVMTI agent.

With local variable support as an example, as well as assessing what is 
practicable for the RI should we be investigating the capabilities of 
the JVMs
that would be retrieving local variables to determine the optionality? 
For that I mean IBM, Oracle and Sun's JVMs and their possible Kato 
implementations
as well as the RI.


Regards,
    Stuart

Bobrovsky, Konstantin S wrote:
>> we know that they were in a -> b -> c somewhere, but we wouldn't know 
>> whether they were still in d or e.  Is that right?
>>     
>
> Sorry, I don't quite understand the question. What is the sense of 'they' and 'were' you implied here?
>
> Thanks,
> Konst
>  
> Intel Novosibirsk
> Closed Joint Stock Company Intel A/O
> Registered legal address: Krylatsky Hills Business Park, 
> 17 Krylatskaya Str., Bldg 4, Moscow 121614, 
> Russian Federation
>  
>
> -----Original Message-----
> From: Nicholas.Sterling@Sun.COM [mailto:Nicholas.Sterling@Sun.COM] 
> Sent: Wednesday, July 01, 2009 5:15 PM
> To: kato-spec@incubator.apache.org
> Subject: Re: JavaStackFrame/JavaLocation local variable support
>
> Bobrovsky, Konstantin S wrote:
>   
>> Hi Nicholas,
>>
>> C2 compiler annotates each safepoint with so-called DebugInfo (serialized together with method's executable image), which records an entire in-lining hierarchy for this particular safepoint, 
>>     
>
> Ah, I'm with you now -- thanks!  :^)  You're right, something like this 
> is needed for de-optimization.
>
> So in general we would be between safepoints, and most call sites are 
> safepoints.  If the first safepoint has backtrace
> a -> b -> c -> d -> e
> and the second has
> a -> b -> c -> g -> h -> i
> we know that they were in a -> b -> c somewhere, but we wouldn't know 
> whether they were still in d or e.  Is that right?
>
> Nicholas
>
>
>   

-- 
Stuart Monteith
http://blog.stoo.me.uk/


RE: JavaStackFrame/JavaLocation local variable support

Posted by "Bobrovsky, Konstantin S" <ko...@intel.com>.
> we know that they were in a -> b -> c somewhere, but we wouldn't know 
> whether they were still in d or e.  Is that right?

Sorry, I don't quite understand the question. What is the sense of 'they' and 'were' you implied here?

Thanks,
Konst
 
Intel Novosibirsk
Closed Joint Stock Company Intel A/O
Registered legal address: Krylatsky Hills Business Park, 
17 Krylatskaya Str., Bldg 4, Moscow 121614, 
Russian Federation
 

-----Original Message-----
From: Nicholas.Sterling@Sun.COM [mailto:Nicholas.Sterling@Sun.COM] 
Sent: Wednesday, July 01, 2009 5:15 PM
To: kato-spec@incubator.apache.org
Subject: Re: JavaStackFrame/JavaLocation local variable support

Bobrovsky, Konstantin S wrote:
> Hi Nicholas,
>
> C2 compiler annotates each safepoint with so-called DebugInfo (serialized together with method's executable image), which records an entire in-lining hierarchy for this particular safepoint, 

Ah, I'm with you now -- thanks!  :^)  You're right, something like this 
is needed for de-optimization.

So in general we would be between safepoints, and most call sites are 
safepoints.  If the first safepoint has backtrace
a -> b -> c -> d -> e
and the second has
a -> b -> c -> g -> h -> i
we know that they were in a -> b -> c somewhere, but we wouldn't know 
whether they were still in d or e.  Is that right?

Nicholas



Re: JavaStackFrame/JavaLocation local variable support

Posted by Nicholas Sterling <Ni...@Sun.COM>.
Bobrovsky, Konstantin S wrote:
> Hi Nicholas,
>
> C2 compiler annotates each safepoint with so-called DebugInfo (serialized together with method's executable image), which records an entire in-lining hierarchy for this particular safepoint, 

Ah, I'm with you now -- thanks!  :^)  You're right, something like this 
is needed for de-optimization.

So in general we would be between safepoints, and most call sites are 
safepoints.  If the first safepoint has backtrace
a -> b -> c -> d -> e
and the second has
a -> b -> c -> g -> h -> i
we know that they were in a -> b -> c somewhere, but we wouldn't know 
whether they were still in d or e.  Is that right?

Nicholas



RE: JavaStackFrame/JavaLocation local variable support

Posted by "Bobrovsky, Konstantin S" <ko...@intel.com>.
Hi Nicholas,

> Even at HotSpot safepoints, the code being executed often looks nothing 
> like the source.  In particular, compiled code may be heavily inlined.  
> Many instructions are just gone, and those that remain are moved up and 
> down, smearing methods together so that, variables aside, in general we 
> couldn't possibly tell you what method -- or even what class -- you're 
> in because you are in several at once.

NOTE: all my speculations below are for the Hotspot server compiler (C2) only. I don't know how safepoints are supported by the client (C1) Hotspot compiler.

Inlining does not harm safepoints. C2 compiler annotates each safepoint with so-called DebugInfo (serialized together with method's executable image), which records an entire in-lining hierarchy for this particular safepoint, as well as mapping of a JVM state of each in-lined method at this safepoint to memory locations/registers. (Prior to that, each JVM state element of every method in the in-lined hierarchy is kept as an IR node by JIT, with all these nodes being an input to the SafepointNode) Thus, at every safepoint runtime knows how to re-construct actual JVM state for the method the safepoint is in as well as for all the methods up the inlining hierarchy. This is particularly used by the de-optimization technique, when even a heavily optimized method maybe replaced by its interpreted version (with necessary chain of callers in case of in-lining) on-the-fly at a safepoint. De-optimization, being a critical and outstanding feature of Hotspot, is actually the only reason why JVM state mapping is maintained and saved together with compiled code. 

> Not sure how stack backtraces for exceptions work -- perhaps that
> suppresses some optimizations?

For each 'athrow' C2 server compiler creates a SafepointNode during the parse stage. So, as the comment above implies, at each 'athrow' site runtime has full information about what was in-lined here. For exceptions triggered not by an 'athrow' (e.g. implicit null pointer exception), things are slightly more complicated, but there is always a 'serialized' safepoint at the bottom, which can provide the in-lining details.

Thanks,
Konst

Intel Novosibirsk
Closed Joint Stock Company Intel A/O
Registered legal address: Krylatsky Hills Business Park, 
17 Krylatskaya Str., Bldg 4, Moscow 121614, 
Russian Federation
 

-----Original Message-----
From: Nicholas.Sterling@Sun.COM [mailto:Nicholas.Sterling@Sun.COM] 
Sent: Wednesday, July 01, 2009 3:31 PM
To: kato-spec@incubator.apache.org
Subject: Re: JavaStackFrame/JavaLocation local variable support

Joining this conversation late, so forgive me if this isn't as relevant 
as I think it should be.  :^)

Even at HotSpot safepoints, the code being executed often looks nothing 
like the source.  In particular, compiled code may be heavily inlined.  
Many instructions are just gone, and those that remain are moved up and 
down, smearing methods together so that, variables aside, in general we 
couldn't possibly tell you what method -- or even what class -- you're 
in because you are in several at once.  Of course the debugger doesn't 
have this problem because as soon as you point the debugger at a method 
HotSpot abandons the compiled version and uses the interpreted version, 
or at least something optimized less aggressively.

At least that's my understanding; I'm happy to be corrected.  Not sure 
how stack backtraces for exceptions work -- perhaps that suppresses some 
optimizations?

Nicholas



Stuart Monteith wrote:
>
>
> Steve Poole wrote:
>> On Fri, Jun 26, 2009 at 11:41 AM, Stuart Monteith 
>> <st...@stoo.me.uk>wrote:
>>
>>  
>>> Hi,
>>>   I was wondering what peoples thoughts were regarding program 
>>> counters,
>>> line number table and variable tables.
>>> There is a tension between most users of the Kato API and the JDI 
>>> connector
>>> and its obligations towards supplying the information JDWP requires.
>>> JDWP, for the most part, would like to know the location of a stack 
>>> frame,
>>> i.e. a program counter normally, and using that to look up the variable
>>> tables and line number tables.
>>>
>>>     
>>
>> I do see JDWP as a major use case  for us so we must make sure that 
>> our JDI
>> connector is first class.
>>
>>   
> Agreed. I know that the katoView tomcat commands would benefit too - 
> the FFDC scenario.
>>> Some changes have been made to the API to supply the local variables
>>> (through JavaStackFrame.getVariable(int)) and their locations/types 
>>> through
>>> JavaMethod.getVariables().
>>> However, we haven't resolved the issue of the program counter or the 
>>> line
>>> numbers.
>>>
>>> Just now the line numbers are available through
>>> JavaLocation.getLineNumber(), where available. However, JDWP never 
>>> asks for
>>> a stack frame's line number, it maps from a stack frames
>>> location to a line number using the line number table from a stack 
>>> frame's
>>> method.
>>>
>>>     
>>
>>  
>>> So, should we forgo having the simple JavaLocation.getLineNumber() 
>>> and only
>>> supply the line number table (where appropriate)?
>>>     
>>
>>
>> I was thinking "So what's the use of the getLineNumber method? "   but
>> outside the JDWP scenerio it does enable simple access to the 
>> linenumbers
>> (ie via the xpath approach)   The question is how much use that is 
>> and what
>> we'd be encombering the implementations with.       Since what the  
>> the JDI
>> does is "standard" in its mapping then the RI could provide that code 
>> for
>> implementors to use.
>>
>>   
> I think the concern I have is that not all implementations would be 
> able to supply a program counter or a line number table.
> For instance, the hprof file stores only the line numbers. However, we 
> shouldn't get too hung up on this as our implementation
> for hprof was always going to be limited.
> It is important that we supply all of the necessary information, and 
> supply either helper methods on top or within API to make
> it more digestible for the majority of implementations.
>
>>  
>>> Of course, having a "getProgramCounter()" method would be useful, 
>>> but what
>>> should we do for compiled methods? There is a strong requirement for 
>>> us to
>>> return the contents
>>> of local variables in compiled methods as well as interpreted methods.
>>> However, that requires synthesizing a bytecode program counter to 
>>> retrieve
>>> the correct variables, which implies
>>> that line numbers could be generated too. However, as with C, etc, the
>>> debugging information derived from optimized code is usually 
>>> inaccurate.
>>>     
>>
>>
>>  
>>> For line numbers, I imagine we'd either have the line numbers or not if
>>> they are inaccurate. But for local variables, it would be sensible 
>>> to alter
>>> the variable table information to suit the
>>> optimized code, to give a consistent picture.
>>>
>>>     
>>
>> I think we need to examine this in more detail -  got an example?
>>
>>
>>   
> My experience of the JIT is somewhat limited, but certainly when 
> debugging C programs with optimization,
> it is usual that variables are optimized out, loops unrolled, code 
> reordered, such that  the variable contents and
> line numbers don't match the source. Having said that, I'm sure there 
> are others who could make more authoritative
> comments on this area.
>
> Take:
>    for(int a=0, b=0; a<10; a++) {
>       b = a*2;
>       array[a][b] = array2[a];
>    }
>
> if the compiler did this:
>
>    for(int a=0; a<10;a++) {
>       array[a][a*2] = array2[a];
>    }
>
> Then the local variable "b" would no longer exist in any meaningful 
> sense. My suggestion would be to remove "b" from the
> variable table. Of course, we could have two stack frames in the same 
> method with different levels of optimization, but I believe that's
> probably still an issue anyhow.
>
>
>>  
>>> Regards,
>>>   Stuart
>>>
>>>
>>> Stuart Monteith wrote:
>>>
>>>    
>>>> Hi,
>>>>   I've been looking at local variables in relation to the JDI 
>>>> connector.
>>>> For the BOF at JavaOne we'd like for there to be a prototype of  local
>>>> variable support in the API. I've been looking at what JDWP 
>>>> requires as we
>>>> would have to be able to satisfy its queries using the Kato API. 
>>>> This has
>>>> made me lean towards exposing the variable table and have us 
>>>> retrieve the
>>>> local variables from the stack frames by slot number.
>>>>
>>>> So my suggestion for the API is this:
>>>>
>>>> ---------------------------------
>>>>
>>>> JavaMethod
>>>> -------------
>>>>
>>>> // returns all local variables
>>>> // empty if there are no variables.
>>>> Iterator<JavaVariable> getVariable() throws DataUnavailable;
>>>>
>>>> JavaVariable
>>>> -------------
>>>>
>>>> // Local variable's name
>>>> // throws DataUnavailable if the variable was derived from bytecode 
>>>> and so
>>>> the name is unknown. Caller is free to make a name up.
>>>> String getName() throws DataUnavailable;
>>>>
>>>> // The local variable's signature in JNI format.
>>>> String getSignature();
>>>>
>>>> // The start of the local variable's scope within the bytecode.
>>>> int getStart();
>>>>
>>>> // The number of bytes this variables scope covers over the bytecode.
>>>> int getLength();
>>>>
>>>> // The slot this variable occupies. Passed to 
>>>> JavaStackFrame.getVariable()
>>>> to retrieve the contents.
>>>> int getSlot();
>>>>
>>>>
>>>> JavaStackFrame
>>>> ------------------
>>>>
>>>> // Gets the value of a variable from a stack frame.
>>>> // Returns a JavaObject for an object reference, null for a null 
>>>> object
>>>> reference. Primitives are returned as boxed primitives.
>>>> // throws CorruptDataException if object reference is incorrect, or 
>>>> if the
>>>> float or double are set to invalid values.
>>>> // throws DataUnavailable if this method is not supported or if 
>>>> stack not
>>>> in correct state to return variables.
>>>> // throws IndexOutOfBoundsException if an invalid slot number if 
>>>> passed.
>>>> Object getVariable(int slot) throws CorruptDataException, 
>>>> DataUnavailable,
>>>> IndexOutOfBoundsException;
>>>>
>>>>
>>>> ---------------------------------
>>>>
>>>> The bytecode offset can be calculated with:
>>>>   JavaLocation.getAddress() - (
>>>> JavaMethod.getBytecodeSections().next().getBase().getAddress())
>>>> but I think that might be a little too tedious, and doesn't allow
>>>> cleverness with JITted frames. So we will probably have to add:
>>>>
>>>> // Return program counter in bytecode.
>>>> int JavaLocation.getBytecodePC();
>>>>
>>>> alternatively the JavaVariable.getStart() would use absolute 
>>>> addresses,
>>>> which could conceivably work with JITed frames, if the tables are 
>>>> maintained
>>>> during compilation.
>>>>
>>>> We should also expose the line number table too as that will aid class
>>>> file reproduction and queries for line numbers based on bytecode 
>>>> program
>>>> counters.
>>>>
>>>> A slightly different scheme would have the 
>>>> JavaStackFrame.getVariable(int
>>>> slot) method look like:
>>>>   Object getVariable(JavaVariable var);
>>>> but I don't think it gains us much.
>>>>
>>>> Retrieving all of the variables would therefore look something like 
>>>> this:
>>>>
>>>> void dumpVariables(JavaThread thread) throws Exception {
>>>>   Iterator frames = thread.getStackFrames();
>>>>   while (frames.hasNext()) {
>>>>      JavaStackFrame frame = (JavaStackFrame) frames.next();
>>>>      JavaLocation location = frame.getLocation();
>>>>      JavaMethod method = location.getMethod();
>>>>      int pc = location.getBytecodePC();
>>>>           System.out.println(location.toString()+":");
>>>>
>>>>      Iterator variables = method.getVariables();
>>>>      while (variables.hasNext()) {
>>>>         JavaVariable variable = (JavaVariable) variables.next();
>>>>
>>>>         if (pc >= variable.getStart() && pc <=
>>>> variable.getStart()+variables.getLength()) {
>>>>            Object value = frame.getVariable( variable.getSlot());
>>>>                       System.out.println("\t"+ 
>>>> variable.getSignature()+"
>>>> "+variable.getName()+" = "+ value.toString());
>>>>         }
>>>>      }
>>>>   }
>>>> }
>>>>
>>>> Let me know what you think,
>>>>   Stuart
>>>>
>>>> Stuart Monteith wrote:
>>>>
>>>>      
>>>>> Hello,
>>>>>   With Steve's work on JVMTI/python coming along, the issue of 
>>>>> what to do
>>>>> about local methods is coming up. Currently there is no means to 
>>>>> determine
>>>>> the names and values of local variables through the current API.
>>>>>
>>>>> The most obvious way of implementing this is to have the API do 
>>>>> all of
>>>>> the processing by exposing the variables as name and value pairs.
>>>>>
>>>>> For example:
>>>>> interface JavaStackFrame {
>>>>>   List<LocalVariable> getLocalVariables();
>>>>> }
>>>>>
>>>>> where:
>>>>>
>>>>> interface LocalVariable {
>>>>>   String getName();
>>>>>   Object getValue();
>>>>> }
>>>>>
>>>>> Where the value is a JavaObject, or an boxed primitive.
>>>>>
>>>>> The other extreme is for the necessary information to be made 
>>>>> available
>>>>> for the callers of the API to generate this information themselves.
>>>>> This would mean properly exposing:
>>>>>   Program Counter - currently we have JavaLocation.getAddress(), 
>>>>> which is
>>>>> an address in memory, rather than a bytecode program counter. For 
>>>>> JITted
>>>>> frames we'd still need the bytecode program counter.
>>>>>   Local variable table - this is to determine which variables 
>>>>> there are,
>>>>> their types and their indexes into the local variable array
>>>>>   Local variable array - the contents of the local variables need 
>>>>> to be
>>>>> exposed, and their proper types should be returnable (JavaObject, 
>>>>> int, etc).
>>>>>
>>>>> Doing it that way might be beneficial for more user stories, there is
>>>>> more information available to reconstruct the class file, for 
>>>>> instance.
>>>>> There is also the small matter of what to do when the local variable
>>>>> table is not available. When the API exposes all that it knows the 
>>>>> values
>>>>> might still be retrievable, although I have my doubts as to how 
>>>>> useful that
>>>>> would be if you don't know the types.
>>>>>
>>>>> Thoughts?
>>>>>   Stuart
>>>>>
>>>>>
>>>>>         
>>
>>