You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@harmony.apache.org by Pavel Afremov <pa...@gmail.com> on 2006/07/21 13:33:12 UTC

Re: Stack Overflow Error support issues

Because more elegant decision wasn't proposed during current discussion, I'd
like to put the patch with results of my experiments into JIRA, as Stack
Overflow Implementation.

You can find it in
*HARMONY-945*<https://issues.apache.org/jira/browse/HARMONY-945>.
Welcome to try it.

Pavel Afremov.

Re: Stack Overflow Error support issues

Posted by Rana Dasgupta <rd...@gmail.com>.

Will do, thanks

On 7/21/06, Geir Magnusson Jr <ge...@pobox.com> wrote:
>
> I'll commit as is - if you need to modify  the stack space issue, just
> submit a new patch
>
> geir
>
> Rana Dasgupta wrote:
> > Pavel,
> >   I tried the attached overflow test, and then applied the patch and
> > retried it. The patch looks good. A couple of comments:
> >
> >   - I could not get the  unwind failure that you have mentioned (with
> >   the overflow happening in the first two lines) though I played around
> > with
> >   the test, but that may depend on what the specific setup of the stack
> > is at
> >   that point?
> >   - I noticed that on both Linux and Windows you preload the SOE class
> >   and precompile it. This may be OK.
> >   - But I also saw that you fail the JIT if you don't have 256 K of free
> >   stack space. The default Windows stack size is only 1 MB. Do we need
> > to fail
> >   a compile of 10 lines of bytecode if we don't have 1/4 of the stack
> >   available? Maybe this can be less strict, or some heuristic based on
> > method
> >   size? What do you think  about this?
> >
> >
> >   It would be nice if this patch could get committed, it is a good
> solution
> > for exceptions in Java code. For native code frames, we can continue the
> > work needed on top of this fix.
> >
> > Thanks,
> > Rana
> >
> > On 7/21/06, Pavel Afremov <pa...@gmail.com> wrote:
> >
> >> Because more elegant decision wasn't proposed during current
> discussion,
> >> I'd
> >> like to put the patch with results of my experiments into JIRA, as
> Stack
> >> Overflow Implementation.
> >>
> >> You can find it in
> >> *HARMONY-945*<https://issues.apache.org/jira/browse/HARMONY-945>.
> >> Welcome to try it.
> >>
> >> Pavel Afremov.
> >>
> >>
> >
>
> ---------------------------------------------------------------------
> Terms of use : http://incubator.apache.org/harmony/mailing.html
> To unsubscribe, e-mail: harmony-dev-unsubscribe@incubator.apache.org
> For additional commands, e-mail: harmony-dev-help@incubator.apache.org
>
>

Re: Stack Overflow Error support issues

Posted by Geir Magnusson Jr <ge...@pobox.com>.

I'll commit as is - if you need to modify  the stack space issue, just
submit a new patch

geir

Rana Dasgupta wrote:
> Pavel,
>   I tried the attached overflow test, and then applied the patch and
> retried it. The patch looks good. A couple of comments:
> 
>   - I could not get the  unwind failure that you have mentioned (with
>   the overflow happening in the first two lines) though I played around
> with
>   the test, but that may depend on what the specific setup of the stack
> is at
>   that point?
>   - I noticed that on both Linux and Windows you preload the SOE class
>   and precompile it. This may be OK.
>   - But I also saw that you fail the JIT if you don't have 256 K of free
>   stack space. The default Windows stack size is only 1 MB. Do we need
> to fail
>   a compile of 10 lines of bytecode if we don't have 1/4 of the stack
>   available? Maybe this can be less strict, or some heuristic based on
> method
>   size? What do you think  about this?
> 
> 
>   It would be nice if this patch could get committed, it is a good solution
> for exceptions in Java code. For native code frames, we can continue the
> work needed on top of this fix.
> 
> Thanks,
> Rana
> 
> On 7/21/06, Pavel Afremov <pa...@gmail.com> wrote:
> 
>> Because more elegant decision wasn't proposed during current discussion,
>> I'd
>> like to put the patch with results of my experiments into JIRA, as Stack
>> Overflow Implementation.
>>
>> You can find it in
>> *HARMONY-945*<https://issues.apache.org/jira/browse/HARMONY-945>.
>> Welcome to try it.
>>
>> Pavel Afremov.
>>
>>
> 

---------------------------------------------------------------------
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: harmony-dev-unsubscribe@incubator.apache.org
For additional commands, e-mail: harmony-dev-help@incubator.apache.org

Re: Stack Overflow Error support issues

Posted by Pavel Afremov <pa...@gmail.com>.

Hello

Sorry for late replay. It was very nice weekend for me :)

On 7/21/06, Rana Dasgupta <rd...@gmail.com> wrote:

>   - I could not get the  unwind failure that you have mentioned (with
>   the overflow happening in the first two lines) though I played around
> with
>   the test, but that may depend on what the specific setup of the stack is
> at
>   that point?

I rebase and recompile my version of DRLVM. And I can't reproduce it too.

On my previous version some magic was required to reproduce it. I added

several variables into main functions and into function "func". This "magic"

 was different for different build and should be tuned after a small changes

in VM code. Now I can't reproduce it using this magic.

>  ...
>   - But I also saw that you fail the JIT if you don't have 256 K of free
>   stack space. The default Windows stack size is only 1 MB. Do we need to
> fail
>   a compile of 10 lines of bytecode if we don't have 1/4 of the stack
>   available? Maybe this can be less strict, or some heuristic based on
> method
>   size? What do you think  about this?

Yes it's 1/4 of the stack. But in my test current implementation provide
recursion
with depth = 704220. Is more deep reqursion required anywhere? I think its
very
unlikely .

In any case if more deep recursion will be required in the future, we can
tune thread
stack size for windows build.

Thanks.
Pavel Afremov

Re: Stack Overflow Error support issues

Posted by Rana Dasgupta <rd...@gmail.com>.

On 7/24/06, Mikhail Fursov <mi...@gmail.com> wrote:
>
> >No, I think that VM can do this check but use lower border: e.g. 1/100 of
> >initial.
> >JIT must do this check more accurate: use knowledge of algorithms it
> uses.

I lowered the defensive  VM check to 1/100 of the initial stack size on both
platforms as discussed. Even with this less strict requirement, the test
recurses > 700,000 times.To be augmented with simple Jit heuristic when we
pursue further. This is a simple, self-contained change primarily as self
education to go thru the JIRA based code patch submission procedure.

https://issues.apache.org/jira/browse/HARMONY-1086

Thanks,
Rana

Re: Stack Overflow Error support issues

Posted by Pavel Afremov <pa...@gmail.com>.

Hi Weldon.

There is a simple test in description of the
*HARMONY-945*<https://issues.apache.org/jira/browse/HARMONY-945>
.
I can add this test in the smoke tests of DRLVM. Is it OK?

Pavel Afremov.

Re: Stack Overflow Error support issues

Posted by Weldon Washburn <we...@gmail.com>.

On 7/24/06, Mikhail Fursov <mi...@gmail.com> wrote:
> No, I think that VM can do this check but use lower border: e.g. 1/100 of
> initial.
> JIT must do this check more accurate: use knowledge of algorithms it uses.
>
> The requirement to avoid SOE during a compilation can affect any algorithm
> in JIT that uses recursion. Jitrino.OPT has a lot of such algorithms: node,
> insts, opnd based . So I'm not sure that JIT can construct a heuristic or a
> profile to refuse to compile a method in the beginning of the compilation.
> The another option is to check available stack size before any recursion
> based algorithm and limit the algorithm up to N steps in recursion (N is
> recomputed in runtime) . If N steps is not enough algorithm will fail and
> JIT will not not perform the optimization or compilation at all.
> Quite a lot of changes in JIT though. Any other ideas?

I would be hesitant to make a bunch of changes to the jit.  1) it
might cause stability problems.  2) It still does not fix the root
problem.  In specific, it is quite easy in C code to cause gobs of the
stack to be grabbed.  You can even grab so much stack that you leap
over the guard page then mysteriously crash. 3) what is wrong with
setting the guard at 256KB or larger for now?  Since we are not
running lots of threads at this point, we can afford to make each java
stack even 4MB big with 1MB guard pages.  This will allow us to
quickly rule out stack overflow as a cause of JVM crash.  Perhaps the
max stack and guard page size can  be adjustable at command level.

Also, I looked at the source code contained in harmony-945.  I did not
see a regession test or unit test.  Would it be possible to add this?

>
>
>
>
> On 7/24/06, Pavel Afremov <pa...@gmail.com> wrote:
> >
> > Hi
> >
> > On 7/22/06, Mikhail Fursov <mi...@gmail.com> wrote:
> >
> > > I think this must be a JIT heuristics because even a small method can
> > lead
> > > to inlining of whole classlib API :)
> >
> >
> > Are You think this check should be removed from VM and puted into JIT
> > only?
> >
> > BR
> > Pavel Afremov.
> >
> >
>
>
> --
> Mikhail Fursov
>
>

-- 
Weldon Washburn
Intel Middleware Products Division

---------------------------------------------------------------------
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: harmony-dev-unsubscribe@incubator.apache.org
For additional commands, e-mail: harmony-dev-help@incubator.apache.org

Re: Stack Overflow Error support issues

Posted by Pavel Afremov <pa...@gmail.com>.

On 7/24/06, Mikhail Fursov <mi...@gmail.com> wrote:
>
> No, I think that VM can do this check but use lower border: e.g. 1/100 of
> initial.
> JIT must do this check more accurate: use knowledge of algorithms it uses.
> ...



I think we can review this issue when the check appears in the JIT.



Pavel Afremov.

Re: Stack Overflow Error support issues

Posted by Gregory Shimansky <gs...@gmail.com>.

On Wednesday 26 July 2006 22:41 Weldon Washburn wrote:
> On 7/26/06, Gregory Shimansky <gs...@gmail.com> wrote:
> > On Wednesday 26 July 2006 20:05 Rana Dasgupta wrote:
> > > In general, I am not too keen to implement a feature to mitigate a bug.
> > > I also think that we need some real usage based test cases for SOE
> > > failure ( not our dev tests for unbounded recursion which force it to
> > > happen ) to understand how serious the problem is in usage scenarios.
> > > Most server environments eg., web servers, recycle host processes, so I
> > > have some doubts on how often this kind of resource scarcity problem
> > > really occurs. Thread stacks are also not shared resources, even on
> > > clients. Anyway, we may want to wait till Harmony identifies some
> > > signature apps or benchmarks and we see failures due to SOE. In the
> > > meantime, my suggestion/vote would be to proactively check for
> > > exception state in unwindable code sections in the jit, or when
> > > returning from suspension state in the VM. That would be my 2 cents.
> >
> > To make SOE more likely to happen in real applications it is easy to
> > use "ulimit -s" command to limit the stack size to some small value. The
> > default on Linux is 8Mb which is quite a lot.
>
> All good points.  Forcing the JVM into an SOE condition, then
> analyzing what happens is a valuable tool.  However, it is more
> important to know when/where the JVM runs into SOE problems with real
> workloads.  And what we need to do to address this particular
> situation.  While it is conceivable that real workloads could cause
> substantial redesign to make SOE work as expected, we need to be
> convinced this is indeed the situation.  In other words, premature
> overengineering is the little brother of premature optimization.

Yes I agree completely. I just offered a way to analyze the real world 
scenarios in stressed stack conditions. Because in default 8Mb stack I don't 
remember any Java or C real world application hitting the stack limit (maybe 
except for some of those that I worked on :) ).

Making stack shorter may allow to compare stability of Harmony VMs and RI on 
real applications, and if some investigation is done we can then make a 
decision what actually needs improvement.

-- 
Gregory Shimansky, Intel Middleware Products Division

---------------------------------------------------------------------
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: harmony-dev-unsubscribe@incubator.apache.org
For additional commands, e-mail: harmony-dev-help@incubator.apache.org

Re: Stack Overflow Error support issues

Posted by Weldon Washburn <we...@gmail.com>.

On 7/26/06, Gregory Shimansky <gs...@gmail.com> wrote:
> On Wednesday 26 July 2006 20:05 Rana Dasgupta wrote:
> > In general, I am not too keen to implement a feature to mitigate a bug. I
> > also think that we need some real usage based test cases for SOE failure (
> > not our dev tests for unbounded recursion which force it to happen )
> > to understand how serious the problem is in usage scenarios. Most server
> > environments eg., web servers, recycle host processes, so I have some
> > doubts on how often this kind of resource scarcity problem really occurs.
> > Thread stacks are also not shared resources, even on clients. Anyway, we
> > may want to wait till Harmony identifies some signature apps or benchmarks
> > and we see failures due to SOE. In the meantime, my suggestion/vote would
> > be to proactively check for exception state in unwindable code sections in
> > the jit, or when returning from suspension state in the VM. That would be
> > my 2 cents.
>
> To make SOE more likely to happen in real applications it is easy to
> use "ulimit -s" command to limit the stack size to some small value. The
> default on Linux is 8Mb which is quite a lot.

All good points.  Forcing the JVM into an SOE condition, then
analyzing what happens is a valuable tool.  However, it is more
important to know when/where the JVM runs into SOE problems with real
workloads.  And what we need to do to address this particular
situation.  While it is conceivable that real workloads could cause
substantial redesign to make SOE work as expected, we need to be
convinced this is indeed the situation.  In other words, premature
overengineering is the little brother of premature optimization.

>
> Both drlvm and Sun VM honor this stack limit and it is possible to make even
> the very simple applications to fail on stack shortage or even crash the VM
> (Sun crashes too) when stack is very small like 32Kb.
>
> Also playing with ulimit -s it is possible to try to catch situations when SOE
> happens in VM code like in JIT or class loader.
>
> --
> Gregory Shimansky, Intel Middleware Products Division
>
> ---------------------------------------------------------------------
> Terms of use : http://incubator.apache.org/harmony/mailing.html
> To unsubscribe, e-mail: harmony-dev-unsubscribe@incubator.apache.org
> For additional commands, e-mail: harmony-dev-help@incubator.apache.org
>
>


-- 
Weldon Washburn
Intel Middleware Products Division

---------------------------------------------------------------------
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: harmony-dev-unsubscribe@incubator.apache.org
For additional commands, e-mail: harmony-dev-help@incubator.apache.org

Re: Stack Overflow Error support issues

Posted by Gregory Shimansky <gs...@gmail.com>.

On Wednesday 26 July 2006 20:05 Rana Dasgupta wrote:
> In general, I am not too keen to implement a feature to mitigate a bug. I
> also think that we need some real usage based test cases for SOE failure (
> not our dev tests for unbounded recursion which force it to happen )
> to understand how serious the problem is in usage scenarios. Most server
> environments eg., web servers, recycle host processes, so I have some
> doubts on how often this kind of resource scarcity problem really occurs.
> Thread stacks are also not shared resources, even on clients. Anyway, we
> may want to wait till Harmony identifies some signature apps or benchmarks
> and we see failures due to SOE. In the meantime, my suggestion/vote would
> be to proactively check for exception state in unwindable code sections in
> the jit, or when returning from suspension state in the VM. That would be
> my 2 cents.

To make SOE more likely to happen in real applications it is easy to 
use "ulimit -s" command to limit the stack size to some small value. The 
default on Linux is 8Mb which is quite a lot.

Both drlvm and Sun VM honor this stack limit and it is possible to make even 
the very simple applications to fail on stack shortage or even crash the VM 
(Sun crashes too) when stack is very small like 32Kb.

Also playing with ulimit -s it is possible to try to catch situations when SOE 
happens in VM code like in JIT or class loader.

-- 
Gregory Shimansky, Intel Middleware Products Division

---------------------------------------------------------------------
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: harmony-dev-unsubscribe@incubator.apache.org
For additional commands, e-mail: harmony-dev-help@incubator.apache.org

Re: Stack Overflow Error support issues

Posted by Rana Dasgupta <rd...@gmail.com>.

Pavel,
   Thanks for the clarifications. Please see below...

On 7/26/06, Pavel Afremov <pa...@gmail.com> wrote:
>
>
>
> >The main issue  is that new object can't be created in suspend disabled
> >mode. DRL VM is in this state during "GC unsafe" operation, when GC have
> not
> >been started. New SOE can't be created in this mode. But all information
> >about exception can be stored in thread local storage. When Vm return
> >control to managed code, function rethrow_current_thread_exception is
> >called, and this function can decide to create exception (it's possible
> >here) or throw it lazy.
>  This is the same problem as long unwindable stretches in the Jit.
> rethrow_curr_thread_exception_if_pending()  can be invoked when one comes
> out of the GC suspension state. Functionally, this is the same as the Jit
> proactively checking for an exception object on the thread periodically.
> lazy exceptions are not the solution here, it is an optimization to the
> raising mechanism. At best, it allows us not to precreate the exception
> object as we are doing now at start time.
>
> >Now "lazy exception" supported for MANAGED code only, Alexey propose
> extend
> >it for VM code. This technique should fix the most case when exception
> >should be raised in suspend disabled mode.

There is no reason why the lazy exception optimization cannot be implemented
for VM code( it may be easier to do this than in jitted code ), but I don't
think that we should do it as a fix to SOE.

In general, I am not too keen to implement a feature to mitigate a bug. I
also think that we need some real usage based test cases for SOE failure (
not our dev tests for unbounded recursion which force it to happen )
to understand how serious the problem is in usage scenarios. Most server
environments eg., web servers, recycle host processes, so I have some doubts
on how often this kind of resource scarcity problem really occurs. Thread
stacks are also not shared resources, even on clients. Anyway, we may want
to wait till Harmony identifies some signature apps or benchmarks and we see
failures due to SOE. In the meantime, my suggestion/vote would be to
proactively check for exception state in unwindable code sections in the
jit, or when returning from suspension state in the VM. That would be my 2
cents.

Thanks,
Rana

Re: Stack Overflow Error support issues

Posted by Rana Dasgupta <rd...@gmail.com>.

On 28 Jul 2006 15:35:42 +0700, Egor Pasko <eg...@gmail.com> wrote:

>
> >It reminds me of one usecase, when a 'maybe-commercial' server running
> >a JVM accepts deployments of JAR files and cannot predict what users
> >deploy. So, not-to-crash on SOE is a matter of security for the
> >server. And this is really important to eliminate this
> >vulnerability. Readjusting stack size does not work here.

Egor,
My impression is that most commercial server design is insulated from
vulnerability from user code and similar security breaches by allowing only
the process hosting the user application to go down, not the entire web
server. Since the user code causing unbounded recursion is an
application bug, it may not be worthwhile to try to prevent this failure.
The failure that would be good to guard against is the one that happens
though the user code is correct, ie., from a genuine lack of stack
resources. It would be nice to see a real use case for this. We don't have
to be perfect. No VM is.

 For failures in native code, there are some choices..
1. Tuning by choosing stack and guard sizes from the command line. This
is  limited on Windows due to interference from the inbuilt Windows stack
growth algorithm. But something like this could yield a good average size
setting.
2. Set the exception info on the thread when the exception cannot be raised
and check for it and raise it later. This may be the simplest. Is there a
reason ( other than choice of approach ) why we can't do this?
3. Proactively check in the Jit to not allow SOE to happen by querying the
VM and taking some recovery action eg., failing the compile, disabling
optimizations, compiling with a down level compiler etc. This is a lot of
work. And only the jit( eg., not the verifier ) can do this. Without seeing
a real use case or reliability target, I don't know if it is necessary to do
so much. There is a priority question too.

Thanks,
Rana

--
Egor Pasko, Intel Managed Runtime Division

---------------------------------------------------------------------
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: harmony-dev-unsubscribe@incubator.apache.org
For additional commands, e-mail: harmony-dev-help@incubator.apache.org

Re: Stack Overflow Error support issues

Posted by Weldon Washburn <we...@gmail.com>.

On 28 Jul 2006 15:35:42 +0700, Egor Pasko <eg...@gmail.com> wrote:
> On the 0x1B2 day of Apache Harmony Weldon Washburn wrote:
> > Stack Overflow Exception handling needs to be good enough for
> > commercial workloads.  This might be less than required for absolute
> > correctness.  The approach I would take at this time is to make stack
> > size and guard page size adjustable at the command line.  Then when we
> > suspect a stack overflow problem, bump up the stack and guard page
> > sizes substantially.  If the problem disappears, we might have an
> > interesting SOE case to study (and to guide what a reasonable JVM
> > should do.)  Otherwise, we are designing for the absolute worst case.
> > While this is ideal, it might not be required and could lead to all
> > sorts of other problems like unmaintainable code...
>
> It reminds me of one usecase, when a 'maybe-commercial' server running
> a JVM accepts deployments of JAR files and cannot predict what users
> deploy. So, not-to-crash on SOE is a matter of security for the
> server. And this is really important to eliminate this
> vulnerability. Readjusting stack size does not work here.
>
> What can we do to simplify things? I have no idea currently :(

All good points.  What all this leads to is a need for a top-level
development plan.  And what can help in the planning is a better
assessment of what needs to happen to fix SOE.  Four days ago Mikhail
Fursov mentioned that absolute correct JIT SOE handling is, "quite a
lot of changes".   Some questions we should be asking.  Will the
harmony JIT developers be able to handle this work as well as fixing
bugs and performance problems from bringing up new workloads,
implementing new fuctionality to support java 1.5 and
incremental/concurrent garbage collectors, etc.   Also, which projects
can JIT SOE support be done in parallel with?

>
> --
> Egor Pasko, Intel Managed Runtime Division
>
>
> ---------------------------------------------------------------------
> Terms of use : http://incubator.apache.org/harmony/mailing.html
> To unsubscribe, e-mail: harmony-dev-unsubscribe@incubator.apache.org
> For additional commands, e-mail: harmony-dev-help@incubator.apache.org
>
>


-- 
Weldon Washburn
Intel Middleware Products Division

---------------------------------------------------------------------
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: harmony-dev-unsubscribe@incubator.apache.org
For additional commands, e-mail: harmony-dev-help@incubator.apache.org

Re: Stack Overflow Error support issues

Posted by Egor Pasko <eg...@gmail.com>.

On the 0x1B2 day of Apache Harmony Weldon Washburn wrote:
> Stack Overflow Exception handling needs to be good enough for
> commercial workloads.  This might be less than required for absolute
> correctness.  The approach I would take at this time is to make stack
> size and guard page size adjustable at the command line.  Then when we
> suspect a stack overflow problem, bump up the stack and guard page
> sizes substantially.  If the problem disappears, we might have an
> interesting SOE case to study (and to guide what a reasonable JVM
> should do.)  Otherwise, we are designing for the absolute worst case.
> While this is ideal, it might not be required and could lead to all
> sorts of other problems like unmaintainable code...

It reminds me of one usecase, when a 'maybe-commercial' server running
a JVM accepts deployments of JAR files and cannot predict what users
deploy. So, not-to-crash on SOE is a matter of security for the
server. And this is really important to eliminate this
vulnerability. Readjusting stack size does not work here.

What can we do to simplify things? I have no idea currently :(

-- 
Egor Pasko, Intel Managed Runtime Division


---------------------------------------------------------------------
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: harmony-dev-unsubscribe@incubator.apache.org
For additional commands, e-mail: harmony-dev-help@incubator.apache.org

Re: Stack Overflow Error support issues

Posted by Weldon Washburn <we...@gmail.com>.

On 7/26/06, Pavel Afremov <pa...@gmail.com> wrote:
> Hi
>
>
>
> The main issue  is that new object can't be created in suspend disabled
> mode. DRL VM is in this state during "GC unsafe" operation, when GC have not
> been started. New SOE can't be created in this mode. But all information
> about exception can be stored in thread local storage. When Vm return
> control to managed code, function rethrow_current_thread_exception is
> called, and this function can decide to create exception (it's possible
> here) or throw it lazy.
>
>
>
> Now "lazy exception" supported for MANAGED code only, Alexey propose extend
> it for VM code. This technique should fix the most case when exception
> should be raised in suspend disabled mode.

Aha!  I think I now understand what you are suggesting.  You propose
to borrow the technique of lazy exception throw.  A better term might
be "deferred exception object creation".  DEOC happens to use the same
mechanism as lazy exception throwing.  The idea is to defer the
creation of the exception object until the JVM is in a state where it
can actually create the desired object.  If this is what you intend, I
like it.  And I can see how it could address situations where the JVM
is in a state such that object creation could lead to a deadlocked GC.

Looking at the top-level for a minute, guaranteeing that the JIT will
gracefully throw a Stack Overflow Exception instead of crashing might
require rewriting a bunch the JIT with this new requirement in mind.
Then testing the recursion code paths for compliance.  While this
could be done, my guess is that this might be very disruptive to the
existing harmonydrlvm code base.  Before embarking on such a project,
I would like to see evidence that workloads Harmony cares about push
the JIT into situations that expose SOE problems.  I suspect there are
other higher value things we should be doing like getting more
commercial workloads up and running.

Stack Overflow Exception handling needs to be good enough for
commercial workloads.  This might be less than required for absolute
correctness.  The approach I would take at this time is to make stack
size and guard page size adjustable at the command line.  Then when we
suspect a stack overflow problem, bump up the stack and guard page
sizes substantially.  If the problem disappears, we might have an
interesting SOE case to study (and to guide what a reasonable JVM
should do.)  Otherwise, we are designing for the absolute worst case.
While this is ideal, it might not be required and could lead to all
sorts of other problems like unmaintainable code...

>
>
>
> About "suspended" mode. It's misprint for suspend disabled mode. Sorry for
> confusion. The issue is in check available stack size before entering in
> suspend disabled mode, and raise new SOE if available stack size is not
> enough.
>
>
>
> The third point is not really fix. I think it's workaround for cases when VM
> can't create new exception object by different reasons. I suppose, VM can
> raise pre created SOE  in the case when stack overflow happen in suspend
> disabled mode and stack can't be unwound destructively.
>
>
>
> Pavel Afremov.
>
>

-- 
Weldon Washburn
Intel Middleware Products Division

---------------------------------------------------------------------
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: harmony-dev-unsubscribe@incubator.apache.org
For additional commands, e-mail: harmony-dev-help@incubator.apache.org

Re: Stack Overflow Error support issues

Posted by Pavel Afremov <pa...@gmail.com>.

Hi



The main issue  is that new object can't be created in suspend disabled
mode. DRL VM is in this state during "GC unsafe" operation, when GC have not
been started. New SOE can't be created in this mode. But all information
about exception can be stored in thread local storage. When Vm return
control to managed code, function rethrow_current_thread_exception is
called, and this function can decide to create exception (it's possible
here) or throw it lazy.



Now "lazy exception" supported for MANAGED code only, Alexey propose extend
it for VM code. This technique should fix the most case when exception
should be raised in suspend disabled mode.



About "suspended" mode. It's misprint for suspend disabled mode. Sorry for
confusion. The issue is in check available stack size before entering in
suspend disabled mode, and raise new SOE if available stack size is not
enough.



The third point is not really fix. I think it's workaround for cases when VM
can't create new exception object by different reasons. I suppose, VM can
raise pre created SOE  in the case when stack overflow happen in suspend
disabled mode and stack can't be unwound destructively.



Pavel Afremov.

Re: Stack Overflow Error support issues

Posted by Rana Dasgupta <rd...@gmail.com>.

On 7/25/06, Weldon Washburn <we...@gmail.com> wrote:
>
> On 7/25/06, Pavel Afremov <pa...@gmail.com> wrote:
> >> Hello
> >>
> >>
> >>
> >> I can't reproduce assertion described in HARMONY-971, but it possible
> > >indeed. Alexey is right, lazy exception support for VM code fixes this
> and
> > >other similar bugs.
>
> >I think I understand the above.  Lazy exceptions only actually create
> >the exception object if an  exception handler can be located that
> >actually consumes this object.  If an exception handler is located and
> >it does not use this object, the optimization is to never create the
> >exception object.
>
> >The net result is that an exception can indeed be properly thown when
> >there is zero memory to allocate any new object.  This, in turn,
> >improves DRLVM's overall stability and ability to deal with Stack
> >Overflow Exceptions correctly.  In other words, it reduces the
> symptoms which is good.  But it does not completely solve SOE
> >problems.


Not sure I understand. Lazy exception handling is an efficiency
optimization, not a correctness enhancement. As Weldon says, there *could *be
some indirect benefits and relief of memory pressure in that the
exception is not created unless the exception object is live at entry to a
matching handler. But if it matches, it still needs to be created ...and it
does not address the problem that certain prolonged sections of the
jit/verifier code are unwindable.

>>
> >> The other possible solution is check available stack size before
> running the
> >> VM functions which should works in suspended mode and which can be
> source of
> > >stack overflow.
>
> >I don't know what "suspended mode" is.  Please explain.


  I didn't understand this either!

>>
> >>
> >>
> >> The third solution is using pre created exception like for
> >> OutOfMemoryException.
> >>
> >>
> Do you mean all exception objects or just SOE? This has implications on
> Java semantics. The exception objects need to be created in the right
> context with all associated side-effects.
>
> --
> Weldon Washburn
> Intel Middleware Products Division
>
> ---------------------------------------------------------------------
> Terms of use : http://incubator.apache.org/harmony/mailing.html
> To unsubscribe, e-mail: harmony-dev-unsubscribe@incubator.apache.org
> For additional commands, e-mail: harmony-dev-help@incubator.apache.org
>
>

Re: Stack Overflow Error support issues

Posted by Weldon Washburn <we...@gmail.com>.

On 7/25/06, Pavel Afremov <pa...@gmail.com> wrote:
> Hello
>
>
>
> I can't reproduce assertion described in HARMONY-971, but it possible
> indeed. Alexey is right, lazy exception support for VM code fixes this and
> other similar bugs.

I think I understand the above.  Lazy exceptions only actually create
the exception object if an  exception handler can be located that
actually consumes this object.  If an exception handler is located and
it does not use this object, the optimization is to never create the
exception object.

The net result is that an exception can indeed be properly thown when
there is zero memory to allocate any new object.  This, in turn,
improves DRLVM's overall stability and ability to deal with Stack
Overflow Exceptions correctly.  In other words, it reduces the
symptoms which is good.  But it does not completely solve SOE
problems.  The question I think is important to find out is if the
above is good enough for 2006 goals.

>
>
>
> The other possible solution is check available stack size before running the
> VM functions which should works in suspended mode and which can be source of
> stack overflow.

I don't know what "suspended mode" is.  Please explain.

>
>
>
> The third solution is using pre created exception like for
> OutOfMemoryException.
>
>
>
> I think that the first solution is the best.

Lazy object creation also has an added benefit of reducing the amount
of time it takes to throw an exception (which is good).  A question
about the third solution.  Will it be a better solution when an
exception handler does indeed consume the exception object?  In other
words, does the third solution fix more SOE corner cases than the
first solution?

>
> So I'd like start experiments with lazy exceptions support for VM code.

The last time I looked, the lazy exception code was still in DRLVM
code base.  The experiments would be to find out if it causes more SOE
cases to be handled correctly?  Is this right?

>
>
>
> Thanks.
>
> Pavel Afremov.
>
>

-- 
Weldon Washburn
Intel Middleware Products Division

---------------------------------------------------------------------
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: harmony-dev-unsubscribe@incubator.apache.org
For additional commands, e-mail: harmony-dev-help@incubator.apache.org

Re: Stack Overflow Error support issues

Posted by Pavel Afremov <pa...@gmail.com>.

Hello



I can't reproduce assertion described in HARMONY-971, but it possible
indeed. Alexey is right, lazy exception support for VM code fixes this and
other similar bugs.



The other possible solution is check available stack size before running the
VM functions which should works in suspended mode and which can be source of
stack overflow.



The third solution is using pre created exception like for
OutOfMemoryException.



I think that the first solution is the best.

So I'd like start experiments with lazy exceptions support for VM code.



Thanks.

Pavel Afremov.

Re: Stack Overflow Error support issues

Posted by Rana Dasgupta <rd...@gmail.com>.

Since we are still undecided on whether to handle exceptions in VM code
lazily or not to allow SOE at all in VM code, I have fixed ALexey's bug
HARMONY-971 by precreating a default constructed SOE exception object (
used in situations like this..suspend_disable )and some more local
rearranging. I did not want to leave the bug open since RI passes the
scenario quite easily. But I suspect that we will need to do more work on
SOE in VM code.

Thanks,
Rana

> On 7/25/06, Alexey Varlamov <al...@gmail.com> wrote:
> >
> > [SNIP]
> >
> > > Shouldn't we care (even more) about kernel/classlib natives? I guess
> > > there are enough places lacking guard checks like " if
> > > (ExceptionOccured()) return 0;"
> > >
> > > BTW, the SOE machinery seems to be incomplete for this case, see
> > > HARMONY-971 issue.
> >
> > I think some issues could disappear if lazy exception creation is used
> > for SOE. Besides, it would be more reliable - in case we are really
> > tight on stack...
> > Pavel, is it possible without massive refactoring in exceptions
> > submodule ?
> >
> > [SNIP]
> >
> > --
> > Alexey
> >
> >

Re: Stack Overflow Error support issues

Posted by Alexey Varlamov <al...@gmail.com>.

[SNIP]

> Shouldn't we care (even more) about kernel/classlib natives? I guess
> there are enough places lacking guard checks like " if
> (ExceptionOccured()) return 0;"
>
> BTW, the SOE machinery seems to be incomplete for this case, see
> HARMONY-971 issue.

I think some issues could disappear if lazy exception creation is used
for SOE. Besides, it would be more reliable - in case we are really
tight on stack...
Pavel, is it possible without massive refactoring in exceptions submodule ?

[SNIP]

--
Alexey

---------------------------------------------------------------------
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: harmony-dev-unsubscribe@incubator.apache.org
For additional commands, e-mail: harmony-dev-help@incubator.apache.org

Re: Stack Overflow Error support issues

Posted by Alexey Varlamov <al...@gmail.com>.

Guys,

I'm just curious - anybody know how probable is the chance to hit
stack overflow during compilation vs managed code execution and vs
native code execution?
Shouldn't we care (even more) about kernel/classlib natives? I guess
there are enough places lacking guard checks like " if
(ExceptionOccured()) return 0;"

BTW, the SOE machinery seems to be incomplete for this case, see
HARMONY-971 issue.

--
Alexey Varlamov

2006/7/25, Rana Dasgupta <rd...@gmail.com>:
> As I understand it, the value added by a broad check like this before the
> compile starts may not be very high. At best, it can try to avoid SOE in
> native code with a clean failure when it is certain that the stack state
> will not permit completion of the compile. So something like 1/100 of the
> stack as Mikhail mentions sounds reasonable, and the VM could do this. This
> is something like a tuning activity and some experiments may also help.
>
> It is also true that to offer real guarantees that overflow will not occur,
> the JIT will need to consider this bound before making optimization
> decisions or to limit recursion depth in algo selection. But this is a lot
> to do proactively in the Jit, just to try to avoid SOE. My suggestion is
> that we can consider doing these defensive checks in the Jit only if we want
> to offer and honor some published reliability guarantees. This is a broader
> topic and would need to include not only SOE, but several other failures
> like ThreadAbort etc. that the user does not usually anticipate and are not
> user code failures. Basically these are guarantees that if the user follows
> some reasonable guidelines, certain portions of his code cannot fail. How
> those guarantees can be exposed etc. ( eg., a method attributed with an
> attribute "Reliable" that the Jit and VM support, or some command line flags
> etc. ) is  a matter of design. We should defer this work unless SOE becomes
> an issue with our apps of choice or this type of "reliable managed code" is
> a design requirement in Harmony.
>
> For now, it may be better to periodically check for the exception state on
> the thread in line with Pavel's original design.
>
> Thanks,
> Rana
>
>
>
>
> On 7/24/06, Mikhail Fursov <mi...@gmail.com> wrote:
> >
> > No, I think that VM can do this check but use lower border: e.g. 1/100 of
> > initial.
> > JIT must do this check more accurate: use knowledge of algorithms it uses.
> >
> > The requirement to avoid SOE during a compilation can affect any algorithm
> > in JIT that uses recursion. Jitrino.OPT has a lot of such algorithms:
> > node,
> > insts, opnd based . So I'm not sure that JIT can construct a heuristic or
> > a
> > profile to refuse to compile a method in the beginning of the compilation.
> > The another option is to check available stack size before any recursion
> > based algorithm and limit the algorithm up to N steps in recursion (N is
> > recomputed in runtime) . If N steps is not enough algorithm will fail and
> > JIT will not not perform the optimization or compilation at all.
> > Quite a lot of changes in JIT though. Any other ideas?
> >
> >
> >
> >
> > On 7/24/06, Pavel Afremov <pa...@gmail.com> wrote:
> > >
> > > Hi
> > >
> > > On 7/22/06, Mikhail Fursov <mi...@gmail.com> wrote:
> > >
> > > > I think this must be a JIT heuristics because even a small method can
> > > lead
> > > > to inlining of whole classlib API :)
> > >
> > >
> > > Are You think this check should be removed from VM and puted into JIT
> > > only?
> > >
> > > BR
> > > Pavel Afremov.
> > >
> > >
> >
> >
> > --
> > Mikhail Fursov
> >
> >
>
>

---------------------------------------------------------------------
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: harmony-dev-unsubscribe@incubator.apache.org
For additional commands, e-mail: harmony-dev-help@incubator.apache.org

Re: Stack Overflow Error support issues

Posted by Rana Dasgupta <rd...@gmail.com>.

As I understand it, the value added by a broad check like this before the
compile starts may not be very high. At best, it can try to avoid SOE in
native code with a clean failure when it is certain that the stack state
will not permit completion of the compile. So something like 1/100 of the
stack as Mikhail mentions sounds reasonable, and the VM could do this. This
is something like a tuning activity and some experiments may also help.

It is also true that to offer real guarantees that overflow will not occur,
the JIT will need to consider this bound before making optimization
decisions or to limit recursion depth in algo selection. But this is a lot
to do proactively in the Jit, just to try to avoid SOE. My suggestion is
that we can consider doing these defensive checks in the Jit only if we want
to offer and honor some published reliability guarantees. This is a broader
topic and would need to include not only SOE, but several other failures
like ThreadAbort etc. that the user does not usually anticipate and are not
user code failures. Basically these are guarantees that if the user follows
some reasonable guidelines, certain portions of his code cannot fail. How
those guarantees can be exposed etc. ( eg., a method attributed with an
attribute "Reliable" that the Jit and VM support, or some command line flags
etc. ) is  a matter of design. We should defer this work unless SOE becomes
an issue with our apps of choice or this type of "reliable managed code" is
a design requirement in Harmony.

For now, it may be better to periodically check for the exception state on
the thread in line with Pavel's original design.

Thanks,
Rana

On 7/24/06, Mikhail Fursov <mi...@gmail.com> wrote:
>
> No, I think that VM can do this check but use lower border: e.g. 1/100 of
> initial.
> JIT must do this check more accurate: use knowledge of algorithms it uses.
>
> The requirement to avoid SOE during a compilation can affect any algorithm
> in JIT that uses recursion. Jitrino.OPT has a lot of such algorithms:
> node,
> insts, opnd based . So I'm not sure that JIT can construct a heuristic or
> a
> profile to refuse to compile a method in the beginning of the compilation.
> The another option is to check available stack size before any recursion
> based algorithm and limit the algorithm up to N steps in recursion (N is
> recomputed in runtime) . If N steps is not enough algorithm will fail and
> JIT will not not perform the optimization or compilation at all.
> Quite a lot of changes in JIT though. Any other ideas?
>
>
>
>
> On 7/24/06, Pavel Afremov <pa...@gmail.com> wrote:
> >
> > Hi
> >
> > On 7/22/06, Mikhail Fursov <mi...@gmail.com> wrote:
> >
> > > I think this must be a JIT heuristics because even a small method can
> > lead
> > > to inlining of whole classlib API :)
> >
> >
> > Are You think this check should be removed from VM and puted into JIT
> > only?
> >
> > BR
> > Pavel Afremov.
> >
> >
>
>
> --
> Mikhail Fursov
>
>

Re: Stack Overflow Error support issues

Posted by Mikhail Fursov <mi...@gmail.com>.

No, I think that VM can do this check but use lower border: e.g. 1/100 of
initial.
JIT must do this check more accurate: use knowledge of algorithms it uses.

The requirement to avoid SOE during a compilation can affect any algorithm
in JIT that uses recursion. Jitrino.OPT has a lot of such algorithms: node,
insts, opnd based . So I'm not sure that JIT can construct a heuristic or a
profile to refuse to compile a method in the beginning of the compilation.
The another option is to check available stack size before any recursion
based algorithm and limit the algorithm up to N steps in recursion (N is
recomputed in runtime) . If N steps is not enough algorithm will fail and
JIT will not not perform the optimization or compilation at all.
Quite a lot of changes in JIT though. Any other ideas?

On 7/24/06, Pavel Afremov <pa...@gmail.com> wrote:
>
> Hi
>
> On 7/22/06, Mikhail Fursov <mi...@gmail.com> wrote:
>
> > I think this must be a JIT heuristics because even a small method can
> lead
> > to inlining of whole classlib API :)
>
>
> Are You think this check should be removed from VM and puted into JIT
> only?
>
> BR
> Pavel Afremov.
>
>

-- 
Mikhail Fursov

Re: Stack Overflow Error support issues

Posted by Pavel Afremov <pa...@gmail.com>.

Hi

On 7/22/06, Mikhail Fursov <mi...@gmail.com> wrote:

> I think this must be a JIT heuristics because even a small method can lead
> to inlining of whole classlib API :)


Are You think this check should be removed from VM and puted into JIT only?

BR
Pavel Afremov.

Re: Stack Overflow Error support issues

Posted by Geir Magnusson Jr <ge...@pobox.com>.


Mikhail Fursov wrote:
> On 7/22/06, Rana Dasgupta <rd...@gmail.com> wrote:
>>
>>    - But I also saw that you fail the JIT if you don't have 256 K of free
>>    stack space. The default Windows stack size is only 1 MB. Do we
>> need to
>> fail
>>    a compile of 10 lines of bytecode if we don't have 1/4 of the stack
>>    available? Maybe this can be less strict, or some heuristic based on
>> method
>>    size? What do you think  about this?
> 
> 
> I think this must be a JIT heuristics because even a small method can lead
> to inlining of whole classlib API :)

Think of the performance, though!

geir


---------------------------------------------------------------------
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: harmony-dev-unsubscribe@incubator.apache.org
For additional commands, e-mail: harmony-dev-help@incubator.apache.org

Re: Stack Overflow Error support issues

Posted by Mikhail Fursov <mi...@gmail.com>.

On 7/22/06, Rana Dasgupta <rd...@gmail.com> wrote:
>
>    - But I also saw that you fail the JIT if you don't have 256 K of free
>    stack space. The default Windows stack size is only 1 MB. Do we need to
> fail
>    a compile of 10 lines of bytecode if we don't have 1/4 of the stack
>    available? Maybe this can be less strict, or some heuristic based on
> method
>    size? What do you think  about this?


I think this must be a JIT heuristics because even a small method can lead
to inlining of whole classlib API :)


-- 
Mikhail Fursov

Re: Stack Overflow Error support issues

Posted by Rana Dasgupta <rd...@gmail.com>.

Pavel,
   I tried the attached overflow test, and then applied the patch and
retried it. The patch looks good. A couple of comments:

   - I could not get the  unwind failure that you have mentioned (with
   the overflow happening in the first two lines) though I played around with
   the test, but that may depend on what the specific setup of the stack is at
   that point?
   - I noticed that on both Linux and Windows you preload the SOE class
   and precompile it. This may be OK.
   - But I also saw that you fail the JIT if you don't have 256 K of free
   stack space. The default Windows stack size is only 1 MB. Do we need to fail
   a compile of 10 lines of bytecode if we don't have 1/4 of the stack
   available? Maybe this can be less strict, or some heuristic based on method
   size? What do you think  about this?


   It would be nice if this patch could get committed, it is a good solution
for exceptions in Java code. For native code frames, we can continue the
work needed on top of this fix.

Thanks,
Rana

On 7/21/06, Pavel Afremov <pa...@gmail.com> wrote:

> Because more elegant decision wasn't proposed during current discussion,
> I'd
> like to put the patch with results of my experiments into JIRA, as Stack
> Overflow Implementation.
>
> You can find it in
> *HARMONY-945*<https://issues.apache.org/jira/browse/HARMONY-945>.
> Welcome to try it.
>
> Pavel Afremov.
>
>