You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@groovy.apache.org by David Clark <pl...@gmail.com> on 2016/03/14 22:31:26 UTC

Very Odd/Random JVM Slowdown With Indy

I've been chasing a slowdown in our application for a couple of months now.
I have what I believe is a solution (no slowdown for 4 days now). But I'm
having difficulty understanding why the solution works.

Symptoms:

At random intervals and a random times our web servers will go from serving
responses in the 300 ms range to taking 30 seconds or more. Sometimes the
servers will recover, sometimes they require a restart of the webserver
(spring boot/tomcat). When the applications slow down we always see the
tomcat thread pool hit the maximum size. Every single thread in the thread
pool is in the RUNNABLE state but appears to be making no progress.
Successive thread dumps show that the stacks are changing, but VERY slowly.
The top of the stack is always this method:

at java.lang.invoke.MethodHandleNatives.setCallSiteTargetNormal(Native
Method).

The other common condition is that whatever application code is on the
stack is always dynamically compiled. Code that is @CompileStatic is NEVER
on the stack when we see these slowdowns.

The thread dumps showed that the application code is never waiting on
locks, socket reads, db connections, etc.

Solution:

The solution to the problem was to disable Indy compilation and return to
non-Indy compilation. However, I don't think Indy is the problem here. I
noticed that our Spring Boot executable jar contained BOTH
groovy-all-2.4.5.jar AND groovy-all-indy-2.4.5.jar. Someone forgot to
exclude the non-indy jars.

My theory:

Having both indy and non-indy jars on the classpath is confusing the JIT
compiler. Code will be continuously JIT-ed as different methods fight over
which class files to JIT, those loaded from the groovy-all jar or those
loaded from the groovy-all-indy jar. If this is true then the compiler
threads will be continuously running and applying native locks which are
invisible to tools like VisualVM. The result would be random slowdowns
because only certain combinations of code paths would result in slowdowns.
It would also cause application code to go very slowly as the JIT compiler
continuously re-compiles code over and over again. Application code would
be stuck mostly waiting for JIT operations to complete as invalidated code
is continuously removed and replaced.

For now I will be leaving Indy disabled until we can do more accurate load
testing in non production environments.

My Question:

Is this theory possible? Am I going in a direction that is possible or
likely?

Re: Very Odd/Random JVM Slowdown With Indy

Posted by Marc Paquette <ma...@mac.com>.
Also, maybe you have checked this already, but if your "code cache" (-XX:ReservedCodeCacheSize) is too low, it will occur severe slowdowns once this cache is full or near-full.  I would suggest monitoring this with jconsole or visualvm.

Marc Paquette

> Le 15 mars 2016 à 08:40, Winnebeck, Jason <Ja...@windstream.com> a écrit :
> 
> Method handles are a relatively new component to the JVM. One thing to consider is whether or not you are running the latest JVM version.
>  
> If I ignore for a moment your finding of that method on the top of the stack, when I’ve seen the JVM get very slow without any apparent deadlocks I’ve also seen GC as an issue. I would just double check to make sure there’s not a leak (in heap or permgen/metaspace) going on, it might be possible that the MethodHandle code is more susceptible to GC pauses and appears more often in your stack dumps. Same thing with your underlying OS as well, I would just double check that you aren’t in a swap condition or (on Linux) in iowait states. Your note about removing indy fixing this seems to suggest that is not the case but it’s also a common condition I see with “JVM starts running order of magnitudes slower until you restart” issues. It may also be possible that Groovy indy or JVM invokedynamic has a leak that shows up as a memory scaling problem as well.
>  
> Jason
>  
> From: David Clark [mailto:plotinussmith@gmail.com] 
> Sent: Monday, March 14, 2016 5:31 PM
> To: users@groovy.apache.org
> Subject: Very Odd/Random JVM Slowdown With Indy
>  
> I've been chasing a slowdown in our application for a couple of months now. I have what I believe is a solution (no slowdown for 4 days now). But I'm having difficulty understanding why the solution works.
>  
> Symptoms:
>  
> At random intervals and a random times our web servers will go from serving responses in the 300 ms range to taking 30 seconds or more. Sometimes the servers will recover, sometimes they require a restart of the webserver (spring boot/tomcat). When the applications slow down we always see the tomcat thread pool hit the maximum size. Every single thread in the thread pool is in the RUNNABLE state but appears to be making no progress. Successive thread dumps show that the stacks are changing, but VERY slowly. The top of the stack is always this method:
>  
> at java.lang.invoke.MethodHandleNatives.setCallSiteTargetNormal(Native Method).
>  
> The other common condition is that whatever application code is on the stack is always dynamically compiled. Code that is @CompileStatic is NEVER on the stack when we see these slowdowns.
>  
> The thread dumps showed that the application code is never waiting on locks, socket reads, db connections, etc.
>  
> Solution:
>  
> The solution to the problem was to disable Indy compilation and return to non-Indy compilation. However, I don't think Indy is the problem here. I noticed that our Spring Boot executable jar contained BOTH groovy-all-2.4.5.jar AND groovy-all-indy-2.4.5.jar. Someone forgot to exclude the non-indy jars.
>  
> My theory:
>  
> Having both indy and non-indy jars on the classpath is confusing the JIT compiler. Code will be continuously JIT-ed as different methods fight over which class files to JIT, those loaded from the groovy-all jar or those loaded from the groovy-all-indy jar. If this is true then the compiler threads will be continuously running and applying native locks which are invisible to tools like VisualVM. The result would be random slowdowns because only certain combinations of code paths would result in slowdowns. It would also cause application code to go very slowly as the JIT compiler continuously re-compiles code over and over again. Application code would be stuck mostly waiting for JIT operations to complete as invalidated code is continuously removed and replaced.
>  
> For now I will be leaving Indy disabled until we can do more accurate load testing in non production environments.
>  
> My Question:
>  
> Is this theory possible? Am I going in a direction that is possible or likely?
> This email message and any attachments are for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message and any attachments.


Re: Very Odd/Random JVM Slowdown With Indy

Posted by David Clark <pl...@gmail.com>.
On Tue, Mar 15, 2016 at 7:40 AM, Winnebeck, Jason <
Jason.Winnebeck@windstream.com> wrote:

> Method handles are a relatively new component to the JVM. One thing to
> consider is whether or not you are running the latest JVM version.
>
>
>
> If I ignore for a moment your finding of that method on the top of the
> stack, when I’ve seen the JVM get very slow without any apparent deadlocks
> I’ve also seen GC as an issue. I would just double check to make sure
> there’s not a leak (in heap or permgen/metaspace) going on, it might be
> possible that the MethodHandle code is more susceptible to GC pauses and
> appears more often in your stack dumps. Same thing with your underlying OS
> as well, I would just double check that you aren’t in a swap condition or
> (on Linux) in iowait states. Your note about removing indy fixing this
> seems to suggest that is not the case but it’s also a common condition I
> see with “JVM starts running order of magnitudes slower until you restart”
> issues. It may also be possible that Groovy indy or JVM invokedynamic has a
> leak that shows up as a memory scaling problem as well.
>
>
>
> Jason
>

We have had a problem previously with GC in this application. I was able to
isolate it to the Oracle driver and prevent the problem by forcing
statement caching. Now the GC is textbook "sawtooth" allocation and
deallocation. We go for hours with no major GC at this point. We also never
use the entire heap allocated to us, generally it is 500MB - 1GB under the
max heap size.

I have also checked metaspace and code cache (no permgen since this is Java
8). Both were well under the max limits. Is there some other memory
utilization I can check?

As for the OS conditions I was about to start monitoring OS level events
when disabling indy fixed the issue. However, the fact that we have run for
5 days now with no events and the only change being the indy change
suggests this is not at the OS leve.

RE: Very Odd/Random JVM Slowdown With Indy

Posted by "Winnebeck, Jason" <Ja...@windstream.com>.
Method handles are a relatively new component to the JVM. One thing to consider is whether or not you are running the latest JVM version.

If I ignore for a moment your finding of that method on the top of the stack, when I’ve seen the JVM get very slow without any apparent deadlocks I’ve also seen GC as an issue. I would just double check to make sure there’s not a leak (in heap or permgen/metaspace) going on, it might be possible that the MethodHandle code is more susceptible to GC pauses and appears more often in your stack dumps. Same thing with your underlying OS as well, I would just double check that you aren’t in a swap condition or (on Linux) in iowait states. Your note about removing indy fixing this seems to suggest that is not the case but it’s also a common condition I see with “JVM starts running order of magnitudes slower until you restart” issues. It may also be possible that Groovy indy or JVM invokedynamic has a leak that shows up as a memory scaling problem as well.

Jason

From: David Clark [mailto:plotinussmith@gmail.com]
Sent: Monday, March 14, 2016 5:31 PM
To: users@groovy.apache.org
Subject: Very Odd/Random JVM Slowdown With Indy

I've been chasing a slowdown in our application for a couple of months now. I have what I believe is a solution (no slowdown for 4 days now). But I'm having difficulty understanding why the solution works.

Symptoms:

At random intervals and a random times our web servers will go from serving responses in the 300 ms range to taking 30 seconds or more. Sometimes the servers will recover, sometimes they require a restart of the webserver (spring boot/tomcat). When the applications slow down we always see the tomcat thread pool hit the maximum size. Every single thread in the thread pool is in the RUNNABLE state but appears to be making no progress. Successive thread dumps show that the stacks are changing, but VERY slowly. The top of the stack is always this method:

at java.lang.invoke.MethodHandleNatives.setCallSiteTargetNormal(Native Method).

The other common condition is that whatever application code is on the stack is always dynamically compiled. Code that is @CompileStatic is NEVER on the stack when we see these slowdowns.

The thread dumps showed that the application code is never waiting on locks, socket reads, db connections, etc.

Solution:

The solution to the problem was to disable Indy compilation and return to non-Indy compilation. However, I don't think Indy is the problem here. I noticed that our Spring Boot executable jar contained BOTH groovy-all-2.4.5.jar AND groovy-all-indy-2.4.5.jar. Someone forgot to exclude the non-indy jars.

My theory:

Having both indy and non-indy jars on the classpath is confusing the JIT compiler. Code will be continuously JIT-ed as different methods fight over which class files to JIT, those loaded from the groovy-all jar or those loaded from the groovy-all-indy jar. If this is true then the compiler threads will be continuously running and applying native locks which are invisible to tools like VisualVM. The result would be random slowdowns because only certain combinations of code paths would result in slowdowns. It would also cause application code to go very slowly as the JIT compiler continuously re-compiles code over and over again. Application code would be stuck mostly waiting for JIT operations to complete as invalidated code is continuously removed and replaced.

For now I will be leaving Indy disabled until we can do more accurate load testing in non production environments.

My Question:

Is this theory possible? Am I going in a direction that is possible or likely?

----------------------------------------------------------------------
This email message and any attachments are for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message and any attachments.

Re: Very Odd/Random JVM Slowdown With Indy

Posted by David Clark <pl...@gmail.com>.
I believe I have a simple application which shows the problem. You can 
find it here:

https://github.com/dwclark/deopt-storm

The README should cover how to use it. The original application involved 
a dynamic state machine. I have tried to preserve the basics of how the 
program would execute while removing the application specific details. 
Please let me know if there is something I can do to improve this or if 
you have any questions.

Thanks,

David

On 03/18/2016 07:26 AM, Jochen Theodorou wrote:
>
>
> On 16.03.2016 20:31, David Clark wrote:
>> If I can think of a way to reproduce the problem with a small script or
>> application I will definitely post it to the list. I really have no idea
>> what would cause this problem. If you have any suggestions about what
>> types of application behavior would tend to lead to this issue I will
>> give them a try.
>>
>> Thank for the link to the JDK bug, that does seem to be what we are 
>> seeing.
>
> If it is not that bug, then it is most probably... well it is the most 
> probable reason anyway... our non-existing ability to handle multiple 
> targets for the method call in the callsite. It is non-existing, 
> because we don't do it for the non-indy case, and while this can 
> increase Groovy's performance quite a bit, it is additional complexity 
> to deal with and so I was waiting for a real life use case.... And it 
> might be, that your case is exactly what I was looking for.
>
> bye Jochen


Re: Very Odd/Random JVM Slowdown With Indy

Posted by Jochen Theodorou <bl...@gmx.org>.

On 16.03.2016 20:31, David Clark wrote:
> If I can think of a way to reproduce the problem with a small script or
> application I will definitely post it to the list. I really have no idea
> what would cause this problem. If you have any suggestions about what
> types of application behavior would tend to lead to this issue I will
> give them a try.
>
> Thank for the link to the JDK bug, that does seem to be what we are seeing.

If it is not that bug, then it is most probably... well it is the most 
probable reason anyway... our non-existing ability to handle multiple 
targets for the method call in the callsite. It is non-existing, because 
we don't do it for the non-indy case, and while this can increase 
Groovy's performance quite a bit, it is additional complexity to deal 
with and so I was waiting for a real life use case.... And it might be, 
that your case is exactly what I was looking for.

bye Jochen

Re: Very Odd/Random JVM Slowdown With Indy

Posted by David Clark <pl...@gmail.com>.
If I can think of a way to reproduce the problem with a small script or
application I will definitely post it to the list. I really have no idea
what would cause this problem. If you have any suggestions about what types
of application behavior would tend to lead to this issue I will give them a
try.

Thank for the link to the JDK bug, that does seem to be what we are seeing.

On Wed, Mar 16, 2016 at 7:51 AM, Jochen Theodorou <bl...@gmx.org> wrote:

> Ideal would be to have a small application showing the problem. With an
> actual test we could analyze the problem much better. It is possible the
> issue is related to https://bugs.openjdk.java.net/browse/JDK-8151981, but
> there are also possible bugs in our invokedynamic usage. What I could
> gather so far is that
>
> java.lang.invoke.MethodHandleNatives.setCallSiteTargetNormal(Native
> Method)
>
> will lock the compiler and stop all threads. And even if jvisualvm shows
> no locking, there might be still many many threads trying to recompile a
> callsite at the same time and wait for a lock to do so.
>
> bye Jochen
>
> On 14.03.2016 22:31, David Clark wrote:
>
>> I've been chasing a slowdown in our application for a couple of months
>> now. I have what I believe is a solution (no slowdown for 4 days now).
>> But I'm having difficulty understanding why the solution works.
>>
>> Symptoms:
>>
>> At random intervals and a random times our web servers will go from
>> serving responses in the 300 ms range to taking 30 seconds or more.
>> Sometimes the servers will recover, sometimes they require a restart of
>> the webserver (spring boot/tomcat). When the applications slow down we
>> always see the tomcat thread pool hit the maximum size. Every single
>> thread in the thread pool is in the RUNNABLE state but appears to be
>> making no progress. Successive thread dumps show that the stacks are
>> changing, but VERY slowly. The top of the stack is always this method:
>>
>> at java.lang.invoke.MethodHandleNatives.setCallSiteTargetNormal(Native
>> Method).
>>
>> The other common condition is that whatever application code is on the
>> stack is always dynamically compiled. Code that is @CompileStatic is
>> NEVER on the stack when we see these slowdowns.
>>
>> The thread dumps showed that the application code is never waiting on
>> locks, socket reads, db connections, etc.
>>
>> Solution:
>>
>> The solution to the problem was to disable Indy compilation and return
>> to non-Indy compilation. However, I don't think Indy is the problem
>> here. I noticed that our Spring Boot executable jar contained BOTH
>> groovy-all-2.4.5.jar AND groovy-all-indy-2.4.5.jar. Someone forgot to
>> exclude the non-indy jars.
>>
>> My theory:
>>
>> Having both indy and non-indy jars on the classpath is confusing the JIT
>> compiler. Code will be continuously JIT-ed as different methods fight
>> over which class files to JIT, those loaded from the groovy-all jar or
>> those loaded from the groovy-all-indy jar. If this is true then the
>> compiler threads will be continuously running and applying native locks
>> which are invisible to tools like VisualVM. The result would be random
>> slowdowns because only certain combinations of code paths would result
>> in slowdowns. It would also cause application code to go very slowly as
>> the JIT compiler continuously re-compiles code over and over again.
>> Application code would be stuck mostly waiting for JIT operations to
>> complete as invalidated code is continuously removed and replaced.
>>
>> For now I will be leaving Indy disabled until we can do more accurate
>> load testing in non production environments.
>>
>> My Question:
>>
>> Is this theory possible? Am I going in a direction that is possible or
>> likely?
>>
>

Re: Very Odd/Random JVM Slowdown With Indy

Posted by Jochen Theodorou <bl...@gmx.org>.
Ideal would be to have a small application showing the problem. With an 
actual test we could analyze the problem much better. It is possible the 
issue is related to https://bugs.openjdk.java.net/browse/JDK-8151981, 
but there are also possible bugs in our invokedynamic usage. What I 
could gather so far is that

java.lang.invoke.MethodHandleNatives.setCallSiteTargetNormal(Native  Method)

will lock the compiler and stop all threads. And even if jvisualvm shows 
no locking, there might be still many many threads trying to recompile a 
callsite at the same time and wait for a lock to do so.

bye Jochen

On 14.03.2016 22:31, David Clark wrote:
> I've been chasing a slowdown in our application for a couple of months
> now. I have what I believe is a solution (no slowdown for 4 days now).
> But I'm having difficulty understanding why the solution works.
>
> Symptoms:
>
> At random intervals and a random times our web servers will go from
> serving responses in the 300 ms range to taking 30 seconds or more.
> Sometimes the servers will recover, sometimes they require a restart of
> the webserver (spring boot/tomcat). When the applications slow down we
> always see the tomcat thread pool hit the maximum size. Every single
> thread in the thread pool is in the RUNNABLE state but appears to be
> making no progress. Successive thread dumps show that the stacks are
> changing, but VERY slowly. The top of the stack is always this method:
>
> at java.lang.invoke.MethodHandleNatives.setCallSiteTargetNormal(Native
> Method).
>
> The other common condition is that whatever application code is on the
> stack is always dynamically compiled. Code that is @CompileStatic is
> NEVER on the stack when we see these slowdowns.
>
> The thread dumps showed that the application code is never waiting on
> locks, socket reads, db connections, etc.
>
> Solution:
>
> The solution to the problem was to disable Indy compilation and return
> to non-Indy compilation. However, I don't think Indy is the problem
> here. I noticed that our Spring Boot executable jar contained BOTH
> groovy-all-2.4.5.jar AND groovy-all-indy-2.4.5.jar. Someone forgot to
> exclude the non-indy jars.
>
> My theory:
>
> Having both indy and non-indy jars on the classpath is confusing the JIT
> compiler. Code will be continuously JIT-ed as different methods fight
> over which class files to JIT, those loaded from the groovy-all jar or
> those loaded from the groovy-all-indy jar. If this is true then the
> compiler threads will be continuously running and applying native locks
> which are invisible to tools like VisualVM. The result would be random
> slowdowns because only certain combinations of code paths would result
> in slowdowns. It would also cause application code to go very slowly as
> the JIT compiler continuously re-compiles code over and over again.
> Application code would be stuck mostly waiting for JIT operations to
> complete as invalidated code is continuously removed and replaced.
>
> For now I will be leaving Indy disabled until we can do more accurate
> load testing in non production environments.
>
> My Question:
>
> Is this theory possible? Am I going in a direction that is possible or
> likely?

Re: Very Odd/Random JVM Slowdown With Indy

Posted by Jochen Theodorou <bl...@gmx.org>.

On 15.03.2016 17:01, David Clark wrote:
>
>     if it is in native code, it might not be showing properly actually
>
>
> We don't have any JNI code in our code base. Is there something else
> native I could/should be looking at?

you are not using JNI, yes, but the method you mentioned is implemented 
in native code of the JVM itself. And such internal information tends 
not to be shown in tools inspecting the Application

bye Jochen

Re: Very Odd/Random JVM Slowdown With Indy

Posted by David Clark <pl...@gmail.com>.
>
>
>> if it is in native code, it might not be showing properly actually
>

We don't have any JNI code in our code base. Is there something else native
I could/should be looking at?


> hmm.... I find that unlikely. Either the classes in the jars are defined
> by different class loaders, then they are different classes, or they are
> loaded by the same class loader, in which case the jit still sees only one
> set of classes... which might be from different jars of course, but that
> should not matter... In other words, the origin should not matter
>

OK, I will keep the theory in mind, but won't pursue it further.


> you know, you can easily test your theory. Remove the non-indy jar from
> the classpath and let the application run with indy. If the slow downs,
> then do not appear anymore you are right.
>
>
I thought it would be easy to test as well. On my dev machine I have run
three separate combinations: 1) Indy compilation + indy jar only, 2) Indy
compilation + indy jar + non-indy jar, and 3) Non indy compilation +
non-indy jar. Option #2 is the configuration that was causing the problem
earlier. However, I was not able to replicate the problem. However, this is
not an exact comparison different OS, slightly different JVM build numbers,
along with the fact that it is inherently difficult to replicate actual
load from real users.

Unfortunately there is no way I will be able to run the Indy compiler +
Indy jar in our production environments any time soon. Until I can prove
that configuration can avoid the problem under substantial load in
identical configurations indy will be kept out of production.


> I find it unlikely, but you never know unless you verify it ;)
>
> I wonder how we can get to the ground of this.
>

Thanks for the reply. If I think of something I will post back.

Re: Very Odd/Random JVM Slowdown With Indy

Posted by Jochen Theodorou <bl...@gmx.org>.

On 14.03.2016 22:31, David Clark wrote:
> I've been chasing a slowdown in our application for a couple of months
> now. I have what I believe is a solution (no slowdown for 4 days now).
> But I'm having difficulty understanding why the solution works.
>
> Symptoms:
>
> At random intervals and a random times our web servers will go from
> serving responses in the 300 ms range to taking 30 seconds or more.

30s is pretty bad indeed

> Sometimes the servers will recover, sometimes they require a restart of
> the webserver (spring boot/tomcat). When the applications slow down we
> always see the tomcat thread pool hit the maximum size. Every single
> thread in the thread pool is in the RUNNABLE state but appears to be
> making no progress. Successive thread dumps show that the stacks are
> changing, but VERY slowly. The top of the stack is always this method:
>
> at java.lang.invoke.MethodHandleNatives.setCallSiteTargetNormal(Native
> Method).

strange...

> The other common condition is that whatever application code is on the
> stack is always dynamically compiled. Code that is @CompileStatic is
> NEVER on the stack when we see these slowdowns.

static compiled code does not use any invokedynamic yet

> The thread dumps showed that the application code is never waiting on
> locks, socket reads, db connections, etc.

if it is in native code, it might not be showing properly actually.

> Solution:
>
> The solution to the problem was to disable Indy compilation and return
> to non-Indy compilation. However, I don't think Indy is the problem
> here. I noticed that our Spring Boot executable jar contained BOTH
> groovy-all-2.4.5.jar AND groovy-all-indy-2.4.5.jar. Someone forgot to
> exclude the non-indy jars.
>
> My theory:
>
> Having both indy and non-indy jars on the classpath is confusing the JIT
> compiler.

hmm.... I find that unlikely. Either the classes in the jars are defined 
by different class loaders, then they are different classes, or they are 
loaded by the same class loader, in which case the jit still sees only 
one set of classes... which might be from different jars of course, but 
that should not matter... In other words, the origin should not matter

> Code will be continuously JIT-ed as different methods fight
> over which class files to JIT, those loaded from the groovy-all jar or
> those loaded from the groovy-all-indy jar. If this is true then the
> compiler threads will be continuously running and applying native locks
> which are invisible to tools like VisualVM. The result would be random
> slowdowns because only certain combinations of code paths would result
> in slowdowns. It would also cause application code to go very slowly as
> the JIT compiler continuously re-compiles code over and over again.
> Application code would be stuck mostly waiting for JIT operations to
> complete as invalidated code is continuously removed and replaced.
>
> For now I will be leaving Indy disabled until we can do more accurate
> load testing in non production environments.

you know, you can easily test your theory. Remove the non-indy jar from 
the classpath and let the application run with indy. If the slow downs, 
then do not appear anymore you are right.

> My Question:
>
> Is this theory possible? Am I going in a direction that is possible or
> likely?

I find it unlikely, but you never know unless you verify it ;)

I wonder how we can get to the ground of this.

bye Jochen