You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tomcat.apache.org by Christopher Schultz <ch...@christopherschultz.net> on 2011/05/03 16:22:27 UTC

100% CPU Usage when stopping Tomcat after OOM condition

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

All,

Moments ago in our development environment, our webapp suffered an OOME
after many re-reployments (we know we have an undeploy-related leak).
When attempting to bounce Tomcat, the shutdown failed and I took a
thread dump which included the one non-trivial thread shown below.

I haven't looked at the code yet to see if there is some kind of loop
around this code, but top reported 100% CPU usage from this process so I
suspect something like that was happening.

We are using TC 6.0.32 on Debian Lenny and Sun/Oracle 32-bit
1.6.0_22-b04 Server VM.

This is less of a "help" request as it is an observation of an
unfortunate situation: the OOME is clearly the real problem and has
nothing to do with Tomcat itself. Unfortunately, after the OOME, Tomcat
was unable to shut down gracefully and that could be a problem in and of
itself.

Thanks,
- -chris

Stack trace hung during shutdown:

"main" prio=10 tid=0x08059400 nid=0x7da1 runnable [0xb7276000]
   java.lang.Thread.State: RUNNABLE
        at java.lang.Class.getDeclaredFields0(Native Method)
        at java.lang.Class.privateGetDeclaredFields(Class.java:2291)
        at java.lang.Class.getDeclaredFields(Class.java:1743)
        at
org.apache.catalina.loader.WebappClassLoader.clearReferencesStaticFinal(WebappClassLoader.java:2034)
        at
org.apache.catalina.loader.WebappClassLoader.clearReferences(WebappClassLoader.java:1929)
        at
org.apache.catalina.loader.WebappClassLoader.stop(WebappClassLoader.java:1833)
        at
org.apache.catalina.loader.WebappLoader.stop(WebappLoader.java:740)
        at
org.apache.catalina.core.StandardContext.stop(StandardContext.java:4913)
        - locked <0xb01a3e68> (a org.apache.catalina.core.StandardContext)
        at
org.apache.catalina.core.ContainerBase.removeChild(ContainerBase.java:932)
        at
org.apache.catalina.startup.HostConfig.undeployApps(HostConfig.java:1357)
        at org.apache.catalina.startup.HostConfig.stop(HostConfig.java:1328)
        at
org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:326)
        at
org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:142)
        at
org.apache.catalina.core.ContainerBase.stop(ContainerBase.java:1094)
        - locked <0xb01a3cf0> (a org.apache.catalina.core.StandardHost)
        at
org.apache.catalina.core.ContainerBase.stop(ContainerBase.java:1106)
        - locked <0xb0220538> (a org.apache.catalina.core.StandardEngine)
        at
org.apache.catalina.core.StandardEngine.stop(StandardEngine.java:468)
        at
org.apache.catalina.core.StandardService.stop(StandardService.java:604)
        - locked <0xb0220538> (a org.apache.catalina.core.StandardEngine)
        at
org.apache.catalina.core.StandardServer.stop(StandardServer.java:788)
        at org.apache.catalina.startup.Catalina.stop(Catalina.java:662)
        at org.apache.catalina.startup.Catalina.start(Catalina.java:629)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:289)
        at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:414)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk3AD6MACgkQ9CaO5/Lv0PA8wwCfWLN7kwtFEhy/xgvBHIhr7jxR
93MAn3F101tNSYhmeO+yqEUElcGScTE+
=6Xnu
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: 100% CPU Usage when stopping Tomcat after OOM condition

Posted by Christopher Schultz <ch...@christopherschultz.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Felix,

On 5/4/2011 1:07 AM, Felix Schumacher wrote:
> It could've been the garbage collector itself. Just yesterday I had 
> an oom on permgen, which lead to a major-gc loop. I could see the
> loop nicely as I had enabled logging of gc activity with
> -Xloggc:$CATALINA_BASE/logs/gc.log -XX:+PrintGCDetails 
> -XX:PrintGCTimeStamps

I don't recall any of the GC threads actually going anything in my
thread dump, but it's possible I just missed it. This situation is
trivially reproducable, so I'll check again next time it happens.

If the GC is going into a tight loop to try to free-up space, that seems
like a bug to me... the GC ought to detect that it's not accomplishing
anything and ... well, I guess throw another OOME.

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk3CpfkACgkQ9CaO5/Lv0PCKpQCeJOibvjER/FbeLvIVVEzcCTMj
EWIAni0egoiA4Q8CxZKzQ3DL0ePo2O5K
=Q5AF
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: 100% CPU Usage when stopping Tomcat after OOM condition

Posted by Felix Schumacher <fe...@internetallee.de>.

"Christopher Schultz" <ch...@christopherschultz.net> schrieb:

>-----BEGIN PGP SIGNED MESSAGE-----
>Hash: SHA1
>
>Mark,
>
>On 5/3/2011 10:52 AM, Mark Thomas wrote:
>> On 03/05/2011 15:46, André Warnier wrote:
>>> Mark Thomas wrote:
>>>> On 03/05/2011 15:22, Christopher Schultz wrote:
>>>>> All,
>>>>>
>>>>> Moments ago in our development environment, our webapp suffered an
>OOME
>>>>> after many re-reployments (we know we have an undeploy-related
>leak).
>>>>> When attempting to bounce Tomcat, the shutdown failed and I took a
>>>>> thread dump which included the one non-trivial thread shown below.
>>>>>
>>>>> I haven't looked at the code yet to see if there is some kind of
>loop
>>>>> around this code, but top reported 100% CPU usage from this
>process so I
>>>>> suspect something like that was happening.
>>>>>
>>>>> We are using TC 6.0.32 on Debian Lenny and Sun/Oracle 32-bit
>>>>> 1.6.0_22-b04 Server VM.
>>>>>
>>>>> This is less of a "help" request as it is an observation of an
>>>>> unfortunate situation: the OOME is clearly the real problem and
>has
>>>>> nothing to do with Tomcat itself. Unfortunately, after the OOME,
>Tomcat
>>>>> was unable to shut down gracefully and that could be a problem in
>and of
>>>>> itself.
>>>>
>>>> After an OOME there are no guarantees as to the state of the JVM.
>That
>>>> Tomcat is unable to shut down cleanly in that case is not
>suprising.
>
>Chuck's assertion is that an OOME isn't fatal to the JVM but almost
>always fatal to the application since it's unlikely equipped to handle
>the bizarre situations that arise from it's occurrence. A bit of a
>hair-split if you ask me, but it's always worth examining any options
>TC
>has to at least be able to stop itself gracefully.
>
>>> Do I not remember seeing somewhere, at some point, a parameter named
>>> "OOMParachute" or similar ?
>> 
>> Yes, in the NIO connector. It may, or may not, provide the JVM with
>> enough memory to exit gracefully.
>
>Since this is a PegmGen issue, it's unlikely that the NIO connector's
>OOM parachute will help, here. Also, I'm not using the NIO connector :)
>
>Any reason the parachute is in the /connector/ and not somewhere more
>universal?
>
>I was mostly interested in why I was getting 100% CPU usage... checking

It could've been the garbage collector itself. Just yesterday I had an oom on permgen, which lead to a major-gc loop.
I could see the loop nicely as I had enabled logging of gc activity with -Xloggc:$CATALINA_BASE/logs/gc.log -XX:+PrintGCDetails -XX:PrintGCTimeStamps

Regards
 Felix
>the code, I *do* see a loop in
>WebappClassLoader.clearReferencesStaticFinal but it's over an iterator
>that will presumably run out of entries. The CPU spike seems to be
>happening within the native code in java.lang.Class.getDeclaredFields,
>so there's nothing TC can do to prevent it. :(
>
>- -chris
>-----BEGIN PGP SIGNATURE-----
>Version: GnuPG v1.4.10 (MingW32)
>Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
>iEYEARECAAYFAk3AWvEACgkQ9CaO5/Lv0PAodwCfUxrRVV0TZihjiS1d6QwF1LCo
>8PgAn3jOrCy4IvUTe4diJ68gGrqIyQXZ
>=Ys5e
>-----END PGP SIGNATURE-----
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>For additional commands, e-mail: users-help@tomcat.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: 100% CPU Usage when stopping Tomcat after OOM condition

Posted by Christopher Schultz <ch...@christopherschultz.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Mark,

On 5/3/2011 10:52 AM, Mark Thomas wrote:
> On 03/05/2011 15:46, André Warnier wrote:
>> Mark Thomas wrote:
>>> On 03/05/2011 15:22, Christopher Schultz wrote:
>>>> All,
>>>>
>>>> Moments ago in our development environment, our webapp suffered an OOME
>>>> after many re-reployments (we know we have an undeploy-related leak).
>>>> When attempting to bounce Tomcat, the shutdown failed and I took a
>>>> thread dump which included the one non-trivial thread shown below.
>>>>
>>>> I haven't looked at the code yet to see if there is some kind of loop
>>>> around this code, but top reported 100% CPU usage from this process so I
>>>> suspect something like that was happening.
>>>>
>>>> We are using TC 6.0.32 on Debian Lenny and Sun/Oracle 32-bit
>>>> 1.6.0_22-b04 Server VM.
>>>>
>>>> This is less of a "help" request as it is an observation of an
>>>> unfortunate situation: the OOME is clearly the real problem and has
>>>> nothing to do with Tomcat itself. Unfortunately, after the OOME, Tomcat
>>>> was unable to shut down gracefully and that could be a problem in and of
>>>> itself.
>>>
>>> After an OOME there are no guarantees as to the state of the JVM. That
>>> Tomcat is unable to shut down cleanly in that case is not suprising.

Chuck's assertion is that an OOME isn't fatal to the JVM but almost
always fatal to the application since it's unlikely equipped to handle
the bizarre situations that arise from it's occurrence. A bit of a
hair-split if you ask me, but it's always worth examining any options TC
has to at least be able to stop itself gracefully.

>> Do I not remember seeing somewhere, at some point, a parameter named
>> "OOMParachute" or similar ?
> 
> Yes, in the NIO connector. It may, or may not, provide the JVM with
> enough memory to exit gracefully.

Since this is a PegmGen issue, it's unlikely that the NIO connector's
OOM parachute will help, here. Also, I'm not using the NIO connector :)

Any reason the parachute is in the /connector/ and not somewhere more
universal?

I was mostly interested in why I was getting 100% CPU usage... checking
the code, I *do* see a loop in
WebappClassLoader.clearReferencesStaticFinal but it's over an iterator
that will presumably run out of entries. The CPU spike seems to be
happening within the native code in java.lang.Class.getDeclaredFields,
so there's nothing TC can do to prevent it. :(

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk3AWvEACgkQ9CaO5/Lv0PAodwCfUxrRVV0TZihjiS1d6QwF1LCo
8PgAn3jOrCy4IvUTe4diJ68gGrqIyQXZ
=Ys5e
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: 100% CPU Usage when stopping Tomcat after OOM condition

Posted by Mark Thomas <ma...@apache.org>.
On 03/05/2011 15:46, André Warnier wrote:
> Mark Thomas wrote:
>> On 03/05/2011 15:22, Christopher Schultz wrote:
>>> All,
>>>
>>> Moments ago in our development environment, our webapp suffered an OOME
>>> after many re-reployments (we know we have an undeploy-related leak).
>>> When attempting to bounce Tomcat, the shutdown failed and I took a
>>> thread dump which included the one non-trivial thread shown below.
>>>
>>> I haven't looked at the code yet to see if there is some kind of loop
>>> around this code, but top reported 100% CPU usage from this process so I
>>> suspect something like that was happening.
>>>
>>> We are using TC 6.0.32 on Debian Lenny and Sun/Oracle 32-bit
>>> 1.6.0_22-b04 Server VM.
>>>
>>> This is less of a "help" request as it is an observation of an
>>> unfortunate situation: the OOME is clearly the real problem and has
>>> nothing to do with Tomcat itself. Unfortunately, after the OOME, Tomcat
>>> was unable to shut down gracefully and that could be a problem in and of
>>> itself.
>>
>> After an OOME there are no guarantees as to the state of the JVM. That
>> Tomcat is unable to shut down cleanly in that case is not suprising.
>>
> Do I not remember seeing somewhere, at some point, a parameter named
> "OOMParachute" or similar ?

Yes, in the NIO connector. It may, or may not, provide the JVM with
enough memory to exit gracefully.

Mark



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: 100% CPU Usage when stopping Tomcat after OOM condition

Posted by André Warnier <aw...@ice-sa.com>.
Mark Thomas wrote:
> On 03/05/2011 15:22, Christopher Schultz wrote:
>> All,
>>
>> Moments ago in our development environment, our webapp suffered an OOME
>> after many re-reployments (we know we have an undeploy-related leak).
>> When attempting to bounce Tomcat, the shutdown failed and I took a
>> thread dump which included the one non-trivial thread shown below.
>>
>> I haven't looked at the code yet to see if there is some kind of loop
>> around this code, but top reported 100% CPU usage from this process so I
>> suspect something like that was happening.
>>
>> We are using TC 6.0.32 on Debian Lenny and Sun/Oracle 32-bit
>> 1.6.0_22-b04 Server VM.
>>
>> This is less of a "help" request as it is an observation of an
>> unfortunate situation: the OOME is clearly the real problem and has
>> nothing to do with Tomcat itself. Unfortunately, after the OOME, Tomcat
>> was unable to shut down gracefully and that could be a problem in and of
>> itself.
> 
> After an OOME there are no guarantees as to the state of the JVM. That
> Tomcat is unable to shut down cleanly in that case is not suprising.
> 
Do I not remember seeing somewhere, at some point, a parameter named "OOMParachute" or 
similar ?

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: 100% CPU Usage when stopping Tomcat after OOM condition

Posted by Mark Thomas <ma...@apache.org>.
On 03/05/2011 15:22, Christopher Schultz wrote:
> All,
> 
> Moments ago in our development environment, our webapp suffered an OOME
> after many re-reployments (we know we have an undeploy-related leak).
> When attempting to bounce Tomcat, the shutdown failed and I took a
> thread dump which included the one non-trivial thread shown below.
> 
> I haven't looked at the code yet to see if there is some kind of loop
> around this code, but top reported 100% CPU usage from this process so I
> suspect something like that was happening.
> 
> We are using TC 6.0.32 on Debian Lenny and Sun/Oracle 32-bit
> 1.6.0_22-b04 Server VM.
> 
> This is less of a "help" request as it is an observation of an
> unfortunate situation: the OOME is clearly the real problem and has
> nothing to do with Tomcat itself. Unfortunately, after the OOME, Tomcat
> was unable to shut down gracefully and that could be a problem in and of
> itself.

After an OOME there are no guarantees as to the state of the JVM. That
Tomcat is unable to shut down cleanly in that case is not suprising.

Mark

> 
> Thanks,
> -chris
> 
> Stack trace hung during shutdown:
> 
> "main" prio=10 tid=0x08059400 nid=0x7da1 runnable [0xb7276000]
>    java.lang.Thread.State: RUNNABLE
>         at java.lang.Class.getDeclaredFields0(Native Method)
>         at java.lang.Class.privateGetDeclaredFields(Class.java:2291)
>         at java.lang.Class.getDeclaredFields(Class.java:1743)
>         at
> org.apache.catalina.loader.WebappClassLoader.clearReferencesStaticFinal(WebappClassLoader.java:2034)
>         at
> org.apache.catalina.loader.WebappClassLoader.clearReferences(WebappClassLoader.java:1929)
>         at
> org.apache.catalina.loader.WebappClassLoader.stop(WebappClassLoader.java:1833)
>         at
> org.apache.catalina.loader.WebappLoader.stop(WebappLoader.java:740)
>         at
> org.apache.catalina.core.StandardContext.stop(StandardContext.java:4913)
>         - locked <0xb01a3e68> (a org.apache.catalina.core.StandardContext)
>         at
> org.apache.catalina.core.ContainerBase.removeChild(ContainerBase.java:932)
>         at
> org.apache.catalina.startup.HostConfig.undeployApps(HostConfig.java:1357)
>         at org.apache.catalina.startup.HostConfig.stop(HostConfig.java:1328)
>         at
> org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:326)
>         at
> org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:142)
>         at
> org.apache.catalina.core.ContainerBase.stop(ContainerBase.java:1094)
>         - locked <0xb01a3cf0> (a org.apache.catalina.core.StandardHost)
>         at
> org.apache.catalina.core.ContainerBase.stop(ContainerBase.java:1106)
>         - locked <0xb0220538> (a org.apache.catalina.core.StandardEngine)
>         at
> org.apache.catalina.core.StandardEngine.stop(StandardEngine.java:468)
>         at
> org.apache.catalina.core.StandardService.stop(StandardService.java:604)
>         - locked <0xb0220538> (a org.apache.catalina.core.StandardEngine)
>         at
> org.apache.catalina.core.StandardServer.stop(StandardServer.java:788)
>         at org.apache.catalina.startup.Catalina.stop(Catalina.java:662)
>         at org.apache.catalina.startup.Catalina.start(Catalina.java:629)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:289)
>         at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:414)

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org





---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org