You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tomcat.apache.org by Dante Bell <Da...@cocoanet.us> on 2011/08/05 16:34:07 UTC

Urgent: Tomcat 6.0.20 on Solaris 10 Reaches max threads and is non responsive

Hi,

I'm running out of ideas on what to try for this customer. Their load
tests show that Tomcat is getting to a point where it no longer services
requests. Thread dumps show most threads in a wait state, but some are
runnable. I'm suspecting GC is the issue, but in analyzing the logs it
shows the longest pause was 15 seconds, but that only happened once, and
most of the stats look OK to me.

Details can be found on my blog: http://wp.me/plPvN-ai

Here's some info:

*Apache:* Apache HTTP Server Version 2.2 -- prefork with mpm
*Tomcat:* 6.0.20
*JK Connector:* Same as whatever is bundled in with Apache 2.2 (from
customer)
*Solaris* Solaris 10 10/09 s10s_u8wos_08a SPARC

Workers.Properties:

|# Define 1 real worker using ajp13|
|worker.list=worker1,worker2,worker3,worker4|
|worker.maintain=10|
 
|# Set properties for worker1 (ajp13)|
|worker.worker1.||type||=ajp13|
|worker.worker1.host=localhost|
|worker.worker1.port=8019|
|worker.worker1.lbfactor=1|
|worker.worker1.connection_pool_size=1|
|worker.worker1.connection_pool_timeout=10|
|worker.worker1.socket_keepalive=1|
|worker.worker1.socket_timeout=300|
|worker.worker1.cache_timeout=10|
 
|# Set properties for worker2 (ajp13)|
|worker.worker2.||type||=ajp13|
|worker.worker2.host=localhost|
|worker.worker2.port=8019|
|worker.worker2.lbfactor=1|
|worker.worker2.connection_pool_size=1|
|worker.worker2.connection_pool_timeout=10|
|worker.worker2.socket_keepalive=1|
|worker.worker2.socket_timeout=300|
|worker.worker2.cache_timeout=10|
 
|# Set properties for worker3 (ajp13)|
|worker.worker3.||type||=ajp13|
|worker.worker3.host=localhost|
|worker.worker3.port=8019|
|worker.worker3.lbfactor=1|
|worker.worker3.connection_pool_size=1|
|worker.worker3.connection_pool_timeout=10|
|worker.worker3.socket_keepalive=1|
|worker.worker3.socket_timeout=300|
|worker.worker3.cache_timeout=10|
 
|# Set properties for worker4 (ajp13)|
|worker.worker4.||type||=ajp13|
|worker.worker4.host=localhost|
|worker.worker4.port=8019|
|worker.worker4.lbfactor=1|
|worker.worker4.connection_pool_size=1|
|worker.worker4.connection_pool_timeout=10|
|worker.worker4.socket_keepalive=1|
|worker.worker4.socket_timeout=300|
|worker.worker4.cache_timeout=10

|


Re: Urgent: Tomcat 6.0.20 on Solaris 10 Reaches max threads and is non responsive

Posted by Dante Bell <Da...@cocoanet.us>.
Hi Guys,

I hate to pick your brains on this as the customer should know how to do
this, but they tasked me to find out ;(

Is there any API or other method they can code that will give them an
indication of nearing this threshold? I know it's a crappy solution, but
I thought I'd ask anyway :) AFAIK, from the OS side there's really
nothing I can think of that would help.

Thanks for all your help, it's greatly appreciated by myself and the
entire team,
Danté

On 08/05/2011 12:14 PM, Mark Thomas wrote:
> On 05/08/2011 17:10, Dante Bell wrote:
>> This is probably a really dumb question, but say they implement
>> load-balanced Tomcat on 2 nodes for example. Would that then allow for
>> greater than 20 STMs for the servlets?
> It will allow them up to 20 concurrent requests per STM Servlet per
> Tomcat instance. What that means for actual users is difficult to judge
> since a single request may be routed through multiple servlets
> (including the same servlet several times). It will certainly increase
> capacity. From what to what is hard impossible to tell without knowing
> the application code.
>
> Mark
>
>> On 08/05/2011 12:00 PM, Mark Thomas wrote:
>>> On 05/08/2011 16:56, Dante Bell wrote:
>>>> Thanks!
>>>>
>>>> Like I said, I'm an OS/HW guy, never looked at java b4!
>>>>
>>>> They are saying that the load test has 20 'connections' so I'm guessing
>>>> that's the 20 STMs.
>>>>
>>>> Now, is this a fixable thing within the Java stack? Or is it an
>>>> application limitation?
>>> It looks to be hard-coded within Tomcat. I don't see a way to change
>>> that limit without building Tomcat from source.
>>>
>>> The other option is re-write the STM Servlet(s) as non-STM.
>>>
>>> Mark
>>>
>>>> Danté
>>>>
>>>> On 08/05/2011 11:12 AM, Mark Thomas wrote:
>>>>> On 05/08/2011 15:34, Dante Bell wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I'm running out of ideas on what to try for this customer. Their load
>>>>>> tests show that Tomcat is getting to a point where it no longer services
>>>>>> requests.
>>>>> Let me guess. It is fine for low loads but as soon as the load goes
>>>>> above a certain number (maybe around 20 concurrent users?) then
>>>>> everything starts going wrong?
>>>>>
>>>>>> Thread dumps show most threads in a wait state, but some are
>>>>>> runnable. I'm suspecting GC is the issue,
>>>>> On what basis? The thread dump clearly shows a different problem.
>>>>>
>>>>>> but in analyzing the logs it
>>>>>> shows the longest pause was 15 seconds, but that only happened once, and
>>>>>> most of the stats look OK to me.
>>>>> That would suggest that it isn't GC then wouldn't it.
>>>>>
>>>>>> Details can be found on my blog: http://wp.me/plPvN-ai
>>>>> All the information needed to diagnose this issue is in the thead dump.
>>>>> If you take a look at the first thread that is blocked:
>>>>> "TP-Processor40745" daemon prio=3 tid=0x03831400 nid=0xa888 in
>>>>> Object.wait() [0x2cd2f000..0x2cd2f870]
>>>>>    java.lang.Thread.State: WAITING (on object monitor)
>>>>> 	at java.lang.Object.wait(Native Method)
>>>>> 	at java.lang.Object.wait(Object.java:485)
>>>>> 	at
>>>>> org.apache.catalina.core.StandardWrapper.allocate(StandardWrapper.java:854)
>>>>> 	- locked <0x63a26708> (a java.util.Stack)
>>>>> 	at
>>>>> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:129)
>>>>> 	at
>>>>> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
>>>>>
>>>>> And Bingo! we have found the problem.
>>>>>
>>>>> Now that stack trace might not ring alarm bells with someone unfamiliar
>>>>> with the Tomcat code base but if Tomcat is unresponsive, understanding
>>>>> why *any* thread is blocked would be a good place to start. If you look
>>>>> at line 854 and the surrounding code for StandardWrapper you will see
>>>>> that this is part of the Servlet allocation process. You should then
>>>>> realise that line 854 is part of Servlet allocation for Servlets that
>>>>> implement the SingleThreadModel (and now the alarm bells should be
>>>>> ringing loud and clear).
>>>>>
>>>>> Tomcat allocates a maximum of 20 instances of any STM servlet. Your
>>>>> client has an application that uses STM and requires many more than 20
>>>>> concurrent instances. Hence most requests are sat waiting for a STM
>>>>> instance to be released.
>>>>>
>>>>> Since that means there must be 20 instances of an STM Servlet already
>>>>> allocated, it is simple enough to grep the thread dump to find out which
>>>>> one. The winner is:
>>>>> com.motorola.nsm.common.ui.servlet.ValidateServlet
>>>>>
>>>>> Mark
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>>>>> For additional commands, e-mail: users-help@tomcat.apache.org
>>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>>>> For additional commands, e-mail: users-help@tomcat.apache.org
>>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>>> For additional commands, e-mail: users-help@tomcat.apache.org
>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>> For additional commands, e-mail: users-help@tomcat.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: Urgent: Tomcat 6.0.20 on Solaris 10 Reaches max threads and is non responsive

Posted by Mark Thomas <ma...@apache.org>.
On 05/08/2011 17:10, Dante Bell wrote:
> This is probably a really dumb question, but say they implement
> load-balanced Tomcat on 2 nodes for example. Would that then allow for
> greater than 20 STMs for the servlets?

It will allow them up to 20 concurrent requests per STM Servlet per
Tomcat instance. What that means for actual users is difficult to judge
since a single request may be routed through multiple servlets
(including the same servlet several times). It will certainly increase
capacity. From what to what is hard impossible to tell without knowing
the application code.

Mark

> 
> On 08/05/2011 12:00 PM, Mark Thomas wrote:
>> On 05/08/2011 16:56, Dante Bell wrote:
>>> Thanks!
>>>
>>> Like I said, I'm an OS/HW guy, never looked at java b4!
>>>
>>> They are saying that the load test has 20 'connections' so I'm guessing
>>> that's the 20 STMs.
>>>
>>> Now, is this a fixable thing within the Java stack? Or is it an
>>> application limitation?
>> It looks to be hard-coded within Tomcat. I don't see a way to change
>> that limit without building Tomcat from source.
>>
>> The other option is re-write the STM Servlet(s) as non-STM.
>>
>> Mark
>>
>>> Danté
>>>
>>> On 08/05/2011 11:12 AM, Mark Thomas wrote:
>>>> On 05/08/2011 15:34, Dante Bell wrote:
>>>>> Hi,
>>>>>
>>>>> I'm running out of ideas on what to try for this customer. Their load
>>>>> tests show that Tomcat is getting to a point where it no longer services
>>>>> requests.
>>>> Let me guess. It is fine for low loads but as soon as the load goes
>>>> above a certain number (maybe around 20 concurrent users?) then
>>>> everything starts going wrong?
>>>>
>>>>> Thread dumps show most threads in a wait state, but some are
>>>>> runnable. I'm suspecting GC is the issue,
>>>> On what basis? The thread dump clearly shows a different problem.
>>>>
>>>>> but in analyzing the logs it
>>>>> shows the longest pause was 15 seconds, but that only happened once, and
>>>>> most of the stats look OK to me.
>>>> That would suggest that it isn't GC then wouldn't it.
>>>>
>>>>> Details can be found on my blog: http://wp.me/plPvN-ai
>>>> All the information needed to diagnose this issue is in the thead dump.
>>>> If you take a look at the first thread that is blocked:
>>>> "TP-Processor40745" daemon prio=3 tid=0x03831400 nid=0xa888 in
>>>> Object.wait() [0x2cd2f000..0x2cd2f870]
>>>>    java.lang.Thread.State: WAITING (on object monitor)
>>>> 	at java.lang.Object.wait(Native Method)
>>>> 	at java.lang.Object.wait(Object.java:485)
>>>> 	at
>>>> org.apache.catalina.core.StandardWrapper.allocate(StandardWrapper.java:854)
>>>> 	- locked <0x63a26708> (a java.util.Stack)
>>>> 	at
>>>> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:129)
>>>> 	at
>>>> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
>>>>
>>>> And Bingo! we have found the problem.
>>>>
>>>> Now that stack trace might not ring alarm bells with someone unfamiliar
>>>> with the Tomcat code base but if Tomcat is unresponsive, understanding
>>>> why *any* thread is blocked would be a good place to start. If you look
>>>> at line 854 and the surrounding code for StandardWrapper you will see
>>>> that this is part of the Servlet allocation process. You should then
>>>> realise that line 854 is part of Servlet allocation for Servlets that
>>>> implement the SingleThreadModel (and now the alarm bells should be
>>>> ringing loud and clear).
>>>>
>>>> Tomcat allocates a maximum of 20 instances of any STM servlet. Your
>>>> client has an application that uses STM and requires many more than 20
>>>> concurrent instances. Hence most requests are sat waiting for a STM
>>>> instance to be released.
>>>>
>>>> Since that means there must be 20 instances of an STM Servlet already
>>>> allocated, it is simple enough to grep the thread dump to find out which
>>>> one. The winner is:
>>>> com.motorola.nsm.common.ui.servlet.ValidateServlet
>>>>
>>>> Mark
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>>>> For additional commands, e-mail: users-help@tomcat.apache.org
>>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>>> For additional commands, e-mail: users-help@tomcat.apache.org
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>> For additional commands, e-mail: users-help@tomcat.apache.org
>>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: Urgent: Tomcat 6.0.20 on Solaris 10 Reaches max threads and is non responsive

Posted by Dante Bell <Da...@cocoanet.us>.
This is probably a really dumb question, but say they implement
load-balanced Tomcat on 2 nodes for example. Would that then allow for
greater than 20 STMs for the servlets?

On 08/05/2011 12:00 PM, Mark Thomas wrote:
> On 05/08/2011 16:56, Dante Bell wrote:
>> Thanks!
>>
>> Like I said, I'm an OS/HW guy, never looked at java b4!
>>
>> They are saying that the load test has 20 'connections' so I'm guessing
>> that's the 20 STMs.
>>
>> Now, is this a fixable thing within the Java stack? Or is it an
>> application limitation?
> It looks to be hard-coded within Tomcat. I don't see a way to change
> that limit without building Tomcat from source.
>
> The other option is re-write the STM Servlet(s) as non-STM.
>
> Mark
>
>> Danté
>>
>> On 08/05/2011 11:12 AM, Mark Thomas wrote:
>>> On 05/08/2011 15:34, Dante Bell wrote:
>>>> Hi,
>>>>
>>>> I'm running out of ideas on what to try for this customer. Their load
>>>> tests show that Tomcat is getting to a point where it no longer services
>>>> requests.
>>> Let me guess. It is fine for low loads but as soon as the load goes
>>> above a certain number (maybe around 20 concurrent users?) then
>>> everything starts going wrong?
>>>
>>>> Thread dumps show most threads in a wait state, but some are
>>>> runnable. I'm suspecting GC is the issue,
>>> On what basis? The thread dump clearly shows a different problem.
>>>
>>>> but in analyzing the logs it
>>>> shows the longest pause was 15 seconds, but that only happened once, and
>>>> most of the stats look OK to me.
>>> That would suggest that it isn't GC then wouldn't it.
>>>
>>>> Details can be found on my blog: http://wp.me/plPvN-ai
>>> All the information needed to diagnose this issue is in the thead dump.
>>> If you take a look at the first thread that is blocked:
>>> "TP-Processor40745" daemon prio=3 tid=0x03831400 nid=0xa888 in
>>> Object.wait() [0x2cd2f000..0x2cd2f870]
>>>    java.lang.Thread.State: WAITING (on object monitor)
>>> 	at java.lang.Object.wait(Native Method)
>>> 	at java.lang.Object.wait(Object.java:485)
>>> 	at
>>> org.apache.catalina.core.StandardWrapper.allocate(StandardWrapper.java:854)
>>> 	- locked <0x63a26708> (a java.util.Stack)
>>> 	at
>>> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:129)
>>> 	at
>>> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
>>>
>>> And Bingo! we have found the problem.
>>>
>>> Now that stack trace might not ring alarm bells with someone unfamiliar
>>> with the Tomcat code base but if Tomcat is unresponsive, understanding
>>> why *any* thread is blocked would be a good place to start. If you look
>>> at line 854 and the surrounding code for StandardWrapper you will see
>>> that this is part of the Servlet allocation process. You should then
>>> realise that line 854 is part of Servlet allocation for Servlets that
>>> implement the SingleThreadModel (and now the alarm bells should be
>>> ringing loud and clear).
>>>
>>> Tomcat allocates a maximum of 20 instances of any STM servlet. Your
>>> client has an application that uses STM and requires many more than 20
>>> concurrent instances. Hence most requests are sat waiting for a STM
>>> instance to be released.
>>>
>>> Since that means there must be 20 instances of an STM Servlet already
>>> allocated, it is simple enough to grep the thread dump to find out which
>>> one. The winner is:
>>> com.motorola.nsm.common.ui.servlet.ValidateServlet
>>>
>>> Mark
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>>> For additional commands, e-mail: users-help@tomcat.apache.org
>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>> For additional commands, e-mail: users-help@tomcat.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: Urgent: Tomcat 6.0.20 on Solaris 10 Reaches max threads and is non responsive

Posted by Mark Eggers <it...@yahoo.com>.
----- Original Message -----

> From: "Caldarale, Charles R" <Ch...@unisys.com>
> To: Tomcat Users List <us...@tomcat.apache.org>
> Cc: 
> Sent: Friday, August 5, 2011 9:38 AM
> Subject: RE: Urgent: Tomcat 6.0.20 on Solaris 10 Reaches max threads and is non responsive
> 
> From: Mark Thomas [mailto:markt@apache.org] 
> Subject: Re: Urgent: Tomcat 6.0.20 on Solaris 10 Reaches max threads and is non 
> responsive
> 
>>  Now, is this a fixable thing within the Java stack? Or is it an
>>  application limitation?
> 
> The other option is re-write the STM Servlet(s) as non-STM.
> 
> What Mark didn't say is that, in this day and age, using the single-thread 
> model is really rather dumb.  It can be used to work around sloppy programming 
> techniques, but shouldn't appear in any proper modern application.  To quote 
> from the servlet spec:
> 
> "It is recommended that a developer take other means to resolve those 
> issues instead of implementing this interface, such as avoiding the usage of an 
> instance variable or synchronizing the block of the code accessing those 
> resources. The SingleThreadModel Interface is deprecated in this version of the 
> specification."
> 
> (The above has been in the spec for many years.)
> 
> - Chuck


Yep, everything I've read states that STM servlets are evil. In fact one of the books I have has (evil) after SingleThreadModel in the index.

A real problem that I mentioned in another mail message is the following scenario.

1. ValidateServlet has a bug
2. This bug prevents ValidateServlet from completing

Now all someone has to do in order to prevent further connections to ValidateServlet is to create the appropriate request that tickles the bug and launch it 20 times against the web site.

If ValidateServlet is a central part of the site (sounds like it's a general validation utility), then the site is essentially down after 20 well-crafted requests.

Load balancing, increasing the number of STM threads allowed in Tomcat, etc. will just delay the onset of this problem.

This servlet and the underlying logic needs to be rewritten.

. . . . just my two cents.
/mde/

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


RE: Urgent: Tomcat 6.0.20 on Solaris 10 Reaches max threads and is non responsive

Posted by "Caldarale, Charles R" <Ch...@unisys.com>.
From: Mark Thomas [mailto:markt@apache.org] 
Subject: Re: Urgent: Tomcat 6.0.20 on Solaris 10 Reaches max threads and is non responsive

> Now, is this a fixable thing within the Java stack? Or is it an
> application limitation?

The other option is re-write the STM Servlet(s) as non-STM.

What Mark didn't say is that, in this day and age, using the single-thread model is really rather dumb.  It can be used to work around sloppy programming techniques, but shouldn't appear in any proper modern application.  To quote from the servlet spec:

"It is recommended that a developer take other means to resolve those issues instead of implementing this interface, such as avoiding the usage of an instance variable or synchronizing the block of the code accessing those resources. The SingleThreadModel Interface is deprecated in this version of the specification."

(The above has been in the spec for many years.)

 - Chuck


THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: Urgent: Tomcat 6.0.20 on Solaris 10 Reaches max threads and is non responsive

Posted by Christopher Schultz <ch...@christopherschultz.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Mark,

On 8/5/2011 12:00 PM, Mark Thomas wrote:
> On 05/08/2011 16:56, Dante Bell wrote:
>> Thanks!
>> 
>> Like I said, I'm an OS/HW guy, never looked at java b4!
>> 
>> They are saying that the load test has 20 'connections' so I'm
>> guessing that's the 20 STMs.
>> 
>> Now, is this a fixable thing within the Java stack? Or is it an 
>> application limitation?
> 
> It looks to be hard-coded within Tomcat. I don't see a way to change 
> that limit without building Tomcat from source.

It seems that, in this case, being able to configure the number of STM
instances managed by Tomcat could be a good thing. On the other hand,
there's no reason anyone should be encouraged to write STM servlets, so
maybe having to recompile Tomcat is a reasonable punishment.

> The other option is re-write the STM Servlet(s) as non-STM.

That is, of course, the best solution. ;)

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk48VzQACgkQ9CaO5/Lv0PBh+QCfY5wYvbiQ0iYSPofc9ZSL9ic3
8/QAoKg5I23sBXsAAWzAvLv79BUMe+jh
=lX+g
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: Urgent: Tomcat 6.0.20 on Solaris 10 Reaches max threads and is non responsive

Posted by Mark Thomas <ma...@apache.org>.
On 05/08/2011 16:56, Dante Bell wrote:
> Thanks!
> 
> Like I said, I'm an OS/HW guy, never looked at java b4!
> 
> They are saying that the load test has 20 'connections' so I'm guessing
> that's the 20 STMs.
> 
> Now, is this a fixable thing within the Java stack? Or is it an
> application limitation?

It looks to be hard-coded within Tomcat. I don't see a way to change
that limit without building Tomcat from source.

The other option is re-write the STM Servlet(s) as non-STM.

Mark

> 
> Danté
> 
> On 08/05/2011 11:12 AM, Mark Thomas wrote:
>> On 05/08/2011 15:34, Dante Bell wrote:
>>> Hi,
>>>
>>> I'm running out of ideas on what to try for this customer. Their load
>>> tests show that Tomcat is getting to a point where it no longer services
>>> requests.
>> Let me guess. It is fine for low loads but as soon as the load goes
>> above a certain number (maybe around 20 concurrent users?) then
>> everything starts going wrong?
>>
>>> Thread dumps show most threads in a wait state, but some are
>>> runnable. I'm suspecting GC is the issue,
>> On what basis? The thread dump clearly shows a different problem.
>>
>>> but in analyzing the logs it
>>> shows the longest pause was 15 seconds, but that only happened once, and
>>> most of the stats look OK to me.
>> That would suggest that it isn't GC then wouldn't it.
>>
>>> Details can be found on my blog: http://wp.me/plPvN-ai
>> All the information needed to diagnose this issue is in the thead dump.
>> If you take a look at the first thread that is blocked:
>> "TP-Processor40745" daemon prio=3 tid=0x03831400 nid=0xa888 in
>> Object.wait() [0x2cd2f000..0x2cd2f870]
>>    java.lang.Thread.State: WAITING (on object monitor)
>> 	at java.lang.Object.wait(Native Method)
>> 	at java.lang.Object.wait(Object.java:485)
>> 	at
>> org.apache.catalina.core.StandardWrapper.allocate(StandardWrapper.java:854)
>> 	- locked <0x63a26708> (a java.util.Stack)
>> 	at
>> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:129)
>> 	at
>> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
>>
>> And Bingo! we have found the problem.
>>
>> Now that stack trace might not ring alarm bells with someone unfamiliar
>> with the Tomcat code base but if Tomcat is unresponsive, understanding
>> why *any* thread is blocked would be a good place to start. If you look
>> at line 854 and the surrounding code for StandardWrapper you will see
>> that this is part of the Servlet allocation process. You should then
>> realise that line 854 is part of Servlet allocation for Servlets that
>> implement the SingleThreadModel (and now the alarm bells should be
>> ringing loud and clear).
>>
>> Tomcat allocates a maximum of 20 instances of any STM servlet. Your
>> client has an application that uses STM and requires many more than 20
>> concurrent instances. Hence most requests are sat waiting for a STM
>> instance to be released.
>>
>> Since that means there must be 20 instances of an STM Servlet already
>> allocated, it is simple enough to grep the thread dump to find out which
>> one. The winner is:
>> com.motorola.nsm.common.ui.servlet.ValidateServlet
>>
>> Mark
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>> For additional commands, e-mail: users-help@tomcat.apache.org
>>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: Urgent: Tomcat 6.0.20 on Solaris 10 Reaches max threads and is non responsive

Posted by Dante Bell <Da...@cocoanet.us>.
Thanks!

Like I said, I'm an OS/HW guy, never looked at java b4!

They are saying that the load test has 20 'connections' so I'm guessing
that's the 20 STMs.

Now, is this a fixable thing within the Java stack? Or is it an
application limitation?

Danté

On 08/05/2011 11:12 AM, Mark Thomas wrote:
> On 05/08/2011 15:34, Dante Bell wrote:
>> Hi,
>>
>> I'm running out of ideas on what to try for this customer. Their load
>> tests show that Tomcat is getting to a point where it no longer services
>> requests.
> Let me guess. It is fine for low loads but as soon as the load goes
> above a certain number (maybe around 20 concurrent users?) then
> everything starts going wrong?
>
>> Thread dumps show most threads in a wait state, but some are
>> runnable. I'm suspecting GC is the issue,
> On what basis? The thread dump clearly shows a different problem.
>
>> but in analyzing the logs it
>> shows the longest pause was 15 seconds, but that only happened once, and
>> most of the stats look OK to me.
> That would suggest that it isn't GC then wouldn't it.
>
>> Details can be found on my blog: http://wp.me/plPvN-ai
> All the information needed to diagnose this issue is in the thead dump.
> If you take a look at the first thread that is blocked:
> "TP-Processor40745" daemon prio=3 tid=0x03831400 nid=0xa888 in
> Object.wait() [0x2cd2f000..0x2cd2f870]
>    java.lang.Thread.State: WAITING (on object monitor)
> 	at java.lang.Object.wait(Native Method)
> 	at java.lang.Object.wait(Object.java:485)
> 	at
> org.apache.catalina.core.StandardWrapper.allocate(StandardWrapper.java:854)
> 	- locked <0x63a26708> (a java.util.Stack)
> 	at
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:129)
> 	at
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
>
> And Bingo! we have found the problem.
>
> Now that stack trace might not ring alarm bells with someone unfamiliar
> with the Tomcat code base but if Tomcat is unresponsive, understanding
> why *any* thread is blocked would be a good place to start. If you look
> at line 854 and the surrounding code for StandardWrapper you will see
> that this is part of the Servlet allocation process. You should then
> realise that line 854 is part of Servlet allocation for Servlets that
> implement the SingleThreadModel (and now the alarm bells should be
> ringing loud and clear).
>
> Tomcat allocates a maximum of 20 instances of any STM servlet. Your
> client has an application that uses STM and requires many more than 20
> concurrent instances. Hence most requests are sat waiting for a STM
> instance to be released.
>
> Since that means there must be 20 instances of an STM Servlet already
> allocated, it is simple enough to grep the thread dump to find out which
> one. The winner is:
> com.motorola.nsm.common.ui.servlet.ValidateServlet
>
> Mark
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: Urgent: Tomcat 6.0.20 on Solaris 10 Reaches max threads and is non responsive

Posted by André Warnier <aw...@ice-sa.com>.
Pid wrote:
> On 05/08/2011 16:12, Mark Thomas wrote:
>> On 05/08/2011 15:34, Dante Bell wrote:
>>> Hi,
>>>
>>> I'm running out of ideas on what to try for this customer. Their load
>>> tests show that Tomcat is getting to a point where it no longer services
>>> requests.
>> Let me guess. It is fine for low loads but as soon as the load goes
>> above a certain number (maybe around 20 concurrent users?) then
>> everything starts going wrong?
>>
>>> Thread dumps show most threads in a wait state, but some are
>>> runnable. I'm suspecting GC is the issue,
>> On what basis? The thread dump clearly shows a different problem.
>>
>>> but in analyzing the logs it
>>> shows the longest pause was 15 seconds, but that only happened once, and
>>> most of the stats look OK to me.
>> That would suggest that it isn't GC then wouldn't it.
>>
>>> Details can be found on my blog: http://wp.me/plPvN-ai
>> All the information needed to diagnose this issue is in the thead dump.
>> If you take a look at the first thread that is blocked:
>> "TP-Processor40745" daemon prio=3 tid=0x03831400 nid=0xa888 in
>> Object.wait() [0x2cd2f000..0x2cd2f870]
>>    java.lang.Thread.State: WAITING (on object monitor)
>> 	at java.lang.Object.wait(Native Method)
>> 	at java.lang.Object.wait(Object.java:485)
>> 	at
>> org.apache.catalina.core.StandardWrapper.allocate(StandardWrapper.java:854)
>> 	- locked <0x63a26708> (a java.util.Stack)
>> 	at
>> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:129)
>> 	at
>> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
>>
>> And Bingo! we have found the problem.
>>
>> Now that stack trace might not ring alarm bells with someone unfamiliar
>> with the Tomcat code base but if Tomcat is unresponsive, understanding
>> why *any* thread is blocked would be a good place to start. If you look
>> at line 854 and the surrounding code for StandardWrapper you will see
>> that this is part of the Servlet allocation process. You should then
>> realise that line 854 is part of Servlet allocation for Servlets that
>> implement the SingleThreadModel (and now the alarm bells should be
>> ringing loud and clear).
>>
>> Tomcat allocates a maximum of 20 instances of any STM servlet. Your
>> client has an application that uses STM and requires many more than 20
>> concurrent instances. Hence most requests are sat waiting for a STM
>> instance to be released.
>>
>> Since that means there must be 20 instances of an STM Servlet already
>> allocated, it is simple enough to grep the thread dump to find out which
>> one. The winner is:
>> com.motorola.nsm.common.ui.servlet.ValidateServlet
> 
> "By Jove, Holmes!", exclaimed Dr Watson.
> 

Brilliant use of ze little grey cells !
(s) Hercule Poirot

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: Urgent: Tomcat 6.0.20 on Solaris 10 Reaches max threads and is non responsive

Posted by Mark Eggers <it...@yahoo.com>.
----- Original Message -----

> From: Pid <pi...@pidster.com>
> To: Tomcat Users List <us...@tomcat.apache.org>
> Cc: 
> Sent: Friday, August 5, 2011 8:35 AM
> Subject: Re: Urgent: Tomcat 6.0.20 on Solaris 10 Reaches max threads and is non responsive
> 
> On 05/08/2011 16:12, Mark Thomas wrote:
>>  On 05/08/2011 15:34, Dante Bell wrote:
>>>  Hi,
>>> 
>>>  I'm running out of ideas on what to try for this customer. Their 
> load
>>>  tests show that Tomcat is getting to a point where it no longer 
> services
>>>  requests.
>> 
>>  Let me guess. It is fine for low loads but as soon as the load goes
>>  above a certain number (maybe around 20 concurrent users?) then
>>  everything starts going wrong?
>> 
>>>  Thread dumps show most threads in a wait state, but some are
>>>  runnable. I'm suspecting GC is the issue,
>> 
>>  On what basis? The thread dump clearly shows a different problem.
>> 
>>>  but in analyzing the logs it
>>>  shows the longest pause was 15 seconds, but that only happened once, 
> and
>>>  most of the stats look OK to me.
>> 
>>  That would suggest that it isn't GC then wouldn't it.
>> 
>>>  Details can be found on my blog: http://wp.me/plPvN-ai
>> 
>>  All the information needed to diagnose this issue is in the thead dump.
>>  If you take a look at the first thread that is blocked:
>>  "TP-Processor40745" daemon prio=3 tid=0x03831400 nid=0xa888 in
>>  Object.wait() [0x2cd2f000..0x2cd2f870]
>>     java.lang.Thread.State: WAITING (on object monitor)
>>      at java.lang.Object.wait(Native Method)
>>      at java.lang.Object.wait(Object.java:485)
>>      at
>>  org.apache.catalina.core.StandardWrapper.allocate(StandardWrapper.java:854)
>>      - locked <0x63a26708> (a java.util.Stack)
>>      at
>> 
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:129)
>>      at
>> 
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
>> 
>>  And Bingo! we have found the problem.
>> 
>>  Now that stack trace might not ring alarm bells with someone unfamiliar
>>  with the Tomcat code base but if Tomcat is unresponsive, understanding
>>  why *any* thread is blocked would be a good place to start. If you look
>>  at line 854 and the surrounding code for StandardWrapper you will see
>>  that this is part of the Servlet allocation process. You should then
>>  realise that line 854 is part of Servlet allocation for Servlets that
>>  implement the SingleThreadModel (and now the alarm bells should be
>>  ringing loud and clear).
>> 
>>  Tomcat allocates a maximum of 20 instances of any STM servlet. Your
>>  client has an application that uses STM and requires many more than 20
>>  concurrent instances. Hence most requests are sat waiting for a STM
>>  instance to be released.
>> 
>>  Since that means there must be 20 instances of an STM Servlet already
>>  allocated, it is simple enough to grep the thread dump to find out which
>>  one. The winner is:
>>  com.motorola.nsm.common.ui.servlet.ValidateServlet
> 
> "By Jove, Holmes!", exclaimed Dr Watson.
> 
> 
> p


Nice explanation. I hauled out the Tomcat 6 code and followed along. I found the STM code and went back to the top of the class and found the limit of 20 instances.

I'm guessing what you grepped on were the service methods (doPost, doGet, etc.) and noticed that there were no doGet methods, only doPost.

Then, there are 2 classes in the thread dump that were in the doPost method:

com.motorola.nsm.common.ui.servlet.UIJnlpServlet

com.motorola.nsm.common.ui.servlet.ValidateServlet


Doing:

grep com.motorola.nsm.common.ui.servlet.UIJnlpServlet.doPost tomcat_stack_dump.txt | wc -l

results in 28, so that's not the culprit. In other words, there are more than 20 instances, so this cannot be a STM servlet which is indicated by the blocking on line 854 of StandardWrapper.java.

Doing:

grep com.motorola.nsm.common.ui.servlet.ValidateServlet.doPost tomcat_stack_dump.txt | wc -l

results in 20, so bingo! We're at the limit here, so the application has to wait until one of these instances is finished.

If the ValidateServlet is slow or you get a lot of requests, then the application becomes unresponsive until one or more instances finishes. If the ValidateServlet has problems with certain cases, then you can lock up the entire application. This would serve as a good denial of service attack. Twenty well-crafted requests to the application, and it's essentially locked (or at least this functionality is).

. . . . just my two cents.

/mde/

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: Urgent: Tomcat 6.0.20 on Solaris 10 Reaches max threads and is non responsive

Posted by Pid <pi...@pidster.com>.
On 05/08/2011 16:12, Mark Thomas wrote:
> On 05/08/2011 15:34, Dante Bell wrote:
>> Hi,
>>
>> I'm running out of ideas on what to try for this customer. Their load
>> tests show that Tomcat is getting to a point where it no longer services
>> requests.
> 
> Let me guess. It is fine for low loads but as soon as the load goes
> above a certain number (maybe around 20 concurrent users?) then
> everything starts going wrong?
> 
>> Thread dumps show most threads in a wait state, but some are
>> runnable. I'm suspecting GC is the issue,
> 
> On what basis? The thread dump clearly shows a different problem.
> 
>> but in analyzing the logs it
>> shows the longest pause was 15 seconds, but that only happened once, and
>> most of the stats look OK to me.
> 
> That would suggest that it isn't GC then wouldn't it.
> 
>> Details can be found on my blog: http://wp.me/plPvN-ai
> 
> All the information needed to diagnose this issue is in the thead dump.
> If you take a look at the first thread that is blocked:
> "TP-Processor40745" daemon prio=3 tid=0x03831400 nid=0xa888 in
> Object.wait() [0x2cd2f000..0x2cd2f870]
>    java.lang.Thread.State: WAITING (on object monitor)
> 	at java.lang.Object.wait(Native Method)
> 	at java.lang.Object.wait(Object.java:485)
> 	at
> org.apache.catalina.core.StandardWrapper.allocate(StandardWrapper.java:854)
> 	- locked <0x63a26708> (a java.util.Stack)
> 	at
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:129)
> 	at
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
> 
> And Bingo! we have found the problem.
> 
> Now that stack trace might not ring alarm bells with someone unfamiliar
> with the Tomcat code base but if Tomcat is unresponsive, understanding
> why *any* thread is blocked would be a good place to start. If you look
> at line 854 and the surrounding code for StandardWrapper you will see
> that this is part of the Servlet allocation process. You should then
> realise that line 854 is part of Servlet allocation for Servlets that
> implement the SingleThreadModel (and now the alarm bells should be
> ringing loud and clear).
> 
> Tomcat allocates a maximum of 20 instances of any STM servlet. Your
> client has an application that uses STM and requires many more than 20
> concurrent instances. Hence most requests are sat waiting for a STM
> instance to be released.
> 
> Since that means there must be 20 instances of an STM Servlet already
> allocated, it is simple enough to grep the thread dump to find out which
> one. The winner is:
> com.motorola.nsm.common.ui.servlet.ValidateServlet

"By Jove, Holmes!", exclaimed Dr Watson.


p

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: Urgent: Tomcat 6.0.20 on Solaris 10 Reaches max threads and is non responsive

Posted by Mark Thomas <ma...@apache.org>.
On 05/08/2011 15:34, Dante Bell wrote:
> Hi,
> 
> I'm running out of ideas on what to try for this customer. Their load
> tests show that Tomcat is getting to a point where it no longer services
> requests.

Let me guess. It is fine for low loads but as soon as the load goes
above a certain number (maybe around 20 concurrent users?) then
everything starts going wrong?

> Thread dumps show most threads in a wait state, but some are
> runnable. I'm suspecting GC is the issue,

On what basis? The thread dump clearly shows a different problem.

> but in analyzing the logs it
> shows the longest pause was 15 seconds, but that only happened once, and
> most of the stats look OK to me.

That would suggest that it isn't GC then wouldn't it.

> Details can be found on my blog: http://wp.me/plPvN-ai

All the information needed to diagnose this issue is in the thead dump.
If you take a look at the first thread that is blocked:
"TP-Processor40745" daemon prio=3 tid=0x03831400 nid=0xa888 in
Object.wait() [0x2cd2f000..0x2cd2f870]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	at java.lang.Object.wait(Object.java:485)
	at
org.apache.catalina.core.StandardWrapper.allocate(StandardWrapper.java:854)
	- locked <0x63a26708> (a java.util.Stack)
	at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:129)
	at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)

And Bingo! we have found the problem.

Now that stack trace might not ring alarm bells with someone unfamiliar
with the Tomcat code base but if Tomcat is unresponsive, understanding
why *any* thread is blocked would be a good place to start. If you look
at line 854 and the surrounding code for StandardWrapper you will see
that this is part of the Servlet allocation process. You should then
realise that line 854 is part of Servlet allocation for Servlets that
implement the SingleThreadModel (and now the alarm bells should be
ringing loud and clear).

Tomcat allocates a maximum of 20 instances of any STM servlet. Your
client has an application that uses STM and requires many more than 20
concurrent instances. Hence most requests are sat waiting for a STM
instance to be released.

Since that means there must be 20 instances of an STM Servlet already
allocated, it is simple enough to grep the thread dump to find out which
one. The winner is:
com.motorola.nsm.common.ui.servlet.ValidateServlet

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


RE: Urgent: Tomcat 6.0.20 on Solaris 10 Reaches max threads and is non responsive

Posted by Jeffrey Janner <Je...@PolyDyne.com>.
Dante -
Take a real close look at the application running in Tomcat.
I've had similar issues where Tomcat was suddenly using all configured request threads, but they were all waiting on something else to happen.
I found the problem to be a minor defect with the DB pooling code from the DB vendor that sometimes causes the pool to hang.  The vendor had a fix that solved the problem, for the most part.
It took some serious scouring of the threads using jconsole to see where the problem really was. Following the waiting tree really helped identify it.  It could very well be a defect in the code running under Tomcat, and not Tomcat's code.

> -----Original Message-----
> From: Dante Bell [mailto:DantePasquale@cocoanet.us]
> Sent: Friday, August 05, 2011 9:34 AM
> To: users@tomcat.apache.org
> Subject: Urgent: Tomcat 6.0.20 on Solaris 10 Reaches max threads and is
> non responsive
> 
> Hi,
> 
> I'm running out of ideas on what to try for this customer. Their load
> tests show that Tomcat is getting to a point where it no longer
> services
> requests. Thread dumps show most threads in a wait state, but some are
> runnable. I'm suspecting GC is the issue, but in analyzing the logs it
> shows the longest pause was 15 seconds, but that only happened once,
> and
> most of the stats look OK to me.
> 
> Details can be found on my blog: http://wp.me/plPvN-ai
> 
> Here's some info:
> 
> *Apache:* Apache HTTP Server Version 2.2 -- prefork with mpm
> *Tomcat:* 6.0.20
> *JK Connector:* Same as whatever is bundled in with Apache 2.2 (from
> customer)
> *Solaris* Solaris 10 10/09 s10s_u8wos_08a SPARC
> 
> Workers.Properties:
> 
> |# Define 1 real worker using ajp13|
> |worker.list=worker1,worker2,worker3,worker4|
> |worker.maintain=10|
> 
> |# Set properties for worker1 (ajp13)|
> |worker.worker1.||type||=ajp13|
> |worker.worker1.host=localhost|
> |worker.worker1.port=8019|
> |worker.worker1.lbfactor=1|
> |worker.worker1.connection_pool_size=1|
> |worker.worker1.connection_pool_timeout=10|
> |worker.worker1.socket_keepalive=1|
> |worker.worker1.socket_timeout=300|
> |worker.worker1.cache_timeout=10|
> 
> |# Set properties for worker2 (ajp13)|
> |worker.worker2.||type||=ajp13|
> |worker.worker2.host=localhost|
> |worker.worker2.port=8019|
> |worker.worker2.lbfactor=1|
> |worker.worker2.connection_pool_size=1|
> |worker.worker2.connection_pool_timeout=10|
> |worker.worker2.socket_keepalive=1|
> |worker.worker2.socket_timeout=300|
> |worker.worker2.cache_timeout=10|
> 
> |# Set properties for worker3 (ajp13)|
> |worker.worker3.||type||=ajp13|
> |worker.worker3.host=localhost|
> |worker.worker3.port=8019|
> |worker.worker3.lbfactor=1|
> |worker.worker3.connection_pool_size=1|
> |worker.worker3.connection_pool_timeout=10|
> |worker.worker3.socket_keepalive=1|
> |worker.worker3.socket_timeout=300|
> |worker.worker3.cache_timeout=10|
> 
> |# Set properties for worker4 (ajp13)|
> |worker.worker4.||type||=ajp13|
> |worker.worker4.host=localhost|
> |worker.worker4.port=8019|
> |worker.worker4.lbfactor=1|
> |worker.worker4.connection_pool_size=1|
> |worker.worker4.connection_pool_timeout=10|
> |worker.worker4.socket_keepalive=1|
> |worker.worker4.socket_timeout=300|
> |worker.worker4.cache_timeout=10
> 
> |

__________________________________________________________________________

Confidentiality Notice:  This Transmission (including any attachments) may contain information that is privileged, confidential, and exempt from disclosure under applicable law.  If the reader of this message is not the intended recipient you are hereby notified that any dissemination, distribution, or copying of this communication is strictly prohibited.  

If you have received this transmission in error, please immediately reply to the sender or telephone (512) 343-9100 and delete this transmission from your system.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org