You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tomcat.apache.org by Greg Ward <gw...@python.net> on 2006/05/02 16:42:26 UTC

Limiting effects of badly-behaved webapps

We've been using Tomcat 4.1.30 happily for a couple of years now, but
every so often one badly-behaved webapp can make life unhappy for
everyone living in the container.  (Our Tomcat deployment is part of a
suite of applications that run on a small cluster of Linux servers; all
of the webapps running inside Tomcat are written and controlled by us.
We have around a hundred of these small clusters deployed worldwide, so
several hundred servers all told.)

Here's what typically happens:

  * webapp A tries to open a database connection to another server in
    the cluster, but that server is down and packets to it just
    disappear (alternately, A runs a badly-written and consequently very
    s-l-o-w query: either way, it's a database operation that takes a
    looooong time)

  * meanwhile, the thread running that request for A is holding a
    synchronization lock: yes, we know that you're not supposed to hold
    synchronization locks while doing I/O, but the programmers who wrote
    this stuff 3-5 years ago did not know that.  We fix the bugs as we
    find them, but they aren't easy to find and they aren't easy to fix.

  * thus, all requests to A backup in a queue waiting for the original
    thread to finish its slow I/O and release that synchronization
    lock.  If there are enough incoming requests for A, then Tomcat's
    thread pool is gradually exhausted, eventually allocating all 75
    threads to process requests for A that are blocked by that one
    synchronization lock.

  * now Tomcat is unable to process requests for webapps B, C, D, ....
    and our whole application suite is effectively dead.  Oops!

Obviously, the right long-term fix is "don't hold synchronization locks
while doing database I/O".  (It would also help if database connections
and queries were always fast, but alas! life just doesn't work that
way.)  But until all those bugs are found and fixed, this cascading
failure is going to happen occasionally.

One idea that has occurred to me is to limit the number of threads
Tomcat allocates to any one webapp.  Say we could limit webapp A to 25
threads from Tomcat's pool of 75: users depending on A would still be
shut out (all requests block), but that failure would not cascade out to
affect all other webapps running in the same container.

So I'm wondering:

  * is there an easy way to implement this with Tomcat 4.1?  how about
    5.5?  (we haven't upgraded because we're pretty happy with 4.1
    ... but if there's a compelling reason to switch to 5.5, we'll do
    it)

  * are there other good techniques for limiting the damage caused
    by badly-behaved webapps?  I'm sure "holding synchronization lock
    while doing database I/O" is only one type of bad behaviour lurking
    in our code ... I'd like to reduce the effect webapps in
    the same container have on each other as much as possible.

Thanks --

        Greg

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: Limiting effects of badly-behaved webapps

Posted by Greg Ward <gw...@python.net>.
On 02 May 2006, Tim Funk said:
> An "easier solution" is to throttle the webapp via a filter. For example:

Good idea, thanks!  I'll try to implement that and report back to the
list.

        Greg

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: Limiting effects of badly-behaved webapps

Posted by Tim Funk <fu...@joedog.org>.
An "easier solution" is to throttle the webapp via a filter. For example:

Filter{
   final int maxThreadCount=10;
   int threadCount=0;
   doFilter() {
     synchronized(this) {
       if (threadCount>maxThreadCount) {
         response.sendError(SC_SERVICE_UNAVAILABLE);
         return;
       }
       threadCount++;
     }
     try {
       chain.doFilter(request, response);
     } finally {
       synchronized(this) {
         threadCount--;
       }
     }
   }
}

The filter can then be run on the whole webapp, of just the evil servlets.

-Tim

Greg Ward wrote:

> We've been using Tomcat 4.1.30 happily for a couple of years now, but
> every so often one badly-behaved webapp can make life unhappy for
> everyone living in the container.  (Our Tomcat deployment is part of a
> suite of applications that run on a small cluster of Linux servers; all
> of the webapps running inside Tomcat are written and controlled by us.
> We have around a hundred of these small clusters deployed worldwide, so
> several hundred servers all told.)
> 
> Here's what typically happens:
> 
>   * webapp A tries to open a database connection to another server in
>     the cluster, but that server is down and packets to it just
>     disappear (alternately, A runs a badly-written and consequently very
>     s-l-o-w query: either way, it's a database operation that takes a
>     looooong time)
> 
>   * meanwhile, the thread running that request for A is holding a
>     synchronization lock: yes, we know that you're not supposed to hold
>     synchronization locks while doing I/O, but the programmers who wrote
>     this stuff 3-5 years ago did not know that.  We fix the bugs as we
>     find them, but they aren't easy to find and they aren't easy to fix.
> 
>   * thus, all requests to A backup in a queue waiting for the original
>     thread to finish its slow I/O and release that synchronization
>     lock.  If there are enough incoming requests for A, then Tomcat's
>     thread pool is gradually exhausted, eventually allocating all 75
>     threads to process requests for A that are blocked by that one
>     synchronization lock.
> 
>   * now Tomcat is unable to process requests for webapps B, C, D, ....
>     and our whole application suite is effectively dead.  Oops!
> 
> Obviously, the right long-term fix is "don't hold synchronization locks
> while doing database I/O".  (It would also help if database connections
> and queries were always fast, but alas! life just doesn't work that
> way.)  But until all those bugs are found and fixed, this cascading
> failure is going to happen occasionally.
> 
> One idea that has occurred to me is to limit the number of threads
> Tomcat allocates to any one webapp.  Say we could limit webapp A to 25
> threads from Tomcat's pool of 75: users depending on A would still be
> shut out (all requests block), but that failure would not cascade out to
> affect all other webapps running in the same container.
> 
> So I'm wondering:
> 
>   * is there an easy way to implement this with Tomcat 4.1?  how about
>     5.5?  (we haven't upgraded because we're pretty happy with 4.1
>     ... but if there's a compelling reason to switch to 5.5, we'll do
>     it)
> 
>   * are there other good techniques for limiting the damage caused
>     by badly-behaved webapps?  I'm sure "holding synchronization lock
>     while doing database I/O" is only one type of bad behaviour lurking
>     in our code ... I'd like to reduce the effect webapps in
>     the same container have on each other as much as possible.
> 
> Thanks --
> 
>         Greg
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org