You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by Tal Dayan <ta...@zapta.com> on 2001/03/21 00:18:36 UTC

Control Channel and Thread Pool (was bug 1006)

Hi Costin,


> The timeout patch will certainly go into tomcat3.3, and I think Marc will
> add it to 3.2 also - it's a good solution and I don't think it can brake
> anything.

Great, we will be able to drop soon our proprietry patches.zip library. ;-)

BTW, Our patch library currently contains three patches: the socketRead()
timeout path, the JSP class
encoding patch (to address NT's file name limit of 256 chars), and a patch
that eliminate the full disclosure that Tomcat does
regarding its version, the OS and its version and the JVM and its version
(we consider it to be a security issue, it will be nice to be able to
disable this disclosure or even set our own string for an out of the box
Tomcat).

> I was thinking about what we can do for a more general solution - and
> looking at the code I think there are few easy things that need to be
> done related with thread pools:
>
> - use a common thread pool for all endpoints ( right now ajp12 is creating
> a full thread pool for itself, etc).

Yes, this will be more efficient. For example, the current standlone version
allocates 10 threads by default
for the Ajp connector even though only few may be needed for shutdown. The
problem however is that if you will get
to the max threads, you will not be able to shutdown Tomcat.

> - document the thread pool implementation, remove some old code
>
> - add an admin page to monitor the thread pools

Here is a simple idea that works well for us and will bring great benefit to
the
entire Jakarta project (unless it already exists ;-))

We have in our application a Control mechanism that allows to send commands
and get information from various parts of the application. It is build in a
modular way such that every module can add its command handlers dynamically.
This way, you have very good access to the internals of the server and the
application. Somthing similar to SNMP but simpler.

Then you can use this mechnaism to get any status information from any
module (e.g. ThreadPool), send commands etc. The access to this control can
be with Java API, servlet that uses HTMl or XML, SOAP, XMLRPC, etc and you
can access all of the status and control points in a uniform way.

In our case, the access is done both using Java API and through a servlet.
The servlet present an interactive form in which you can post the commands
and see the results. We also have a simple command-line client (writen in
Java) that allows to send the commands and get the status from a command
line (great for scripting).

We can release the source code of it but I think it will be better if it is
redisgned and implemented in a more general way. This will be easy to do and
will brings great benifit.

> - use the thread pool to run session expire and verify the expire bugs
> - add a ThreadPoolListener/Event to allow a (future) module to monitor and
> manage the thread pool
>
> - add a field to store the current "owner" or "user" of each thread.

Yes, this will be very useful. And with the above control mechanism you will
be able to get not just the status of the pool but also things like how long
does each worker is performing the current task (since it was allocated from
the pool).

BTW, this can be improved even with the current code. Currently when you
perform a thread/stack dump of the JVM, you can see all the threads that are
waiting in the thread pools but you don't know which is which. Setting the
thread names to something like 'PoolName-01' instead of 'Thread-01' will be
very useful.

> - add some log messages for the case when the thread pool is at the
> maximum capacity

Yes, and if you can control the verbosity levels in your logs, you can also
report when they are freed or periodically (by the control thread of the
pool) what is the status of the pool. And with the control mechanism, you
will be able to easily play with all kind of scenarios, for example, by
changing pool parameters while the server is running, issue 'free all unused
thread', or 'abort thread 34' commands, increase the verbosity level of the
thread pool log, etc.

> - maybe provide a spare "admin" thread that can be used to "un-hang" the
> server without restarting tomcat ( i.e. if the thread pool is at max
> capacity, and if the connector detects a localhost connection allow it to
> create an extra thread - so an admin application can kill threads ).

Maybe you can detect the this bad conditions, you may use the control thread
of the thread pool to
abort hanging threads. And with the external control mechanism, you can even
have external monitoring programs to issue a 'soft reset' command to Tomcat.

> - API change in ThreadPool - allow it to run normal Runnable ( the current
> ThreadPoolRunnable has some nice performance tricks, but it should be
> usable for normal tasks that don't want to take advantage of overhead-free
> thread data )
>
> None of those would resolve the DOS problem, but I think it would be nice
> to have them (and very easy to implement - without affecting the current
> functionality )

Yes, it is hard to have a DOS safe solution (e.g. how do you handle
excessive creation of sessions). The first priority IMHO should be to make
the thing work well in normal conditions without introducing new
vulnerabilities.

>
>
> Costin
>

Tal

>
>
>
>
> On Tue, 20 Mar 2001, Tal Dayan wrote:
>
> > Hi,
> >
> > Our first priority is to make Tomcat to work in normal
> conditions with good
> > intentioned users. We will
> > worry later about DOS (as long as we don't introduce new
> vulnerabilities).
> >
> > Yes, we tested the timeout patch all day yesterday with a
> production system
> > with real users and normal load and
> > all the hanging threads and connections was cleaned up perfectly (we are
> > using 'netstat' to
> > get the number of HTTP connections and 'ps' to get the number
> of thread and
> > all is graphed around the clock by MRTG). We are running with a
> relatively
> > timeout of 5 minutes (50*60*1000) just to be on the safe side
> but a shorter
> > one can be used.
> >
> > Note that I am in no way and expert in Tomcat nor do I claim to
> understand
> > all the implications of the patch so we need some qualified person to
> > understand the implication and make sure it does not break anything. For
> > example, if another service like Ajp uses this connection pool
> for long term
> > connections that need to wait long periods for a data from some
> client (e.g.
> > a front end web server, client side applets, etc), the patch
> will break it.
> > The patch assumes that the connections are should never be idle
> for a long
> > period of time.
> >
> > As for Apache, it supports a request timeout (see
> > http://httpd.apache.org/docs/mod/core.html#timeout) and this will
> > will eventually cleanup hanging connections. The timeout in this case is
> > longer because it is for the entire request and the cleanup
> will be slower.
> >
> > Implementing a similar query timeout in Tomcat may require things like
> > asynchronous thread kill (yuck) or some synchronous termination of the
> > worker threads, for example by closing the sockets they are
> waiting on (I
> > think this will release the socketRead() but I am not sure).
> But this is of
> > course a more involved change than simply adding the timeout statement.
> > Having an asynchronous I/O may help here (see Bug Parade
> > http://developer.java.sun.com/developer/bugParade/bugs/4075058.html).
> >
> > BTW, does Tomcat 4.X has the same problem ?
> >
> > Tal
> >
> >
> >
> >
> > > -----Original Message-----
> > > From: cmanolache@yahoo.com [mailto:cmanolache@yahoo.com]
> > > Sent: Tuesday, March 20, 2001 7:03 AM
> > > To: tomcat-dev@jakarta.apache.org
> > > Subject: Re: Bug 1006, what's next ?
> > >
> > >
> > > Hi,
> > >
> > > I had a (long) weekend without computers. But I still found
> one and read
> > > the mail once - and your report is very serious and important
> ( and not
> > > easy to fix ). You have (at least ) my full attention. The read
> > > timeout will be checked in soon - but the general problem

> with a servlet
> > > hanging a thread ) is very hard to resolve (or I don't know any good
> > > solution ).
> > >
> > > We could stop setting an upper limit on the thread count ( we
> still have
> > > the OS upper limit ), and we could also use the (dangerous,
> > > deprecated) suspend/terminate on the thread that is taking
> too much time.
> > >
> > > Have you tried any fix ? The timeout will not resolve the
> "bursts" ( and
> > > high-loaded servers ) - unless it is very short.
> > >
> > > BTW, this is not a tomcat-specific problem ( I would guess Apache does
> > > have the same issue - and we need to find how they deal with that ).
> > >
> > > Costin
> > >
> > >
> > > On Tue, 20 Mar 2001, Tal Dayan wrote:
> > >
> > > >
> > > > Two days ago I filed a bug report regarding a sever hanging
> problem in
> > > > PollTcpEndpoint of Tomcat 3.x. The bug is
> > > > at http://nagoya.apache.org/bugzilla/show_bug.cgi?id=1006 and
> > > also include a
> > > > suggestion for a patch.
> > > >
> > > > Since then I did not notice any change in the status or
> > > resolution of the
> > > > bug report nor any
> > > > indication that it got anywhere.
> > > >
> > > > 1. Is this is the right place to file the bug ?
> > > >
> > > > 2. Is the bug filed correctly ?
> > > >
> > > > 3. Should I do anything else to make sure the bug gets the
> > > attention of the
> > > > relevant maintainers ?
> > > >
> > > > Thanks,
> > > >
> > > > Tal
> > > >
> > > >
> > >
> > >
> >
>
>
>