You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by Costin Manolache <Co...@eng.sun.com> on 2000/07/20 20:09:50 UTC

BUG (IMPORTANT): AJP12 hangs in certain conditions

Hi,

Sam, I think we should hold the release until this one is resolved.

I have a long running tomcat , with apache 1.3, and after 2-3 days
it stops working in strange conditions.

The HTTP connector works fine, but AJP12 reports:
" Premature end of script headers: " if mod_jserv is used
and nothing ( bug just hangs ) with mod_jk.

I am working on this one, but I have a lot of ( job related) work,
and it's a bit hard to reproduce it ( wait 2 days ).

If anyone has an idea, please let me know, it's very important
to fix this one.

Costin




Re: BUG (IMPORTANT): AJP12 hangs in certain conditions

Posted by Costin Manolache <co...@eng.sun.com>.
Hi,

After more investigation I think I know what's wrong.

It think it's very probably a bug in the VM and probably it
happens in the connector code / thread pool. If you remember,
just before 3.1 we had a similar behavior with a Linux
VM ( it was much worse - the hangup happened after few 100s
of requests ). The old connector had no problems, and that was
one of the reasons we decided to not use the new thread pool
by default.

A short recap:
- on FreeBSD with JDK1.1, after 3-4 days of normal
operation tomcat will hung.  Telnet on 8007 will work, but
nothing will happen ( i.e. the problem is after accept, the java
process works but something wrong happens in connector ).
If port 8080 is enabled, it will work fine.

- The machine have 2 processors and high load ( many requests).

- JServ used to work fine, without any problems.

- Since everything else works in tomcat, except the connector
( i.e. after accept() and before Ajp12ConnectionHandler - shutdown
doesn't work either and it is implemented in Ajp12 ), and since most
of the adapter is common with Jserv, I assume the problem is in
the only part that is different - tomcat uses a  thread pooling
mechanism

We have few choices:

1. Use the old connector. It's not very difficult to "revive" it, but
I have
very limited time and a lot of work to do, so we'll need a volunteer.
Note that
this will have a significant effect on performance ( the thread pool
means
20..30% improvement, and much more on old VMs )

2. Spend more time investigating this bug and maybe find a workaround
or fix. This is very, very difficult IMHO( or I don't know any good way
to do that,
it maybe trivial for someone smarter ).  The main problem is that the
problem is hard to reproduce ( 4 days since the last hang and it still
works).

3.  Go ahead with the current configuration, and maybe in a future
release ( and if more VMs will have this problem ) add the old
connector.
Probably a better VM will be available, and if Apache 2.0 will be
released
the native connector will also be available.

It's a hard decision, but IMHO we shouldn't hold 3.2 because of this.
Sam, what do you think ?

Costin