You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Graham Leggett <mi...@sharp.fm> on 1999/05/19 09:00:13 UTC

Apache restart bug (1.3.7-dev)

[This is another repost, didn't come through either]

Hi all,

I am experiencing some bizaare behaviour from Apache v1.3.7-dev
(19990510071220) when an attempt is made to restart Apache using
"apachectl restart".

When a restart request is issued, Apache sends a SIGHUP to the process
group, but not all processes die, some remain. Apache sends another
SIGHUP, and logs the problem. Again, some processes remain (18 logged in
my test). Apache then tried a SIGTERM (terminating another 7), and
finally it tried a SIGKILL (terminating the final 11).

By looking at the output of the extended server status screen, it turns
out that a number of processes in the list were affected. These
processes were in a number of states just before the restart was issued,
some were stuck in a "...reading..." state with a code R, some of them
were in W, one or two even idle, though this could have been due to the
time delay between the server-status output and the server restarting.
Most of them seemed stuck, with a long time (>60 seconds) on the time
since last request start.

Over time without the restarts caused by our hourly logfile rotation
these stuck processes cause Apache to spawn more processes to replace
them, eventually reaching the MaxClients limit.

Interestingly enough, this behaviour does *not* seem to happen with
Apache
v1.3.6. At first I suspected that Dave Carrigan's auth_ldap module was a
contributing factor, however disabling this module has had no effect on
the problem.

I've looked through the src/CHANGES file for possible explanations,
however nothing from there seems obvious. Any ideas?

(Apache v1.3.7-dev running under Solaris v2.6)

Regards,
Graham
-- 
-----------------------------------------
minfrin@sharp.fm		"There's a moon
					over Bourbon Street
						tonight...

Re: Apache restart bug (1.3.7-dev)

Posted by Graham Leggett <mi...@sharp.fm>.
Dean Gaudet wrote:

> truss on the stuck processes would be helpful...
> 
> can you reproduce it at will?

I can, yes.

> If so, truss -f the entire thing, including the parent and all the forks
> and such... reproduce the problem... then stick the output on some website
> and direct us at it.

I made a truss -f output and a backtrace using Solaris pstack just
before the stop request was given to Apache. They can be found at
http://www.ericsson.se/apache-debug.tar.gz.

Regards,
Graham
-- 
-----------------------------------------
minfrin@sharp.fm		"There's a moon
					over Bourbon Street
						tonight...



Re: Apache restart bug (1.3.7-dev)

Posted by Dean Gaudet <dg...@arctic.org>.
truss on the stuck processes would be helpful... 

can you reproduce it at will? 

If so, truss -f the entire thing, including the parent and all the forks
and such... reproduce the problem... then stick the output on some website
and direct us at it.

Dean

On Wed, 19 May 1999, Graham Leggett wrote:

> [This is another repost, didn't come through either]
> 
> Hi all,
> 
> I am experiencing some bizaare behaviour from Apache v1.3.7-dev
> (19990510071220) when an attempt is made to restart Apache using
> "apachectl restart".
> 
> When a restart request is issued, Apache sends a SIGHUP to the process
> group, but not all processes die, some remain. Apache sends another
> SIGHUP, and logs the problem. Again, some processes remain (18 logged in
> my test). Apache then tried a SIGTERM (terminating another 7), and
> finally it tried a SIGKILL (terminating the final 11).
> 
> By looking at the output of the extended server status screen, it turns
> out that a number of processes in the list were affected. These
> processes were in a number of states just before the restart was issued,
> some were stuck in a "...reading..." state with a code R, some of them
> were in W, one or two even idle, though this could have been due to the
> time delay between the server-status output and the server restarting.
> Most of them seemed stuck, with a long time (>60 seconds) on the time
> since last request start.
> 
> Over time without the restarts caused by our hourly logfile rotation
> these stuck processes cause Apache to spawn more processes to replace
> them, eventually reaching the MaxClients limit.
> 
> Interestingly enough, this behaviour does *not* seem to happen with
> Apache
> v1.3.6. At first I suspected that Dave Carrigan's auth_ldap module was a
> contributing factor, however disabling this module has had no effect on
> the problem.
> 
> I've looked through the src/CHANGES file for possible explanations,
> however nothing from there seems obvious. Any ideas?
> 
> (Apache v1.3.7-dev running under Solaris v2.6)
> 
> Regards,
> Graham
> -- 
> -----------------------------------------
> minfrin@sharp.fm		"There's a moon
> 					over Bourbon Street
> 						tonight...
>