You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Nick <t0...@nickhill.co.uk> on 2005/03/31 00:02:05 UTC
Flushing lingering CGI processes
Hello
Firstly, thank you to everyone who has contributed to the great apache
project.
I understand that if a CGI process is launched then gets stuck, if the
CGI process keeps quiet, the apache server will not 'know' the http
connection has broken so will not 'know' to kill the CGI process. The
stuck CGI process and the accompanying apache process will then
needlessly use system resources indefinately.
In a worst case scenario, the stuck CGI process will lock some server
resource possibly causing many other CGI processes to get stuck,
bringing the server down. (I attribute a recent crash to this).
A great solution for administrators would be to set a timer. If the CGI
program hasn't given any output for the given time, the CGI process is
killed.
A downside would be that the system will encourage sloppy programming by
being very forgiving. On the other hand, it would mean apache would be
more competitive.
Setting a timer and implementing the statefulness of a timer would
conflict with apache being stateless, however, if this is only a way of
swatting wayward processes and is opional, such statefullness shouldn't
conflict with clustering and high availablility environments.
If statefulness is a beurocratic hurdle, the statefulness may be
implemented as a CGI wrapper.
If counting CGI output bytes is an unacceptable overhead, perhaps a
plain CGI time-out as a sub-optimal solution.
Comments?
Re: Flushing lingering CGI processes
Posted by Rici Lake <ri...@ricilake.net>.
On 31-Mar-05, at 6:16 AM, Nick wrote:
> As a refinement;
> The timer firstly sends a signal to the CGI program which may be
> caught by the CGI program to generate some form of output. This output
> will prove whether the http connection is still open. If the CGI
> program doesn't output anything, a TERM is sent. If still no output
> and process is still running, a KILL is sent.
This seems overly complex to me, and furthermore a bit difficult to
actually implement in a CGI. How is a CGI to know whether it is "hung"?
If it could figure that out, it could kill itself easily enough, but I
believe it to be a computationally intractable problem related to the
halting problem; certainly it is not something I'd like to try in a
bash script. :)
It seems to me that a simple timer would be sufficient; however, it
would be nice to be able to configure individual CGIs (or directories
of CGIs) with longer or shorter timers; I have seen instances of
webapps with CGIs which compute for several minutes, or even longer.
(Presumably, the users of such webapps are very patient or really need
the results.)
Rici
Re: Flushing lingering CGI processes
Posted by Nick <t0...@nickhill.co.uk>.
Nick Hill wrote:
> A great solution for administrators would be to set a timer. If the CGI
> program hasn't given any output for the given time, the CGI process is
> killed.
As a refinement;
The timer firstly sends a signal to the CGI program which may be caught
by the CGI program to generate some form of output. This output will
prove whether the http connection is still open. If the CGI program
doesn't output anything, a TERM is sent. If still no output and process
is still running, a KILL is sent.
A well written CGI will catch the first signal then if it doesn't break
the system, output something (eg whitespace if outputting HTML). If the
CGI program can't output anything without corrupting the output, a cgi
timeout error should be written to a log before cleanly exiting.
A less-well written or hung CGI program will be closed by term or kill.
If the CGI program outputs something in response to the first signal,
the timer is reset. Apache will then 'know' whether the http connection
is still open. If http closed, will go ahead and kill the CGI process
normally. If open, continue to wait for the CGI process. The timer will
eventually send a signal again. The sysadmin should be able to set an
upper limit on the number of times the first signal is sent before going
to term and kill, providing an optional hard limit on how long a CGI can
live for.
The first signal can be characterised as a way for apache to request the
CGI program says something to prove there is still a connection between
the CGI program and the client. There is no requirement for a CGI
programmer to catch the signal, however, he may choose to if he expects
the CGI program to run for long periods waiting for some external event.
Re: Flushing lingering CGI processes
Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 04:02 PM 3/30/2005, Nick wrote:
>I understand that if a CGI process is launched then gets stuck, if the CGI process keeps quiet, the apache server will not 'know' the http connection has broken so will not 'know' to kill the CGI process. The stuck CGI process and the accompanying apache process will then needlessly use system resources indefinately.
Nick, don't feel bad about approaching this list, your timing
is actually amusing :)
We just backported a change which will clobber an httpd process
more thoroughly, and I have a request routed directly to me to
look at this on Win32.
One issue is the cleanups of cgi processes, as many have to do
some sort of cleanup (and be killed if they runaway.)
I agree the control you describe would be ideal. And it solves
my question about our axing stale httpd processes, after ensuring
cgi processes have an opportunity to complete.
Bill
Re: Flushing lingering CGI processes
Posted by Stas Bekman <st...@stason.org>.
Nick wrote:
> Hello
>
> Firstly, thank you to everyone who has contributed to the great apache
> project.
>
> I understand that if a CGI process is launched then gets stuck, if the
> CGI process keeps quiet, the apache server will not 'know' the http
> connection has broken so will not 'know' to kill the CGI process. The
> stuck CGI process and the accompanying apache process will then
> needlessly use system resources indefinately.
>
> In a worst case scenario, the stuck CGI process will lock some server
> resource possibly causing many other CGI processes to get stuck,
> bringing the server down. (I attribute a recent crash to this).
>
> A great solution for administrators would be to set a timer. If the CGI
> program hasn't given any output for the given time, the CGI process is
> killed.
Nick, take a look at the perl module Apache::Watchdog::RunAway
http://search.cpan.org/dist/Apache-Watchdog-RunAway/
http://search.cpan.org/dist/Apache-Watchdog-RunAway/RunAway.pm
which does exactly what you want and comes with a little daemon.
To operate it needs to be able to fetch the scoreboard, which can be done
by either Apache::Scoreboard if you run mod_perl, or by a stand-alone C
module mod_scoreboard_send.c. Both live here:
http://search.cpan.org/dist/Apache-Scoreboard/
--
__________________________________________________________________
Stas Bekman JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org http://ticketmaster.com
Re: Flushing lingering CGI processes
Posted by Nick Hill <ni...@nickhill.co.uk>.
I have a script which takes a snapshot of processes running on my
production server each minute.
I theorise several posts arriving around the same time possibly locked a
database causing processes to pile up. bringing the whole machine down
after 20 minutes as heavy, locked server and database processes as
children of apache swamped the memory.
Appears to be 150 instances of apache plus 86 instances of a CGI script
calling mysqld.
Obviously I would need to gather as much useful information as possible
why this happened and to be sure my interpretation is accurate. I have
both access logs and outputs of top for the whole event.
If anyone will find the output of top and the applicable log useful
showing the period when the machine went AWOL, I will make them available.
I would be pleased to hear any suggestions which will help me surely
identify what was really going on.
Joe Orton wrote:
> That does happen already: if any CGI script does not output anything for
> the period specified by the Timeout directive, it will be killed. I
> think it would be useful to make this separately configurable from the
> network I/O timeout.
>
> joe
>
>
Re: Flushing lingering CGI processes
Posted by Joe Orton <jo...@redhat.com>.
On Wed, Mar 30, 2005 at 11:02:05PM +0100, Nick wrote:
> I understand that if a CGI process is launched then gets stuck, if the
> CGI process keeps quiet, the apache server will not 'know' the http
> connection has broken so will not 'know' to kill the CGI process. The
> stuck CGI process and the accompanying apache process will then
> needlessly use system resources indefinately.
>
> In a worst case scenario, the stuck CGI process will lock some server
> resource possibly causing many other CGI processes to get stuck,
> bringing the server down. (I attribute a recent crash to this).
>
> A great solution for administrators would be to set a timer. If the CGI
> program hasn't given any output for the given time, the CGI process is
> killed.
That does happen already: if any CGI script does not output anything for
the period specified by the Timeout directive, it will be killed. I
think it would be useful to make this separately configurable from the
network I/O timeout.
joe