You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Nick <t0...@nickhill.co.uk> on 2005/03/31 00:02:05 UTC

Flushing lingering CGI processes

Hello

Firstly, thank you to everyone who has contributed to the great apache 
project.

I understand that if a CGI process is launched then gets stuck, if the 
CGI process keeps quiet, the apache server will not 'know' the http 
connection has broken so will not 'know' to kill the CGI process. The 
stuck CGI process and the accompanying apache process will then 
needlessly use system resources indefinately.

In a worst case scenario, the stuck CGI process will lock some server 
resource possibly causing many other CGI processes to get stuck, 
bringing the server down. (I attribute a recent crash to this).

A great solution for administrators would be to set a timer. If the CGI 
program hasn't given any output for the given time, the CGI process is 
killed.

A downside would be that the system will encourage sloppy programming by 
being very forgiving. On the other hand, it would mean apache would be 
more competitive.

Setting a timer and implementing the statefulness of a timer would 
conflict with apache being stateless, however, if this is only a way of 
swatting wayward processes and is opional, such statefullness shouldn't 
conflict with clustering and high availablility environments.

If statefulness is a beurocratic hurdle, the statefulness may be 
implemented as a CGI wrapper.

If counting CGI output bytes is an unacceptable overhead, perhaps a 
plain CGI time-out as a sub-optimal solution.


Comments?

Re: Flushing lingering CGI processes

Posted by Rici Lake <ri...@ricilake.net>.
On 31-Mar-05, at 6:16 AM, Nick wrote:

> As a refinement;
> The timer firstly sends a signal to the CGI program which may be 
> caught by the CGI program to generate some form of output. This output 
> will prove whether the http connection is still open. If the CGI 
> program doesn't output anything, a TERM is sent. If still no output 
> and process is still running, a KILL is sent.

This seems overly complex to me, and furthermore a bit difficult to 
actually implement in a CGI. How is a CGI to know whether it is "hung"? 
If it could figure that out, it could kill itself easily enough, but I 
believe it to be a computationally intractable problem related to the 
halting problem; certainly it is not something I'd like to try in a 
bash script. :)

It seems to me that a simple timer would be sufficient; however, it 
would be nice to be able to configure individual CGIs (or directories 
of CGIs) with longer or shorter timers; I have seen instances of 
webapps with CGIs which compute for several minutes, or even longer. 
(Presumably, the users of such webapps are very patient or really need 
the results.)

Rici


Re: Flushing lingering CGI processes

Posted by Nick <t0...@nickhill.co.uk>.
Nick Hill wrote:

> A great solution for administrators would be to set a timer. If the CGI 
> program hasn't given any output for the given time, the CGI process is 
> killed.

As a refinement;
The timer firstly sends a signal to the CGI program which may be caught 
by the CGI program to generate some form of output. This output will 
prove whether the http connection is still open. If the CGI program 
doesn't output anything, a TERM is sent. If still no output and process 
is still running, a KILL is sent.

A well written CGI will catch the first signal then if it doesn't break 
the system, output something (eg whitespace if outputting HTML). If the 
CGI program can't output anything without corrupting the output, a cgi 
timeout error should be written to a log before cleanly exiting.

A less-well written or hung CGI program will be closed by term or kill.

If the CGI program outputs something in response to the first signal, 
the timer is reset. Apache will then 'know' whether the http connection 
is still open. If http closed, will go ahead and kill the CGI process 
normally. If open, continue to wait for the CGI process. The timer will 
eventually send a signal again. The sysadmin should be able to set an 
upper limit on the number of times the first signal is sent before going 
to term and kill, providing an optional hard limit on how long a CGI can 
live for.

The first signal can be characterised as a way for apache to request the 
CGI program says something to prove there is still a connection between 
the CGI program and the client. There is no requirement for a CGI 
programmer to catch the signal, however, he may choose to if he expects 
the CGI program to run for long periods waiting for some external event.


Re: Flushing lingering CGI processes

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 04:02 PM 3/30/2005, Nick wrote:

>I understand that if a CGI process is launched then gets stuck, if the CGI process keeps quiet, the apache server will not 'know' the http connection has broken so will not 'know' to kill the CGI process. The stuck CGI process and the accompanying apache process will then needlessly use system resources indefinately.

Nick, don't feel bad about approaching this list, your timing
is actually amusing :)

We just backported a change which will clobber an httpd process
more thoroughly, and I have a request routed directly to me to
look at this on Win32.

One issue is the cleanups of cgi processes, as many have to do
some sort of cleanup (and be killed if they runaway.)

I agree the control you describe would be ideal.  And it solves
my question about our axing stale httpd processes, after ensuring
cgi processes have an opportunity to complete.

Bill 


Re: Flushing lingering CGI processes

Posted by Stas Bekman <st...@stason.org>.
Nick wrote:
> Hello
> 
> Firstly, thank you to everyone who has contributed to the great apache 
> project.
> 
> I understand that if a CGI process is launched then gets stuck, if the 
> CGI process keeps quiet, the apache server will not 'know' the http 
> connection has broken so will not 'know' to kill the CGI process. The 
> stuck CGI process and the accompanying apache process will then 
> needlessly use system resources indefinately.
> 
> In a worst case scenario, the stuck CGI process will lock some server 
> resource possibly causing many other CGI processes to get stuck, 
> bringing the server down. (I attribute a recent crash to this).
> 
> A great solution for administrators would be to set a timer. If the CGI 
> program hasn't given any output for the given time, the CGI process is 
> killed.

Nick, take a look at the perl module Apache::Watchdog::RunAway
http://search.cpan.org/dist/Apache-Watchdog-RunAway/
http://search.cpan.org/dist/Apache-Watchdog-RunAway/RunAway.pm
which does exactly what you want and comes with a little daemon.

To operate it needs to be able to fetch the scoreboard, which can be done 
by either Apache::Scoreboard if you run mod_perl, or by a stand-alone C 
module mod_scoreboard_send.c. Both live here:
http://search.cpan.org/dist/Apache-Scoreboard/

-- 
__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

Re: Flushing lingering CGI processes

Posted by Nick Hill <ni...@nickhill.co.uk>.
I have a script which takes a snapshot of processes running on my 
production server each minute.

I theorise several posts arriving around the same time possibly locked a 
database causing processes to pile up. bringing the whole machine down 
after 20 minutes as heavy, locked server and database processes as 
children of apache swamped the memory.

Appears to be 150 instances of apache plus 86 instances of a CGI script 
calling mysqld.

Obviously I would need to gather as much useful information as possible 
why this happened and to be sure my interpretation is accurate. I have 
both access logs and outputs of top for the whole event.

If anyone will find the output of top and the applicable log useful 
showing the period when the machine went AWOL, I will make them available.

I would be pleased to hear any suggestions which will help me surely 
identify what was really going on.

Joe Orton wrote:

> That does happen already: if any CGI script does not output anything for
> the period specified by the Timeout directive, it will be killed.  I
> think it would be useful to make this separately configurable from the
> network I/O timeout.
> 
> joe
> 
> 

Re: Flushing lingering CGI processes

Posted by Joe Orton <jo...@redhat.com>.
On Wed, Mar 30, 2005 at 11:02:05PM +0100, Nick wrote:
> I understand that if a CGI process is launched then gets stuck, if the 
> CGI process keeps quiet, the apache server will not 'know' the http 
> connection has broken so will not 'know' to kill the CGI process. The 
> stuck CGI process and the accompanying apache process will then 
> needlessly use system resources indefinately.
> 
> In a worst case scenario, the stuck CGI process will lock some server 
> resource possibly causing many other CGI processes to get stuck, 
> bringing the server down. (I attribute a recent crash to this).
> 
> A great solution for administrators would be to set a timer. If the CGI 
> program hasn't given any output for the given time, the CGI process is 
> killed.

That does happen already: if any CGI script does not output anything for
the period specified by the Timeout directive, it will be killed.  I
think it would be useful to make this separately configurable from the
network I/O timeout.

joe