You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Aaron Bannert <aa...@clove.org> on 2002/05/10 21:27:11 UTC

Deadlocks possible with CGI scripts?

The discussion about filters got me thinking about places where we
could use multiplexed I/O in Apache 2.0's filter, which led me to
mod_cgi/cgid. In my research I think I may have found a long-standing
possible deadlock (and denial of service) in how we pass input bodies
to CGI scripts. I was able to reproduce both failure cases on Apache
1.3.25-dev and I suspect the same problem still exists today:

1) If the script doesn't consume the input data, and the data exceeds
   the buffer sizes of the operating system, then the httpd process
   and the CGI child will deadlock. (Note: The CGI spec specifically
   states that scripts are not obliged to read the data. See sect. 6.2
   here: http://cgi-spec.golux.com/draft-coar-cgi-v11-03-clean.html#6.0)

2) If the script writes enough data to fill its output buffers without
   consuming enough of the input data such that the entire input body
   is contained in the operating system's buffers, then a deadlock
   will also occur in the same way as above.

The simple workaround is to add a LimitRequestBody directive to your
httpd.conf that is smaller than the I/O buffers on your system.

I'm not as familiar with the Apache 1.3 code as I'd like, but I suspect
that we should be pulling the descriptors out of the buff_struct available
to mod_cgi, and using those in a poll set to perform multiplexed I/O. A
solution in 2.0, however, is not as straightforward.

In the long term I'd like to start profiling cases where multiplexed
I/O is either necessary or beneficial. Then I think we can start talking
about how to design this in, either at the filter-stack level, the
ap_get_brigade()-level, or somewhere else.

-aaron

Re: Deadlocks possible with CGI scripts?

Posted by Bill Stoddard <bi...@wstoddard.com>.
Yep it's possible... I've brought this up a few times. If the CGI script starts responding
before it has consumed all POST request, it can easily get into deadlock. The same problem
exists in 2.0. If you recall, I issued a challenge a few months back (Feb 12):

<begin>
"I've wanted to look into this for quite awhile but never seem to have the time to do
it...

Wouldn't it be cool if the interface between the server and mod_cgi (or any other content
generator) could be configured to be full duplex (rather than stuck at half duplex as it
is today)?

With any version of Apache, it is not possible to support an HTTP transaction that
consists of a continuous stream of chunked encoded bytes from the client and a concurrent
stream of chunk encoded bytes back to the server because the interface between the server
and the application (a cgi script for example) is half-duplex. The server will not attempt
to read any bytes from the application until it has received ALL of the request. The cgi
must read and buffer ALL of the request before sending any of the response else the
application risks deadlocking with the server.

This is an interesting exercise in that we will need to manipulate the blocking behaviour
of
the network layer from the top of the filter stack."
</begin>

I'm still interested and I still don't have much time to devote to it.

Bill


> The discussion about filters got me thinking about places where we
> could use multiplexed I/O in Apache 2.0's filter, which led me to
> mod_cgi/cgid. In my research I think I may have found a long-standing
> possible deadlock (and denial of service) in how we pass input bodies
> to CGI scripts. I was able to reproduce both failure cases on Apache
> 1.3.25-dev and I suspect the same problem still exists today:
>
> 1) If the script doesn't consume the input data, and the data exceeds
>    the buffer sizes of the operating system, then the httpd process
>    and the CGI child will deadlock. (Note: The CGI spec specifically
>    states that scripts are not obliged to read the data. See sect. 6.2
>    here: http://cgi-spec.golux.com/draft-coar-cgi-v11-03-clean.html#6.0)
>
> 2) If the script writes enough data to fill its output buffers without
>    consuming enough of the input data such that the entire input body
>    is contained in the operating system's buffers, then a deadlock
>    will also occur in the same way as above.
>
> The simple workaround is to add a LimitRequestBody directive to your
> httpd.conf that is smaller than the I/O buffers on your system.
>
> I'm not as familiar with the Apache 1.3 code as I'd like, but I suspect
> that we should be pulling the descriptors out of the buff_struct available
> to mod_cgi, and using those in a poll set to perform multiplexed I/O. A
> solution in 2.0, however, is not as straightforward.
>
> In the long term I'd like to start profiling cases where multiplexed
> I/O is either necessary or beneficial. Then I think we can start talking
> about how to design this in, either at the filter-stack level, the
> ap_get_brigade()-level, or somewhere else.
>
> -aaron
>