You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@httpd.apache.org by Steven Simpson <ss...@comp.lancs.ac.uk> on 2010/11/15 10:43:33 UTC

[users@httpd] Buffering of huge POSTs to CGI and alternatives

Hi,

I intend to have a CGI program extract a form field and deliver this
data to an external system, but the field in question is likely to be
huge.  The server can't invoke the program until it knows the length of
the request body, in order to set CONTENT_LENGTH in the program's
environment.  If the POST doesn't include a Content-Length field, the
server will have to buffer the entire contents somewhere.  Can it deal
with huge message bodies, such as those exceeding virtual RAM, by saving
to disc (for example)?

How does FastCGI/fcgid compare?  Will it handle huge POSTs any better? 
My cursory reading of the FastCGI spec suggests that it doesn't have to
know the content length to deliver it, because it is sent in chunks.

Thanks,

Steven


---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] Buffering of huge POSTs to CGI and alternatives

Posted by Steven Simpson <ss...@comp.lancs.ac.uk>.
On 15/11/10 19:00, Jeff Trawick wrote:
> On Mon, Nov 15, 2010 at 4:43 AM, Steven Simpson <ss...@comp.lancs.ac.uk> wrote:
>> Can [CGI/Apache] deal
>> with huge message bodies, such as those exceeding virtual RAM, by saving
>> to disc (for example)?
> mod_cgi/mod_cgid: request will fail with 411 if client doesn't send
> content-length;

So it's extremely likely that all clients, certainly major browsers,
will have to send Content-Length, or there'd be failures all the time.

I've just realised I've muddled up responses (whose lengths can be
implied by connection close) and requests (whose lengths can't).  (I
knew this, just forgot!)  Chunking is still a possibility, but since it
is a 1.1 feature, and in isolation a client doesn't know that the server
is 1.1, it is practically likely to always send Content-Length and not
risk chunking.

What's confused me is that we did a quick test a few days ago with
Safari and Firefox on a dummy CGI script.  Using one of the Firefox
extensions, we could see that it indeed sent Content-Length.  In
contrast, Safari's developer tools claimed that no length or chunking
was sent.  I'll have to try them again, as maybe those tools fib a bit.

>  i.e., they don't have the logic to spool the body and
> compute CONTENT_LENGTH

So, apart from small amounts of buffering in Apache and the network
stack, the body will mostly be waiting in the client if it's
particularly big, until the CGI program starts consuming it...?

[snip good summary of alternatives' behaviours]

> So: mod_fcgid seems closest to what you need, and whether it works for
> you today is dependent on whether or not you need CONTENT_LENGTH set
> due to a chunked request body.

Thanks for that!

I'll do some deeper tests of the POSTing behaviour of clients, and
hopefully find that CGI won't be a problem.  If it is, I'll investigate
mod_fcgid.

Thanks to all respondents!

Cheers,

Steven

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] Buffering of huge POSTs to CGI and alternatives

Posted by Jeff Trawick <tr...@gmail.com>.
On Mon, Nov 15, 2010 at 4:43 AM, Steven Simpson <ss...@comp.lancs.ac.uk> wrote:
> Hi,
>
> I intend to have a CGI program extract a form field and deliver this
> data to an external system, but the field in question is likely to be
> huge.  The server can't invoke the program until it knows the length of
> the request body, in order to set CONTENT_LENGTH in the program's
> environment.  If the POST doesn't include a Content-Length field, the
> server will have to buffer the entire contents somewhere.  Can it deal
> with huge message bodies, such as those exceeding virtual RAM, by saving
> to disc (for example)?

mod_cgi/mod_cgid: request will fail with 411 if client doesn't send
content-length; i.e., they don't have the logic to spool the body and
compute CONTENT_LENGTH

theoretically you could interject some other module to spool the body
and set a computed content-length before the cgi handler runs

>
> How does FastCGI/fcgid compare?  Will it handle huge POSTs any better?
> My cursory reading of the FastCGI spec suggests that it doesn't have to
> know the content length to deliver it, because it is sent in chunks.

FastCGI spec indicates that CONTENT_LENGTH will be provided to the
app, even though the request body will be sent to the application in
chunks.  Furthermore, it suggests that the app could compare
CONTENT_LENGTH with the actual length received to determine if the
client aborted before sending the entire body.

mod_fastcgi: as with mod_cgi/mod_cgid: the request fails with 411 if
the client doesn't send content-length

mod_fcgid: request is fine if client doesn't send content-length, but
application doesn't get CONTENT_LENGTH (bug)

mod_fcgid spools the entire body to memory/disk before connecting to
the app, so it should go ahead and pass over a computed CONTENT_LENGTH
value; that would also resolve a Content-Length value from the client
which becomes invalid because of a filter.

So: mod_fcgid seems closest to what you need, and whether it works for
you today is dependent on whether or not you need CONTENT_LENGTH set
due to a chunked request body.

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] Buffering of huge POSTs to CGI and alternatives

Posted by Jeff Trawick <tr...@gmail.com>.
On Mon, Nov 15, 2010 at 4:25 PM, William A. Rowe Jr.
<wr...@rowe-clan.net> wrote:
> On 11/15/2010 3:43 AM, Steven Simpson wrote:
>>
>> I intend to have a CGI program extract a form field and deliver this
>> data to an external system, but the field in question is likely to be
>> huge.  The server can't invoke the program until it knows the length of
>> the request body, in order to set CONTENT_LENGTH in the program's
>> environment.  If the POST doesn't include a Content-Length field, the
>> server will have to buffer the entire contents somewhere.  Can it deal
>> with huge message bodies, such as those exceeding virtual RAM, by saving
>> to disc (for example)?
>>
>> How does FastCGI/fcgid compare?  Will it handle huge POSTs any better?
>> My cursory reading of the FastCGI spec suggests that it doesn't have to
>> know the content length to deliver it, because it is sent in chunks.
>
> This is really a flaw in your CGI; it should read to end of stream (httpd
> will mark that stream EOF when it's complete under either cgi or fastcgi)
> and if it wants to read it all into memory (hopefully with some limits
> imposed based on realistic expectations) then it's free to buffer.
> httpd will avoid buffering entirely, whenever it is possible.  It isn't
> httpd's job to be buffering as a developer convenience.

It can be very useful for httpd/mod_foo to buffer request/response
bodies to restrict the number of application processes in use at any
one time, when they are more expensive than httpd threads.

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] Buffering of huge POSTs to CGI and alternatives

Posted by "William A. Rowe Jr." <wr...@rowe-clan.net>.
On 11/15/2010 6:30 PM, Nick Kew wrote:
> 
> The CGI spec is clear: CONTENT_LENGTH is guaranteed;
> EOF is NOT guaranteed, so reading to EOF is a bug and
> means the CGI can only ever "work" by coincidence!

Ah yes, pre-HTTP/1.1, silly me.

> CGI pre-dates chunked encoding (as do reports of the 
> death of CGI, from proponents of alternative serverside
> application environments)!

Actually, the RFC doesn't.

But point taken, saw Jeff's bug report, just EIGNORE my earlier comment.

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] Buffering of huge POSTs to CGI and alternatives

Posted by Nick Kew <ni...@webthing.com>.
On Mon, 15 Nov 2010 15:25:18 -0600
"William A. Rowe Jr." <wr...@rowe-clan.net> wrote:

> This is really a flaw in your CGI;

No it isn't!

The CGI spec is clear: CONTENT_LENGTH is guaranteed;
EOF is NOT guaranteed, so reading to EOF is a bug and
means the CGI can only ever "work" by coincidence!

CGI pre-dates chunked encoding (as do reports of the 
death of CGI, from proponents of alternative serverside
application environments)!

-- 
Nick Kew

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] Buffering of huge POSTs to CGI and alternatives

Posted by "William A. Rowe Jr." <wr...@rowe-clan.net>.
On 11/15/2010 3:43 AM, Steven Simpson wrote:
> 
> I intend to have a CGI program extract a form field and deliver this
> data to an external system, but the field in question is likely to be
> huge.  The server can't invoke the program until it knows the length of
> the request body, in order to set CONTENT_LENGTH in the program's
> environment.  If the POST doesn't include a Content-Length field, the
> server will have to buffer the entire contents somewhere.  Can it deal
> with huge message bodies, such as those exceeding virtual RAM, by saving
> to disc (for example)?
> 
> How does FastCGI/fcgid compare?  Will it handle huge POSTs any better? 
> My cursory reading of the FastCGI spec suggests that it doesn't have to
> know the content length to deliver it, because it is sent in chunks.

This is really a flaw in your CGI; it should read to end of stream (httpd
will mark that stream EOF when it's complete under either cgi or fastcgi)
and if it wants to read it all into memory (hopefully with some limits
imposed based on realistic expectations) then it's free to buffer.

httpd will avoid buffering entirely, whenever it is possible.  It isn't
httpd's job to be buffering as a developer convenience.

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org