You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@httpd.apache.org by Stas Bekman <st...@stason.org> on 2003/12/10 23:53:30 UTC

filtering huge request bodies (like 650MB files)

Chris is trying to filter a 650MB file coming in through a proxy. Obviously he 
sees that httpd-2.0 is allocating > 650MB of memory, since each bucket will 
use the request's pool memory and won't free it untill after the request is 
over. Now even if his machine was able to deal with one such request, what if 
there are several of those? What's the solution in this case? How can we 
pipeline the memory allocation and release?

Ideally the core_in filter would allocate the buckets for a single brigade, 
pass them through the filters, core_out would splash them out and then core_in 
could theoretically reuse that memory for the next brigade. Obviously it's not 
how things work at the moment, as the memory is never freed (which could 
probably be dealt with), but the real problem is that no data will leave the 
server out before it was completely read in. So httpd always requires at the 
least the amount of memory that is needed to allocate to store all the 
incoming data. Which is usually multiplied by at least 2, if there is any 
transformation applied on that incoming data.

I'm not sure what to advise to Chris, who as a user rightfully thinks that 
it's a memory leak.

__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

Re: filtering huge request bodies (like 650MB files)

Posted by Sander Striker <st...@apache.org>.

On Thu, 2003-12-11 at 09:54, Stas Bekman wrote:
> Sander Striker wrote:
> > On Thu, 2003-12-11 at 07:44, Stas Bekman wrote:
> > 
> > 
> >>I now know what the problem is. It is not a problem in httpd or its filters, 
> >>but mod_perl, allocated filter struct from the pool. With many bucket brigades 
> >>there were many filter invocations during the same request, resulting in 
> >>multiple memory allocation. So I have to move to the good-old malloc/free to 
> >>solve this problem.
> >>
> >>Though it looks like I've found a problem in apr_pcalloc.
> >>
> >>modperl_filter_t *filter = apr_pcalloc(p, sizeof(*filter));
> >>
> >>was constantly allocating 40k of memory, whereas sizeof(*filter) == 16464
> >>
> >>replacing apr_pcalloc with apr_palloc reduced the memory allocations to 16k.
> >>
> >>Could it be a bug in APR_ALIGN_DEFAULT? apr_pcalloc calls APR_ALIGN_DEFAULT 
> >>and then it calls apr_palloc which calls APR_ALIGN_DEFAULT again, and probably 
> >>doubling the memory usage.
> > 
> > 
> > Woah!  I'll look into this.
> 
> I think pcalloc shouldn't call the alignment function, but let palloc align 
> the size and return the updated size to pcalloc, so it could memset the right 
> size.

APR_ALIGN_DEFAULT only bumps the size to the next 8 byte boundary.  It
should definitely not induce the allocation of the amount of memory you
report.

> On the other hand, the "right" size is the one requested by the caller, so the 
> caller really expects to zero only the amount it has requested. So may be it's 
> OK if palloc allocates more memory (because of the alignment), but calloc 
> zeroing only the requested amount. In which case all we need to do is nuke the 
> call to APR_ALIGN_DEFAULT in pcalloc.

apr_pcalloc is even a macro when not in debug mode:

#define apr_pcalloc(p, size) memset(apr_palloc(p, size), 0, size)

With that in mind, we should loose the APR_ALIGN_DEFAULT call, but
nevertheless, it shouldn't make a difference.

Sander

Re: filtering huge request bodies (like 650MB files)

Posted by Sander Striker <st...@apache.org>.

On Thu, 2003-12-11 at 09:54, Stas Bekman wrote:
> Sander Striker wrote:
> > On Thu, 2003-12-11 at 07:44, Stas Bekman wrote:
> > 
> > 
> >>I now know what the problem is. It is not a problem in httpd or its filters, 
> >>but mod_perl, allocated filter struct from the pool. With many bucket brigades 
> >>there were many filter invocations during the same request, resulting in 
> >>multiple memory allocation. So I have to move to the good-old malloc/free to 
> >>solve this problem.
> >>
> >>Though it looks like I've found a problem in apr_pcalloc.
> >>
> >>modperl_filter_t *filter = apr_pcalloc(p, sizeof(*filter));
> >>
> >>was constantly allocating 40k of memory, whereas sizeof(*filter) == 16464
> >>
> >>replacing apr_pcalloc with apr_palloc reduced the memory allocations to 16k.
> >>
> >>Could it be a bug in APR_ALIGN_DEFAULT? apr_pcalloc calls APR_ALIGN_DEFAULT 
> >>and then it calls apr_palloc which calls APR_ALIGN_DEFAULT again, and probably 
> >>doubling the memory usage.
> > 
> > 
> > Woah!  I'll look into this.
> 
> I think pcalloc shouldn't call the alignment function, but let palloc align 
> the size and return the updated size to pcalloc, so it could memset the right 
> size.

APR_ALIGN_DEFAULT only bumps the size to the next 8 byte boundary.  It
should definitely not induce the allocation of the amount of memory you
report.

> On the other hand, the "right" size is the one requested by the caller, so the 
> caller really expects to zero only the amount it has requested. So may be it's 
> OK if palloc allocates more memory (because of the alignment), but calloc 
> zeroing only the requested amount. In which case all we need to do is nuke the 
> call to APR_ALIGN_DEFAULT in pcalloc.

apr_pcalloc is even a macro when not in debug mode:

#define apr_pcalloc(p, size) memset(apr_palloc(p, size), 0, size)

With that in mind, we should loose the APR_ALIGN_DEFAULT call, but
nevertheless, it shouldn't make a difference.

Sander

Re: filtering huge request bodies (like 650MB files)

Posted by Stas Bekman <st...@stason.org>.

Sander Striker wrote:
> On Thu, 2003-12-11 at 07:44, Stas Bekman wrote:
> 
> 
>>I now know what the problem is. It is not a problem in httpd or its filters, 
>>but mod_perl, allocated filter struct from the pool. With many bucket brigades 
>>there were many filter invocations during the same request, resulting in 
>>multiple memory allocation. So I have to move to the good-old malloc/free to 
>>solve this problem.
>>
>>Though it looks like I've found a problem in apr_pcalloc.
>>
>>modperl_filter_t *filter = apr_pcalloc(p, sizeof(*filter));
>>
>>was constantly allocating 40k of memory, whereas sizeof(*filter) == 16464
>>
>>replacing apr_pcalloc with apr_palloc reduced the memory allocations to 16k.
>>
>>Could it be a bug in APR_ALIGN_DEFAULT? apr_pcalloc calls APR_ALIGN_DEFAULT 
>>and then it calls apr_palloc which calls APR_ALIGN_DEFAULT again, and probably 
>>doubling the memory usage.
> 
> 
> Woah!  I'll look into this.

I think pcalloc shouldn't call the alignment function, but let palloc align 
the size and return the updated size to pcalloc, so it could memset the right 
size.

On the other hand, the "right" size is the one requested by the caller, so the 
caller really expects to zero only the amount it has requested. So may be it's 
OK if palloc allocates more memory (because of the alignment), but calloc 
zeroing only the requested amount. In which case all we need to do is nuke the 
call to APR_ALIGN_DEFAULT in pcalloc.

__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

Re: filtering huge request bodies (like 650MB files)

Posted by Sander Striker <st...@apache.org>.

On Thu, 2003-12-11 at 07:44, Stas Bekman wrote:

> I now know what the problem is. It is not a problem in httpd or its filters, 
> but mod_perl, allocated filter struct from the pool. With many bucket brigades 
> there were many filter invocations during the same request, resulting in 
> multiple memory allocation. So I have to move to the good-old malloc/free to 
> solve this problem.
> 
> Though it looks like I've found a problem in apr_pcalloc.
> 
> modperl_filter_t *filter = apr_pcalloc(p, sizeof(*filter));
> 
> was constantly allocating 40k of memory, whereas sizeof(*filter) == 16464
> 
> replacing apr_pcalloc with apr_palloc reduced the memory allocations to 16k.
> 
> Could it be a bug in APR_ALIGN_DEFAULT? apr_pcalloc calls APR_ALIGN_DEFAULT 
> and then it calls apr_palloc which calls APR_ALIGN_DEFAULT again, and probably 
> doubling the memory usage.

Woah!  I'll look into this.

Sander

Re: filtering huge request bodies (like 650MB files)

Posted by Stas Bekman <st...@stason.org>.

Stas Bekman wrote:
> I'm debugging the issue. I have a good test case, having a response 
> handler sending 1byte followed by rflush in a loop creates lots of 
> buckets. I can see that each iteration allocates 40k. i.e. each new 
> bucket brigade and its bucket demand 40k which won't be reused till the 
> next request. This happens only if using a custom filter. I'm next going 
> to move in and try to see whether the extra allocation comes from 
> modperl or something else. I'll keep you posted.

I now know what the problem is. It is not a problem in httpd or its filters, 
but mod_perl, allocated filter struct from the pool. With many bucket brigades 
there were many filter invocations during the same request, resulting in 
multiple memory allocation. So I have to move to the good-old malloc/free to 
solve this problem.

Though it looks like I've found a problem in apr_pcalloc.

modperl_filter_t *filter = apr_pcalloc(p, sizeof(*filter));

was constantly allocating 40k of memory, whereas sizeof(*filter) == 16464

replacing apr_pcalloc with apr_palloc reduced the memory allocations to 16k.

Could it be a bug in APR_ALIGN_DEFAULT? apr_pcalloc calls APR_ALIGN_DEFAULT 
and then it calls apr_palloc which calls APR_ALIGN_DEFAULT again, and probably 
doubling the memory usage.

__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

Re: filtering huge request bodies (like 650MB files)

Posted by Stas Bekman <st...@stason.org>.

I'm debugging the issue. I have a good test case, having a response handler 
sending 1byte followed by rflush in a loop creates lots of buckets. I can see 
that each iteration allocates 40k. i.e. each new bucket brigade and its bucket 
demand 40k which won't be reused till the next request. This happens only if 
using a custom filter. I'm next going to move in and try to see whether the 
extra allocation comes from modperl or something else. I'll keep you posted.

sub handler {
     my $r = shift;

     $r->content_type('text/plain');

     my $chunk = "x";

     for (1..70) {
         my $before = $gtop->proc_mem($$)->size;

         $r->print($chunk);
         $r->rflush;

         my $after = $gtop->proc_mem($$)->size;
         warn sprintf "size : %-5s\n", GTop::size_string($after - $before),
     }

     Apache::OK;
}

This handler on its own requires just a few bytes. When feeding it to a simple 
pass-through unmodified filter it ends up allocating 70*40k = 2800kb. 
Obviously with 650MB file there are going to be many more buckets...

__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

Re: filtering huge request bodies (like 650MB files)

Posted by Stas Bekman <st...@stason.org>.

William A. Rowe, Jr. wrote:
> At 04:57 PM 12/10/2003, Cliff Woolley wrote:
> 
>>On Wed, 10 Dec 2003, Stas Bekman wrote:
>>
>>
>>>Obviously it's not how things work at the moment, as the memory is never
>>>freed (which could probably be dealt with), but the real problem is that
>>>no data will leave the server out before it was completely read in.
>>
>>Yes, that would be the real problem.  So somewhere there is a filter (or
>>maybe the proxy itself) buffering the entire data stream before sending
>>it.  That is a bug.
> 
> 
> It's NOT the proxy - I've been through it many times - and AFAICT we have
> a simple leak in that we don't reuse the individual pool buckets, so memory
> creeps up over time.  It isn't even the end of the world, until someone at
> apachecon pointed out continous HTML proxied streams (e.g. video) really
> gobble memory, even at 8kb/min+ this isn't acceptable.
> 
> So it's not the proxy or the core output filter.  The bug lies in the Filter itself.
> Is it Chrises' own filter or one of ours?  whichever it is, it would be nice to
> get this fixed.  This is why we aught to not flip subject headers, Stas, I'm
> really too short on time to go fumbling for the original posts.  Need to know
> which filters are inserted, and therefore possibly suspect.

I wasn't flipping subjects, Bill. The original report was posted to the 
modperl list and there was indeed a leak in modperl that I've plugged 
yesterday. But Chris did testing with the fix and the leakage was still there, 
and it wasn't in his filter, since his filter didn't do anything. So the 
original post isn't relevant as it was.

Chris' filter just moves the bucket brigades through unmodified. In his case 
it was an output filter so all it did is calling ap_pass_brigade and return 
SUCCESS. His configuration was:

Listen 8081

LoadModule perl_module modules/mod_perl.so
PerlModule Apache2
PerlModule Test::BigFileTest

<VirtualHost *:8081>
	ServerName my_server_name:8081
	ServerAdmin chris.pringle@hp.com
	ProxyRequests On
	ProxyRemote * http://corporate_proxy_server:8080
	<Proxy *>
		Order Allow,Deny
		Allow from all
		PerlOutputFilterHandler Test::BigFileTest
	</Proxy>
</VirtualHost>


__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

Re: filtering huge request bodies (like 650MB files)

Posted by Greg Stein <gs...@lyra.org>.

On Wed, Dec 10, 2003 at 05:23:14PM -0600, William A. Rowe, Jr. wrote:
>...
> It's NOT the proxy - I've been through it many times - and AFAICT we have
> a simple leak in that we don't reuse the individual pool buckets, so memory
> creeps up over time.  It isn't even the end of the world, until someone at
> apachecon pointed out continous HTML proxied streams (e.g. video) really
> gobble memory, even at 8kb/min+ this isn't acceptable.
> 
> So it's not the proxy or the core output filter.  The bug lies in the Filter itself.
> Is it Chrises' own filter or one of ours?  whichever it is, it would be nice to
> get this fixed.  This is why we aught to not flip subject headers, Stas, I'm
> really too short on time to go fumbling for the original posts.  Need to know
> which filters are inserted, and therefore possibly suspect.

The brigrade structure is allocated in a pool, along with a cleanup. The
*buckets* might get returned to memory when the brigade is cleared, but
the brigade itself won't.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: filtering huge request bodies (like 650MB files)

Posted by Nick Kew <ni...@webthing.com>.

On Wed, 10 Dec 2003, William A. Rowe, Jr. wrote:

> Is it Chrises' own filter or one of ours?  whichever it is, it would be nice to
> get this fixed.

Can I suggest Chris insert mod_diagnostics at different points in his
chain to identify exactly where it's buffering (if indeed that's where
his memory is going)?

I had a very similar situation to this, when a bug in a third-party
library caused it to buffer everything in my filter.  mod_diagnostics
rapidly tracked that down for a x300 performance improvement.

-- 
Nick Kew

Re: filtering huge request bodies (like 650MB files)

Posted by "William A. Rowe, Jr." <wr...@apache.org>.

At 04:57 PM 12/10/2003, Cliff Woolley wrote:
>On Wed, 10 Dec 2003, Stas Bekman wrote:
>
>> Obviously it's not how things work at the moment, as the memory is never
>> freed (which could probably be dealt with), but the real problem is that
>> no data will leave the server out before it was completely read in.
>
>Yes, that would be the real problem.  So somewhere there is a filter (or
>maybe the proxy itself) buffering the entire data stream before sending
>it.  That is a bug.

It's NOT the proxy - I've been through it many times - and AFAICT we have
a simple leak in that we don't reuse the individual pool buckets, so memory
creeps up over time.  It isn't even the end of the world, until someone at
apachecon pointed out continous HTML proxied streams (e.g. video) really
gobble memory, even at 8kb/min+ this isn't acceptable.

So it's not the proxy or the core output filter.  The bug lies in the Filter itself.
Is it Chrises' own filter or one of ours?  whichever it is, it would be nice to
get this fixed.  This is why we aught to not flip subject headers, Stas, I'm
really too short on time to go fumbling for the original posts.  Need to know
which filters are inserted, and therefore possibly suspect.

Bill

Re: filtering huge request bodies (like 650MB files)

Posted by Glenn <gs...@gluelogic.com>.

On Thu, Dec 11, 2003 at 10:35:28PM -0600, William A. Rowe, Jr. wrote:
> I've been thinking the same thing driving around this evening...
> 
> One major goal of httpd-3.0 is *finally* arriving at something that starts
> looking async.  We've kicked it around some time, but perhaps it's time
> to start looking at the async poll implementation, to get some idea of how
> we can 'poll' on multiple sorts of events.
> 
> The one thing that is clear to me, pre-1.0:  win32 needs to be able to poll
> pipes and sockets, *even* if it means a really lame 100ms timeout (perhaps
> configurable) on the socket poll to look sideways at the pipe info.  There is
> no way to solve any of these problems without clearing that first hurdle.

On platforms where pipes are not an advantage over sockets, why not use
socketpair() instead of pipe()?

Cheers,
Glenn

Re: filtering huge request bodies (like 650MB files)

Posted by "William A. Rowe, Jr." <wr...@apache.org>.

I've been thinking the same thing driving around this evening...

One major goal of httpd-3.0 is *finally* arriving at something that starts
looking async.  We've kicked it around some time, but perhaps it's time
to start looking at the async poll implementation, to get some idea of how
we can 'poll' on multiple sorts of events.

The one thing that is clear to me, pre-1.0:  win32 needs to be able to poll
pipes and sockets, *even* if it means a really lame 100ms timeout (perhaps
configurable) on the socket poll to look sideways at the pipe info.  There is
no way to solve any of these problems without clearing that first hurdle.

But you brought up a great point - what about some notification signals?
How do we extend 'poll'.  It sure looks like we need something more clever
than a wrapper around posix poll/select.

Bill

At 02:52 PM 12/11/2003, Aaron Bannert wrote:
>On Thu, Dec 11, 2003 at 01:50:46PM -0600, William A. Rowe, Jr. wrote:
>> But the 2.0 architecture is entirely different.  We need a poll but it's not entirely
>> obvious where to put one...
>> 
>> One suggestion raised in a poll bucket: when a connection level filter cannot
>> read anything more, it passed back a bucket containing a poll descriptor as
>> metadata.  Each filter passes this metadata bucket back up.  Some filters
>> like mod_ssl would move it from the connection brigade to the data brigade.
>
>At one level we'll have to fit whatever I/O multiplexer we come
>up with in the filters. I'm going to stay out of that discussion.
>
>At a lower level, ignoring filters for a moment, we still need a
>way for applications to be able to multiplex I/O between different
>I/O types: pipes, files, sockets, IPC, etc... I think this is the
>root of the problem (and something we should probably move over
>to the dev@apr list, and also something we might want to take up
>after APR 1.0 is released).
>
>-aaron

Re: filtering huge request bodies (like 650MB files)

Posted by "William A. Rowe, Jr." <wr...@apache.org>.

I've been thinking the same thing driving around this evening...

One major goal of httpd-3.0 is *finally* arriving at something that starts
looking async.  We've kicked it around some time, but perhaps it's time
to start looking at the async poll implementation, to get some idea of how
we can 'poll' on multiple sorts of events.

The one thing that is clear to me, pre-1.0:  win32 needs to be able to poll
pipes and sockets, *even* if it means a really lame 100ms timeout (perhaps
configurable) on the socket poll to look sideways at the pipe info.  There is
no way to solve any of these problems without clearing that first hurdle.

But you brought up a great point - what about some notification signals?
How do we extend 'poll'.  It sure looks like we need something more clever
than a wrapper around posix poll/select.

Bill

At 02:52 PM 12/11/2003, Aaron Bannert wrote:
>On Thu, Dec 11, 2003 at 01:50:46PM -0600, William A. Rowe, Jr. wrote:
>> But the 2.0 architecture is entirely different.  We need a poll but it's not entirely
>> obvious where to put one...
>> 
>> One suggestion raised in a poll bucket: when a connection level filter cannot
>> read anything more, it passed back a bucket containing a poll descriptor as
>> metadata.  Each filter passes this metadata bucket back up.  Some filters
>> like mod_ssl would move it from the connection brigade to the data brigade.
>
>At one level we'll have to fit whatever I/O multiplexer we come
>up with in the filters. I'm going to stay out of that discussion.
>
>At a lower level, ignoring filters for a moment, we still need a
>way for applications to be able to multiplex I/O between different
>I/O types: pipes, files, sockets, IPC, etc... I think this is the
>root of the problem (and something we should probably move over
>to the dev@apr list, and also something we might want to take up
>after APR 1.0 is released).
>
>-aaron

Re: filtering huge request bodies (like 650MB files)

Posted by Aaron Bannert <aa...@clove.org>.

[we really should move this to the dev@apr list]

On Fri, Dec 12, 2003 at 11:53:53AM +0000, Ben Laurie wrote:
> This was exactly the conversation we were having at the hackathon. As 
> always, Windows was the problem, but I thought Bill had it licked?

Well, there are two things we have to solve. I think we know how to solve
the first one: portable IPC that works on Windows. This is not easy to
solve in a portable way, but given enough energy I think this is solvable.

The second part is getting all the different I/O types to work within
the same poll() or poll-like mechanism. This seems like a much more
difficult task to me, but it all depends on how it works under Windows.

-aaron

Re: filtering huge request bodies (like 650MB files)

Posted by Aaron Bannert <aa...@clove.org>.

[we really should move this to the dev@apr list]

On Fri, Dec 12, 2003 at 11:53:53AM +0000, Ben Laurie wrote:
> This was exactly the conversation we were having at the hackathon. As 
> always, Windows was the problem, but I thought Bill had it licked?

Well, there are two things we have to solve. I think we know how to solve
the first one: portable IPC that works on Windows. This is not easy to
solve in a portable way, but given enough energy I think this is solvable.

The second part is getting all the different I/O types to work within
the same poll() or poll-like mechanism. This seems like a much more
difficult task to me, but it all depends on how it works under Windows.

-aaron

Re: filtering huge request bodies (like 650MB files)

Posted by Ben Laurie <be...@algroup.co.uk>.

Aaron Bannert wrote:

> On Thu, Dec 11, 2003 at 01:50:46PM -0600, William A. Rowe, Jr. wrote:
> 
>>But the 2.0 architecture is entirely different.  We need a poll but it's not entirely
>>obvious where to put one...
>>
>>One suggestion raised in a poll bucket: when a connection level filter cannot
>>read anything more, it passed back a bucket containing a poll descriptor as
>>metadata.  Each filter passes this metadata bucket back up.  Some filters
>>like mod_ssl would move it from the connection brigade to the data brigade.
> 
> 
> At one level we'll have to fit whatever I/O multiplexer we come
> up with in the filters. I'm going to stay out of that discussion.
> 
> At a lower level, ignoring filters for a moment, we still need a
> way for applications to be able to multiplex I/O between different
> I/O types: pipes, files, sockets, IPC, etc... I think this is the
> root of the problem (and something we should probably move over
> to the dev@apr list, and also something we might want to take up
> after APR 1.0 is released).

This was exactly the conversation we were having at the hackathon. As 
always, Windows was the problem, but I thought Bill had it licked?

Cheers,

Ben.

-- 
http://www.apache-ssl.org/ben.html       http://www.thebunker.net/

"There is no limit to what a man can do or how far he can go if he
doesn't mind who gets the credit." - Robert Woodruff

Re: filtering huge request bodies (like 650MB files)

Posted by Aaron Bannert <aa...@clove.org>.

On Thu, Dec 11, 2003 at 01:50:46PM -0600, William A. Rowe, Jr. wrote:
> But the 2.0 architecture is entirely different.  We need a poll but it's not entirely
> obvious where to put one...
> 
> One suggestion raised in a poll bucket: when a connection level filter cannot
> read anything more, it passed back a bucket containing a poll descriptor as
> metadata.  Each filter passes this metadata bucket back up.  Some filters
> like mod_ssl would move it from the connection brigade to the data brigade.

At one level we'll have to fit whatever I/O multiplexer we come
up with in the filters. I'm going to stay out of that discussion.

At a lower level, ignoring filters for a moment, we still need a
way for applications to be able to multiplex I/O between different
I/O types: pipes, files, sockets, IPC, etc... I think this is the
root of the problem (and something we should probably move over
to the dev@apr list, and also something we might want to take up
after APR 1.0 is released).

-aaron

Re: filtering huge request bodies (like 650MB files)

Posted by Glenn <gs...@gluelogic.com>.

[snip]
(wrowe's exposition of a possible non-blocking filter chain implementation)

Your poll bucket idea would be welcome for input, although it would
only save a bit of work since we already have APR_NONBLOCK_READ
   apr_bucket_read(b, &bdata, &blen, APR_NONBLOCK_READ);

For polling output, I had the idea of an ap_pass_brigade_nb() for a
non-blocking brigade pass.  The return value would indicate success,
failure, or blocking.  (Or modify passing brigade to take a flag like
APR_NONBLOCK_WRITE or APR_BLOCK_WRITE)

When I first looked at the brigade passing code, I mistakenly assumed
that AP_NOBODY_WROTE was for.  My current understanding of AP_NOBODY_WROTE
is that it should be an ap_assert() fatal error because the filter chain
went off into thin air, and so did the brigade that was passed to it.

Anyway, ap_pass_brigade_nb() would require that filters learn an extra
return value, and that they check the return value from
ap_pass_brigade_nb().  That might be too much to ask.


Currently, I'm playing with finding the connection socket by searching
the output filters for f->frec->name as "core" and then stealing the
connection socket from ((core_net_rec)f->ctx)->client_socket.  Polling
on that will reveal if the network connection is blocking.  Until
then, I can send data.  If it is blocking, I assume that I should stop
reading data (e.g. CGI output) until it clears.  This theoretically
works for all output, even if there is an intermediate output filter
that is buffering all my data (mod_deflate or similar).

I'd be very happy if there was a sanctioned routine which I could call 
to get a poll descriptor to use to determine if output would block.
This would allow the returned descriptor to be inserted anywhere in
the filter chain by more knowledgable filters downstream (e.g. if we're
a subrequest that is reverse-proxied that is mod_ext_filter that is...)
instead of me just pulling the socket all the way at the end.

Cheers,
Glenn

Re: filtering huge request bodies (like 650MB files)

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.

At 07:01 PM 12/10/2003, Bill Stoddard wrote:
>Aaron Bannert wrote:
>>
>>[slightly off-topic]
>>Actually, I believe that mod_cgi and mod_cgid are currently broken
>>WRT the CGI spec. The spec says that a CGI may read as much of an
>>incoming request body as it wishes and may return data as soon as
>>it wishes (all AIUI). 

I agree with your reading, it's the first bug report I ever filed on Apache.

>>That means that right now if you send a big
>>body to a CGI script that does not read the request body (which
>>is perfectly valid according to the CGI script) then mod_cgi[d] will
>>deadlock trying to write the rest of the body to the script.
>>The best way to fix this would be to support a poll()-like multiplexing
>>I/O scheme where data could be written do and read from the CGI process
>>at the same time. Unfortunately, that's not currently supported by
>>the Apache filter code.
>>-aaron
>
>Interesting. Then Apache 1.3 is broken too. I believe Jeff posted a patch not too long ago to enable full duplex interface between Apache and CGI scripts.

Unfortunately they are entirely unrelated.  The 1.3 patch would be terrific, since
on Win32 especially the pipe buffers were pretty small (until I increased them
at least to 64k inbound/outbound.)

But the 2.0 architecture is entirely different.  We need a poll but it's not entirely
obvious where to put one...

One suggestion raised in a poll bucket: when a connection level filter cannot
read anything more, it passed back a bucket containing a poll descriptor as
metadata.  Each filter passes this metadata bucket back up.  Some filters
like mod_ssl would move it from the connection brigade to the data brigade.

When a module like mod_cgi saw the last apr_brigade_read, it could then
multiplex what it wants to do with more data.  Even with things like a charset
conversion filter containing an incomplete sequence, or mod_ssl with some
data but an incomplete packet, the module could continue to do 'something
else' until that poll descriptor was signalled, then call back down the filter
chain to read more data.

Now poll buckets are a simple solution to read, but they don't work at all
for write.  mod_cgi[d] simply passes the pipe bucket out the filter chain
and that operation is always blocking.  The only valid result under today's
filter design is sent, or could not send [fatal].  The first filter that cares
reads from the cgi pipe, and transforms or writes that data.  At that point
we are deep in the output filter chain.

The only sane solution I can think of would be a hybrid.  On the read from
client/write to pipe side, we implement a poll bucket.  On the read from
pipe side, we have to actually buffer the data instead of passing the pipe
bucket down the filter chain.  So we are polling on several events;

  CGI stdin pipe ready-to-write?
    \yes - write to the pipe, and also start polling again;
    Network (pipe bucket) ready to read?
       \yes - Read again (nonblock) from the input filters
  CGI stdout pipe data-to-read?    
    \yes - read the available data (nonblock), and pass the brigade out

This ignores if the network is ready to write because we just won't *do*
anything till the CGI results have been written out.

This also ignores a filter like mod_ext_filter.  That filter implies that our
poll buckets must allow for a collection of sockets/pipes to poll on.  Two
things can happen within the mod_ext_filter_in, either it's blocked for more
data or it is truly taking it's time computing some results.  We just don't
(can't) know the answer to that puzzle from inside Apache.

So consider mod_ext_filter_in.  Let's presume there are three things that
can trigger more labor in our hypothetical input filter...

 * it needs more input to continue.  Solution: poll the network.
 * it is churning away at it's data.  Solution: poll the ext filter's stdout pipe

and the most complex case:

 * it is churning away, but the ext filter's stdin pipe is *full* still!  Even
   with more network data, we have to ignore the fact that we have more
   data to give to the ext filter till it either empties the stdin pipe or has
   more stdout pipe results for us to process.
   Solution: the mod_ext_filter_in looks at the full stdin pipe and declines
   to read more from the network.  It sets aside the current network input
   and does *not* return the network poll bucket, but instead passes it's
   own poll bucket of *both* the stdin and stdout ext filter pipes.

Imagine a chain of such things - we really define the problem in terms of
a set of filters that would trigger another nonblocking attempt to get the 
input chain moving again.

So that's the input side - now to consider the output side.  We can make
one assumption here for the sake of the handler - we don't need the handler
to do *anything* more until we can shoot it's cumulative results to the network.

mod_ext_filter has the same stdin/stdout blocking problems of mod_cgi, so
let's consider that complex filter case.  If mod_ext_filter sees that the filter
can accept more data, obviously that data should be written to the pipe
(nonblocking.)  So long as the ext filters stdout pipe has data, we can read
it and pass it out to the network.  It may be blocked on stdin because it is
blocking write attempting to return the stdout results.  So priority one is to
pull the data off the stdout and pass it out to the network.

What do we do when there is nothing on the ext_filter's stdout?  Unlike other
filters, we don't know if it's stuck for sufficient data to continue processing,
or it is just taking it's lazy time trying to compute results.

  * Brigade ended in EOS?  Well our caller will never try calling again,
    so ALWAYS poll on write to the ext filter's stdin and it's stdout, pulling 
    the stdout from the filter and feeding it data as it's ready for more.
    This is the *only* faux-blocking case.

Otherwise...

  * stdin is not full and we've written all our data to the filter?  
    Solution:  return immediately, we can let that filter keep churning away
    or stall for more input - we don't care as long as the ext filter's stdout
    has been cleared.

  * stdin is full and all the stdout results have been read and passed down?
    Solution: return immediately.  The pipe is full so the caller can't be blocked
    on us (if it contained 25 bytes and was attempting a block-read of 32 bytes
    of course it is blocked on us.)  But presume the caller can't look for more
    bytes from the stdin pipe than the pipe is capable of holding.  The CGI
    will continue to churn, but we can keep composing data.

This is *one* solution - you can probably see some alternative rules and come
up with good justifications either way.  The one side effect is that will would
only attempt to flush down more data to the client when the core handler is
ready to send more data.  I'd like to see a proof of concept where this would
be a large obstacle.

Finally, you can see other permutations with mod_proxy - I'll leave those up
to someone else to explore - and determine if they fit within the scope I've
outlined above, or if my outline was insufficient to cover some edge cases.

Bill

Re: filtering huge request bodies (like 650MB files)

Posted by Bill Stoddard <bi...@wstoddard.com>.

Aaron Bannert wrote:

> On Wed, Dec 10, 2003 at 06:29:28PM -0500, Glenn wrote:
> 
>>On Wed, Dec 10, 2003 at 03:18:44PM -0800, Stas Bekman wrote:
>>
>>>Are you saying that if I POST N MBytes of data to the server and just have 
>>>the server send it back to me, it won't grow by that N MBytes of memory for 
>>>the duration of that request? Can you pipe the data out as it comes in? I 
>>>thought that you must read the data in before you can send it out (at least 
>>>if it's the same client who sends and receives the data).
>>
>>Well, in the case of CGI, mod_cgi and mod_cgid currently require that
>>the CGI read in the entire body before mod_cgi(d?) will read the
>>response from the CGI.  So a CGI "echo" program must buffer the whole
>>response before mod_cgi(d?) will read the CGI output and send it back
>>to the client.  If the CGI buffers to disk, no problem, but if the
>>CGI buffers in memory, it will take a lot of memory (but not in Apache).
>>
>>Obviously :-), that's a shortcoming of mod_cgi(d?), but might also be
>>a problem for modules such as mod_php which preprocesses the CGI POST
>>info before running the PHP script.
> 
> 
> [slightly off-topic]
> 
> Actually, I believe that mod_cgi and mod_cgid are currently broken
> WRT the CGI spec. The spec says that a CGI may read as much of an
> incoming request body as it wishes and may return data as soon as
> it wishes (all AIUI). That means that right now if you send a big
> body to a CGI script that does not read the request body (which
> is perfectly valid according to the CGI script) then mod_cgi[d] will
> deadlock trying to write the rest of the body to the script.
> 
> The best way to fix this would be to support a poll()-like multiplexing
> I/O scheme where data could be written do and read from the CGI process
> at the same time. Unfortunately, that's not currently supported by
> the Apache filter code.
> 
> -aaron

Interesting. Then Apache 1.3 is broken too. I believe Jeff posted a patch not too long ago to enable full 
duplex interface between Apache and CGI scripts.

Bill

Re: filtering huge request bodies (like 650MB files)

Posted by Aaron Bannert <aa...@clove.org>.

On Wed, Dec 10, 2003 at 06:29:28PM -0500, Glenn wrote:
> On Wed, Dec 10, 2003 at 03:18:44PM -0800, Stas Bekman wrote:
> > Are you saying that if I POST N MBytes of data to the server and just have 
> > the server send it back to me, it won't grow by that N MBytes of memory for 
> > the duration of that request? Can you pipe the data out as it comes in? I 
> > thought that you must read the data in before you can send it out (at least 
> > if it's the same client who sends and receives the data).
> 
> Well, in the case of CGI, mod_cgi and mod_cgid currently require that
> the CGI read in the entire body before mod_cgi(d?) will read the
> response from the CGI.  So a CGI "echo" program must buffer the whole
> response before mod_cgi(d?) will read the CGI output and send it back
> to the client.  If the CGI buffers to disk, no problem, but if the
> CGI buffers in memory, it will take a lot of memory (but not in Apache).
> 
> Obviously :-), that's a shortcoming of mod_cgi(d?), but might also be
> a problem for modules such as mod_php which preprocesses the CGI POST
> info before running the PHP script.

[slightly off-topic]

Actually, I believe that mod_cgi and mod_cgid are currently broken
WRT the CGI spec. The spec says that a CGI may read as much of an
incoming request body as it wishes and may return data as soon as
it wishes (all AIUI). That means that right now if you send a big
body to a CGI script that does not read the request body (which
is perfectly valid according to the CGI script) then mod_cgi[d] will
deadlock trying to write the rest of the body to the script.

The best way to fix this would be to support a poll()-like multiplexing
I/O scheme where data could be written do and read from the CGI process
at the same time. Unfortunately, that's not currently supported by
the Apache filter code.

-aaron

Re: filtering huge request bodies (like 650MB files)

Posted by Glenn <gs...@gluelogic.com>.

On Wed, Dec 10, 2003 at 03:18:44PM -0800, Stas Bekman wrote:
> Are you saying that if I POST N MBytes of data to the server and just have 
> the server send it back to me, it won't grow by that N MBytes of memory for 
> the duration of that request? Can you pipe the data out as it comes in? I 
> thought that you must read the data in before you can send it out (at least 
> if it's the same client who sends and receives the data).

Well, in the case of CGI, mod_cgi and mod_cgid currently require that
the CGI read in the entire body before mod_cgi(d?) will read the
response from the CGI.  So a CGI "echo" program must buffer the whole
response before mod_cgi(d?) will read the CGI output and send it back
to the client.  If the CGI buffers to disk, no problem, but if the
CGI buffers in memory, it will take a lot of memory (but not in Apache).

Obviously :-), that's a shortcoming of mod_cgi(d?), but might also be
a problem for modules such as mod_php which preprocesses the CGI POST
info before running the PHP script.

As Cliff said, with bucket brigades, it is possible to avoid such
problems and to process things in bite-size buckets.  Not all modules
do so, though.

Cheers,
Glenn

Re: filtering huge request bodies (like 650MB files)

Posted by Stas Bekman <st...@stason.org>.

Cliff Woolley wrote:
> On Wed, 10 Dec 2003, Stas Bekman wrote:
> 
> 
>>Chris is trying to filter a 650MB file coming in through a proxy. Obviously he
>>sees that httpd-2.0 is allocating > 650MB of memory, since each bucket will
>>use the request's pool memory and won't free it untill after the request is
>>over.
> 
> 
> Whoa.  Obviously?  It is NOT supposed to do that.  Buckets do not use pool
> memory for that very reason (well, that's one of the two or three big
> reasons).
> 
> 
>>could theoretically reuse that memory for the next brigade.
> 
> 
> Which is exactly what is supposed to happen.

Ah, cool, I thought that pools are used everywhere. Thanks for correcting me, 
Cliff.

>>Obviously it's not how things work at the moment, as the memory is never
>>freed (which could probably be dealt with), but the real problem is that
>>no data will leave the server out before it was completely read in.
> 
> 
> Yes, that would be the real problem.  So somewhere there is a filter (or
> maybe the proxy itself) buffering the entire data stream before sending
> it.  That is a bug.

Are you saying that if I POST N MBytes of data to the server and just have the 
server send it back to me, it won't grow by that N MBytes of memory for the 
duration of that request? Can you pipe the data out as it comes in? I thought 
that you must read the data in before you can send it out (at least if it's 
the same client who sends and receives the data).

p.s. obviously I should stop using the word 'obviously' ;)

__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

Re: filtering huge request bodies (like 650MB files)

Posted by Joe Schaefer <jo...@sunstarsys.com>.

Cliff Woolley <jw...@virginia.edu> writes:

[...]

> Yes, that would be the real problem.  So somewhere there is a filter
> (or maybe the proxy itself) buffering the entire data stream before
> sending it.  That is a bug.

>From the proxy_http.c source in HEAD, it looks to me like 
mod_proxy buffers the entire incoming request body unless 
it's able to send it via chunked encoding.  However, 
"proxy-sendchunks" needs to appear in the r->subprocess_env 
table for that to happen, and AFAICT there's no code in 
httpd-2.0 which does that.

-- 
Joe Schaefer

Re: filtering huge request bodies (like 650MB files)

Posted by Cliff Woolley <jw...@virginia.edu>.

On Wed, 10 Dec 2003, Stas Bekman wrote:

> > But doesn't unsetting the C-L header cause the C-L filter to automatically
> > attempt to generate a new C-L value?
>
> I thought that bug has been fixed long time ago. Dynamic handlers used to bump

Ryan would know.  :-)

Re: filtering huge request bodies (like 650MB files)

Posted by Stas Bekman <st...@stason.org>.

Cliff Woolley wrote:
> On Wed, 10 Dec 2003, Stas Bekman wrote:
> 
> 
>>No, there is no C-L header. The complete filter looks like so:
>>
>>sub handler {
>>	# Get the filter object
>>	my($f) = @_;
>>
>>	# Only done on the FIRST pass of the filter
>>	unless($f->ctx)
>>	{
>>		$f->r->headers_out->unset('Content-Length');
>>		$f->ctx('');
>>	}
>>
>>	return Apache::DECLINED;
>>
>>} # handler
> 
> 
> 
> But doesn't unsetting the C-L header cause the C-L filter to automatically
> attempt to generate a new C-L value?

I thought that bug has been fixed long time ago. Dynamic handlers used to bump 
into the problem, where the output will be buffered up untill EOS and then 
sent out with C-L. This is no longer the case. We advise to always unset C-L 
header in filters that modify the length of the data (which is not the case 
with this filter, but he did it just in case).

> A HEAD response for a broken URL would help.  :-)

I hope Chris will take over at this point. I was just trying to report on his 
behalf.

__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

Re: filtering huge request bodies (like 650MB files)

Posted by Cliff Woolley <jw...@virginia.edu>.

On Wed, 10 Dec 2003, Brian Pane wrote:

> > But doesn't unsetting the C-L header cause the C-L filter to
> > automatically attempt to generate a new C-L value?
>
> Not unless the C-L filter sees the entire response in the first brigade
> passed through it.  It used to buffer the entire response in order to
> compute the C-L, but I changed that last year.

Ah.  Thanks for the reminder, Brian.  I knew it was that way once upon a
time.  :-)

Re: filtering huge request bodies (like 650MB files)

Posted by Brian Pane <br...@cnet.com>.

On Dec 10, 2003, at 5:15 PM, Cliff Woolley wrote:

> On Wed, 10 Dec 2003, Stas Bekman wrote:
>
>
> But doesn't unsetting the C-L header cause the C-L filter to 
> automatically
> attempt to generate a new C-L value?

Not unless the C-L filter sees the entire response in the first brigade
passed through it.  It used to buffer the entire response in order to
compute the C-L, but I changed that last year.

Brian

Re: filtering huge request bodies (like 650MB files)

Posted by Cliff Woolley <jw...@virginia.edu>.

On Wed, 10 Dec 2003, Stas Bekman wrote:

> No, there is no C-L header. The complete filter looks like so:
>
> sub handler {
> 	# Get the filter object
> 	my($f) = @_;
>
> 	# Only done on the FIRST pass of the filter
> 	unless($f->ctx)
> 	{
> 		$f->r->headers_out->unset('Content-Length');
> 		$f->ctx('');
> 	}
>
> 	return Apache::DECLINED;
>
> } # handler

But doesn't unsetting the C-L header cause the C-L filter to automatically
attempt to generate a new C-L value?

A HEAD response for a broken URL would help.  :-)

Re: filtering huge request bodies (like 650MB files)

Posted by Stas Bekman <st...@stason.org>.

Cliff Woolley wrote:
> On Thu, 11 Dec 2003, Ian Holsman wrote:
> 
> 
>>do the server's reply to you have a content-length header?
>>if so .. this is probably what is holding up the request in the server.
> 
> 
> Yah, I was going to guess it was probably the C-L filter.  But I thought
> we had logic in the C-L filter to avoid buffering "too much".

No, there is no C-L header. The complete filter looks like so:

sub handler {
	# Get the filter object
	my($f) = @_;

	# Only done on the FIRST pass of the filter
	unless($f->ctx)
	{
		$f->r->headers_out->unset('Content-Length');
		$f->ctx('');
	}

	return Apache::DECLINED;
	
} # handler

Apache::DECLINED tells modperl to just call pass_brigade unmodified.

__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

Re: filtering huge request bodies (like 650MB files)

Posted by Cliff Woolley <jw...@virginia.edu>.

On Thu, 11 Dec 2003, Ian Holsman wrote:

> do the server's reply to you have a content-length header?
> if so .. this is probably what is holding up the request in the server.

Yah, I was going to guess it was probably the C-L filter.  But I thought
we had logic in the C-L filter to avoid buffering "too much".

--Cliff

Re: filtering huge request bodies (like 650MB files)

Posted by Ian Holsman <Ia...@apache.org>.

Cliff Woolley wrote:

> 
> Which is exactly what is supposed to happen.
> 
> 
>>Obviously it's not how things work at the moment, as the memory is never
>>freed (which could probably be dealt with), but the real problem is that
>>no data will leave the server out before it was completely read in.
> 
> 
> Yes, that would be the real problem.  So somewhere there is a filter (or
> maybe the proxy itself) buffering the entire data stream before sending
> it.  That is a bug.
> 

Chris...
do the server's reply to you have a content-length header?
if so .. this is probably what is holding up the request in the server.

> --Cliff
>

Re: filtering huge request bodies (like 650MB files)

Posted by Cliff Woolley <jw...@virginia.edu>.

On Wed, 10 Dec 2003, Stas Bekman wrote:

> Chris is trying to filter a 650MB file coming in through a proxy. Obviously he
> sees that httpd-2.0 is allocating > 650MB of memory, since each bucket will
> use the request's pool memory and won't free it untill after the request is
> over.

Whoa.  Obviously?  It is NOT supposed to do that.  Buckets do not use pool
memory for that very reason (well, that's one of the two or three big
reasons).

> could theoretically reuse that memory for the next brigade.

Which is exactly what is supposed to happen.

> Obviously it's not how things work at the moment, as the memory is never
> freed (which could probably be dealt with), but the real problem is that
> no data will leave the server out before it was completely read in.

Yes, that would be the real problem.  So somewhere there is a filter (or
maybe the proxy itself) buffering the entire data stream before sending
it.  That is a bug.

--Cliff