You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@httpd.apache.org by "Roy T. Fielding" <fi...@kiwi.ICS.UCI.EDU> on 2000/07/10 22:36:47 UTC

Re: filtering patches

Pools work well for HTTP requests because their operational overhead
matches the nature of interactions -- pool overhead occurs before and
after the critical path of handling the request, and the stateless
interaction of HTTP means the entire pool can be cleaned between requests.

Aside from the above coolness for HTTP serving, there is nothing magic
about pools.  They can be used for filter-local storage provided that
the pool is as persistent as the filter, but they can't be used for
bucket data.  Buckets require their own memory allocation/sharing/freeing.

....Roy

Re: filtering patches

Posted by Rodent of Unusual Size <Ke...@Golux.Com>.

Greg Stein wrote:
> 
> Ryan is popping the filters off the stack as he sends the data
> through the chain.

I saw that but forgot.  That's silly.  Maybe we shouldn't think of
it as a stack because of the other uses of the term, but more
like the diffraction (?) tower at a cracking plant.  Or maybe a
leaching field. :-)
-- 
#ken    P-)}

Ken Coar                    <http://Golux.Com/coar/>
Apache Software Foundation  <http://www.apache.org/>
"Apache Server for Dummies" <http://Apache-Server.Com/>
"Apache Server Unleashed"   <http://ApacheUnleashed.Com/>

Re: filtering patches

Posted by Greg Stein <gs...@lyra.org>.

On Mon, Jul 10, 2000 at 03:14:14PM -0700, Roy T. Fielding wrote:
>...
> The data within the buckets pass down the filter chain and outside
> the control of the filter.  The filter's lifetime may be shorter than
> the lifetime of the data that it has sent.  Therefore, any data that
> it sends downstream cannot be from a pool that the filter creates,
> and that includes the bucket brigade structure that houses the data.
> 
> >Regardless of whether I do eventually 'get it,' I propose that
> >each bucket have a pool pointer for a sandbox for the filters,
> 
> Each filter will need a pointer to a local pool that matches its
> own lifetime and a pointer to the stream to which it is attached.
> New buckets are obtained from the stream allocator, which is most
> like a connection-lifetime pool with a free list and backing store.
> 
> The interface to all this is extremely simple.  We don't need to remove
> any of the ap_w* routines -- they are just a different interface that is
> only capable of writing const strings to the output stack.

I think that all incoming data should be scoped as "this goes away when you
return." Only when a filter needs to alter that assumption ("set-aside" is
the term that I've been using), should explicit work be done. IOW, when a
filter needs to keep the data around longer than its return-from-function,
it should take the appropriate steps (e.g. copy the memory, dup the file).

The filters that I've seen so far do not require extended lifetimes. For the
few that do, I think they should take the extra work (rather than imposing
the work on the rest that don't).

> I need to look at Ryan's latest patch to understand why he thinks it is
> less efficient than hooks -- my guess is that we just aren't thinking
> of the same thing when I say stack.

Ryan is popping the filters off the stack as he sends the data through the
chain. Because of the "pop", they would need to be re-inserted after each
pass through the chain.

Needless to say, I much prefer a simple linked list :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: filtering patches

Posted by Rodent of Unusual Size <Ke...@Golux.Com>.

Greg Stein wrote:
> 
> In other words, a gzip filter would not have to set aside the entire
> request. As the data arrives, it compresses it and sends the (partial)
> result down the wire.

Well, it needs to keep them if it's going to emit a valid
Content-length response header field, which is the point.
You can't send that field after the body transmission has
begun.. :-)
-- 
#ken    P-)}

Ken Coar                    <http://Golux.Com/coar/>
Apache Software Foundation  <http://www.apache.org/>
"Apache Server for Dummies" <http://Apache-Server.Com/>
"Apache Server Unleashed"   <http://ApacheUnleashed.Com/>

Re: filtering patches

Posted by rb...@covalent.net.

On Tue, 11 Jul 2000, Greg Stein wrote:

> On Tue, Jul 11, 2000 at 08:24:40AM -0400, Rodent of Unusual Size wrote:
> >...
> > Except that the black-box filter use of that will cause it to
> > grow and grow until the request is completed.  If a bucket
> > has its own pool, the pool is destroyed and the memory made
> > available for re-allocation as soon as the bucket is dumped
> > to the wire.  For pathological cases like mod_gzip, in which
> > all the buckets hang around until the filtering is done, the
> 
> Isn't gzip a streaming protocol? I really believe it is:
> 
> $ cat foo | gzip -c > foo.gz
> $ zcat foo.gz | diff - foo
> $
> 
> In other words, a gzip filter would not have to set aside the entire
> request. As the data arrives, it compresses it and sends the (partial)
> result down the wire.

Content-length.

I can be streaming, but it will most likely be a caching
filter.  Regardless, it is an easy example that people can understand and
talk about intelligently.  If the exact example is a bit off, just having
a name for modules like this is useful.

Ryan

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------

Re: filtering patches

Posted by Greg Stein <gs...@lyra.org>.

On Tue, Jul 11, 2000 at 08:24:40AM -0400, Rodent of Unusual Size wrote:
>...
> Except that the black-box filter use of that will cause it to
> grow and grow until the request is completed.  If a bucket
> has its own pool, the pool is destroyed and the memory made
> available for re-allocation as soon as the bucket is dumped
> to the wire.  For pathological cases like mod_gzip, in which
> all the buckets hang around until the filtering is done, the

Isn't gzip a streaming protocol? I really believe it is:

$ cat foo | gzip -c > foo.gz
$ zcat foo.gz | diff - foo
$

In other words, a gzip filter would not have to set aside the entire
request. As the data arrives, it compresses it and sends the (partial)
result down the wire.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: filtering patches

Posted by Rodent of Unusual Size <Ke...@Golux.Com>.

rbb@covalent.net wrote:
> 
> I'm confused.  With pools, each filter will have associate itself with
> it's own pool, and thus create and destroy its own working area.  With
> malloc/free the filters have to create/destroy the memory used to pass
> around the data (done through the bucket API), but the working area
> can be the pool in the request_rec.

Except that the black-box filter use of that will cause it to
grow and grow until the request is completed.  If a bucket
has its own pool, the pool is destroyed and the memory made
available for re-allocation as soon as the bucket is dumped
to the wire.  For pathological cases like mod_gzip, in which
all the buckets hang around until the filtering is done, the
effect is essentially the same as having the filters use the
request_rec pool.  For other cases, in which the buckets get
dumped in sequence before the request is completed, having
each bucket have a pool has a lower memory footprint.

Consider a filter that uses 12K of working storage for each
bucket it processes, and a request with 1,000 buckets in the
output sequence.  If the filter uses the request_rec, that's
12MB allocated before the request is completed.  In the
mod_gzip case, it's also 12MB.  In a non-pathological case,
it's 12K*n, where n is the number of buckets between the
filter and the wire.

> I do not believe each bucket needs their own pool, because the
> bucket_brigade has its own pool.

This has the same problem as using the request_rec.
-- 
#ken    P-)}

Ken Coar                    <http://Golux.Com/coar/>
Apache Software Foundation  <http://www.apache.org/>
"Apache Server for Dummies" <http://Apache-Server.Com/>
"Apache Server Unleashed"   <http://ApacheUnleashed.Com/>

Re: filtering patches

Posted by rb...@covalent.net.

On Mon, 10 Jul 2000, Rodent of Unusual Size wrote:
> "Roy T. Fielding" wrote:
> > 
> > Aside from the above coolness for HTTP serving, there is nothing magic
> > about pools.  They can be used for filter-local storage provided that
> > the pool is as persistent as the filter, but they can't be used for
> > bucket data.  Buckets require their own memory allocation/sharing/
> > freeing.
> 
> Please substantiate so I 'get it..'
> 
> Regardless of whether I do eventually 'get it,' I propose that
> each bucket have a pool pointer for a sandbox for the filters,
> just as we now provide in just about every other passed-around
> structure.  If the data can't be kept in the pool, all right
> (though that's the part I hope to 'get') -- but each filter
> shouldn't have to create/destroy its own working area.

I'm confused.  With pools, each filter will have associate itself with
it's own pool, and thus create and destroy its own working area.  With
malloc/free the filters have to create/destroy the memory used to pass
around the data (done through the bucket API), but the working area can be
the pool in the request_rec.

I do not believe each bucket needs their own pool, because the
bucket_brigade has its own pool.  In general, buckets shouldn't be passed
around without a bucket_brigade.  This is why buckets don't have their own
pools.  I do mention in the comments that the pool in the bucket brigade
is used to limit the brigade's lifetime, although I can't remember right
now if I actually registered a cleanup for it.

Ryan

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------

Re: filtering patches

Posted by Rodent of Unusual Size <Ke...@Golux.Com>.

"Roy T. Fielding" wrote:
> 
> Aside from the above coolness for HTTP serving, there is nothing magic
> about pools.  They can be used for filter-local storage provided that
> the pool is as persistent as the filter, but they can't be used for
> bucket data.  Buckets require their own memory allocation/sharing/
> freeing.

Please substantiate so I 'get it..'

Regardless of whether I do eventually 'get it,' I propose that
each bucket have a pool pointer for a sandbox for the filters,
just as we now provide in just about every other passed-around
structure.  If the data can't be kept in the pool, all right
(though that's the part I hope to 'get') -- but each filter
shouldn't have to create/destroy its own working area.
-- 
#ken    P-)}

Ken Coar                    <http://Golux.Com/coar/>
Apache Software Foundation  <http://www.apache.org/>
"Apache Server for Dummies" <http://Apache-Server.Com/>
"Apache Server Unleashed"   <http://ApacheUnleashed.Com/>

Re: filtering patches

Posted by rb...@covalent.net.

> Pools work well for HTTP requests because their operational overhead
> matches the nature of interactions -- pool overhead occurs before and
> after the critical path of handling the request, and the stateless
> interaction of HTTP means the entire pool can be cleaned between requests.
> 
> Aside from the above coolness for HTTP serving, there is nothing magic
> about pools.  They can be used for filter-local storage provided that
> the pool is as persistent as the filter, but they can't be used for
> bucket data.  Buckets require their own memory allocation/sharing/freeing.

Roy, I've been trying to explain this, but failing miserably.  Perhaps you
could take a stab at explaining why pools won't work for the actual data.

Thanks,

Ryan

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------