You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by "Roy T. Fielding" <fi...@kiwi.ICS.UCI.EDU> on 1999/11/13 08:57:43 UTC

bucket brigades and IOL

About two years ago I wasted a lot of time writing an Ada95 library
called Onions that provides a stackable stream abstraction for files,
sockets, etc.  It is at <http://www.ics.uci.edu/pub/websoft/libwww-ada95/>
if you want to take a look at it, but I don't recommend looking at the
code since it is almost all just working around Ada95's lack of a
system interface.  I'll describe the worthwhile bits here.

The heart of Onions is the input and output stream object
classes and classwide types for building a data stream via a
stack of stream objects (Input_Pipe and Output_Pipe).  Reading
from the head of an input pipe causes the head stream object
to read from the next outbound stream object, and on down the line.
Likewise for writing to the head of an output pipe.  One of the
main features of streams is that they can filter the data as it
passes, converting, adding to, and/or removing from the data
before giving it to the next object.  Since multiple streams can be
cascaded, the complete data conversion is the sum of the individual
data conversions performed by the stream objects.

So far, no big deal -- this can be manually created by stacking ap_iol
types in a meaningful way.  But, the one unique thing I did in Onions was 
abstract the memory handling into something called Buckets and moved them
around in Bucket_Brigades.  A bucket is an allocated segment of memory
with pointers to its allocation address and current size.  If I were doing
this in C, I'd also add a pointer to current start address and allocated
size, so that a single bucket could be shrunk from both ends without
copying, and a function pointer for freeing it at the stream end.
Note that this is the same type of memory structure that IO-Lite uses,
though developed independently and for different reasons.

A bucket brigade is a list-queue of buckets.  Each of the stream read/write
calls would pass a bucket brigade instead of single bucket, since this
made insertion by filters more efficient, with the general idea being that
the outbound end of the sream would be writing them out using writev
or reading them in using readv, which is about as efficient as I could
get with Ada95.  [I call it a list-queue instead of just queue because you
have the choice of removing buckets from (or adding to) the queue one
bucket at a time or an entire linked list of buckets.]

But we could go one step further.  A bucket is an ADT, and as such can
be used as a general handle for read-only memory, read-write memory,
cache object, file handle, mmap handle, file name, URL, whatever.
What if, instead of just a stream of memory, it could pass around a
stream of memory interspersed with file handles or references to
remote objects?  A filter could then add stuff around the stream without
causing too much parsing overhead, and if it needed to look at all the
bytes in the stream it would just replace the bucket handle with a stream
of memory sucked from that handle.  Something like this was talked about
last year (see threads on "Stacking up Response Handling" on 23 Sep 1998
and "I/O filters & reference counts" in late December 1998 and January 1999).
And Dean started something with ap_buf.h, but I don't know how he meant
to finish it.

What I was thinking of was

   typedef enum {
       AP_BUCKET_rwmem,
       AP_BUCKET_rmem,
       AP_BUCKET_file_t,
       AP_BUCKET_mmap_t,
       AP_BUCKET_filename,
       AP_BUCKET_cached_entity,
       AP_BUCKET_URI,
   } ap_bucket_color_t;

   typedef struct ap_bucket_t ap_bucket_t;
   struct ap_bucket_t {
       ap_bucket_color_t color;
       void              *content;
       ap_status_t       (*free)(ap_bucket_t *bucket);
       unsigned int      refcount;
   };

   typedef struct ap_bucket_rwmem_t ap_bucket_rwmem_t;
   struct ap_bucket_rwmem_t {
       void   *alloc_addr;
       size_t  alloc_len;
       void   *addr;
       size_t  len;
   };

   typedef struct ap_bucket_rmem_t ap_bucket_rmem_t;
   struct ap_bucket_rmem_t {
       void   *addr;
       size_t  len;
   };

   typedef struct ap_bucket_filename ap_bucket_filename;
   struct ap_bucket_filename {
       ap_context_t *ctx;
       char         *name;
       ap_stat_t    *stat;  /* useful if already stat'ed */
       ap_aa_t      *conf;  /* access control structure for this file */
   };

   ...

and then

   typedef struct ap_bucket_list_t ap_bucket_list_t;
   struct ap_bucket_list_t {
       ap_bucket_t *bucket;
       ap_bucket_list_t *prev;
       ap_bucket_list_t *next;
   };

   typedef struct ap_brigade_t ap_brigade_t;
   struct ap_brigade_t {
       ap_context_t     *ctx;
       ap_bucket_list_t *first;
       ap_bucket_list_t *last;
       unsigned int     count;
   };

and then construct the input and output streams as pushing these
bucket brigades to or from the client.  The streams would have to
be a little more complicated than Onions, since I learned later that
you also need a parallel stream of header fields (in tokenized form)
in order for it to work with anything HTTP-like.

Why use an enum instead of a bunch of file pointers for each type
of bucket, kind of like ap_iol?  Because it allows adjacent memory
buckets (the most frequent kind after a filter operation) to be
gathered into a single writev.  Also, we need a way to be able to
set up an operation and figure out what it will produce without
actually performing the operation -- this is for OPTIONS and HEAD.

Note that this would completely change the way we handle internal
redirects, subrequests, server-side include, mod_proxy, access control, etc.
And then most of the API hooks would need to change.  I think that is why
Dean was putting it off until 2.1.  The annoying thing is that this is the
most useful rearchitecting of the server -- the MPM, APR, and hook changes
make 2.0 easier/cleaner/faster to port to other platforms, but layering
enables in one fell swoop almost every significant non-config feature
that our users have requested.  A cache would just be a hash table or
btree of file buckets, complete with AA info.

Anyway, that was stuck in the back of my head and had to get out.
I won't be able to work on it until after the dissertation is done,
which every day seems to be further away.  Maybe 3.0, with rHTTP/2.0.

....Roy

Re: bucket brigades and IOL

Posted by Ben Laurie <be...@algroup.co.uk>.
Dean Gaudet wrote:
> 
> On Sat, 13 Nov 1999, Ben Laurie wrote:
> 
> > Also, the usual objections still apply - i.e. it is awkward to do things
> > like searching for particular strings, since they may cross boundaries.
> > I'm beginning to think that the right answer to this is to provide nice
> > matching functions that know about the chunked structures, and last
> > resort functions that'll glue it all back into one chunk...
> 
> yeah, we use a zero-copy library at criticalpath and we frequently run
> into the case where we want to do some string-like operation on data in
> the zero-copy datastructure.  you end up having to either copy it to a
> regular C string, or re-write all of the string functions.  consider
> flex... or a regex library... neither work terribly well in the face of a
> zero-copy abstraction because they don't have the equivalent of
> writev()/readv().
> 
> but.  if apache had it, maybe we would see more libraries start to adopt
> iovec-like interfaces... dunno.

Well, its not as if there's a really huge number of string functions,
and regex is far and away the hardest.

Someone needs to bite that bullet!

Cheers,

Ben.

--
http://www.apache-ssl.org/ben.html

"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
     - Indira Gandhi

Re: bucket brigades and IOL

Posted by Dean Gaudet <dg...@arctic.org>.

On Sat, 13 Nov 1999, Ben Laurie wrote:

> Also, the usual objections still apply - i.e. it is awkward to do things
> like searching for particular strings, since they may cross boundaries.
> I'm beginning to think that the right answer to this is to provide nice
> matching functions that know about the chunked structures, and last
> resort functions that'll glue it all back into one chunk...

yeah, we use a zero-copy library at criticalpath and we frequently run
into the case where we want to do some string-like operation on data in
the zero-copy datastructure.  you end up having to either copy it to a
regular C string, or re-write all of the string functions.  consider
flex... or a regex library... neither work terribly well in the face of a
zero-copy abstraction because they don't have the equivalent of
writev()/readv().

but.  if apache had it, maybe we would see more libraries start to adopt
iovec-like interfaces... dunno.

Dean


Re: bucket brigades and IOL

Posted by Ben Laurie <be...@algroup.co.uk>.
"Roy T. Fielding" wrote:
> Anyway, that was stuck in the back of my head and had to get out.
> I won't be able to work on it until after the dissertation is done,
> which every day seems to be further away.  Maybe 3.0, with rHTTP/2.0.

I've got to say that this is the most coherent suggestion along these
lines that I've seen yet. I rather like it. One thing I'd add is that if
you are going to have a movable "start of block" pointer, and changeable
length, it can be nice to allocate extra around the edges under some
circumstances, so that lower layers can expand the block without having
to add extra chunks.

Also, the usual objections still apply - i.e. it is awkward to do things
like searching for particular strings, since they may cross boundaries.
I'm beginning to think that the right answer to this is to provide nice
matching functions that know about the chunked structures, and last
resort functions that'll glue it all back into one chunk...

Cheers,

Ben.

--
http://www.apache-ssl.org/ben.html

"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
     - Indira Gandhi