You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@httpd.apache.org by Greg Stein <gs...@lyra.org> on 2001/01/19 04:17:01 UTC

[PATCH] ap_r* performance patch

I've attached the patch and an strace of my earlier-described design. Greg
Ames ran an "ab" test to measure the performance improvement for the other
patch and saw a 2.6x increase in speed. The patch below gives 3.5x :-)

I'll describe the design in detail:

*) there is a new filter named OLD_WRITE (to support the old writing style;
   the name isn't important, though :-)

*) OLD_WRITE is installed as an AP_FTYPE_CONTENT-1 filter so that it will
   stay at the head of the output chain whenever possible.

*) the filter keeps a buffer of AP_MIN_BYTES_TO_WRITE (9K) in its context,
   allocated using malloc(). Later on, this will be inserted into a HEAP
   bucket and sent down the chain. Since it is on the HEAP, it will be
   returned to storage after delivery down the chain.

*) whenever *any* brigade arrives in the filter chain, this filter will
   prepend its buffer contents (if any) to the brigade and pass all the
   content on down the chain. This provides correct ordering in a
   mixed-output environment. (the buffer contents arrived before the new
   brigade)

*) the flush_buffer() routine is used deliver the filter's buffer to the
   output chain. an optional "extra" data may be delivered, too. This latter
   feature is to support large writes -- they go straight into the chain
   rather than getting copied into a buffer.

*) buffer_output() is the main function to buffer the output for delivery.

   a) it searches for whether OLD_WRITE has been inserted, and inserts it if
      it hasn't been inserted already.

   b) if OLD_WRITE isn't the first filter, then the data is delivered
      normally into the output chain.
      - if we did a regular buffer-store, then the content would skip
        filters in between the top and the OLD_WRITE filter
      - this implies that we return to current bit-by-bit behavior if
        something gets in the way. (nothing should given our filter type)

   c) the buffer is allocated, if necessary, and the data is placed into the
      buffer if room is available. if the buffer becomes full, then it is
      flushed to the output.

   d) if the new data is "big", then we flush the existing contents plus the
      big data (as "extra").

   e) otherwise, the data is "small" but doesn't fit in the buffer. we max
      out the buffer, flush it down the chain, and put the remaining into a
      new buffer.

*) the ap_r* functions simply call buffer_output() to buffer/deliver
   content.

*) a couple further improvements have been noted within the patch.


Of course, please pipe up if you have questions.

Benefits of this patch:

1) 3.5x speedup in requests/sec and thruput on my box
2) strace shows a single writev and hugely improved alloc behavior
3) mixed-style output is possible
4) network congestion is used
5) memory is NOT proportional to output size: it mallocs/frees at each
   network delivery
6) ap_r* API is unchanged

Yesterday, Ryan posted a list of five "priorities", which are all met by the
above list of benefits.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: perf number stuff (was: Re: [PATCH] ap_r* performance patch)

Posted by rb...@covalent.net.

> The strace came from a "telnet" session, so there will definitely be some
> extra read() calls as Apache looks for the rest of the request, fails, then
> moves into a select() to wait for it.

But we are seeing extra read calls even with HTTP/1.0 requests.

> But for the rest: yes, a side-by-side will be great. Didn't Dean do one of
> those already, but then we got sidetracked by the autoindex ugliness?

He ran a couple 2.0 strace's, but he doesn't still have the 1.3 strace's
as far as I know.

Ryan

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------

perf number stuff (was: Re: [PATCH] ap_r* performance patch)

Posted by Greg Stein <gs...@lyra.org>.

On Thu, Jan 18, 2001 at 07:47:12PM -0800, rbb@covalent.net wrote:
> 
> > > > [ I presume 2.0 has higher latency in handling the request, and then we slam
> > > >   it all onto the network at once ]
> > > 
> > > I've got a feeling it's the 9K buffer we use as opposed to the 4K 1.3
> > > uses.  It would be interesting to tweak that.
> > 
> > Yup. I'll give it a quick run and see what happens....
> > 
> > *blink* ... the numbers are the same. The strace shows that the writev() has
> > an extra item (since one is limited to the 4k length), so the change
> > definitely got in there. It is simply that there is no discernible speed
> > difference (in my testing, at least).
> 
> might be interesting to see a strace from 2.0 and 1.3 side by side.  There
> are a couple of read calls that we don't need, but I haven't had a chance
> to track them down yet.  hmmmmm.....

The strace came from a "telnet" session, so there will definitely be some
extra read() calls as Apache looks for the rest of the request, fails, then
moves into a select() to wait for it.

But for the rest: yes, a side-by-side will be great. Didn't Dean do one of
those already, but then we got sidetracked by the autoindex ugliness?

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: [PATCH] ap_r* performance patch

Posted by rb...@covalent.net.

> > > [ I presume 2.0 has higher latency in handling the request, and then we slam
> > >   it all onto the network at once ]
> > 
> > I've got a feeling it's the 9K buffer we use as opposed to the 4K 1.3
> > uses.  It would be interesting to tweak that.
> 
> Yup. I'll give it a quick run and see what happens....
> 
> *blink* ... the numbers are the same. The strace shows that the writev() has
> an extra item (since one is limited to the 4k length), so the change
> definitely got in there. It is simply that there is no discernible speed
> difference (in my testing, at least).

might be interesting to see a strace from 2.0 and 1.3 side by side.  There
are a couple of read calls that we don't need, but I haven't had a chance
to track them down yet.  hmmmmm.....

Ryan

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------

Re: [PATCH] ap_r* performance patch

Posted by Greg Stein <gs...@lyra.org>.

On Thu, Jan 18, 2001 at 07:37:45PM -0800, rbb@covalent.net wrote:
> 
> > In a mail yesterday, Greg Ames asked about a 1.3 comparison. I just ran some
> > 1.3 numbers:
> > 
> >              req/sec   thruput (kb/s)
> >         1.3   30.20     74.27
> >         2.0    4.17     23.18
> > 2.0 patched   14.51     80.75
> > 
> > 
> > 2.0 is using the prefork MPM. 2.0's "ab" was used to measure all three. The
> > numbers were quite stable. ab and httpd were on the same box.
> > 
> > You can see that 2.0's requests per second is quite a bit lower. But the
> > throughput is higher! I'll let Dean tell us why :-)
> > 
> > [ I presume 2.0 has higher latency in handling the request, and then we slam
> >   it all onto the network at once ]
> 
> I've got a feeling it's the 9K buffer we use as opposed to the 4K 1.3
> uses.  It would be interesting to tweak that.

Yup. I'll give it a quick run and see what happens....

*blink* ... the numbers are the same. The strace shows that the writev() has
an extra item (since one is limited to the 4k length), so the change
definitely got in there. It is simply that there is no discernible speed
difference (in my testing, at least).

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: [PATCH] ap_r* performance patch

Posted by rb...@covalent.net.

> In a mail yesterday, Greg Ames asked about a 1.3 comparison. I just ran some
> 1.3 numbers:
> 
>              req/sec   thruput (kb/s)
>         1.3   30.20     74.27
>         2.0    4.17     23.18
> 2.0 patched   14.51     80.75
> 
> 
> 2.0 is using the prefork MPM. 2.0's "ab" was used to measure all three. The
> numbers were quite stable. ab and httpd were on the same box.
> 
> You can see that 2.0's requests per second is quite a bit lower. But the
> throughput is higher! I'll let Dean tell us why :-)
> 
> [ I presume 2.0 has higher latency in handling the request, and then we slam
>   it all onto the network at once ]

I've got a feeling it's the 9K buffer we use as opposed to the 4K 1.3
uses.  It would be interesting to tweak that.

Ryan

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------

Re: [PATCH] ap_r* performance patch

Posted by Greg Stein <gs...@lyra.org>.

On Thu, Jan 18, 2001 at 07:17:01PM -0800, Greg Stein wrote:
> I've attached the patch and an strace of my earlier-described design. Greg
> Ames ran an "ab" test to measure the performance improvement for the other
> patch and saw a 2.6x increase in speed. The patch below gives 3.5x :-)

In a mail yesterday, Greg Ames asked about a 1.3 comparison. I just ran some
1.3 numbers:

             req/sec   thruput (kb/s)
        1.3   30.20     74.27
        2.0    4.17     23.18
2.0 patched   14.51     80.75

2.0 is using the prefork MPM. 2.0's "ab" was used to measure all three. The
numbers were quite stable. ab and httpd were on the same box.

You can see that 2.0's requests per second is quite a bit lower. But the
throughput is higher! I'll let Dean tell us why :-)

[ I presume 2.0 has higher latency in handling the request, and then we slam
  it all onto the network at once ]

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: [PATCH] ap_r* performance patch

Posted by rb...@covalent.net.

> I am very unlikely to modify my patch at this point.  Not because I don't
> think it is a useful patch, but because I have shown the concept.  If I
> modify my patch to cover the details, then we are just going back and
> forth on patches again.  The concepts are all in the patch, please review
> the concepts, we can get the details perfect later.  I am not too
> concerned about separating concept from detail here, since the prototype
> has already proven itself.

Obviously, I'll get it compiling again, and change the symbol names, but
other than that, I doubt I'll implement anything else about it.

Ryan

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------

Re: [PATCH] ap_r* performance patch

Posted by rb...@covalent.net.

> > I would expect the output to be:
> > 
> > 	mumble foobar baz
> 
> Not me. Brigades hold data. They aren't an output mechanism. Never have
> been, and never will be. The data that you accumulate into a brigade is
> completely separate from everything else, until you call a function to
> *send* that data.

That's fine, that isn't what you expect, but you said that bucket brigades
and ap_r* were going to work seamlessly together.  _I_ expected exactly
what I posted would work.  I have a feeling that if you tell module
authors that the two work seamlessly, they will expect the same thing.

> Nope. I never expect insertion-into-a-brigade to actually deliver data to
> the client. I have no idea where that idea came from.

It came from my understanding of what you were saying.  If I (as somebody
who is in the code everyday) misunderstood, what at the chances that
module authors will too?

> > Now, take the same idea in my patch:
> > 
> > b = ap_bucket_create_heap(..., "mumble", ...);
> > AP_BRIGADE_INSERT_TAIL(r->bb, b);
> 
> Wait a sec. You're using r->bb. You're comparing apples to oranges. If you
> want an accurate and fair comparison, then your code should *also*
> manipulate "bb" rather than "r->bb".

This is an accurate and fair comparison.  With the shift I made to put a
brigade in the request_rec, this is how handlers would work.  We still
have to create it the same as we do in your patch, it is just stored
someplace else.

I guess you could do the same thing pretty easily, I hadn't really thought
that through last night.

> > I am also assuming that ap_vrprintf would be re-written to not copy the
> > data before passing it to buffer_output.
> 
> Absolutely. ap_vrprintf() and apr_bucket_vprintf() will both need to be
> rewritten at some point to properly deal with more than 4K of data.
> 
> At the moment, nobody actually uses apr_brigade_vprintf. If we use your
> patch, then we schedule a fix for that. If we don't use your patch, then we
> can decide to keep the function and fix it, or simply toss it.
> 
> > I'm not really sure how you
> > would do that, but I am assuming you have some thoughts.
> 
> Yes. We fix it the same way for both functions. Use the apr_vformatter
> function and hook up the flush_func to call buffer_output or to insert the
> value into a brigade.

But isn't that two copies?  apr_vformatter makes a copy, and then you will
have another copy in the old_write_filter functions (I lumped them all
together here).  If you just insert the bucket into the brigade, then
haven't you lost the coalesce features?  I guess you could create a new
function like flush_buffer, that does an apr_vformatter under the
covers.  I would need to really look at how this would be done.  This is
just a pointer to say "I don't see this", not a request for a change.

> >...
> > Also, this isn't a general fix, so it only works for Apache,
> 
> It is a fix for our problems with the legacy ap_r* interface. It fixes it
> quite well :-)

I just thought of something else.  Mod_perl 2.0 has it's own set of
functions that basically do exactly what the ap_brigade_* functions
do.  This patch doesn't help modules like that at all.  So, now there is
duplicate coalesce code, one in the server and one in mod_perl.  I am
wondering if mod_perl could insert a copy of old_write_filter so that we
can remove some duplication there.


I am very unlikely to modify my patch at this point.  Not because I don't
think it is a useful patch, but because I have shown the concept.  If I
modify my patch to cover the details, then we are just going back and
forth on patches again.  The concepts are all in the patch, please review
the concepts, we can get the details perfect later.  I am not too
concerned about separating concept from detail here, since the prototype
has already proven itself.

Ryan

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------

Re: [PATCH] ap_r* performance patch

Posted by Greg Stein <gs...@lyra.org>.

On Thu, Jan 18, 2001 at 10:36:12PM -0800, rbb@covalent.net wrote:
> I have begun to review the patch, and I disagree that this solves the
> problem of mixing the two APIs.  Take this code snippet (ignore the
> bucket type, just look at the concept, any bucket type will have the same 
> problem):
> 
> b = ap_bucket_create_heap(..., "mumble", ...);
> AP_BRIGADE_INSERT_TAIL(bb, b);

Okay. Now you have a data structure sitting in memory.

> ap_rputs(r, " foobar ");

And you sent some output.

> b = ap_bucket_create_heap(..., "baz", ...);
> AP_BRIGADE_INSERT_TAIL(bb, b);

You modified your data structure.

> ap_pass_brigade(r->output_filters, bb);

And you sent this output.

> I would expect the output to be:
> 
> 	mumble foobar baz

Not me. Brigades hold data. They aren't an output mechanism. Never have
been, and never will be. The data that you accumulate into a brigade is
completely separate from everything else, until you call a function to
*send* that data.

>...
> The problem is that there is nothing in this API to mark the distinction
> between the bucket_brigade and the request API's.  Just move the ap_rputs
> call into a separate function, and you have the exact case that you
> described earlier today.

Nope. I never expect insertion-into-a-brigade to actually deliver data to
the client. I have no idea where that idea came from.

You manipulate a brigade. You send the brigade.

Manipulation does not and never did imply "send this data."

> Now, take the same idea in my patch:
> 
> b = ap_bucket_create_heap(..., "mumble", ...);
> AP_BRIGADE_INSERT_TAIL(r->bb, b);

Wait a sec. You're using r->bb. You're comparing apples to oranges. If you
want an accurate and fair comparison, then your code should *also*
manipulate "bb" rather than "r->bb".

And guess what? It has the same behavior as above. You manipulate a brigade.
You send data with ap_rputs(), manipulate the brigade some more, then you
deliver the brigade.

Exactly as a person would expect.

>...
> I am also assuming that ap_vrprintf would be re-written to not copy the
> data before passing it to buffer_output.

Absolutely. ap_vrprintf() and apr_bucket_vprintf() will both need to be
rewritten at some point to properly deal with more than 4K of data.

At the moment, nobody actually uses apr_brigade_vprintf. If we use your
patch, then we schedule a fix for that. If we don't use your patch, then we
can decide to keep the function and fix it, or simply toss it.

> I'm not really sure how you
> would do that, but I am assuming you have some thoughts.

Yes. We fix it the same way for both functions. Use the apr_vformatter
function and hook up the flush_func to call buffer_output or to insert the
value into a brigade.

Reasonably straight-forward, but it just needs a Round Tuit.

[ I'm not sure what the copy behavior would be for this; I'm reasonably
  confident that we could get away with a single copy (once into the output
  buffer). note that BUFF also use vformatter ]

> If not, then for
> any ap_rprintf or ap_rvprintf call, you are copying twice before passing
> the data to the filter stack, which I really dislike.

I would dislike it, too :-)

> IMHO this patch has not solved the problem you were most interested in
> solving,

I think it does. Your "counter-example" never mixed the output -- which was
my issue. The counter-example simply built a brigade and called ap_rputs
while it was doing so. Not the same :-)

>...
> Also, this isn't a general fix, so it only works for Apache,

It is a fix for our problems with the legacy ap_r* interface. It fixes it
quite well :-)

Granted, it does not assist others. I'm not so worried about that -- we
don't have another app yet that uses buckets/brigades, so it is hard to
determine the true requirements. We could build something "for another app"
only to miss. We've done it already with Apache :-) and had to build in a
number of concepts that weren't in there initially (e.g. copy).

>...
> I look forward to hearing your thoughts on this, this idea is very
> interesting, and it is VERY different than what I originally envisioned
> when you first posted what you were thinking.

Yah, sorry about the change. Like I said: I ran into a problem with free'ing
of buckets. We may want to think on that topic post 2.0. The filter
remained, though.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: [PATCH] ap_r* performance patch

Posted by Greg Stein <gs...@lyra.org>.

On Thu, Jan 18, 2001 at 07:53:08PM -0800, rbb@covalent.net wrote:
>...
> I am actually about to run a test on your patch, my patch, and the current
> 2.0.  Along with that, I will be generating strace's, and comparing them.

Kick ass.

> > I hope the post explains it well enough, but just ask if something is
> > unclear.
> 
> This is definately very different from what I had envisioned given your
> explanation.

I started down a path of reusing a bucket, but realized that our design
simply doesn't allow that. apr_bucket_destroy() will free() the bucket --
you can't do anything about it. The destroy() bucket-type function is only
for the per-bucket data. So... that blew away the thought of reusing a
brigade and bucket for transmitting data into the buffering filter. But I
found a better way :-)

[ we could probably still reuse brigades in many places ]

The outline I posted does correspond to the new pattern, but the stuff I
was talking about yesterday/today would only work to a point (the filter is
the same, the only diff is how you get data into it).

> I really need to spend a lot more time looking at it, but
> that will have to come later tonight.  :-)

Not a problem. There isn't a rush on this stuff right now. I've got a big
patch from John Vasta (against mod_dav) to review and check in.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: [PATCH] ap_r* performance patch

Posted by rb...@covalent.net.

> > A few comments before I review in great detail.  The actual numbers (2 vs
> > 3.5) can't be compared here.  I used a 4K buffer while you used a 9KK
> > buffer.  That means I always had one more allocation than you do.  Not a
> > big deal, but it is there.
> 
> Agreed. I just had to throw it out, though :-)  It is entirely conceivable
> that your patch would be a 4x on my box. *shrug*  Maybe Greg Ames will do a
> comparison of the different patches.

I am actually about to run a test on your patch, my patch, and the current
2.0.  Along with that, I will be generating strace's, and comparing them.

> I hope the post explains it well enough, but just ask if something is
> unclear.

This is definately very different from what I had envisioned given your
explanation.  I really need to spend a lot more time looking at it, but
that will have to come later tonight.  :-)

Ryan

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------

Re: [PATCH] ap_r* performance patch

Posted by Greg Stein <gs...@lyra.org>.

On Thu, Jan 18, 2001 at 07:35:16PM -0800, rbb@covalent.net wrote:
> 
> A few comments before I review in great detail.  The actual numbers (2 vs
> 3.5) can't be compared here.  I used a 4K buffer while you used a 9KK
> buffer.  That means I always had one more allocation than you do.  Not a
> big deal, but it is there.

Agreed. I just had to throw it out, though :-)  It is entirely conceivable
that your patch would be a 4x on my box. *shrug*  Maybe Greg Ames will do a
comparison of the different patches.

> This only solves the Apache problem, which doesn't solve the main problem
> that Dean pointed out, namely that the brigades don't buffer.  You have
> put buffering into the core.

Agreed. That is the only problem that I am interested in at this point. Our
first priority is a high-performance Apache 2.0. Secondary is to have stuff
in APRUTIL that other apps can use.

> Item 5 is easy to get with the other patch, I used palloc, because I
> wanted to avoid the malloc/free whenever possible.

Agreed. You had everything from that list (or could easily do so), in your
patch, except for (3).

> More detailed review later.

Thanks!

I hope the post explains it well enough, but just ask if something is
unclear.

Cheers,
-g


> Ryan
> 
> > 1) 3.5x speedup in requests/sec and thruput on my box
> > 2) strace shows a single writev and hugely improved alloc behavior
> > 3) mixed-style output is possible
> > 4) network congestion is used
> > 5) memory is NOT proportional to output size: it mallocs/frees at each
> >    network delivery
> > 6) ap_r* API is unchanged
> > 
> > Yesterday, Ryan posted a list of five "priorities", which are all met by the
> > above list of benefits.
> > 
> > Cheers,
> > -g
> > 
> > -- 
> > Greg Stein, http://www.lyra.org/
> > 
> 
> 
> _______________________________________________________________________________
> Ryan Bloom                        	rbb@apache.org
> 406 29th St.
> San Francisco, CA 94131
> -------------------------------------------------------------------------------

-- 
Greg Stein, http://www.lyra.org/

Re: [PATCH] ap_r* performance patch

Posted by rb...@covalent.net.

A few comments before I review in great detail.  The actual numbers (2 vs
3.5) can't be compared here.  I used a 4K buffer while you used a 9KK
buffer.  That means I always had one more allocation than you do.  Not a
big deal, but it is there.

This only solves the Apache problem, which doesn't solve the main problem
that Dean pointed out, namely that the brigades don't buffer.  You have
put buffering into the core.

Item 5 is easy to get with the other patch, I used palloc, because I
wanted to avoid the malloc/free whenever possible.

More detailed review later.

Ryan

> 1) 3.5x speedup in requests/sec and thruput on my box
> 2) strace shows a single writev and hugely improved alloc behavior
> 3) mixed-style output is possible
> 4) network congestion is used
> 5) memory is NOT proportional to output size: it mallocs/frees at each
>    network delivery
> 6) ap_r* API is unchanged
> 
> Yesterday, Ryan posted a list of five "priorities", which are all met by the
> above list of benefits.
> 
> Cheers,
> -g
> 
> -- 
> Greg Stein, http://www.lyra.org/
> 


_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------

Re: [PATCH] ap_r* performance patch

Posted by rb...@covalent.net.

I have begun to review the patch, and I disagree that this solves the
problem of mixing the two APIs.  Take this code snippet (ignore the
bucket type, just look at the concept, any bucket type will have the same 
problem):

b = ap_bucket_create_heap(..., "mumble", ...);
AP_BRIGADE_INSERT_TAIL(bb, b);

ap_rputs(r, " foobar ");

b = ap_bucket_create_heap(..., "baz", ...);
AP_BRIGADE_INSERT_TAIL(bb, b);

ap_pass_brigade(r->output_filters, bb);

I would expect the output to be:

	mumble foobar baz

But it's going to be:

	foobar mumble baz

The problem is that there is nothing in this API to mark the distinction
between the bucket_brigade and the request API's.  Just move the ap_rputs
call into a separate function, and you have the exact case that you
described earlier today.

Now, take the same idea in my patch:

b = ap_bucket_create_heap(..., "mumble", ...);
AP_BRIGADE_INSERT_TAIL(r->bb, b);

ap_rputs(r, " foobar ");

ap_brigade_standardize(r->bb);   /* Could be removed, see below */

b = ap_bucket_create_heap(..., "baz", ...);
AP_BRIGADE_INSERT_TAIL(r->bb, b);

ap_pass_brigade(r->output_filters, bb);

This code does give:

	mumble foobar baz

The standardize call marks the difference between the two APIs.  However,
if you absolutely must, we could actually remove that call (I'm not
writing that code snippet, just remove that line.  :-)

The standardize call is removed by adding a function call to some of the
AP_BRIGADE_* macros.  I am currently thinking of AP_BRIGADE_INSERT_TAIL
and AP_BRIGADE_CONCATENATE.  It needs to be any macro that inserts a
bucket at the end of the brigade, those are the two I have seen.

Basically, it changes AP_BRIGADE_INSERT_TAIL from:

#define AP_BRIGADE_INSERT_TAIL(b, e)                                    \
        AP_RING_INSERT_TAIL(&(b)->list, (e), ap_bucket, link)

to:

#define AP_BRIGADE_INSERT_TAIL(b, e)                                    \
	ap_brigade_standardize(b)					\
        AP_RING_INSERT_TAIL(&(b)->list, (e), ap_bucket, link)

Now, I really dislike this.  I want people to realize that they are switch
APIs, because it forces them to consider what they are doing.

I am also assuming that ap_vrprintf would be re-written to not copy the
data before passing it to buffer_output.  I'm not really sure how you
would do that, but I am assuming you have some thoughts.  If not, then for
any ap_rprintf or ap_rvprintf call, you are copying twice before passing
the data to the filter stack, which I really dislike.

IMHO this patch has not solved the problem you were most interested in
solving, and it hides it in such a way that it is more difficult to find
and fix.  Also, this isn't a general fix, so it only works for Apache,
which means that it basically needs to be duplicated in every bucket
program, and potentially every protocol module (still need to finish
abstracting some stuff out of http_protocol).

I look forward to hearing your thoughts on this, this idea is very
interesting, and it is VERY different than what I originally envisioned
when you first posted what you were thinking.

Ryan

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------