You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@httpd.apache.org by Rici Lake <ri...@ricilake.net> on 2005/04/25 19:47:40 UTC

Proposed patch: always cleanup brigades in ap_pass_brigade

Regardless of any other changes to the brigade API, this seems to me to 
be a good idea:

Index: util_filter.c
===================================================================
--- util_filter.c       (revision 158730)
+++ util_filter.c       (working copy)
@@ -500,6 +500,7 @@
  AP_DECLARE(apr_status_t) ap_pass_brigade(ap_filter_t *next,
                                           apr_bucket_brigade *bb)
  {
+    apr_status_t result = AP_NOBODY_WROTE;
      if (next) {
          apr_bucket *e;
          if ((e = APR_BRIGADE_LAST(bb)) && APR_BUCKET_IS_EOS(e) && 
next->r) {
@@ -523,9 +524,10 @@
                  }
              }
          }
-        return next->frec->filter_func.out_func(next, bb);
+        result = next->frec->filter_func.out_func(next, bb);
+        apr_brigade_cleanup(bb);
      }
-    return AP_NOBODY_WROTE;
+    return result;
  }

  AP_DECLARE(apr_status_t) ap_save_brigade(ap_filter_t *f,



Index: util_filter.h
===================================================================
--- util_filter.h       (revision 158730)
+++ util_filter.h       (working copy)
@@ -299,7 +299,8 @@
   * Pass the current bucket brigade down to the next filter on the 
filter
   * stack.  The filter returns an apr_status_t value.  If the 
bottom-most
   * filter doesn't write to the network, then ::AP_NOBODY_WROTE is 
returned.
- * The caller relinquishes ownership of the brigade.
+ * The caller retains ownership of the brigade. The brigade will be
+ * empty on return.
   * @param filter The next filter in the chain
   * @param bucket The current bucket brigade
   */

Re: Proposed patch: always cleanup brigades in ap_pass_brigade

Posted by Paul Querna <ch...@force-elite.com>.

Rici Lake wrote:
> Regardless of any other changes to the brigade API, this seems to me to
> be a good idea:
> 

+1.  I agree, while the discussion on other 'rules' for brigades is
good, I think this patch should be applied regardless.

It seems all of the other discussion has revolved around the cleanup or
free()ing of brigades.

Several of our own filters rely upon the core input filter doing a
brigade_cleanup(), but with the current code, this is not guaranteed to
happen. Putting the brigade_cleanup at the end of ap_pass_brigade ensure
that our own filters don't accidentally break.

-Paul

Re: Proposed patch: always cleanup brigades in ap_pass_brigade

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.

--On Monday, April 25, 2005 9:34 PM -0700 Paul Querna <ch...@force-elite.com> 
wrote:

> Plus, I hear those new dual core opterons are really nice.  I would
> rather have a better and cleaner API, than save a few CPU cycles here.

The posted patch does nothing to make the brigades or filter a cleaner API. 
All we'd have then is that ap_pass_brigade() does some 'magic' that has 
nothing at all to do with passing a brigade.  Yuck.  -- justin

Re: Proposed patch: always cleanup brigades in ap_pass_brigade

Posted by Paul Querna <ch...@force-elite.com>.

Justin Erenkrantz wrote:
> --On Monday, April 25, 2005 12:47 PM -0500 Rici Lake <ri...@ricilake.net>
> wrote:
....
>> -        return next->frec->filter_func.out_func(next, bb);
>> +        result = next->frec->filter_func.out_func(next, bb);
>> +        apr_brigade_cleanup(bb);
>>       }
>> -    return AP_NOBODY_WROTE;
>> +    return result;
>>   }
> 
> 
> This introduces a really large overhead as we'd be calling the cleanup
> for *every* output filter we invoke.  So, -0.9999.  -- justin

I disagree on 'really large overhead'.

The supposed common case, of an empty brigade, is one extra function
call, and one pointer compare inside brigade_cleanup.  If this is too
much, it could be inlined, to do the compare, inside ap_pass_brigade.
Put in some good branch prediction, and its effective overhead for an
already empty brigade is 0.

There are so many other 'really large overheads' that completely dwarf
this in comparison, I don't expect it to make any difference in a
'benchmark'.

Plus, I hear those new dual core opterons are really nice.  I would
rather have a better and cleaner API, than save a few CPU cycles here.

-Paul

Re: Proposed patch: always cleanup brigades in ap_pass_brigade

Posted by Joe Schaefer <jo...@sunstarsys.com>.

Joe Orton <jo...@redhat.com> writes:

> But in the latter two cases, adding a apr_brigade_cleanup() call to
> ap_pass_brigade() is completely redundant.  To me, you need to make the
> argument that the former case is so prevalent that it's worth an
> additional API guarantee, and the additional (small) overhead in in
> ap_pass_brigade.  I don't see it really...
>
> (and the gcc tail-call optimisation of ap_pass_brigade *is* such a handy
> feature when reading backtraces anyway :)

+1.  Once possibile situation to consider is where there are 
lots of filters, each of which is only handing off the 
original brigade passed in by a content handler.  IMO 
apr_brigade_cleanup should be invoked at most once in 
that case.

-- 
Joe Schaefer

Re: Proposed patch: always cleanup brigades in ap_pass_brigade

Posted by Rici Lake <ri...@ricilake.net>.

On 26-Apr-05, at 5:28 AM, Joe Orton wrote:
>
> This is really just a "should we provide extra API guarantees to 
> prevent
> users shooting themselves in the foot" debate, right?  Let's agree to
> disagree :)

Fair enough. I guess where I'm coming from is that I'm a user of the
API, and my feet have been shot at enough times already :)

>> In any event, a draft docstring. This does not reflect my view, and I
>> don't know if there is actually a consensus view, but it may reflect
>> the current state of affairs:
>
> It got linewrapped (format=flowed?) but otherwise I think this looks
> good - thanks a lot.

Sorry about that. The first time I saw RFC 2646, I checked the date to 
see whether it was a prank. Sadly, no -- it's standards track. And the 
problem with standards is that people *will* go and implement them, no 
matter how dumb they might be...

> Final chance for any dissenter to object? Else
> I'll commit this...

My final thoughts: We've agreed, I think, on the following:

1) The caller retains ownership of the brigade. The callee *must not* 
destroy it.

2) The caller has made a contract that every bucket in the brigade will 
remain valid until the callee returns. It does not guarantee anything 
beyond that, so the callee *must* explicitly setaside any buckets it 
wants to retain.

3) The callee may manipulate the contents of the brigade as it sees 
fit, during the span of the call. That is, it may insert and remove 
buckets, pass the brigade to another filter, read buckets (possibly 
morphing the brigade), etc. However, if it removes a bucket, that 
bucket becomes its responsibility.

I hope that's clear from the docstring I wrote.

So the only point of disagreement is the state of the brigade on return.

Number 3 above does not permit any semantics to be attached to the 
state of the brigade on return; any such semantics would have to be a 
private agreement between caller and callee, and the filter mechanism 
does not lend itself to such private agreements because it is not easy 
to guarantee that no other filter intervenes in the filter chain. If 
someone wanted to stack their own filters in this manner, they could 
just call the filter function directly rather than using 
ap_pass_brigade.

So the current state of play is: the callee MAY empty the brigade; the 
caller MUST cleanup the brigade if it wishes to reuse it. This leads to 
precisely the duplication of calls to apr_brigade_cleanup that people 
seem to be objecting to.

Better in that case would be that the callee MUST empty the brigade 
(and, in accordance with (2) and (3) above, either delete or setaside 
the buckets it removes.) In that case, the caller need not cleanup the 
brigade; it is permitted to assume that it is empty on return and need 
not waste cycles on a pointless apr_brigade_cleanup.

If there is any chance of agreement on this, I can produce a modified 
docstring on a moment's notice. :)

Rici

Re: Proposed patch: always cleanup brigades in ap_pass_brigade

Posted by Joe Orton <jo...@redhat.com>.

On Mon, Apr 25, 2005 at 06:12:04PM -0500, Rici Lake wrote:
> I guess that depends on whether most filters use and recycle pass 
> brigades. I suspect that most internal filters don't (and possibly have 
> the model which Joe Schaefer describes, where the brigade is simply 
> passed through), whereas most third-party filters probably either do or 
> should. Having now looked at quite a few output filters, I've come to 
> the conclusion that the normal model is to create and destroy brigades 
> on every invocation, which I can't help but think is a lot less 
> efficient than it could be.
> 
> But I actually disagree with the premise: the issue is not how 
> prevalent the case is, it is how likely it is to create a bug which is 
> triggered by an interaction between unrelated modules. Mitigating the 
> creation of such bugs is, in my mind, an important part of component 
> API design; particularly if the mitigation is low-cost. Of course, 
> everyone has their own concept of beauty, and mine is probably the 
> least important of anyone participating in this debate.

This is really just a "should we provide extra API guarantees to prevent
users shooting themselves in the foot" debate, right?  Let's agree to
disagree :)

> In any event, a draft docstring. This does not reflect my view, and I 
> don't know if there is actually a consensus view, but it may reflect 
> the current state of affairs:

It got linewrapped (format=flowed?) but otherwise I think this looks
good - thanks a lot.  Final chance for any dissenter to object? Else
I'll commit this...

(attached un-line-wrapped version)

Re: Proposed patch: always cleanup brigades in ap_pass_brigade

Posted by Rici Lake <ri...@ricilake.net>.

On 25-Apr-05, at 4:26 PM, Joe Orton wrote:

> On Mon, Apr 25, 2005 at 03:58:59PM -0500, Rici Lake wrote:
>> If we accept that the contents of a brigade are "undefined" when
>> ap_pass_brigade returns, the caller has three options:
>> -- call cleanup and reuse the brigade
>> -- call destroy (which will first call cleanup)
>> -- drop it on the floor and let destroy be called by the cleanup 
>> function
>>
>> In no case can it make any use of the brigade's contents (they're
>> "undefined"), so cleanup must be called just after ap_pass_brigade
>> returns.
>
> But in the latter two cases, adding a apr_brigade_cleanup() call to
> ap_pass_brigade() is completely redundant.  To me, you need to make the
> argument that the former case is so prevalent that it's worth an
> additional API guarantee, and the additional (small) overhead in in
> ap_pass_brigade.  I don't see it really...

I guess that depends on whether most filters use and recycle pass 
brigades. I suspect that most internal filters don't (and possibly have 
the model which Joe Schaefer describes, where the brigade is simply 
passed through), whereas most third-party filters probably either do or 
should. Having now looked at quite a few output filters, I've come to 
the conclusion that the normal model is to create and destroy brigades 
on every invocation, which I can't help but think is a lot less 
efficient than it could be.

But I actually disagree with the premise: the issue is not how 
prevalent the case is, it is how likely it is to create a bug which is 
triggered by an interaction between unrelated modules. Mitigating the 
creation of such bugs is, in my mind, an important part of component 
API design; particularly if the mitigation is low-cost. Of course, 
everyone has their own concept of beauty, and mine is probably the 
least important of anyone participating in this debate.

In any event, a draft docstring. This does not reflect my view, and I 
don't know if there is actually a consensus view, but it may reflect 
the current state of affairs:

Index: util_filter.h
===================================================================
--- util_filter.h       (revision 158730)
+++ util_filter.h       (working copy)
@@ -103,8 +103,8 @@
   * @name Filter callbacks
   *
   * This function type is used for filter callbacks. It will be passed a
- * pointer to "this" filter, and a "bucket" containing the content to 
be
- * filtered.
+ * pointer to "this" filter, and a "bucket brigade" containing the 
content
+ * to be filtered.
   *
   * In filter->ctx, the callback will find its context. This context is
   * provided here, so that a filter may be installed multiple times, 
each
@@ -120,10 +120,15 @@
   * or output filter chains and before any data is generated to allow 
the
   * filter to prepare for processing.
   *
- * The *bucket structure (and all those referenced by ->next and 
->prev)
- * should be considered "const". The filter is allowed to modify the
- * next/prev to insert/remove/replace elements in the bucket list, but
- * the types and values of the individual buckets should not be 
altered.
+ * The bucket brigade always belongs to the caller, but the filter
+ * is free to use the buckets within it as it sees fit. Normally,
+ * the brigade will be returned empty. Buckets *may not* be retained
+ * between successive calls to the filter unless they have been
+ * "set aside" with a call apr_bucket_setaside. Typically this will
+ * be done with ap_save_brigade(). Buckets removed from the brigade
+ * become the responsibility of the filter, which must arrange for
+ * them to be deleted, either by doing so directly or by inserting
+ * them in a brigade which will subsequently be destroyed.
   *
   * For the input and output filters, the return value of a filter 
should be
   * an APR status value.  For the init function, the return value should
@@ -299,9 +304,13 @@
   * Pass the current bucket brigade down to the next filter on the 
filter
   * stack.  The filter returns an apr_status_t value.  If the 
bottom-most
   * filter doesn't write to the network, then ::AP_NOBODY_WROTE is 
returned.
- * The caller relinquishes ownership of the brigade.
   * @param filter The next filter in the chain
   * @param bucket The current bucket brigade
+ *
+ * @remark Ownership of the brigade is retained by the caller. On 
return,
+ *         the contents of the brigade are UNDEFINED, and the caller 
must
+ *         either call apr_brigade_cleanup or apr_brigade_destroy on
+ *         the brigade.
   */
  AP_DECLARE(apr_status_t) ap_pass_brigade(ap_filter_t *filter,
                                           apr_bucket_brigade *bucket);

Re: Proposed patch: always cleanup brigades in ap_pass_brigade

Posted by Joe Orton <jo...@redhat.com>.

On Mon, Apr 25, 2005 at 03:58:59PM -0500, Rici Lake wrote:
> If we accept that the contents of a brigade are "undefined" when 
> ap_pass_brigade returns, the caller has three options:
> -- call cleanup and reuse the brigade
> -- call destroy (which will first call cleanup)
> -- drop it on the floor and let destroy be called by the cleanup 
> function
> 
> In no case can it make any use of the brigade's contents (they're 
> "undefined"), so cleanup must be called just after ap_pass_brigade 
> returns.

But in the latter two cases, adding a apr_brigade_cleanup() call to
ap_pass_brigade() is completely redundant.  To me, you need to make the
argument that the former case is so prevalent that it's worth an
additional API guarantee, and the additional (small) overhead in in
ap_pass_brigade.  I don't see it really...

(and the gcc tail-call optimisation of ap_pass_brigade *is* such a handy
feature when reading backtraces anyway :)

> /me continues the tradition of creating flames out of bucket brigades. 
> You'd think it would be the reverse :)

Hehe ;)

Re: Proposed patch: always cleanup brigades in ap_pass_brigade

Posted by Rici Lake <ri...@ricilake.net>.

On 25-Apr-05, at 3:36 PM, Justin Erenkrantz wrote:

> --On Monday, April 25, 2005 3:06 PM -0500 Rici Lake 
> <ri...@ricilake.net> wrote:
>
>> Can you quantify that "large overhead"? In the case of a 'compliant'
>> filter, it consists of:
>
> With Joe's suggestion for 'creator must destroy', this step is wholly 
> unnecessary and only overcomplicates things.   -- justin

The creator must destroy the brigade when it's done with it -- if I 
understand Joe correctly.

Creating and destroying a brigade on every invocation of a filter is a 
lot more overhead than a few extraneous unfulfilled while loops -- even 
if the brigade is allocated in a bucket_allocator. Even without the 
buildup of pool memory for the brigade objects, creating a brigade 
requires the registration (and later deregistration, during 
brigade_destroy) of a pool cleanup function.

If we accept that the contents of a brigade are "undefined" when 
ap_pass_brigade returns, the caller has three options:
-- call cleanup and reuse the brigade
-- call destroy (which will first call cleanup)
-- drop it on the floor and let destroy be called by the cleanup 
function

In no case can it make any use of the brigade's contents (they're 
"undefined"), so cleanup must be called just after ap_pass_brigade 
returns.

If ap_pass_brigade does the call itself, it reduces the number of 
places in the code where cleanup is called, and possibly frees 
resources slightly earlier in the case where the brigade creator 
dropped the brigade on the floor. The cost is a few extraneous if 
(!APR_BRIGADE_EMPTY(bb)) statements (or moral equivalent). This is 
actually a macro which expands to a fairly simple pointer comparison.

That's hardly a complication -- it's a simplification (from the 
perspective of the caller of ap_pass_brigade. And it's hardly an 
enormous overhead.

/me continues the tradition of creating flames out of bucket brigades. 
You'd think it would be the reverse :)

Re: Proposed patch: always cleanup brigades in ap_pass_brigade

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.

--On Monday, April 25, 2005 3:06 PM -0500 Rici Lake <ri...@ricilake.net> 
wrote:

> Can you quantify that "large overhead"? In the case of a 'compliant'
> filter, it consists of:

With Joe's suggestion for 'creator must destroy', this step is wholly 
unnecessary and only overcomplicates things.   -- justin

Re: Proposed patch: always cleanup brigades in ap_pass_brigade

Posted by Rici Lake <ri...@ricilake.net>.

On 25-Apr-05, at 1:30 PM, Justin Erenkrantz wrote:

> --On Monday, April 25, 2005 12:47 PM -0500 Rici Lake 
> <ri...@ricilake.net> wrote:
>
>> Regardless of any other changes to the brigade API, this seems to me 
>> to
>> be a good idea:
>>
>> Index: util_filter.c
>> ===================================================================
>> --- util_filter.c       (revision 158730)
>> +++ util_filter.c       (working copy)
>> @@ -500,6 +500,7 @@
>>   AP_DECLARE(apr_status_t) ap_pass_brigade(ap_filter_t *next,
>>                                            apr_bucket_brigade *bb)
>>   {
>> +    apr_status_t result = AP_NOBODY_WROTE;
>>       if (next) {
>>           apr_bucket *e;
>>           if ((e = APR_BRIGADE_LAST(bb)) && APR_BUCKET_IS_EOS(e) &&
>> next->r) {
>> @@ -523,9 +524,10 @@
>>                   }
>>               }
>>           }
>> -        return next->frec->filter_func.out_func(next, bb);
>> +        result = next->frec->filter_func.out_func(next, bb);
>> +        apr_brigade_cleanup(bb);
>>       }
>> -    return AP_NOBODY_WROTE;
>> +    return result;
>>   }
>
> This introduces a really large overhead as we'd be calling the cleanup 
> for *every* output filter we invoke.  So, -0.9999.  -- justin

Can you quantify that "large overhead"? In the case of a 'compliant' 
filter, it consists of:
-- a function call
-- an if (!APR_BRIGADE_EMPTY) which fails
-- a return
-- and the loss of a possible tailcall optimization.

I can't imagine that all that amounts to as much as 100 cycles; if 
there were 20 filters in the chain and that were called every 4k of 
output (and both of those figures are wildly conservative), that would 
amount to half a cycle per byte. If you wanted to save the function 
call, you could replace

+        apr_brigade_cleanup(bb);

with an open coded version of cleanup.

The point is that it's fine to say somewhere "filters must clean their 
input brigades", but what if one doesn't? That will create a bug which 
pops up occasionally in the interaction between two unrelated modules. 
Good interfaces don't have such easily eliminated gotchas, imho.

In any event, if everyone knows that ap_pass_brigade is going to 
cleanup the brigade, then other redundant calls to apr_brigade_cleanup 
can be removed.

Re: Proposed patch: always cleanup brigades in ap_pass_brigade

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.

--On Monday, April 25, 2005 12:47 PM -0500 Rici Lake <ri...@ricilake.net> 
wrote:

> Regardless of any other changes to the brigade API, this seems to me to
> be a good idea:
>
> Index: util_filter.c
> ===================================================================
> --- util_filter.c       (revision 158730)
> +++ util_filter.c       (working copy)
> @@ -500,6 +500,7 @@
>   AP_DECLARE(apr_status_t) ap_pass_brigade(ap_filter_t *next,
>                                            apr_bucket_brigade *bb)
>   {
> +    apr_status_t result = AP_NOBODY_WROTE;
>       if (next) {
>           apr_bucket *e;
>           if ((e = APR_BRIGADE_LAST(bb)) && APR_BUCKET_IS_EOS(e) &&
> next->r) {
> @@ -523,9 +524,10 @@
>                   }
>               }
>           }
> -        return next->frec->filter_func.out_func(next, bb);
> +        result = next->frec->filter_func.out_func(next, bb);
> +        apr_brigade_cleanup(bb);
>       }
> -    return AP_NOBODY_WROTE;
> +    return result;
>   }

This introduces a really large overhead as we'd be calling the cleanup for 
*every* output filter we invoke.  So, -0.9999.  -- justin