You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Rici Lake <ri...@ricilake.net> on 2005/04/25 23:24:47 UTC

Pool buckets, transient buckets and bucket_setaside

Why do pool buckets do an automatic setaside? Should they?

No other bucket type does this; consequently, if you want to hold on to 
a bucket beyond the scope of the call in which you received the 
brigade, you must explicitly call bucket_setaside.

Without examining the bucket type, a practice which should normally be 
discouraged, it is impossible to know whether a given bucket will 
auto-setaside or not; consequently, bucket_setaside should always be 
called (possibly via ap_save_brigade).

So it would seem that either all bucket types should do auto-setaside 
or that none of them should. It would undoubtedly be convenient if all 
buckets did this -- it would remove the necessity for ap_save_brigade 
-- but I believe there is an issue with the model.

Consider a pool bucket which is placed into an existing brigade. If 
this brigade is not explicitly cleaned up, which would defuse the pool 
bucket's setaside mechanism, then both the brigade and the pool 
bucket's cleanup functions will run when the pool is cleaned up. The 
bucket's cleanup function is likely to run first, however, since it is 
likely to be newer than the brigade it was put into. That will create 
an unnecessary heap copy of the bucket, which will then be free'd when 
the pool cleanup for the brigade runs, slightly later.

If brigades are (almost) always explicitly cleaned or destroyed, and 
buckets are always explicitly setaside, then what is the point of the 
overhead of registering auto-setaside functions for every pool bucket?

Conversely, is there a reasonable mechanism for avoiding the 
unnecessary setaside and allowing the removal of the requirement to 
explicitly setaside buckets? 


Re: Pool buckets, transient buckets and bucket_setaside

Posted by Cliff Woolley <jw...@virginia.edu>.
On Mon, 25 Apr 2005, Rici Lake wrote:

> Surely this is equally true of any pool-related resource. For example,
> a file which has been opened and has a pool cleanup registered would
> have exactly the same lifetime as a pool-allocated string.

Hmm... for some reason I thought file buckets morphed themselves upon pool
cleanup, too.  That they don't could be viewed as a bug in file buckets.

As an alternative, we could just add pool cleanups to the other types and
make them all exist only in APR_BUCKET_DEBUG mode and assert(0) if the
cleanup ever actually gets called.

Let's move this discussion to dev@apr.apache.org.

--Cliff

Re: Pool buckets, transient buckets and bucket_setaside

Posted by Cliff Woolley <jw...@virginia.edu>.
On Mon, 25 Apr 2005, Rici Lake wrote:

> Surely this is equally true of any pool-related resource. For example,
> a file which has been opened and has a pool cleanup registered would
> have exactly the same lifetime as a pool-allocated string.

Hmm... for some reason I thought file buckets morphed themselves upon pool
cleanup, too.  That they don't could be viewed as a bug in file buckets.

As an alternative, we could just add pool cleanups to the other types and
make them all exist only in APR_BUCKET_DEBUG mode and assert(0) if the
cleanup ever actually gets called.

Let's move this discussion to dev@apr.apache.org.

--Cliff

Re: Pool buckets, transient buckets and bucket_setaside

Posted by Rici Lake <ri...@ricilake.net>.
On 25-Apr-05, at 5:04 PM, Cliff Woolley wrote:

>> Why do pool buckets do an automatic setaside? Should they?
>
> They set themselves aside when the pool in which their data was 
> allocated
> is in the process of being destroyed.  This is necessary because no 
> other
> bucket type has a similarly unpredictable lifetime.

Surely this is equally true of any pool-related resource. For example, 
a file which has been opened and has a pool cleanup registered would 
have exactly the same lifetime as a pool-allocated string.

> Even transient
> buckets have a lifetime that is well-known: they last until the call 
> chain
> returns.  Pool buckets might not even live that long.  Don't think 
> just in
> terms of httpd's usage of buckets here -- this is an APR construct.

I hadn't considered the case of a pool being released within the call 
chain :) But that would still apply equally to a file, no?

Is that actually likely, though, even leaving httpd aside? Or, to put 
it another way, if you're not using bucket brigades in the httpd style, 
at *exactly what* point must you set aside a bucket.

> If there were any way to have transient buckets automatically set
> themselves aside, then I would argue that that should be done, too.  Of
> course there isn't, since there is no such thing as a callback that 
> says
> "the call chain has returned now".  But at least as it stands now, we 
> have
> a minimum guaranteed time that all of the buckets' data will live -- as
> long as the call chain lives.

Actually, you could achieve this very easily if a brigade had a cleanup 
list -- in other words, if it were a sort of degenerate pool. Then you 
could register the transient's auto-setaside function against the 
brigade's cleanup list, as long as you were prepared to ensure that 
brigade_cleanup were called on the brigade when the call chain 
returned. That's what got me started thinking about this in the first 
place.

I think that lifetime and datatype are probably independent aspects of 
a bucket, and the design really ought, in some idealistic long-term 
definition of 'ought', to be refactored accordingly. *Any* resource 
might be pool-'allocated', might be transient, might be immortal. The 
games played with file buckets and the pool parameter to 
bucket_setaside are a demonstration, I think. Taking that view, pool 
buckets are not particularly special, or maybe I'm missing something 
obvious.

If file and socket buckets (and other such bucket types) could be 
taught to do auto-setaside, then ap_save_brigade could disappear and 
you could actually retain a partial brigade between filter calls with a 
BRIGADE_CONCAT. That has a certain charm, but I have the niggling 
feeling that it wouldn't actually work out in practice, because of the 
timing of cleanup calls I referred to earlier.

>> If brigades are (almost) always explicitly cleaned or destroyed, and
>> buckets are always explicitly setaside, then what is the point of the
>> overhead of registering auto-setaside functions for every pool bucket?
>
> If registering a pool cleanup that doesn't get called (on a bucket type
> that is rarely used in httpd, no less) has a measurable performance
> impact, I'd be extremely surprised.

Well, me too, but then I was surprised at how much the overhead of an 
"if empty" seems to matter. :) However, if they were used a lot, the 
overhead might actually be measurable. It wouldn't be registering the 
cleanup, so much as cancelling it -- registering a cleanup only 
requires snapping a new link onto a chain, but cancelling it involves a 
linear scan through the cleanups, which could easily turn quadratic.


Re: Pool buckets, transient buckets and bucket_setaside

Posted by Cliff Woolley <jw...@virginia.edu>.
On Mon, 25 Apr 2005, Rici Lake wrote:

> Why do pool buckets do an automatic setaside? Should they?

They set themselves aside when the pool in which their data was allocated
is in the process of being destroyed.  This is necessary because no other
bucket type has a similarly unpredictable lifetime.  Even transient
buckets have a lifetime that is well-known: they last until the call chain
returns.  Pool buckets might not even live that long.  Don't think just in
terms of httpd's usage of buckets here -- this is an APR construct.

If there were any way to have transient buckets automatically set
themselves aside, then I would argue that that should be done, too.  Of
course there isn't, since there is no such thing as a callback that says
"the call chain has returned now".  But at least as it stands now, we have
a minimum guaranteed time that all of the buckets' data will live -- as
long as the call chain lives.

> If brigades are (almost) always explicitly cleaned or destroyed, and
> buckets are always explicitly setaside, then what is the point of the
> overhead of registering auto-setaside functions for every pool bucket?

If registering a pool cleanup that doesn't get called (on a bucket type
that is rarely used in httpd, no less) has a measurable performance
impact, I'd be extremely surprised.

--Cliff