You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by "William A. Rowe, Jr." <wr...@rowe-clan.net> on 2004/05/19 07:46:45 UTC

Re: is it possible to mark buckets to be copied only when to be set-aside?

It's a bug in mod_xslt, if that module trys to set aside a transient bucket.

Bill

At 12:09 AM 5/19/2004, Stas Bekman wrote:
>We have the following situation in mod_perl 2 land: we use the same buffer to allocate data in buckets which are passed to the filters. That bucket is created once per request. It works perfectly fine and effective most of the time (we copy from user's program perl space into the re-usable buffer, but no extra allocation happens). But just now one user has reported that it breaks mod_xslt filter, which sets aside the buckets sent from the modperl handler, and then uses them after seeing EOS. By that time the data in all but last bucket is corrupted. Obviously the straightforward solution is to allocate a new buffer for each bucket that mod_perl sends to the filter chain. But this is a huge waste for most users, which don't use this particular kind of output filters that setaside buckets.
>
>My question is: Is it possible to mark the bucket's data as volatile or something, so if a downstream filter wants to set them aside it will have to do the copying?
>
>At the moment the bucket is created as:
>    bucket = apr_bucket_transient_create(buf, len, ba);
>and buf is static for the length of the request.
>
>Thank you.
>
>
>-- 
>__________________________________________________________________
>Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
>http://stason.org/     mod_perl Guide ---> http://perl.apache.org
>mailto:stas@stason.org http://use.perl.org http://apacheweek.com
>http://modperlbook.org http://apache.org   http://ticketmaster.com



Re: is it possible to mark buckets to be copied only when to be set-aside?

Posted by Cliff Woolley <jw...@virginia.edu>.
On Wed, 19 May 2004, Stas Bekman wrote:

> I'm sure Cliff and other filter gurus

For the record, I make no claims as to being a filter guru.  Some parts of
the filtering system baffle me as much as anybody else.  I know buckets.
I more or less leave the filtering to somebody else.  :)

> will be more than helpful to answer any questions arising while working
> on such a document.

... but yes, both I and others would, I'm sure, be more than willing to
have our brains picked.

--Cliff


PS: I begin my drive toward California at the crack o' dawn in the morning
(stops for a few days in Kentucky and St. Louis)... see you SF types soon!

Re: is it possible to mark buckets to be copied only when to be set-aside?

Posted by Stas Bekman <st...@stason.org>.
Cliff Woolley wrote:
> On Wed, 19 May 2004, Stas Bekman wrote:
> 
> 
>>And a lot of pain and wasted time could be saved if all this was documented.
> 
> 
> I know.  :(
> 
> 
>>I'm not aware of such documentation's existance, besides very scarce notes in
>>the header files.
> 
> 
> The best documentation that exists, besides what's in the header files, is
> in the talk I gave on Bucket Brigades at ApacheCon 2002; see
> http://www.cs.virginia.edu/~jcw5q/talks/ for links to the slides (ppt)
> and notes pages (pdf).  There are also links there to my slides from
> ApacheCon 2003, which discuss debugging buckets and brigades.  The second
> best bucket reference is a section in Chapter 13 of Ryan's book Apache
> Server 2.0: The Complete Reference, pp 310-322.  If anybody else knows of
> other resources, I'd love to know about them... but I suspect that's more
> or less it.  Right from the horses' mouths.  But unfortunately neither
> Ryan nor I were verbose enough in either case to really go into the kind
> of depth you're looking for.  Not that I don't agree it would be useful to
> have written down somewhere.

Thanks Cliff, I've been to both of your great talks :)

Perhaps someone on this list will take up an initiative to write such a 
document, by first putting together the existing resources and then beef it up 
to add the missing stuff. I'm sure Cliff and other filter gurus will be more 
than helpful to answer any questions arising while working on such a document.

I'd have taken up that project by myself, but I'm already spread too thin and 
must finish first the mod_perl 2.0 API docs before we can release 2.0.

To add to Cliff's list of resources, mod_perl 2.0 filter documentation is 
pretty extensive and perhaps it's the easiest to start with filters by writing 
them in perl. For more information see:
http://perl.apache.org/docs/2.0/user/handlers/filters.html

p.s. may be rbb will be willing to contribute parts of his book's filters chapter.

-- 
__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

Re: is it possible to mark buckets to be copied only when to be set-aside?

Posted by Cliff Woolley <jw...@virginia.edu>.
On Wed, 19 May 2004, Stas Bekman wrote:

> And a lot of pain and wasted time could be saved if all this was documented.

I know.  :(

> I'm not aware of such documentation's existance, besides very scarce notes in
> the header files.

The best documentation that exists, besides what's in the header files, is
in the talk I gave on Bucket Brigades at ApacheCon 2002; see
http://www.cs.virginia.edu/~jcw5q/talks/ for links to the slides (ppt)
and notes pages (pdf).  There are also links there to my slides from
ApacheCon 2003, which discuss debugging buckets and brigades.  The second
best bucket reference is a section in Chapter 13 of Ryan's book Apache
Server 2.0: The Complete Reference, pp 310-322.  If anybody else knows of
other resources, I'd love to know about them... but I suspect that's more
or less it.  Right from the horses' mouths.  But unfortunately neither
Ryan nor I were verbose enough in either case to really go into the kind
of depth you're looking for.  Not that I don't agree it would be useful to
have written down somewhere.

--Cliff

Re: is it possible to mark buckets to be copied only when to be set-aside?

Posted by Stas Bekman <st...@stason.org>.
Cliff Woolley wrote:
> On Wed, 19 May 2004, Cliff Woolley wrote:
> 
> 
>>can't just ignore APR_ENOTIMPL for bucket types, because pipe and socket
>>buckets never had setaside implemented on them.  I thought I remembered
>>that that was because Greg and Ryan and I had some huge debate about it
>>and it was decided for some reason that setting aside a pipe or socket
>>didn't make sense.  But now I can't find where we talked about that in the
>>archives.
> 
> 
> BTW - I remembered the reason why it was decided that it didn't make sense
> to implement setaside for pipe/socket buckets.  It's because for all other
> bucket types, setaside takes one bucket in and produces one bucket out,
> whereas for pipes and sockets it would take one bucket in and either
> produce one bucket if the lifetime request was already satisfied or it
> would produce a chain of n buckets out where n is proportional to the size
> of the data that had been waiting around to be read in the pipe/socket.
> It was that inconsistency that was disliked.  At this point, I'm willing
> to agree to whatever makes things work and would probably accept this
> inconsistency as long as it's documented.  What we have now is bug-prone
> beyond belief.

And a lot of pain and wasted time could be saved if all this was documented. 
I'm not aware of such documentation's existance, besides very scarce notes in 
the header files. Filters are a very complex area, and w/o proper user docs 
it's no wonder that we see all these problems.

The worst part is that some of the developers who originally wrote things are 
no longer here to help, so their knowledge and reasons, for doing things in 
certain ways, get lost.

I wish someone who groks filters well could write a proper document explaining 
most of (all?) the ins and outs of writing a proper filter. The old article 
written by rbb is nice, but a way too short to cover all the nuances.

-- 
__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

Re: is it possible to mark buckets to be copied only when to be set-aside?

Posted by Cliff Woolley <jw...@virginia.edu>.
On Wed, 19 May 2004, Cliff Woolley wrote:

> can't just ignore APR_ENOTIMPL for bucket types, because pipe and socket
> buckets never had setaside implemented on them.  I thought I remembered
> that that was because Greg and Ryan and I had some huge debate about it
> and it was decided for some reason that setting aside a pipe or socket
> didn't make sense.  But now I can't find where we talked about that in the
> archives.

BTW - I remembered the reason why it was decided that it didn't make sense
to implement setaside for pipe/socket buckets.  It's because for all other
bucket types, setaside takes one bucket in and produces one bucket out,
whereas for pipes and sockets it would take one bucket in and either
produce one bucket if the lifetime request was already satisfied or it
would produce a chain of n buckets out where n is proportional to the size
of the data that had been waiting around to be read in the pipe/socket.
It was that inconsistency that was disliked.  At this point, I'm willing
to agree to whatever makes things work and would probably accept this
inconsistency as long as it's documented.  What we have now is bug-prone
beyond belief.

--Cliff

Re: is it possible to mark buckets to be copied only when to be set-aside?

Posted by Cliff Woolley <jw...@virginia.edu>.
On Wed, 19 May 2004, Cliff Woolley wrote:

> Note: ap_save_brigade() handles the setaside and brigade concatenation for
> you.  I suspect there are a number of places in the code that we are
> incorrectly calling APR_BRIGADE_CONCAT() instead of ap_save_brigade().
> That's bug city right there.

Worse, there are probably places in the code where we don't even use
APR_BRIGADE_CONCAT() and simply append each bucket one at a time from the
source brigade to the "saved" brigade using APR_BRIGADE_INSERT_TAIL() (or
even APR_BUCKET_INSERT_AFTER(), though it's hard for me to imagine a case
where doing it that way would have made sense to the programmer).

Furthermore, I just found a bug in ap_save_brigade().  On line 534 of
util_filter.c:

        rv = apr_bucket_setaside(e, p);
        if (rv != APR_SUCCESS
            /* ### this ENOTIMPL will go away once we implement setaside
               ### for all bucket types. */
            && rv != APR_ENOTIMPL) {
            return rv;
        }

Note that ### comment?  Yeah it denotes what at this point is a bug.  You
can't just ignore APR_ENOTIMPL for bucket types, because pipe and socket
buckets never had setaside implemented on them.  I thought I remembered
that that was because Greg and Ryan and I had some huge debate about it
and it was decided for some reason that setting aside a pipe or socket
didn't make sense.  But now I can't find where we talked about that in the
archives.  Anyway, when Greg originally wrote the lines in question (see
http://marc.theaimsgroup.com/?l=apr-dev&m=99190977519715&w=2), he
obviously did so assuming that setaside would be implemented on all bucket
types.  It was not.

I recommend that the interested reader go back and take a look at the
following thread (this kind of starts somewhere in the middle but this one
message and its followups are particularly relevant):

http://marc.theaimsgroup.com/?l=apr-dev&m=99178798619777&w=2

As it is, given the current State Of The Buckets: If rv == APR_ENOTIMPL
when you call apr_bucket_setaside(), then you're supposed to call
apr_bucket_read(), get the data out of the bucket, and stick it into a
heap bucket.  ap_save_brigade() could possibly do a little optimizing here
in the special cases of socket and pipe buckets, and grab the actual APR
socket or pipe out of the bucket's internal data and compare the
socket/pipe's pool to pool p.  If they already match, you don't have to do
the read and make-into-heap-bucket process; the reason being is that the
point of setaside is to tell the buckets code that you want it to
guarantee that the data storage behind bucket e will live at least as long
as pool p.

I thought I could just refer for some of this back to either of my sets of
ApacheCon slides, but I can see I discussed setaside() insufficiently well
at both ApacheCons.  :(

--Cliff

Re: is it possible to mark buckets to be copied only when to be set-aside?

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 08:33 AM 5/19/2004, Cliff Woolley wrote:
>On Wed, 19 May 2004, William A. Rowe, Jr. wrote:
>
>> It's a bug in mod_xslt, if that module trys to set aside a transient bucket.
>
>Huh?  No it isn't.  Half of the reason setaside() even exists is to handle
>transient buckets.

I didn't say setaside(), and yes - you identifed the proper behavior (which I
doubt the mod_xslt was doing.)  Perhaps if I said "trys to hold a reference"
it might have been less ambiguous.

Bill  


Re: is it possible to mark buckets to be copied only when to be set-aside?

Posted by Cliff Woolley <jw...@virginia.edu>.
On Wed, 19 May 2004, William A. Rowe, Jr. wrote:

> It's a bug in mod_xslt, if that module trys to set aside a transient bucket.

Huh?  No it isn't.  Half of the reason setaside() even exists is to handle
transient buckets.

static apr_status_t transient_bucket_setaside(apr_bucket *b, apr_pool_t *pool)
{
    b = apr_bucket_heap_make(b, (char *)b->data + b->start, b->length, NULL);
    if (b == NULL) {
        return APR_ENOMEM;
    }
    return APR_SUCCESS;
}

In other words, setting aside a transient bucket does exactly what you
want -- it copies the data into a new buffer allocated on the heap.  So I
think mod_perl is doing the right thing by using a transient bucket here,
and I think mod_xslt is doing the right thing by calling setaside.
However, setaside is supposed to be used prior to returning from any
function in the current call stack, not just when you want to keep a
bucket past the end of the current request.

In other words, if any filter wants to keep a bucket and return to its
caller rather than passing that bucket to the next filter, it should call
setaside on that bucket.  Failing to do so won't be a problem for most
bucket types, but it most assuredly WILL be a problem for transient
buckets, by the definition of "transient": the storage behind a transient
bucket is only guaranteed to exist and have the desired data in it until
you return back into the function that created that bucket.

Note: ap_save_brigade() handles the setaside and brigade concatenation for
you.  I suspect there are a number of places in the code that we are
incorrectly calling APR_BRIGADE_CONCAT() instead of ap_save_brigade().
That's bug city right there.

--Cliff

Re: is it possible to mark buckets to be copied only when to be set-aside?

Posted by Stas Bekman <st...@stason.org>.
William A. Rowe, Jr. wrote:
> It's a bug in mod_xslt, if that module trys to set aside a transient bucket.

Ah, cool, for some reason I thought that transient still has to have a copy :) 
Thanks Bill!

> At 12:09 AM 5/19/2004, Stas Bekman wrote:
> 
>>We have the following situation in mod_perl 2 land: we use the same buffer to allocate data in buckets which are passed to the filters. That bucket is created once per request. It works perfectly fine and effective most of the time (we copy from user's program perl space into the re-usable buffer, but no extra allocation happens). But just now one user has reported that it breaks mod_xslt filter, which sets aside the buckets sent from the modperl handler, and then uses them after seeing EOS. By that time the data in all but last bucket is corrupted. Obviously the straightforward solution is to allocate a new buffer for each bucket that mod_perl sends to the filter chain. But this is a huge waste for most users, which don't use this particular kind of output filters that setaside buckets.
>>
>>My question is: Is it possible to mark the bucket's data as volatile or something, so if a downstream filter wants to set them aside it will have to do the copying?
>>
>>At the moment the bucket is created as:
>>   bucket = apr_bucket_transient_create(buf, len, ba);
>>and buf is static for the length of the request.
>>
>>Thank you.
>>
>>
>>-- 
>>__________________________________________________________________
>>Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
>>http://stason.org/     mod_perl Guide ---> http://perl.apache.org
>>mailto:stas@stason.org http://use.perl.org http://apacheweek.com
>>http://modperlbook.org http://apache.org   http://ticketmaster.com
> 
> 


-- 
__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com