You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@httpd.apache.org by Greg Stein <gs...@lyra.org> on 2001/02/27 16:53:26 UTC

Re: Bucket API cleanup issues

On Tue, Feb 27, 2001 at 09:54:25AM -0500, Cliff Woolley wrote:
> On Mon, 26 Feb 2001 rbb@covalent.net wrote:
> > > 3) pool_bucket_cleanup() is completely bogus AFAICT.  I've added this
> > > comment to the code, which describes the problems pretty well:
> > > 4) The same problem applies to file buckets that have been split/copied
> > > when APR_HAS_MMAP: when one of them gets read it gets made into an MMAP.
> > > ALL of the file buckets should be converted at the same time to reference
> > > the same MMAP.
> >
> > I disagree about how to fix this.  The simple solution is to just abstract
> > on level, such as:
> >
> >   bucket   ->     bucket   ->   bucket
> >     |               |             |
> >   shared          shared	shared
> >     |               |             |
> >      -----------------------------
> >                     |
> >                 pool_bucket
> 
> So is what you're saying that struct apr_bucket should be a member of TWO
> rings... one brigade as usual plus one ring ("brigade") of siblings
> buckets that point to the same resource?  I'd thought of that, but didn't
> think anyone would buy it.  I'm fine with the idea.  But I *must* be
> missing something... how does that keep us from traversing a list of
> buckets when we convert types?  (PS: Remember that the "shared" level is
> going away.)  If I've missed your point, please elaborate.

Euh... I don't think we want another ring.

A simpler idea is to have the apr_bucket_pool structure contain a pointer to
an apr_bucket_heap structure. At cleanup time, you do the following:

*) if h->heap_ptr is non-NULL, then use it for your new s->data field.
   do appropriate incref/decref for the two shared structures. toss the
   private pool stuff when refcount==0

*) if h->heap_ptr is NULL, then create a new apr_bucket_heap from the pool
   bucket. leave a ptr to this in the pool bucket. change self to point to
   the new heap and do the right incref/decref. zero-out the ptrs in the
   pool structure (they'll be invalid momentarily; this is for safety in
   case somebody tries to access them)

*) in the read() function (and others), if h->heap_ptr if non-NULL, then
   rebuild the bucket around the (shared) apr_bucket_heap structure. adjust
   incref/decref as appropriate.

So what you get here is a single allocation of a heap bucket. Then, you
lazily repoint all other buckets to the new heap bucket.

This does imply that the pool bucket structures must be malloc'd rather than
come from the pool (because the structs must live until the last reference,
even though the pool may disappear sooner).

Strictly speaking, you'll never get a read where h->heap_ptr is non-NULL.
When the pool is cleaned up, the first bucket hit will construct the heap
bucket. All the rest participating in the cleanup will get repointed. By the
end of the cleanup, nobody will be referring to the old pool substructure,
and it will go away.

But logically, it is better to make allowances for h->heap_ptr in case some
fool wants to read() during a cleanup.

>...
> > > 6) Should mmap_destroy() call apr_mmap_delete(m->mmap) after when the last
> > > reference to the mmap bucket is destroyed?
> >
> > No.  If you do that, then caching MMAP's becomes impossible.  The
> > apr_mmap_delete should be left up to the pool cleanup.
> 
> Okay, no problem.  I'd just noticed it when I was going through the code
> and thought I'd bring it up, but I wasn't married to the idea.  I figured
> it was left out for a reason but didn't know what it was.  Scratch #6.

A better approach is to have the cache insert the MMAP bucket with a
refcount starting at 2 (one for the cache, one for the bucket). When the
last bucket goes away, you'll still have a refcount of 1 and the MMAP will
remain alive.

When we auto-convert a FILE to an MMAP, it will start with refcount==1. This
allows the MMAP to be tossed when the bucket goes away. (pool cleanup could
be a *long* ways away)

This gives us immediate cleanup when possible, and deferred cleanup when
needed [by a cache].

> > > 7) socket_read() and pipe_read() should return APR_SUCCESS even in an EOF
> > > condition, as discussed on new-httpd a week or so ago.  I haven't made
> > > this change yet, though I have put an XXX comment in the code about it,
> > > because it requres fixing the input filters in Apache to figure out EOS on
> > > their own.  Looking for suggestions on how to do that.

If you do a blocking read on a **SOCKET bucket** for one byte, then you'll
end up with one of three situations:

1) EOF was hit (internally) and the socket is turned into a zero-length HEAP
   bucket.

   Apache doesn't do anything special to recognize this situation.

2) Timeout was reached. APR_TIMEUP is returned by the read operation. This
   is an actual error situation, so Apache handles it as appropriate.

3) At least one byte is returned in a HEAP bucket.

Apache will determine when to insert the EOS based on when the protocol says
the request has ended. This is based on chunking or Content-Length or
end-of-headers. The client cannot close the connection to denote the end
marker.

Note that situation (1) should only occur in one of two cases:

a) the client aborted the connection
b) attempting to read a second (or Nth) request, and the client closed it

Okay. The above is for reading an individual bucket. That is, the read()
function.

Reading from the input filter **stack** is different. Let's say that case
(1) occurs and you've "read" the zero bytes out of the brigade. The brigade
is now empty, so you do an ap_get_brigade() for more. This drops all the way
down to the core_input_filter. It says, "sorry, I gave you everything I had"
and returns APR_EOF. *NOW* is where the EOS is detected.

core_input_filter can do one of two things:

 i) insert an EOS right after the SOCKET bucket at setup time. continue to
    return APR_EOF if the SOCKET bucket (and EOS bucket) is consumed and
    core_input_filter is called for more.

ii) don't return APR_EOF, but return EOS instead

I prefer the former. The upper layers cannot distinguish "end of this
request" from "there is nothing more" in the second case. They'll go for
more and keep getting EOS. But that will be a 400 (Bad Request) because it
looks empty.

In case (2) above, Apache tears down the connection, so EOS is never an
issue.

In case (3) above, you have data, so EOS is not an issue.

And when you're successfully fetching data, the EOS is inserted by the upper
level filters (HTTP_IN? DECHUNK? dunno which it is) when it determines the
logical request has ended (despite what the lower level filters may say
about more data being present).

The big gimmick is to not confuse return values from read() and those of
ap_get_brigade(). They have different semantics.

That should do the trick. Save this MsgID somewhere :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: Bucket API cleanup issues

Posted by Cliff Woolley <cl...@yahoo.com>.

On Wed, 28 Feb 2001, Greg Stein wrote:

> 1) it is hard to tell that "p" and "h" refer to the same bucket.
>    "p->heap.foo" is more obvious than "h->foo".

Fine by me.

> 2) Nit: I'm getting tired of these single letters for the bucket stuff. "a"
>    as a variable for a bucket? "e"? Blarg. Using "h" and "p" fall right into
>    that camp; especially p, given its use as a pool or char pointer in so
>    many other contexts. I might suggest "bh" and "bp".

We've more or less standardized on one letter meaning a bucket, though...
fixing this would be a massive change.  For now, I think we need to stick
with one letter for consistency.  I'd be just as happy if that letter were
'b' (I don't know where a and e came from, but they're definitely there),
or at least a letter mnemonic to the type of bucket it is (as I've done
here).

> 3) You have two "base" members. That will be confusing as hell over the long
>    haul. I'd recommend using just heap.base. Guess how to detect when a pool
>    bucket is no longer in a pool? Right in one! Just set pool to NULL. And
>    *please* put some comments to this effect in the apr_bucket_pool
>    structure declaration(!)

Done.

> 4) And no... I don't think we need to check malloc() results. If NULL is
>    returned, then we'll segfault right away. That's fine in my book since
>    we'd otherwise just call abort(). (if you really want to check, then call
>    abort() on error or a pool's abort function)

Yeah... I just put that comment in because there are comments just like it
throughout the buckets code where there's a possibility of failure that
we're not checking for, in case we decide at some point down the road to
go back and put in checks for them.  <shrug>

> That's it for now. I'll re-review when you check it in.

Thanks.

--Cliff

Re: Bucket API cleanup issues

Posted by Greg Stein <gs...@lyra.org>.

On Wed, Feb 28, 2001 at 03:03:08PM -0500, Cliff Woolley wrote:
>...
> Basically, instead of having the pool bucket contain a pointer to a heap
> bucket and having to fool with manipulating reference counts and keeping
> track of when to destroy the old apr_bucket_pool struct, the
> apr_bucket_pool *contains* the apr_bucket_heap struct that it will become
> as its first element.

Good call!

Some comments on the patch:

1) it is hard to tell that "p" and "h" refer to the same bucket.
   "p->heap.foo" is more obvious than "h->foo".

2) Nit: I'm getting tired of these single letters for the bucket stuff. "a"
   as a variable for a bucket? "e"? Blarg. Using "h" and "p" fall right into
   that camp; especially p, given its use as a pool or char pointer in so
   many other contexts. I might suggest "bh" and "bp".

3) You have two "base" members. That will be confusing as hell over the long
   haul. I'd recommend using just heap.base. Guess how to detect when a pool
   bucket is no longer in a pool? Right in one! Just set pool to NULL. And
   *please* put some comments to this effect in the apr_bucket_pool
   structure declaration(!)

4) And no... I don't think we need to check malloc() results. If NULL is
   returned, then we'll segfault right away. That's fine in my book since
   we'd otherwise just call abort(). (if you really want to check, then call
   abort() on error or a pool's abort function)

That's it for now. I'll re-review when you check it in.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: Bucket API cleanup issues

Posted by Cliff Woolley <cl...@yahoo.com>.

On Tue, 27 Feb 2001, Greg Stein wrote:

> Euh... I don't think we want another ring.
>
> A simpler idea is to have the apr_bucket_pool structure contain a pointer to
> an apr_bucket_heap structure. At cleanup time, you do the following:

Ahh, (he says as the little light over his head comes on).  This makes
sense.  I like.

> > > > 6) Should mmap_destroy() call apr_mmap_delete(m->mmap) after when the last
> > > > reference to the mmap bucket is destroyed?
>
> A better approach is to have the cache insert the MMAP bucket with a
> refcount starting at 2 (one for the cache, one for the bucket). When the
> last bucket goes away, you'll still have a refcount of 1 and the MMAP will
> remain alive.

Good idea.

> Reading from the input filter **stack** is different. Let's say that case
> (1) occurs and you've "read" the zero bytes out of the brigade. The brigade
> is now empty, so you do an ap_get_brigade() for more. This drops all the way
> down to the core_input_filter. It says, "sorry, I gave you everything I had"
> and returns APR_EOF. *NOW* is where the EOS is detected.

I confused the issue by even mentioning the filters.  All I care about
right now is what should be returned by the bucket read functions, which
we've determined is *never* APR_EOF.  If the filters want to return
APR_EOF, then they're welcome to, but the buckets themselves don't.

> That should do the trick. Save this MsgID somewhere :-)

;-]  I figured it was more expedient to do a little research and quote
what was said than to do the "well, I had this impression," "no, I had
another impression" back-and-forth routine.  hehe

Thanks,
Cliff

Re: Bucket API cleanup issues

Posted by Cliff Woolley <cl...@yahoo.com>.

On Tue, 27 Feb 2001, Greg Stein wrote:

> Euh... I don't think we want another ring.
>
> A simpler idea is to have the apr_bucket_pool structure contain a pointer to
> an apr_bucket_heap structure. At cleanup time, you do the following:

Ahh, (he says as the little light over his head comes on).  This makes
sense.  I like.

> > > > 6) Should mmap_destroy() call apr_mmap_delete(m->mmap) after when the last
> > > > reference to the mmap bucket is destroyed?
>
> A better approach is to have the cache insert the MMAP bucket with a
> refcount starting at 2 (one for the cache, one for the bucket). When the
> last bucket goes away, you'll still have a refcount of 1 and the MMAP will
> remain alive.

Good idea.

> Reading from the input filter **stack** is different. Let's say that case
> (1) occurs and you've "read" the zero bytes out of the brigade. The brigade
> is now empty, so you do an ap_get_brigade() for more. This drops all the way
> down to the core_input_filter. It says, "sorry, I gave you everything I had"
> and returns APR_EOF. *NOW* is where the EOS is detected.

I confused the issue by even mentioning the filters.  All I care about
right now is what should be returned by the bucket read functions, which
we've determined is *never* APR_EOF.  If the filters want to return
APR_EOF, then they're welcome to, but the buckets themselves don't.

> That should do the trick. Save this MsgID somewhere :-)

;-]  I figured it was more expedient to do a little research and quote
what was said than to do the "well, I had this impression," "no, I had
another impression" back-and-forth routine.  hehe

Thanks,
Cliff

Re: Bucket API cleanup issues

Posted by Cliff Woolley <cl...@yahoo.com>.

On Tue, 27 Feb 2001, Greg Stein wrote:

> Euh... I don't think we want another ring.
>
> A simpler idea is to have the apr_bucket_pool structure contain a pointer to
> an apr_bucket_heap structure. At cleanup time, you do the following:
...
> So what you get here is a single allocation of a heap bucket. Then, you
> lazily repoint all other buckets to the new heap bucket.

I came up with a really cool idea that's an even greater simplification of
this.  It's coded up and ready to go, but I figured I'd give people a
chance to comment before I commit.

Basically, instead of having the pool bucket contain a pointer to a heap
bucket and having to fool with manipulating reference counts and keeping
track of when to destroy the old apr_bucket_pool struct, the
apr_bucket_pool *contains* the apr_bucket_heap struct that it will become
as its first element.

Since the apr_bucket_heap has an apr_bucket_refcount as ITS first element,
the pool and heap bucket share an apr_bucket_refcount (ie, no more
twiddling with reference counts).

All you have to do in the cleanup routine is malloc/memcpy out of the pool
and onto the heap and set p->data to NULL, which is the flag to later
pool_read() calls that they should morph their buckets.  All that entails
is changing b->type to &apr_bucket_type_heap... even the data pointer is
already correct because the apr_bucket_heap is the first element of
apr_bucket_pool.

After the morph, heap_read() and heap_destroy() take right over where
pool_read() and pool_destroy() left off.  When the last reference is
heap_destroy'ed, heap_destroy() free's the b->data.  As far as it knows,
b->data is just an apr_bucket_heap, but it unwittingly cleans up the
entire apr_bucket_pool struct for us since the b->data pointer never
changed.

=-)

Patch attached for comments.

--Cliff