You are viewing a plain text version of this content. The canonical link for it is here.

Posted to apreq-dev@httpd.apache.org by Bojan Smojver <bo...@rexursive.com> on 2004/08/31 04:41:13 UTC

Re: Recall of input filter module on completion of output filterprocessing for a request???

What I had a problem with (and this is why I asked this) is the lifetime
of pools. Brigades (and the buckets within them) that go through filters
should have (at least) connection lifetime, not request lifetime.
Otherwise, the memory area you are referencing may be
released/overwritten by the time the network connection filter gets to
actually serving the request (which is not necessarily at the end of the
request). Once I made sure that everything I put into the brigade has
connection lifetime, all was fine.

Maybe unrelated to your problem, but nevertheless, there it is.

On Tue, 2004-08-31 at 12:31, Anthony Wells wrote:

> Yes, this is over a keep-alive connection.  However, there is nothing else
> to be served for the connection.  So, there should only be one request.

-- 
Bojan

Re: Recall of input filter module on completion of output filterprocessing for a request???

Posted by Bojan Smojver <bo...@rexursive.com>.

On Tue, 2004-08-31 at 17:15, Joe Schaefer wrote:

> Pool buckets are just lazy heap buckets.  The cleanup they use
> just copies the bucket data into the heap if the pool vanishes
> before the bucket does.

Right. I must have been doing something really silly in my code then.
Not sure what, but I'll test it again, just to make sure.

Thanks for your explanation. Nothing beats shortcuts!

-- 
Bojan

Re: Recall of input filter module on completion of output filterprocessing for a request???

Posted by Bojan Smojver <bo...@rexursive.com>.

Quoting Bojan Smojver <bo...@rexursive.com>:

> On Tue, 2004-08-31 at 22:06, Joe Schaefer wrote:
>
> > Unusual or not, that sounds like a bug in httpd (or possibly apr-util).
> > You might want to report the bug to dev@httpd.
>
> I think I'll be checking my own code before I do that. My bet is that
> I'm doing something stupid somewhere, not Apache.

Just for the record, after reorganising my code a little bit (for instance, not
keeping files open for longer than one request), it seems that even the
multiple file buckets can be allocated from the request pool successfully and
passed through the chain (i.e. the bugs are mine and mine alone). The memory
pressure is far lower that way too.

So, everyone should simply disregard what I said in this thread. It's just
rubbish.

Once again, thanks Joe for pointing all this out.

--
Bojan

Re: Recall of input filter module on completion of output filterprocessing for a request???

Posted by Bojan Smojver <bo...@rexursive.com>.

On Tue, 2004-08-31 at 22:06, Joe Schaefer wrote:

> Unusual or not, that sounds like a bug in httpd (or possibly apr-util).
> You might want to report the bug to dev@httpd.

I think I'll be checking my own code before I do that. My bet is that
I'm doing something stupid somewhere, not Apache.

In any event, I really appreciate you taking up this discussion. It
cleared up many things I was not really sure about (in fact, completely
wrong about). The docs on Apache API usually contain simple examples and
nothing in-depth as this.

-- 
Bojan

Re: Recall of input filter module on completion of output filterprocessing for a request???

Posted by Joe Schaefer <jo...@sunstarsys.com>.

Bojan Smojver <bo...@rexursive.com> writes:

> I just did a quick test in my code. The problem there is related to FILE
> buckets. I have an unusual handler that puts multiple file buckets into
> the brigade. If I turn those FILE buckets into POOL buckets (and still
> use request pool) all is well. FILE buckets, however, cause segfaults on
> pipelined requests.
> 
> I know that this multiple file bucket thing is unusual, as some Apache
> developers told me.

Unusual or not, that sounds like a bug in httpd (or possibly apr-util).
You might want to report the bug to dev@httpd.

-- 
Joe Schaefer

Re: Recall of input filter module on completion of output filterprocessing for a request???

Posted by Bojan Smojver <bo...@rexursive.com>.

I just did a quick test in my code. The problem there is related to FILE
buckets. I have an unusual handler that puts multiple file buckets into
the brigade. If I turn those FILE buckets into POOL buckets (and still
use request pool) all is well. FILE buckets, however, cause segfaults on
pipelined requests.

I know that this multiple file bucket thing is unusual, as some Apache
developers told me.

-- 
Bojan

Re: Recall of input filter module on completion of output filterprocessing for a request???

Posted by Joe Schaefer <jo...@sunstarsys.com>.

Bojan Smojver <bo...@rexursive.com> writes:


[...]

> 
> I thought that the connection filters may delay delivery of a particular
> request output data, particularly if the payload is small. In other
> words, they may group all of it together to deliver in a single packet.
>
> Wouldn't that mean that by the time the connection filter (on the
> network level, for instance) gets to it, the brigade, including its
> contents, may be gone?

In theory: no, the core_output_filter calls ap_save_brigade() in those 
circumstances.  Take a look at the implementation...

AP_DECLARE(apr_status_t) ap_save_brigade(ap_filter_t *f, 
                                         apr_bucket_brigade **saveto,
                                         apr_bucket_brigade **b, apr_pool_t *p)
{
    apr_bucket *e;
    apr_status_t rv;

    /* If have never stored any data in the filter, then we had better
     * create an empty bucket brigade so that we can concat.
     */
    if (!(*saveto)) {
        *saveto = apr_brigade_create(p, f->c->bucket_alloc);
    }
    
    for (e = APR_BRIGADE_FIRST(*b);
         e != APR_BRIGADE_SENTINEL(*b);
         e = APR_BUCKET_NEXT(e))
    {
        rv = apr_bucket_setaside(e, p);
        if (rv != APR_SUCCESS
            /* ### this ENOTIMPL will go away once we implement setaside
               ### for all bucket types. */
            && rv != APR_ENOTIMPL) {
            return rv;
        }
    }
    APR_BRIGADE_CONCAT(*saveto, *b);
    return APR_SUCCESS;
}

 
> Actually, when I was testing a lot of pipelined requests, that's exactly
> the kind of problems I was experiencing. I guess I didn't something else
> wrong somewhere...

[...]

> > OTOH, the bucket's setaside function takes a pool argument for just
> > this reason, so in principle you should be able to use buckets created
> > from the request pool if you really want to.  Perhaps the problem
> > is really a wrong choice of bucket type, not the wrong initial pool?
> 
> Isn't the pool buckets' setaside also a NOOP?

Pool buckets are just lazy heap buckets.  The cleanup they use
just copies the bucket data into the heap if the pool vanishes
before the bucket does.


-- 
Joe Schaefer

Re: Recall of input filter module on completion of output filterprocessing for a request???

Posted by Bojan Smojver <bo...@rexursive.com>.

On Tue, 2004-08-31 at 15:24, Joe Schaefer wrote:

> > Brigades are actually allocated from the heap directly, not from the
> > pool.
> 

[... code ...]

> I would describe this as creating the brigade from the pool, and
> registering a cleanup that will delete any buckets which remain
> in the brigade.  *Those remaining buckets* may be heap-allocated,
> which is why the cleanup is important.

OUCH! I've been had! This is what the docs say about that function:

-------------------------------
apr_bucket_brigade* apr_brigade_create 
( 
apr_pool_t * 
  p, 

apr_bucket_alloc_t
* 
  list

) 

Create a new bucket brigade. The
bucket brigade is originally empty. 

Parameters: 
        p 
        The pool to
        associate with
        the brigade.
        Data is not
        allocated out
        of the pool,
        but a cleanup
        is registered.
        list 
        The bucket
        allocator to
        use 

Returns: 
        The empty bucket brigade 

-------------------------------

I never actually had look into the code to verify and this looked to me
like the brigade isn't actually allocated from the pool, only registered
with it. Obviously, I should have looked :-( The data refers to buckets,
not brigade itself.

> [...]
> 
> > When the request is dealt with, the brigade may get hosed, if it's
> > registered with the request pool. 
> 
> But this does not happen *within* a call to ap_pass_brigade.  Consider
> a request filter that calls ap_pass_brigade(f->next, bb).  The request
> filter won't have a problem with bb->p == r->pool, and neither will
> a downstream connection filter that dispenses with bb immediately.
> However, that downstream connection filter will have a problem if 
> it keeps a pointer to bb for future use.  IMO this would be a bug 
> in the downstream connection filter, not the request filter.

I thought that the connection filters may delay delivery of a particular
request output data, particularly if the payload is small. In other
words, they may group all of it together to deliver in a single packet.
Wouldn't that mean that by the time the connection filter (on the
network level, for instance) gets to it, the brigade, including its
contents, may be gone?

Actually, when I was testing a lot of pipelined requests, that's exactly
the kind of problems I was experiencing. I guess I didn't something else
wrong somewhere...

> > pool), of course, your interpretation is again correct. They have to
> > have lifetime of at least connection. Same problem again - if they
> > don't, by the time the output filter gets to deal with them, they may be
> > gone. Again, instant segfault.
> 
> OTOH, the bucket's setaside function takes a pool argument for just
> this reason, so in principle you should be able to use buckets created
> from the request pool if you really want to.  Perhaps the problem
> is really a wrong choice of bucket type, not the wrong initial pool?

Isn't the pool buckets' setaside also a NOOP?

I had some request pool buckets inside the brigade and on occasion a
pipelined requests would segfault Apache. On the other hand, when I
started using connection pool buckets, all worked out fine.

Anyhow, I just thought that the problem may be related to the pool
lifetime. But then again, maybe it isn't. Sorry if I wasted everyone's
time.

-- 
Bojan

Re: Recall of input filter module on completion of output filterprocessing for a request???

Posted by Joe Schaefer <jo...@sunstarsys.com>.

Bojan Smojver <bo...@rexursive.com> writes:

> On Tue, 2004-08-31 at 13:23, Joe Schaefer wrote:
> 
> > Here your talking about the "buckets" in the brigade, not the 
> > brigade itself, right?  (If so, I totally agree with you).
> 
> Yeah, I stuffed that bit up, didn't I? 

It's not your fault.  There's no clear consensus on dev@httpd 
on how filters and brigades are even *supposed* to work together.
Ask two different committers and you'll get two completely 
different answers (assuming you can even get an answer).

> Brigades are actually allocated from the heap directly, not from the
> pool.

I still don't quite follow.  Here's the definition from
apr-util/buckets/apr_brigade.c:

  APU_DECLARE(apr_bucket_brigade *) apr_brigade_create(apr_pool_t *p,
                                               apr_bucket_alloc_t *list)
  {
      apr_bucket_brigade *b;

      b = apr_palloc(p, sizeof(*b));
      b->p = p;
      b->bucket_alloc = list;

      APR_RING_INIT(&b->list, apr_bucket, link);

      apr_pool_cleanup_register(b->p, b, brigade_cleanup, 
                                apr_pool_cleanup_null);
      return b;
  }

I would describe this as creating the brigade from the pool, and
registering a cleanup that will delete any buckets which remain
in the brigade.  *Those remaining buckets* may be heap-allocated,
which is why the cleanup is important.

[...]

> When the request is dealt with, the brigade may get hosed, if it's
> registered with the request pool. 

But this does not happen *within* a call to ap_pass_brigade.  Consider
a request filter that calls ap_pass_brigade(f->next, bb).  The request
filter won't have a problem with bb->p == r->pool, and neither will
a downstream connection filter that dispenses with bb immediately.
However, that downstream connection filter will have a problem if 
it keeps a pointer to bb for future use.  IMO this would be a bug 
in the downstream connection filter, not the request filter.

> pool), of course, your interpretation is again correct. They have to
> have lifetime of at least connection. Same problem again - if they
> don't, by the time the output filter gets to deal with them, they may be
> gone. Again, instant segfault.

OTOH, the bucket's setaside function takes a pool argument for just
this reason, so in principle you should be able to use buckets created
from the request pool if you really want to.  Perhaps the problem
is really a wrong choice of bucket type, not the wrong initial pool?

For example, here's something that works fine on the input side,
(taken from the mfd_parser in apreq_parsers.c) but is totally 
broken on the output side:

             name = apr_pstrmemdup(r->pool, name, nlen);
             e = apr_bucket_immortal_create(name, nlen,
                                            bb->bucket_alloc);

Here "name" is allocated in the request pool, but setaside
is a noop for immortal buckets. So a connection filter
that tried to setaside the bucket would segfault if it
read the bucket after the request pool was cleared.

-- 
Joe Schaefer

Re: Recall of input filter module on completion of output filterprocessing for a request???

Posted by Bojan Smojver <bo...@rexursive.com>.

On Tue, 2004-08-31 at 13:23, Joe Schaefer wrote:

> Here your talking about the "buckets" in the brigade, not the 
> brigade itself, right?  (If so, I totally agree with you).

Yeah, I stuffed that bit up, didn't I? Brigades are actually allocated
from the heap directly, not from the pool. The cleanup is registered
with the pool. So, what I meant is that the brigade's cleanup gets
registered with the connection pool, rather than request pool. That's
all for output filters, which were mentioned in the original post. But,
I didn't mention that either. Sorry :-)

When the request is dealt with, the brigade may get hosed, if it's
registered with the request pool. So, when the output connection filter
wants to access that memory, it's gone or allocated to something
completely different. Instant segfault.

In terms of buckets (which can be allocated among other things from the
pool), of course, your interpretation is again correct. They have to
have lifetime of at least connection. Same problem again - if they
don't, by the time the output filter gets to deal with them, they may be
gone. Again, instant segfault.

So, if the brigade from the input filter is somehow used by connection
processing later on, once the request cleanup whacks it, the connection
part may be causing the segfault. That's what I was trying to point to.
But, it may be something completely different, of course.

-- 
Bojan

Re: Recall of input filter module on completion of output filterprocessing for a request???

Posted by Joe Schaefer <jo...@sunstarsys.com>.

Bojan Smojver <bo...@rexursive.com> writes:

> What I had a problem with (and this is why I asked this) is the lifetime
> of pools. Brigades (and the buckets within them) that go through filters
> should have (at least) connection lifetime, 

I agree wrt "buckets", but are you certain about the "brigades" part?
I can tell you that it is not a good idea to have a perl-request input
filter create a per-connection brigade: because the brigade's pool 
cleanup won't delete any internal buckets until the connection closes, 
thereby introducing per-request memory leaks.

I suspect the reasoning is more-or-less identical on the output filter
side, *except* for the brigade you hand off to ap_pass_brigade(),
because of this cryptic statement in the ap_pass_brigade docs:

    "The caller relinquishes ownership of the brigade"

Do you have any idea what sort "ownership" they're talking about? I can 
think of a few interpretations:

  1)  pointers are invalid (the brigade itself may have been freed),
  2)  the brigade may not be modified by the caller ever again,
  3)  the brigade may not be modified by the caller again, during the
      rest of this call.

> not request lifetime.  Otherwise, the memory area you are referencing
> may be released/overwritten by the time the network connection filter
> gets to actually serving the request (which is not necessarily at the
> end of the request). Once I made sure that everything I put into the
> brigade has connection lifetime, all was fine.

Here your talking about the "buckets" in the brigade, not the 
brigade itself, right?  (If so, I totally agree with you).

-- 
Joe Schaefer