You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Greg Stein <gs...@lyra.org> on 2001/06/07 04:07:51 UTC

Re: cvs commit: httpd-2.0/server request.c

Woah... I've been busy today, so just catching up.

This is the wrong way to solve the problem. If the subrequest sends a
brigade with the following two buckets:

    FILE(1 megabyte) -> EOS

Then you are going to do Very Bad Things.

If your platform has MMAP, then the file bucket will be transformed into an
MMAP bucket and the read() will return the entire file contents. Then, for a
Debug build, you'll crap out at the assert. In a non-debug build, you'll end
up creating a TRANSIENT bucket in tmpbb which then becomes bogus when you
apr_destroy_brigade(bb).

If your platform does not have MMAP, then the loop will read the entire
contents of the file into memory and insert them into the tmpbb brigade.
(you won't hit the assert because FILE->read() happens to return blocks of
APR_BUCKET_BUFF_SIZE)). But even worse, some nuances of apr_brigade_write()
will *again* create a bunch of transient buckets. And again, those will die.

Unless I am mistaken in my analysis, this code is simply wrong and should be
backed out.

Further, it is against the spirit of the filter stack. It is explicitly
copying the data, rather than letting it flow down the stack. *Only* if
something down the stack needs to set it aside, should it get copied. In our
example brigade (FILE + EOS), the next filter should get a FILE, which will
(hopefully) travel all the way to the CORE output filter and do a sendfile()
onto the network.

This routine works well for the CORE filter because when it hits that point,
it knows it has a small amount of data and will never trigger the TRANSIENT
bucket creation. It is also doing a *setaside*, so the copy is by-intent.
The subrequest filter isn't trying to do a set aside. It has seen an EOS --
it *knows* the subrequest is complete. You want to get the data into the
right pool; not copy it. And if you *do* want to copy it, then the patch
below isn't going to do.

So... please explain if my analysis was wrong, or please back out (I can
also back it out, too, if you agree).

Cheers,
-g

On Thu, Jun 07, 2001 at 01:24:44AM -0000, trawick@apache.org wrote:
> trawick     01/06/06 18:24:44
> 
>   Modified:    server   request.c
>   Log:
>   implement Ryan's suggested fix for buckets associated with a subrequest
>   having private data in the wrong (i.e., subrequest) pool, leading to
>   a segfault later in processing the main request
>   
>   in the patch posted on new-httpd, the temporary brigade was allocated from
>   the connection pool; the committed code allocates the brigade from the
>   main-request pool, as suggested by Ian Holsman
>   
>   Revision  Changes    Path
>   1.5       +33 -0     httpd-2.0/server/request.c
>   
>   Index: request.c
>   ===================================================================
>   RCS file: /home/cvs/httpd-2.0/server/request.c,v
>   retrieving revision 1.4
>   retrieving revision 1.5
>   diff -u -r1.4 -r1.5
>   --- request.c	2001/06/06 12:51:21	1.4
>   +++ request.c	2001/06/07 01:24:44	1.5
>   @@ -808,7 +808,40 @@
>        apr_bucket *e = APR_BRIGADE_LAST(bb);
>    
>        if (APR_BUCKET_IS_EOS(e)) {
>   +        apr_bucket_brigade *tmpbb;
>   +
>            apr_bucket_delete(e);
>   +
>   +        if (!APR_BRIGADE_EMPTY(bb)) { /* avoid brigade create/destroy */
>   +
>   +            /* We need to be certain that any data in a bucket is valid
>   +             * after the subrequest pool is cleared.
>   +             */ 
>   +            tmpbb = apr_brigade_create(f->r->main->pool);
>   +            
>   +            APR_BRIGADE_FOREACH(e, bb) {
>   +                const char *str;
>   +                apr_size_t n;
>   +                apr_status_t rv;
>   +                
>   +                rv = apr_bucket_read(e, &str, &n, APR_BLOCK_READ);
>   +                /* XXX handle rv! */
>   +                
>   +                /* This apr_brigade_write does not use a flush function
>   +                   because we assume that we will not write enough data
>   +                   into it to cause a flush. However, if we *do* write
>   +                   "too much", then we could end up with transient
>   +                   buckets which would suck. This works for now, but is
>   +                   a bit shaky if changes are made to some of the
>   +                   buffering sizes. Let's do an assert to prevent
>   +                   potential future problems... */
>   +                AP_DEBUG_ASSERT(AP_MIN_BYTES_TO_WRITE <=
>   +                                APR_BUCKET_BUFF_SIZE);
>   +                apr_brigade_write(tmpbb, NULL, NULL, str, n);
>   +            }
>   +            apr_brigade_destroy(bb);
>   +            bb = tmpbb;
>   +        }
>        }
>        return ap_pass_brigade(f->next, bb);
>    }
>   
>   
>   

-- 
Greg Stein, http://www.lyra.org/

Re: cvs commit: httpd-2.0/server request.c

Posted by Greg Stein <gs...@lyra.org>.
On Wed, Jun 06, 2001 at 10:58:25PM -0400, Jeff Trawick wrote:
> Greg Stein <gs...@lyra.org> writes:
>...
> > If your platform has MMAP, then the file bucket will be transformed into an
> > MMAP bucket and the read() will return the entire file contents. Then, for a
> > Debug build, you'll crap out at the assert. In a non-debug build, you'll end
> > up creating a TRANSIENT bucket in tmpbb which then becomes bogus when you
> > apr_destroy_brigade(bb).
> 
> The assert never fails, debug or not...  I've been through that loop :)

Did you shove in a BIG file? ie. bigger than 8 or 9k? (whatever the value is
nowadays)

That should have sparked the assert.

[reviewing code again]

Okay. The subreq FILE bucket would need to be between 8k and 4M. The
apr_bucket_read() will then return a ptr/len for the entire mmap. That
should trigger the assert.

[ if *not*, then I'd like to figure out why, to ensure that some subtle bug
  isn't hiding from us ]

> Where does the TRANSIENT bucket come from?  It should be a HEAP bucket.

In apr_brigade_write(). If it can't shove the str/len data into a HEAP
bucket at the end of the brigade (an existing one, or a new one; where the
new one is size-limited), then it will create a TRANSIENT bucket for the
data and signal that a "flush" should be done. But we don't pass a flush
function, so the transient bucket just sits in there.

It is some complicated logic in apr_brigade_write and check_brigade_flush in
apr_brigade.c. (those two functions should be combined; having them separate
was somewhat reasonable when apr_brigade_* all need to call it, but they all
use _write now)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: cvs commit: httpd-2.0/server request.c

Posted by Jeff Trawick <tr...@bellsouth.net>.
Greg Stein <gs...@lyra.org> writes:

> This is the wrong way to solve the problem. If the subrequest sends a
> brigade with the following two buckets:
> 
>     FILE(1 megabyte) -> EOS
> 
> Then you are going to do Very Bad Things.

I believe you...  I'll back it out.  Unlike the place where the logic
resides in the core output filter there can be no assumption that
there isn't much data.

> If your platform has MMAP, then the file bucket will be transformed into an
> MMAP bucket and the read() will return the entire file contents. Then, for a
> Debug build, you'll crap out at the assert. In a non-debug build, you'll end
> up creating a TRANSIENT bucket in tmpbb which then becomes bogus when you
> apr_destroy_brigade(bb).

The assert never fails, debug or not...  I've been through that loop :)

Where does the TRANSIENT bucket come from?  It should be a HEAP bucket.


-- 
Jeff Trawick | trawickj@bellsouth.net | PGP public key at web site:
       http://www.geocities.com/SiliconValley/Park/9289/
             Born in Roswell... married an alien...