You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@httpd.apache.org by Brian Pane <bp...@pacbell.net> on 2001/11/26 18:28:03 UTC

Concatenation of small buckets in core_output_filter Re: cvs commit: httpd-2.0/server core.c

The current logic for concatenating small buckets in core_output_filter()
will have performance problems in two cases that I can think of:

* If you have a brigade consisting of several dozen small buckets,
  each one can get copied several times (because we only concatenate
  MAX_IOVEC_TO_WRITE of them at at time).

* If the brigade consists of, say, MAX_IOVEC_TO_WRITE+1 buckets of
  size 1MB each, the code will do a huge memory copy to (needlessly)
  concatenate the first MAX_IOVEC_TO_WRITE of them.

My proposed solution is to change the logic as follows:

* Skip the concatenation if there's >= 8KB of data already
  referenced in the iovec.

* Rather than creating a temporary brigade for concatenation,
  create a heap bucket.  Make it big enough to hold 8KB.  Pop
  the small buckets from the brigade, concatenate their contents
  into the heap bucket, and push the heap bucket onto the brigade.

* If we end up in the concatenation again during the foreach loop
  through the brigade, add the small buckets to the end of the
  previously allocated heap bucket.  If the heap bucket size is
  about to exceed 8KB, stop.

Comments?

Thanks,
--Brian

Re: Concatenation of small buckets in core_output_filter Re: cvs commit: httpd-2.0/server core.c

Posted by Ryan Bloom <rb...@covalent.net>.

On Monday 26 November 2001 12:06 pm, Roy T. Fielding wrote:
> In my earlier attempts at buffering filters, I would allocate a single
> block of memory within the filter's state and use that as fill-area in
> the same way that the old buff.c code managed its buffers (though simpler
> because it wasn't trying to do so many different things in one routine).
>
> This requires that a setaside function take a storage area and max length
> as an argument, rather than a pool.

But setaside shouldn't always be concatenating data.  It only makes sense
to concatenate the data if you have more then MAX_IOVEC buckets and less
than 8K of data.  If I have a heap bucket with 400 bytes and another heap bucket
with 2K, there is no reason to concatenate the data yet.

Ryan
______________________________________________________________
Ryan Bloom				rbb@apache.org
Covalent Technologies			rbb@covalent.net
--------------------------------------------------------------

Re: Concatenation of small buckets in core_output_filter Re: cvs commit: httpd-2.0/server core.c

Posted by "Roy T. Fielding" <fi...@ebuilt.com>.

In my earlier attempts at buffering filters, I would allocate a single
block of memory within the filter's state and use that as fill-area in
the same way that the old buff.c code managed its buffers (though simpler
because it wasn't trying to do so many different things in one routine).

This requires that a setaside function take a storage area and max length
as an argument, rather than a pool.

....Roy

Re: Concatenation of small buckets in core_output_filter Re: cvs commit: httpd-2.0/server core.c

Posted by Ryan Bloom <rb...@covalent.net>.

On Monday 26 November 2001 09:46 am, Brian Pane wrote:
> Ryan Bloom wrote:
> >On Monday 26 November 2001 09:39 am, Brian Pane wrote:
> >>Ryan Bloom wrote:
> >>>On Monday 26 November 2001 09:28 am, Brian Pane wrote:
> >>
> >>[...]
> >>
> >>>>* If we end up in the concatenation again during the foreach loop
> >>>> through the brigade, add the small buckets to the end of the
> >>>> previously allocated heap bucket.  If the heap bucket size is
> >>>> about to exceed 8KB, stop.
> >>>
> >>>This should be done by moving the first bucket into a separate brigade
> >>>and using the brigade_write functions again.
> >>
> >>Thanks for the pointer; using apr_brigade_write for this will make the
> >>code a lot cleaner.  I'll implement this later today if nobody beats
> >>me to it.
> >
> >That is what is being done now.
>
> But in the current implemention, we end up allocating new space
> each time the buckets are compacted, because we start with an
> empty brigade, right?  So even if the first bucket in the list
> to be compacted happens to have been created by a previous
> apr_brigade_write (because we end up compacting twice on the
> same request, due to a long stream of small buckets produced by
> a CGI or mod_include), we can't take advantage of the extra space
> within that bucket.  Or am I missing something?

You've got it right, but I want to be sure that you know we are currently using
those functions. We just need to split the brigade correctly so that we have
a brigade that ends with the previously allocated bucket.

Ryan

______________________________________________________________
Ryan Bloom				rbb@apache.org
Covalent Technologies			rbb@covalent.net
--------------------------------------------------------------

Re: Concatenation of small buckets in core_output_filter Re: cvs commit: httpd-2.0/server core.c

Posted by Brian Pane <bp...@pacbell.net>.

Ryan Bloom wrote:

>On Monday 26 November 2001 09:39 am, Brian Pane wrote:
>
>>Ryan Bloom wrote:
>>
>>>On Monday 26 November 2001 09:28 am, Brian Pane wrote:
>>>
>>[...]
>>
>>>>* If we end up in the concatenation again during the foreach loop
>>>> through the brigade, add the small buckets to the end of the
>>>> previously allocated heap bucket.  If the heap bucket size is
>>>> about to exceed 8KB, stop.
>>>>
>>>This should be done by moving the first bucket into a separate brigade
>>>and using the brigade_write functions again.
>>>
>>Thanks for the pointer; using apr_brigade_write for this will make the
>>code a lot cleaner.  I'll implement this later today if nobody beats
>>me to it.
>>
>
>That is what is being done now.
>

But in the current implemention, we end up allocating new space
each time the buckets are compacted, because we start with an
empty brigade, right?  So even if the first bucket in the list
to be compacted happens to have been created by a previous
apr_brigade_write (because we end up compacting twice on the
same request, due to a long stream of small buckets produced by
a CGI or mod_include), we can't take advantage of the extra space
within that bucket.  Or am I missing something?

--Brian

Re: Concatenation of small buckets in core_output_filter Re: cvs commit: httpd-2.0/server core.c

Posted by Ryan Bloom <rb...@covalent.net>.

On Monday 26 November 2001 09:39 am, Brian Pane wrote:
> Ryan Bloom wrote:
> >On Monday 26 November 2001 09:28 am, Brian Pane wrote:
>
> [...]
>
> >>* If we end up in the concatenation again during the foreach loop
> >>  through the brigade, add the small buckets to the end of the
> >>  previously allocated heap bucket.  If the heap bucket size is
> >>  about to exceed 8KB, stop.
> >
> >This should be done by moving the first bucket into a separate brigade
> >and using the brigade_write functions again.
>
> Thanks for the pointer; using apr_brigade_write for this will make the
> code a lot cleaner.  I'll implement this later today if nobody beats
> me to it.

That is what is being done now.

Ryan
______________________________________________________________
Ryan Bloom				rbb@apache.org
Covalent Technologies			rbb@covalent.net
--------------------------------------------------------------

Re: Concatenation of small buckets in core_output_filter Re: cvs commit: httpd-2.0/server core.c

Posted by Brian Pane <bp...@pacbell.net>.

Ryan Bloom wrote:

>On Monday 26 November 2001 09:28 am, Brian Pane wrote:
>
[...]

>>* If we end up in the concatenation again during the foreach loop
>>  through the brigade, add the small buckets to the end of the
>>  previously allocated heap bucket.  If the heap bucket size is
>>  about to exceed 8KB, stop.
>>
>
>This should be done by moving the first bucket into a separate brigade
>and using the brigade_write functions again.
>

Thanks for the pointer; using apr_brigade_write for this will make the
code a lot cleaner.  I'll implement this later today if nobody beats
me to it.

--Brian

Re: Concatenation of small buckets in core_output_filter Re: cvs commit: httpd-2.0/server core.c

Posted by Ryan Bloom <rb...@covalent.net>.

On Monday 26 November 2001 09:28 am, Brian Pane wrote:
> The current logic for concatenating small buckets in core_output_filter()
> will have performance problems in two cases that I can think of:
>
> * If you have a brigade consisting of several dozen small buckets,
>   each one can get copied several times (because we only concatenate
>   MAX_IOVEC_TO_WRITE of them at at time).
>
> * If the brigade consists of, say, MAX_IOVEC_TO_WRITE+1 buckets of
>   size 1MB each, the code will do a huge memory copy to (needlessly)
>   concatenate the first MAX_IOVEC_TO_WRITE of them.
>
> My proposed solution is to change the logic as follows:
>
> * Skip the concatenation if there's >= 8KB of data already
>   referenced in the iovec.

Do we really concatenate in that case?  Whoops, yeah we do.  That is
a mistake, and should be fixed.

>
> * Rather than creating a temporary brigade for concatenation,
>   create a heap bucket.  Make it big enough to hold 8KB.  Pop
>   the small buckets from the brigade, concatenate their contents
>   into the heap bucket, and push the heap bucket onto the brigade.

This is exactly what the apr_brigade_write function does. If you stop
using that function, you are just duplicating logic.

> * If we end up in the concatenation again during the foreach loop
>   through the brigade, add the small buckets to the end of the
>   previously allocated heap bucket.  If the heap bucket size is
>   about to exceed 8KB, stop.

This should be done by moving the first bucket into a separate brigade
and using the brigade_write functions again.

Please don't create separate logic to create a buffering bucket, we already
have it.

Ryan
______________________________________________________________
Ryan Bloom				rbb@apache.org
Covalent Technologies			rbb@covalent.net
--------------------------------------------------------------

Re: Concatenation of small buckets in core_output_filter

Posted by Greg Stein <gs...@lyra.org>.

On Mon, Nov 26, 2001 at 09:28:03AM -0800, Brian Pane wrote:
>...
> * Rather than creating a temporary brigade for concatenation,
>   create a heap bucket.  Make it big enough to hold 8KB.  Pop
>   the small buckets from the brigade, concatenate their contents
>   into the heap bucket, and push the heap bucket onto the brigade.
> 
> * If we end up in the concatenation again during the foreach loop
>   through the brigade, add the small buckets to the end of the
>   previously allocated heap bucket.  If the heap bucket size is
>   about to exceed 8KB, stop.
> 
> Comments?

You could actually scan the whole brigade and replace small buckets with a
large heap bucket. For example: compress the first 20 5-byte buckets into a
single 100-byte bucket, skip the 10k file bucket, then compress the next 50
10-byte buckets into a 500-byte bucket.

The rule of thumb is simply "stop concatenating if you find a bucket bigger
than your threshold." Whether you skip over those looking for more is a
different question.

Note that apr_brigade_write() is fine if you're writing to the "end" of a
brigade. It doesn't work well "within", and it could also pose problems if
you trigger its flushing behavior. That said, if you're concatenating and
*moving* buckets (i.e. from the input brigade to a holding brigade), then
you're always writing at the end.

Also: why would this ever "exceed 8KB" ?? If that was the case, then you'd
just send the brigade to the network rather than try to hold onto it.

Finally: since you probably know the size of your concat-buffer, then you
can allocate the heap bucket to be the proper size, rather than use a
default 8k heap bucket.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/