You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Konstantin Chuguev <Ch...@Clickstream.com> on 2008/03/26 17:23:11 UTC

Excessive chunking [was: mod_disk_cache and atimes]

Thanks for the clarification.

A small correction: I meant writev() calls instead of sendfile() when  
working with small-size buckets.

The filter I'm developing provisionally splits the supplied buckets  
into relatively small buckets during content parsing. It then removes  
some of them and inserts some other buckets. Before passing the  
resulting brigade further down the filter chain, it merges all buckets  
that have their data in contiguous memory regions back together. So I  
guess I'm doing my bit in preventing excessive chunking.

I've done some research on the source files of httpd-2.2.6. The CORE  
filter seems to do de-chunking in the case when 16 or more buckets are  
passed to it (actually, the brigade is split if it contains flush  
buckets and each split part is checked for 16 buckets) AND the total  
amount of bytes in the 16 buckets does not exceed 8000. The filter  
then buffers the buckets together. Very clever.

	KC


On 26 Mar 2008, at 15:22, Dirk-Willem van Gulik wrote:
>
> On Mar 26, 2008, at 4:15 PM, Konstantin Chuguev wrote:
>
>> Can you please clarify your mentioning the bucket-brigade  
>> footprint? Are they so slow they make memory-based cache no more  
>> efficient then disk-based one? Or the opposite: sendfile() works so  
>> well that serving content from memory is not any faster?
>>
> No - they are very fast (in an absolute sense) - and your approach  
> is almost certainly the right one.
>
> However all-in-all there is a lot of logic surrounding them; and if  
> you are trying to squeeze out the very last drop (e.g. the 1x1 gif  
> example) - you run into all sorts of artificial limits, specifically  
> on linux and 2x2 core machines; as the memory which needs to be  
> accessed is just a little more scattered than one would prefer and  
> all sort of competition around the IRQ handling in the kernel and so  
> on.
>
> Or in other words - in a pure static case where you are serving very  
> small files which rarely if ever change, have no variance to any  
> inbound headers, etc - things are not ideal.
>
> But that is a small price to pay - i.e. apache is more of a swiss  
> army knife; which saw's OK, but a proper hacksaw is 'better'.
>
>> I'm developing an Apache output filter for highly loaded servers  
>> and proxies that juggles small-size buckets and brigades  
>> extensively. I'm not at the stage yet where I can do performance  
>> tests but if I knew this would definitely impact performance, I  
>> would perhaps switch to fixed-size buffers straight away...
>
>
> I'd bet you are on the right track. However there is -one- small  
> concern; sometimes if you have looooots of buckets and very chunked  
> output - then one gets lots and lots of 1-5 byte chunks; each  
> prefixed by the length byte. And this can get really inefficient.
>
> Perhaps we need a de-bucketer to 'dechunk' when outputting chunked.
>
> Dw
>

Konstantin Chuguev
Software Developer

Mobile: +44 7734 955973
Fax: + 44 20 7509 9600
Clickstream Technologies PLC, 58 Davies Street, London, W1K 5JF,  
Registered in England No. 3774129



Re: Excessive chunking [was: mod_disk_cache and atimes]

Posted by Dirk-Willem van Gulik <di...@webweaving.org>.
On Mar 26, 2008, at 5:45 PM, Dirk-Willem van Gulik wrote:
>
> On Mar 26, 2008, at 5:23 PM, Konstantin Chuguev wrote:
>
>> A small correction: I meant writev() calls instead of sendfile()  
>> when working with small-size buckets.
>>
>> The filter I'm developing provisionally splits the supplied buckets  
>> into relatively small buckets during content parsing. It then  
>> removes some of them and inserts some other buckets. Before passing  
>> the resulting brigade further down the filter chain, it merges all  
>> buckets that have their data in contiguous memory regions back  
>> together. So I guess I'm doing my bit in preventing excessive  
>> chunking.
>
>> I've done some research on the source files of httpd-2.2.6. The  
>> CORE filter seems to do de-chunking in the case when 16 or more  
>> buckets are passed to it (actually, the brigade is split if it  
>> contains flush buckets and each split part is checked for 16  
>> buckets) AND the total amount of bytes in the 16 buckets does not  
>> exceed 8000. The filter then buffers the buckets together. Very  
>> clever.
>
> Hmm - I am not sure that this always works - i.e. try this :)
> 	
> $ cat test.shtml

Ok - thunderbird quot-ing detection ate the cut and paste -- so it  
should be

	[!--#echo var="error" --] [!--#echo var="error" --] .. repeated  
umpteen times.

replace [] by <>.

> (make sure that all spaces between the > and < are gone) and have  
> the usual:

And that is a type - make sure that there is all but exactly ONE space  
between the > and the <.

> 	AddType text/html .shtml	
> 	AddOutputFilter INCLUDES .shtml
> 	<Directory ..
> 		Options Includes
> 		..
>
> in your config. You then get the output below.
>
> Dw.
>
> (echo GET /test.shtml HTTP/1.1; echo Host: localhost: echo; echo;  
> sleep 10) | telnet localhost 80
> Connected to localhost.
> Escape character is '^]'.
> HTTP/1.1 200 OK
> Date: Wed, 26 Mar 2008 16:39:35 GMT
> Server: Apache/2.2.8 (Unix) mod_ssl/2.2.8 OpenSSL/0.9.7l DAV/2 PHP/ 
> 5.2.5
> Accept-Ranges: bytes
> Transfer-Encoding: chunked
> Content-Type: text/html
>
>
>
> 1
>
> 1
>
> 1
>
> 1
>
> 1
>
> 1
>
> 1
>
> 1
>
> 1
>
> 1
>
> 1
>
> 1
>
> 1
>
> 1
>
> 1
>
> 1
>
> 1
>
> 1
>
> 1
>
> 1
>
> 1
>
> 1
>
> 1
>
> 2
>
>


Re: Excessive chunking [was: mod_disk_cache and atimes]

Posted by Dirk-Willem van Gulik <di...@webweaving.org>.
On Mar 26, 2008, at 5:23 PM, Konstantin Chuguev wrote:

> A small correction: I meant writev() calls instead of sendfile()  
> when working with small-size buckets.
>
> The filter I'm developing provisionally splits the supplied buckets  
> into relatively small buckets during content parsing. It then  
> removes some of them and inserts some other buckets. Before passing  
> the resulting brigade further down the filter chain, it merges all  
> buckets that have their data in contiguous memory regions back  
> together. So I guess I'm doing my bit in preventing excessive  
> chunking.

> I've done some research on the source files of httpd-2.2.6. The CORE  
> filter seems to do de-chunking in the case when 16 or more buckets  
> are passed to it (actually, the brigade is split if it contains  
> flush buckets and each split part is checked for 16 buckets) AND the  
> total amount of bytes in the 16 buckets does not exceed 8000. The  
> filter then buffers the buckets together. Very clever.

Hmm - I am not sure that this always works - i.e. try this :)
	
$ cat test.shtml
<!--#set var="foo" value="bar" --g!--#set var="foo" value="bar" -->  
<!--#set var="foo" value="bar" --> <!--#set var="foo" value="bar" -->  
<!--#set var="foo" value="bar" --> <!--#set var="foo" value="bar" -->  
<!--#set var="foo" value="bar" --> <!--#set var="foo" value="bar" -->  
<!--#set var="foo" value="bar" --> <!--#set var="foo" value="bar" -->  
<!--#set var="foo" value="bar" --> <!--#set var="foo" value="bar" -->  
<!--#set var="foo" value="bar" --> <!--#set var="foo" value="bar" -->  
<!--#set var="foo" value="bar" --> <!--#set var="foo" value="bar" -->  
<!--#set var="foo" value="bar" --> <!--#set var="foo" value="bar" -->  
<!--#set var="foo" value="bar" --> <!--#set var="foo" value="bar" -->  
<!--#set var="foo" value="bar" --> <!--#set var="foo" value="bar" -->  
<!--#set var="foo" value="bar" --> <!--#set var="foo" value="bar" -->  
<!--#set var="foo" value="bar" -->

(make sure that all spaces between the > and < are gone) and have the  
usual:

	AddType text/html .shtml	
	AddOutputFilter INCLUDES .shtml
	<Directory ..
		Options Includes
		..

in your config. You then get the output below.

Dw.

(echo GET /test.shtml HTTP/1.1; echo Host: localhost: echo; echo;  
sleep 10) | telnet localhost 80
Connected to localhost.
Escape character is '^]'.
HTTP/1.1 200 OK
Date: Wed, 26 Mar 2008 16:39:35 GMT
Server: Apache/2.2.8 (Unix) mod_ssl/2.2.8 OpenSSL/0.9.7l DAV/2 PHP/5.2.5
Accept-Ranges: bytes
Transfer-Encoding: chunked
Content-Type: text/html



1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

2