You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Nick Kew <ni...@webthing.com> on 2009/07/13 23:46:24 UTC

Re: mod_deflate DoS using HEAD

Eric Covener wrote:

>          /* For a 304 response, only change the headers */
> -        if (r->status == HTTP_NOT_MODIFIED) {
> +        if (r->status == HTTP_NOT_MODIFIED || r->header_only) {

Technically speaking, screws up the protocol.

IMHO it would be acceptable provided:
   (a) it's an option for the admin, rather than enforced
   (b) it's documented
   (c) the headers are correct: either Content-Encoding is
       unset (uncompressed response) or Content-Length is
       unset.  Probably the former.

-- 
Nick Kew

Re: mod_deflate DoS using HEAD

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
Roy T. Fielding wrote:
> On Jul 14, 2009, at 10:02 AM, Nick Kew wrote:
>> That, on the other hand, stands.  In the case of an HTTP/1.0
>> request, we'd be closing the connection to signal end-of-response.
> 
> Not on a HEAD request.

But on the GET request with the deflate filter installed, we would not
send a C-L but would close the connection upon transmission of the body,
so we can drop the pretense of querying the true C-L.

Re: mod_deflate DoS using HEAD

Posted by "Roy T. Fielding" <fi...@gbiv.com>.
On Jul 14, 2009, at 10:02 AM, Nick Kew wrote:
> That, on the other hand, stands.  In the case of an HTTP/1.0
> request, we'd be closing the connection to signal end-of-response.

Not on a HEAD request.

....Roy


Re: mod_deflate DoS using HEAD

Posted by Nick Kew <ni...@webthing.com>.
Nick Kew wrote:

> The content-length could've been set anyway - the simplest case being
> a static file that's been "stat"ed.  Have we definitely unset it?

D'oh.  Of course we have.

>> Is this really an optimization?  Sounds like correctness :)  And do we 
>> want
>> to also validate that Accept-Encoding: chunked is present?
> 
> No, deflate doesn't imply chunked encoding.

That, on the other hand, stands.  In the case of an HTTP/1.0
request, we'd be closing the connection to signal end-of-response.

-- 
Nick Kew

Re: mod_deflate DoS using HEAD

Posted by Nick Kew <ni...@webthing.com>.
William A. Rowe, Jr. wrote:
> Plüm, Rüdiger, VF-Group wrote:
>>  
>> +        /*
>> +         * Optimization: If we are a HEAD request and bytes_sent is not zero
>> +         * it means that we have passed the content-length filter once and
>> +         * have more data to sent. This means that the content-length filter
>> +         * could not determine our content-length for the response to the
>> +         * HEAD request anyway (the associated GET request would deliver the
>> +         * body in chunked encoding) and we can stop compressing.
>> +         */

The content-length could've been set anyway - the simplest case being
a static file that's been "stat"ed.  Have we definitely unset it?

> Is this really an optimization?  Sounds like correctness :)  And do we want
> to also validate that Accept-Encoding: chunked is present?

No, deflate doesn't imply chunked encoding.

>> +        if (r->header_only && r->bytes_sent) {
>> +            ap_remove_output_filter(f);
>> +            return ap_pass_brigade(f->next, bb);
>> +        }
> 
> Other than comments above - +1!

Other than comments above, +0.5.  The other +half is still
contemplating giving control to the sysop - c.f. my previous
mail on the subject.

-- 
Nick Kew

Re: mod_deflate DoS using HEAD

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
Plüm, Rüdiger, VF-Group wrote:
>  
> +        /*
> +         * Optimization: If we are a HEAD request and bytes_sent is not zero
> +         * it means that we have passed the content-length filter once and
> +         * have more data to sent. This means that the content-length filter
> +         * could not determine our content-length for the response to the
> +         * HEAD request anyway (the associated GET request would deliver the
> +         * body in chunked encoding) and we can stop compressing.
> +         */

Is this really an optimization?  Sounds like correctness :)  And do we want
to also validate that Accept-Encoding: chunked is present?

> +        if (r->header_only && r->bytes_sent) {
> +            ap_remove_output_filter(f);
> +            return ap_pass_brigade(f->next, bb);
> +        }

Other than comments above - +1!

Re: mod_deflate DoS using HEAD

Posted by Dan Poirier <po...@pobox.com>.
"William A. Rowe, Jr." <wr...@rowe-clan.net> writes:

> Joe Orton wrote:
>> 
>> Does 2616 mandate that a resource must always 
>> exactly the same set of content-codings across methods and time?  
>> (AFAICT there is no MUST on that front; it's a SHOULD if anything)
>
> Read through to the end, it breaks all proxied content;
>
> 9.4 HEAD
>
>    The HEAD method is identical to GET except that the server MUST NOT
>    return a message-body in the response. The metainformation contained
>    in the HTTP headers in response to a HEAD request SHOULD be identical
>    to the information sent in response to a GET request. This method can
>    be used for obtaining metainformation about the entity implied by the
>    request without transferring the entity-body itself. This method is
>    often used for testing hypertext links for validity, accessibility,
>    and recent modification.
>
>    The response to a HEAD request MAY be cacheable in the sense that the
>    information contained in the response MAY be used to update a
>    previously cached entity from that resource. If the new field values
>    indicate that the cached entity differs from the current entity (as
>    would be indicated by a change in Content-Length, Content-MD5, ETag
>    or Last-Modified), then the cache MUST treat the cache entry as
>    stale.

Doesn't that last sentence just indicate that the cache entry will be
invalidated?  That would add some possibly unnecessary work fetching the
content again the next time it's requested, but I wouldn't think it
would break anything.

-- 
Dan Poirier <po...@pobox.com>


Re: mod_deflate DoS using HEAD

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
Joe Orton wrote:
> 
> I'm confused.  Why do this check so late, and why does r->bytes_sent 
> matter?  Why does it "screw up the protocol" if the DEFLATE filter does 
> nothing for a HEAD request?  Because of the concern that a HEAD will 
> return a different C-L & C-E to a GET on the same resource with the same 
> Accept-Encoding header?  Does 2616 mandate that a resource must always 
> exactly the same set of content-codings across methods and time?  
> (AFAICT there is no MUST on that front; it's a SHOULD if anything)

Read through to the end, it breaks all proxied content;

9.4 HEAD

   The HEAD method is identical to GET except that the server MUST NOT
   return a message-body in the response. The metainformation contained
   in the HTTP headers in response to a HEAD request SHOULD be identical
   to the information sent in response to a GET request. This method can
   be used for obtaining metainformation about the entity implied by the
   request without transferring the entity-body itself. This method is
   often used for testing hypertext links for validity, accessibility,
   and recent modification.

   The response to a HEAD request MAY be cacheable in the sense that the
   information contained in the response MAY be used to update a
   previously cached entity from that resource. If the new field values
   indicate that the cached entity differs from the current entity (as
   would be indicated by a change in Content-Length, Content-MD5, ETag
   or Last-Modified), then the cache MUST treat the cache entry as
   stale.


Re: mod_deflate DoS using HEAD

Posted by Stefan Fritsch <sf...@sfritsch.de>.
On Tuesday 22 June 2010, Plüm, Rüdiger, VF-Group wrote:
> I am currently +0 on wether to use the patch above or my original
> proposal. Both have its pros and cons (Saving more CPU vs. be more
> picky about caching and implement an RFC SHOULD).

I have now commited your original patch because it is less likely to 
break something. If the CPU usage for compressing the first buffer 
turns out to be a problem, we can still change it later.

Cheers,
Stefan

RE: mod_deflate DoS using HEAD

Posted by "Plüm, Rüdiger, VF-Group" <ru...@vodafone.com>.
 

> -----Original Message-----
> From: Eric Covener 
> Sent: Dienstag, 22. Juni 2010 15:33
> To: dev@httpd.apache.org
> Subject: Re: mod_deflate DoS using HEAD
> 
> On Tue, Jun 22, 2010 at 2:37 AM, Ruediger Pluem 
> <rp...@apache.org> wrote:
> > Does someone still have Eric's patch? Seems that I can't 
> find it right now.
> 
> This is what I have, but I never got off the fence back then 
> much less now:
> 
> Index: modules/filters/mod_deflate.c
> ===================================================================
> --- modules/filters/mod_deflate.c       (revision 793619)
> +++ modules/filters/mod_deflate.c       (working copy)
> @@ -578,7 +578,7 @@
>         deflate_check_etag(r, "gzip");
> 
>         /* For a 304 response, only change the headers */
> -        if (r->status == HTTP_NOT_MODIFIED) {
> +        if (r->status == HTTP_NOT_MODIFIED || r->header_only) {
>             ap_remove_output_filter(f);
>             return ap_pass_brigade(f->next, bb);
>         }
> 
> 

Thanks for that. I guess the patch is not complete for current trunk.
IMHO it should look like:

Index: modules/filters/mod_deflate.c
===================================================================
--- modules/filters/mod_deflate.c       (revision 955960)
+++ modules/filters/mod_deflate.c       (working copy)
@@ -562,7 +562,7 @@
          * send out the headers).
          */

-        if (r->status != HTTP_NOT_MODIFIED) {
+        if ((r->status != HTTP_NOT_MODIFIED) && !r->header_only) {
             ctx = f->ctx = apr_pcalloc(r->pool, sizeof(*ctx));
             ctx->bb = apr_brigade_create(r->pool, f->c->bucket_alloc);
             ctx->buffer = apr_palloc(r->pool, c->bufferSize);
@@ -616,7 +616,7 @@
         deflate_check_etag(r, "gzip");

         /* For a 304 response, only change the headers */
-        if (r->status == HTTP_NOT_MODIFIED) {
+        if ((r->status == HTTP_NOT_MODIFIED) || r->header_only) {
             ap_remove_output_filter(f);
             return ap_pass_brigade(f->next, bb);
         }

I am currently +0 on wether to use the patch above or my original
proposal. Both have its pros and cons (Saving more CPU vs. be more
picky about caching and implement an RFC SHOULD).

Regards

Rüdiger



Re: mod_deflate DoS using HEAD

Posted by Eric Covener <co...@gmail.com>.
On Tue, Jun 22, 2010 at 2:37 AM, Ruediger Pluem <rp...@apache.org> wrote:
> Does someone still have Eric's patch? Seems that I can't find it right now.

This is what I have, but I never got off the fence back then much less now:

Index: modules/filters/mod_deflate.c
===================================================================
--- modules/filters/mod_deflate.c       (revision 793619)
+++ modules/filters/mod_deflate.c       (working copy)
@@ -578,7 +578,7 @@
        deflate_check_etag(r, "gzip");

        /* For a 304 response, only change the headers */
-        if (r->status == HTTP_NOT_MODIFIED) {
+        if (r->status == HTTP_NOT_MODIFIED || r->header_only) {
            ap_remove_output_filter(f);
            return ap_pass_brigade(f->next, bb);
        }




-- 
Eric Covener
covener@gmail.com

Re: mod_deflate DoS using HEAD

Posted by Ruediger Pluem <rp...@apache.org>.

On 06/21/2010 11:37 PM, William A. Rowe Jr. wrote:
> On 6/21/2010 4:00 PM, Stefan Fritsch wrote:
>> As I understand it, Rüdiger's patch may be better for caching but uses 
>> more CPU cycles. But it uses way less CPU than no patch at all. 
>> Therefore I propose to include that patch unless there is clear 
>> consensus that Eric's patch is to be preferred.
> 
> Not a significant number, and Rüdiger's patch gathered +1's from myself,
> gregames, nick is on the wall with a +.5 - I think your question is to
> Rüdiger, with the emphasis on 'what is your decision?' based on this
> last rather indecisive posting.

Does someone still have Eric's patch? Seems that I can't find it right now.

Regards

Rüdiger


Re: mod_deflate DoS using HEAD

Posted by "William A. Rowe Jr." <wr...@rowe-clan.net>.
On 6/21/2010 4:00 PM, Stefan Fritsch wrote:
> 
> As I understand it, Rüdiger's patch may be better for caching but uses 
> more CPU cycles. But it uses way less CPU than no patch at all. 
> Therefore I propose to include that patch unless there is clear 
> consensus that Eric's patch is to be preferred.

Not a significant number, and Rüdiger's patch gathered +1's from myself,
gregames, nick is on the wall with a +.5 - I think your question is to
Rüdiger, with the emphasis on 'what is your decision?' based on this
last rather indecisive posting.


On 7/16/2009 9:24 AM, "Plüm, Rüdiger, VF-Group" wrote:
>>
>> For a large static file, Ruedigers patch suppresses the C-L entirely
>> (it gets added back in down the chain for my patch, for static files
>> at least) which I thought would make that prefered, if we're confident
>> that we'll never do more than a zlib buffer worth of work the first
>> go-round.
>
> Good point. So your patch would invalidate a cached entity if the
> response to a GET delivered a C-L header, since HEAD and GET would
> deliver different C-L headers.
> OTOH I think only very small or extremely compressable responses (whether
> static or not) would have a C-L in the response to a GET, because everything
> that exceeeds a zlib buffer would be delivered chunked anyway.



Re: mod_deflate DoS using HEAD

Posted by Stefan Fritsch <sf...@sfritsch.de>.
On Thursday 16 July 2009, William A. Rowe, Jr. wrote:
> Plüm, Rüdiger, VF-Group wrote:
> > Good point. So your patch would invalidate a cached entity if the
> > response to a GET delivered a C-L header, since HEAD and GET
> > would deliver different C-L headers.
> > OTOH I think only very small or extremely compressable responses
> > (whether static or not) would have a C-L in the response to a
> > GET, because everything that exceeeds a zlib buffer would be
> > delivered chunked anyway.
> 
> We don't really want to gzip that single buffer though, either. 
> The prime concern here is CPU cycles.  In this case, there is no
> advantage to performing that compression, and inconsistent
> behavior leads cache and proxy authors down unfortunate
> assumptions.

Going back to that old thread, was there consensus on which patch is 
preferable?

As I understand it, Rüdiger's patch may be better for caching but uses 
more CPU cycles. But it uses way less CPU than no patch at all. 
Therefore I propose to include that patch unless there is clear 
consensus that Eric's patch is to be preferred.

As an added data point, Rüdiger's patch is in Debian 5.0 and various 
Ubuntus and AFAIK hasn't caused any issues. 

Re: mod_deflate DoS using HEAD

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
Plüm, Rüdiger, VF-Group wrote:
> 
> Good point. So your patch would invalidate a cached entity if the
> response to a GET delivered a C-L header, since HEAD and GET would
> deliver different C-L headers.
> OTOH I think only very small or extremely compressable responses (whether
> static or not) would have a C-L in the response to a GET, because everything
> that exceeeds a zlib buffer would be delivered chunked anyway.

We don't really want to gzip that single buffer though, either.  The prime
concern here is CPU cycles.  In this case, there is no advantage to
performing that compression, and inconsistent behavior leads cache and
proxy authors down unfortunate assumptions.

RE: mod_deflate DoS using HEAD

Posted by "Plüm, Rüdiger, VF-Group" <ru...@vodafone.com>.
 

> -----Original Message-----
> From: Eric Covener 
> Sent: Donnerstag, 16. Juli 2009 16:13
> To: dev@httpd.apache.org
> Subject: Re: mod_deflate DoS using HEAD
> 
> On Wed, Jul 15, 2009 at 5:19 PM, Nick Kew<ni...@webthing.com> wrote:
> > William A. Rowe, Jr. wrote:
> >
> >> So +1 to the proposed patch; in fact, +1 on unsetting C-L 
> and treating
> >> HEAD to the same processing as 304.
> >
> > +1.  Since it's a SHOULD not a MUST, we can be pragmatic
> > with the headers.
> >
> > That's back to Eric's original patch, isn't it?
> 
> For a large static file, Ruedigers patch suppresses the C-L entirely
> (it gets added back in down the chain for my patch, for static files
> at least) which I thought would make that prefered, if we're confident
> that we'll never do more than a zlib buffer worth of work the first
> go-round.

Good point. So your patch would invalidate a cached entity if the
response to a GET delivered a C-L header, since HEAD and GET would
deliver different C-L headers.
OTOH I think only very small or extremely compressable responses (whether
static or not) would have a C-L in the response to a GET, because everything
that exceeeds a zlib buffer would be delivered chunked anyway.

Regards

Rüdiger


Re: mod_deflate DoS using HEAD

Posted by Eric Covener <co...@gmail.com>.
On Wed, Jul 15, 2009 at 5:19 PM, Nick Kew<ni...@webthing.com> wrote:
> William A. Rowe, Jr. wrote:
>
>> So +1 to the proposed patch; in fact, +1 on unsetting C-L and treating
>> HEAD to the same processing as 304.
>
> +1.  Since it's a SHOULD not a MUST, we can be pragmatic
> with the headers.
>
> That's back to Eric's original patch, isn't it?

For a large static file, Ruedigers patch suppresses the C-L entirely
(it gets added back in down the chain for my patch, for static files
at least) which I thought would make that prefered, if we're confident
that we'll never do more than a zlib buffer worth of work the first
go-round.

-- 
Eric Covener
covener@gmail.com

Re: mod_deflate DoS using HEAD

Posted by Nick Kew <ni...@webthing.com>.
William A. Rowe, Jr. wrote:

> So +1 to the proposed patch; in fact, +1 on unsetting C-L and treating
> HEAD to the same processing as 304.

+1.  Since it's a SHOULD not a MUST, we can be pragmatic
with the headers.

That's back to Eric's original patch, isn't it?

-- 
Nick Kew

Re: mod_deflate DoS using HEAD

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
Joe Orton wrote:
> On Wed, Jul 15, 2009 at 11:03:24AM +0200, "Plüm, Rüdiger, VF-Group" wrote:
>>> I'm confused.  Why do this check so late, and why does r->bytes_sent 
>>> matter?  Why does it "screw up the protocol" if the DEFLATE 
>> All depends on the first brigade that passes mod_deflate. If this brigade
>> contains the whole response *and* mod_deflate can compress this response
>> in one go meaning calling ap_passbrigade only once to sent the fully compressed
>> response then the content-length filter can determine the length of the content
>> and add it to the HEAD response as the same GET request would be delivered
>> with a C-L.
> 
> I understand that, but, the whole point of doing the check early, as 
> Eric's proposed patch did, is to *avoid* doing all that work because it 
> may be exhorbitantly expensive, even in the case of a single brigade 
> containing the whole response - that's the issue in hand.
> 
> The GET/304 case already omits the C-L from the response, so, I'm still 
> struggling to see how doing the same for HEAD somehow breaks HTTP 
> caching in any way.

Ok, rereading 9.4, it discusses "new field values indicate that the cached
entity differs from the current entity" - new; not omitted.

In the absence of the header, the entity doesn't "change".  It is unknown.
So a cache author can/should/would[?] treat it as fresh, as you point out
with respect to 304.

So +1 to the proposed patch; in fact, +1 on unsetting C-L and treating
HEAD to the same processing as 304.





RE: mod_deflate DoS using HEAD

Posted by "Plüm, Rüdiger, VF-Group" <ru...@vodafone.com>.
 

> -----Original Message-----
> From: Joe Orton [mailto:jorton@redhat.com] 
> Sent: Mittwoch, 15. Juli 2009 15:29
> To: dev@httpd.apache.org
> Subject: Re: mod_deflate DoS using HEAD
> 
> On Wed, Jul 15, 2009 at 11:03:24AM +0200, "Plüm, Rüdiger, 
> VF-Group" wrote:
> > > I'm confused.  Why do this check so late, and why does 
> r->bytes_sent 
> > > matter?  Why does it "screw up the protocol" if the DEFLATE 
> > 
> > All depends on the first brigade that passes mod_deflate. 
> If this brigade
> > contains the whole response *and* mod_deflate can compress 
> this response
> > in one go meaning calling ap_passbrigade only once to sent 
> the fully compressed
> > response then the content-length filter can determine the 
> length of the content
> > and add it to the HEAD response as the same GET request 
> would be delivered
> > with a C-L.
> 
> I understand that, but, the whole point of doing the check early, as 
> Eric's proposed patch did, is to *avoid* doing all that work 
> because it 
> may be exhorbitantly expensive, even in the case of a single brigade 
> containing the whole response - that's the issue in hand.

It isn't really this expensive, because of the second condition.
Each time the internal zlib outgoing buffer with the compressed data
within is full we pass this data down the chain. We don't buffer this
data in the filter. It is a good filter :-).
This means if this passing is not the *only* passing mod_deflate does
down the chain we cannot go the C-L road anyway. So most larger responses
even in one brigade if not extremely compressable will cause mod_deflate
aborting compression in this case.
IMHO the rule is that if the C-L of the compressed response is larger than
the size of zlibs outgoing buffer we will sent the response in T-E
chunked anyway. Thus we can stop compressing for a HEAD request once
we got over this limit which should happen fairly early.

> 
> The GET/304 case already omits the C-L from the response, so, 
> I'm still 
> struggling to see how doing the same for HEAD somehow breaks HTTP 
> caching in any way.

IMHO it can cause an unnecessary invalidation of a cache entry. 
This is not nice. Whether this could be called a "break" is another story.

Regards

Rüdiger


Re: mod_deflate DoS using HEAD

Posted by Joe Orton <jo...@redhat.com>.
On Wed, Jul 15, 2009 at 11:03:24AM +0200, "Plüm, Rüdiger, VF-Group" wrote:
> > I'm confused.  Why do this check so late, and why does r->bytes_sent 
> > matter?  Why does it "screw up the protocol" if the DEFLATE 
> 
> All depends on the first brigade that passes mod_deflate. If this brigade
> contains the whole response *and* mod_deflate can compress this response
> in one go meaning calling ap_passbrigade only once to sent the fully compressed
> response then the content-length filter can determine the length of the content
> and add it to the HEAD response as the same GET request would be delivered
> with a C-L.

I understand that, but, the whole point of doing the check early, as 
Eric's proposed patch did, is to *avoid* doing all that work because it 
may be exhorbitantly expensive, even in the case of a single brigade 
containing the whole response - that's the issue in hand.

The GET/304 case already omits the C-L from the response, so, I'm still 
struggling to see how doing the same for HEAD somehow breaks HTTP 
caching in any way.

Regards, Joe

RE: mod_deflate DoS using HEAD

Posted by "Plüm, Rüdiger, VF-Group" <ru...@vodafone.com>.
 

> -----Original Message-----
> From: Joe Orton [mailto:jorton@redhat.com] 
> Sent: Mittwoch, 15. Juli 2009 09:51
> To: dev@httpd.apache.org
> Subject: Re: mod_deflate DoS using HEAD
> 
> On Tue, Jul 14, 2009 at 05:47:16PM +0200, "Plüm, Rüdiger, 
> VF-Group" wrote:
> >  
> > 
> > > -----Original Message-----
> > > From: William A. Rowe, Jr. 
> > > Sent: Montag, 13. Juli 2009 23:58
> > > To: dev@httpd.apache.org
> > > Subject: Re: mod_deflate DoS using HEAD
> > > 
> > > Nick Kew wrote:
> > > > Eric Covener wrote:
> > > > 
> > > >>          /* For a 304 response, only change the headers */
> > > >> -        if (r->status == HTTP_NOT_MODIFIED) {
> > > >> +        if (r->status == HTTP_NOT_MODIFIED || 
> r->header_only) {
> > > > 
> > > > Technically speaking, screws up the protocol.
> > > > 
> > > > IMHO it would be acceptable provided:
> > > >   (a) it's an option for the admin, rather than enforced
> > > >   (b) it's documented
> > > >   (c) the headers are correct: either Content-Encoding is
> > > >       unset (uncompressed response) or Content-Length is
> > > >       unset.  Probably the former.
> > > 
> > > Agreed.  It's not a DoS.  If the admin wants to conserve CPU
> > > resources, they must either;
> > > 
> > >  * cache the deflated pages (avoid user-agent header if there
> > >    are multiples, which reminds me we need a module to unset the
> > >    accept deflate trigger on non-compliant browsers running
> > >    very-first in the quick_handler.)
> > > 
> > >  * create gzip'ed content, navigate the choice of content through
> > >    multiviews.
> > > 
> > >  * do not do server-side deflation (it is expensive).
> > > 
> > 
> > All very true. But how about the following patch. It should do no
> > harm and should solve the issue in at least some cases (I think
> > in most cases):
> 
> I'm confused.  Why do this check so late, and why does r->bytes_sent 
> matter?  Why does it "screw up the protocol" if the DEFLATE 

All depends on the first brigade that passes mod_deflate. If this brigade
contains the whole response *and* mod_deflate can compress this response
in one go meaning calling ap_passbrigade only once to sent the fully compressed
response then the content-length filter can determine the length of the content
and add it to the HEAD response as the same GET request would be delivered
with a C-L. If the above is not true the according GET response would be
delivered with T-E chunked anyway (in the HTTP/1.1 case or without anything
but closing the connection after sending the response in the HTTP/1.0 case).
So there is no point in doing further compression. And r->bytes_sent != 0
indicates that we have been already in the content length filter and that it cannot
measure the length of the response for creating a C-L header as I tried to explain
in my comment.
Well one might argue that the number of cases where the first brigade
contains the whole response *and* mod_deflate can compress this response
in one go meaning calling ap_passbrigade only once to sent the fully compressed
response is so low that this doesn't justify this approach and that we should
just get out of the way in the case of HEAD requests. Especially as providing
the correct C-L in a HEAD response is only a SHOULD (see below)

> filter does 
> nothing for a HEAD request?  Because of the concern that a HEAD will 
> return a different C-L & C-E to a GET on the same resource 
> with the same 
> Accept-Encoding header?  Does 2616 mandate that a resource 
> must always 
> exactly the same set of content-codings across methods and time?  
> (AFAICT there is no MUST on that front; it's a SHOULD if anything)

Exactly this is my impression and it is backed by 9.4 in RFC2616.
But you are correct it is only a SHOULD.

Regards

Rüdiger

Re: mod_deflate DoS using HEAD

Posted by Joe Orton <jo...@redhat.com>.
On Tue, Jul 14, 2009 at 05:47:16PM +0200, "Plüm, Rüdiger, VF-Group" wrote:
>  
> 
> > -----Original Message-----
> > From: William A. Rowe, Jr. 
> > Sent: Montag, 13. Juli 2009 23:58
> > To: dev@httpd.apache.org
> > Subject: Re: mod_deflate DoS using HEAD
> > 
> > Nick Kew wrote:
> > > Eric Covener wrote:
> > > 
> > >>          /* For a 304 response, only change the headers */
> > >> -        if (r->status == HTTP_NOT_MODIFIED) {
> > >> +        if (r->status == HTTP_NOT_MODIFIED || r->header_only) {
> > > 
> > > Technically speaking, screws up the protocol.
> > > 
> > > IMHO it would be acceptable provided:
> > >   (a) it's an option for the admin, rather than enforced
> > >   (b) it's documented
> > >   (c) the headers are correct: either Content-Encoding is
> > >       unset (uncompressed response) or Content-Length is
> > >       unset.  Probably the former.
> > 
> > Agreed.  It's not a DoS.  If the admin wants to conserve CPU
> > resources, they must either;
> > 
> >  * cache the deflated pages (avoid user-agent header if there
> >    are multiples, which reminds me we need a module to unset the
> >    accept deflate trigger on non-compliant browsers running
> >    very-first in the quick_handler.)
> > 
> >  * create gzip'ed content, navigate the choice of content through
> >    multiviews.
> > 
> >  * do not do server-side deflation (it is expensive).
> > 
> 
> All very true. But how about the following patch. It should do no
> harm and should solve the issue in at least some cases (I think
> in most cases):

I'm confused.  Why do this check so late, and why does r->bytes_sent 
matter?  Why does it "screw up the protocol" if the DEFLATE filter does 
nothing for a HEAD request?  Because of the concern that a HEAD will 
return a different C-L & C-E to a GET on the same resource with the same 
Accept-Encoding header?  Does 2616 mandate that a resource must always 
exactly the same set of content-codings across methods and time?  
(AFAICT there is no MUST on that front; it's a SHOULD if anything)

Regards, Joe

Re: mod_deflate DoS using HEAD

Posted by Greg Ames <am...@gmail.com>.
On Tue, Jul 14, 2009 at 11:47 AM, "Plüm, Rüdiger, VF-Group" <
ruediger.pluem@vodafone.com> wrote:

>
> All very true. But how about the following patch. It should do no
> harm and should solve the issue in at least some cases (I think
> in most cases):
>
> Index: modules/filters/mod_deflate.c
>
> +        if (r->header_only && r->bytes_sent) {
> +            ap_remove_output_filter(f);
> +            return ap_pass_brigade(f->next, bb);
> +        }
>

+1

It doesn't break HTTP or caching, is simple, and any "extra" overhead (if
you want to call it that) is limited to one zlib buffer as you pointed out
later.

Greg

RE: mod_deflate DoS using HEAD

Posted by "Plüm, Rüdiger, VF-Group" <ru...@vodafone.com>.
 

> -----Original Message-----
> From: William A. Rowe, Jr. 
> Sent: Montag, 13. Juli 2009 23:58
> To: dev@httpd.apache.org
> Subject: Re: mod_deflate DoS using HEAD
> 
> Nick Kew wrote:
> > Eric Covener wrote:
> > 
> >>          /* For a 304 response, only change the headers */
> >> -        if (r->status == HTTP_NOT_MODIFIED) {
> >> +        if (r->status == HTTP_NOT_MODIFIED || r->header_only) {
> > 
> > Technically speaking, screws up the protocol.
> > 
> > IMHO it would be acceptable provided:
> >   (a) it's an option for the admin, rather than enforced
> >   (b) it's documented
> >   (c) the headers are correct: either Content-Encoding is
> >       unset (uncompressed response) or Content-Length is
> >       unset.  Probably the former.
> 
> Agreed.  It's not a DoS.  If the admin wants to conserve CPU
> resources, they must either;
> 
>  * cache the deflated pages (avoid user-agent header if there
>    are multiples, which reminds me we need a module to unset the
>    accept deflate trigger on non-compliant browsers running
>    very-first in the quick_handler.)
> 
>  * create gzip'ed content, navigate the choice of content through
>    multiviews.
> 
>  * do not do server-side deflation (it is expensive).
> 

All very true. But how about the following patch. It should do no
harm and should solve the issue in at least some cases (I think
in most cases):

Index: modules/filters/mod_deflate.c
===================================================================
--- modules/filters/mod_deflate.c       (revision 793927)
+++ modules/filters/mod_deflate.c       (working copy)
@@ -629,6 +629,19 @@
         apr_bucket *b;
         apr_size_t len;

+        /*
+         * Optimization: If we are a HEAD request and bytes_sent is not zero
+         * it means that we have passed the content-length filter once and
+         * have more data to sent. This means that the content-length filter
+         * could not determine our content-length for the response to the
+         * HEAD request anyway (the associated GET request would deliver the
+         * body in chunked encoding) and we can stop compressing.
+         */
+        if (r->header_only && r->bytes_sent) {
+            ap_remove_output_filter(f);
+            return ap_pass_brigade(f->next, bb);
+        }
+
         e = APR_BRIGADE_FIRST(bb);

         if (APR_BUCKET_IS_EOS(e)) {

Regards

Rüdiger


Re: mod_deflate DoS using HEAD

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
William A. Rowe, Jr. wrote:
> 
>  * do not do server-side deflation (it is expensive).

Whoops - forgot one more

  * do not do content deflation, only transfer deflation, which
    should not metered by the content-length, right?

Re: mod_deflate DoS using HEAD

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
Nick Kew wrote:
> Eric Covener wrote:
> 
>>          /* For a 304 response, only change the headers */
>> -        if (r->status == HTTP_NOT_MODIFIED) {
>> +        if (r->status == HTTP_NOT_MODIFIED || r->header_only) {
> 
> Technically speaking, screws up the protocol.
> 
> IMHO it would be acceptable provided:
>   (a) it's an option for the admin, rather than enforced
>   (b) it's documented
>   (c) the headers are correct: either Content-Encoding is
>       unset (uncompressed response) or Content-Length is
>       unset.  Probably the former.

Agreed.  It's not a DoS.  If the admin wants to conserve CPU
resources, they must either;

 * cache the deflated pages (avoid user-agent header if there
   are multiples, which reminds me we need a module to unset the
   accept deflate trigger on non-compliant browsers running
   very-first in the quick_handler.)

 * create gzip'ed content, navigate the choice of content through
   multiviews.

 * do not do server-side deflation (it is expensive).

These two flaw reports are truly no more DoS than most CGI pages.