You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Nicholas Sherlock <n....@gmail.com> on 2009/07/25 08:56:08 UTC

mod_cache sends 200 code instead of 304

Hi everyone,

If you make a conditional request for a cached document, but the 
document is expired in the cache, mod_cache currently passes on the 
conditional request to the backend. If the backend responds with a "304 
Not Modified" response that indicates that the cached copy is still up 
to date, mod_cache serves the contents of the cache to the client with a 
200 code.

But couldn't it just send a 304 Not Modified code instead? At the moment 
it ends up wasting large amounts of bandwidth on my website in the case 
where you press refresh on an unmodified object in Firefox, which sends 
these request headers:

If-None-Match="My ETag"
Cache-Control=max-age=0

I do not want the behaviour given by "CacheIgnoreCacheControl yes". I 
still want mod_cache to validate the request against the backend, but I 
don't want it to waste bandwidth by sending a 200 response code.

To test it, I have these cache-related lines in my virtual host definition:

CacheRoot C:/temp
CacheEnable disk /

And this PHP file called index.php in /:

<?php

/* Generate our ETag. Assume that generating the ETag is
  * a whole lot less expensive than generating the content
  * (e.g. it could be based on revision counts for documents
  * from a database).
  */
$etag="\"ComputedETag\"";

header("Etag: $etag");
//Expires ages away
header("Expires: " . gmdate("D, d M Y H:i:s", time()
	+ 60 * 60 * 24 * 30) . " GMT");

if (isset($_SERVER['HTTP_IF_NONE_MATCH']) &&
	$_SERVER['HTTP_IF_NONE_MATCH'] == $etag) {

	/* At a users' request, the cache has been bypassed, but the
	 * document is still the same. Avoid costly response generation
	 * and waste of bandwidth by just sending not-modified.
	 */
	header('HTTP/1.0 304 Not Modified');
	
	error_log(date('r')." - Response: 304 Not Modified\n");
	exit(); //Don't generate or send the body
}	

error_log(date('r')." - Response: 200. Generated document.\n");

echo "Document body goes here";

?>

My web browser requests the document the first time, the (trimmed) 
response is:

Status=OK - 200
Date=Mon, 20 Jul 2009 07:16:05 GMT
Expires=Wed, 19 Aug 2009 07:16:05 GMT
Etag="ComputedETag"

The log performed by index.php indicates:

Mon, 20 Jul 2009 19:16:05 +1200 - Response: 200. Generated document.

So far so good. But now I press refresh in my web browser. This makes a 
conditional request for the document:

If-None-Match="ComputedETag"
Cache-Control=max-age=0

With the max-age of 0, the cache will be bypassed, which is the desired 
behaviour. The cache passes this conditional request onto the backend, 
and the backend logs it:

Mon, 20 Jul 2009 19:16:12 +1200 - Response: 304 Not Modified

So the backend is trying to tell the client that it already has an 
up-to-date body. But the response sent to the browser by the caching 
system is:

Status=OK - 200
Date=Mon, 20 Jul 2009 07:16:12 GMT
Etag="ComputedETag"
Expires=Wed, 19 Aug 2009 07:16:12 GMT

My Apache is:

Apache/2.2.11 (Win32) DAV/2 mod_ssl/2.2.11 OpenSSL/0.9.8i SVN/1.6.3 
PHP/5.3.0

I found that I correctly got the 304 response code in my situation if I 
changed these lines (mod_cache.c:741):

/* We found a stale entry which wasn't really stale. */
if (cache->stale_handle) {
    /* Load in the saved status and clear the status line. */
    r->status = info->status;
    r->status_line = NULL;

To:

/* We found a stale entry which wasn't really stale. */
if (cache->stale_handle) {
    /* Load in the saved status and clear the status line. */
    r->status = 304;
    r->status_line = NULL;

But that clearly doesn't work, even if the client sends an unconditional 
request it can end up getting a 304 response as a reply. I don't 
understand what is happening because this is the first time I've looked 
at the Apache codebase, let alone the mod_cache module. Can anyone who 
is more experienced with this module (and Apache in general) comment?

Cheers,
Nicholas Sherlock


Re: mod_cache sends 200 code instead of 304

Posted by Nicholas Sherlock <n....@gmail.com>.
Graham Leggett wrote:
> Nicholas Sherlock wrote:
>> But couldn't it just send a 304 Not Modified code instead? At the moment
>> it ends up wasting large amounts of bandwidth on my website in the case
>> where you press refresh on an unmodified object in Firefox, which sends
>> these request headers:
> 
> I kept this back to investigate as I have been ENOTIME, but I've noticed
> a small detail:

Actually, this problem was traced to a bug in PHP's Apache filter. It 
sets "no_local_copy" to 1 in its response to Apache, which denies 
mod_cache from creating its own 304 Not Modified response code.

> Etags and If-None-Match are HTTP/1.1 caching concepts, and yet you're
> sending a response back to the cache telling the cache that you are an
> HTTP/1.0 server.
> 
> I suspect what is happening is that the cache is seeing an HTTP/1.0
> response with HTTP/1.1 headers in it, and is in turn ignoring your 304
> not modified response.
> 
> Try change your response to 'HTTP/1.1 304 Not Modified' instead.

I think I changed it to HTTP/1.0 as a last resort after I had exhausted 
all my other options. I changed it back to HTTP/1.1, and no change, it 
still gives the same behaviour.

> Another thing to check, you're using a function called "header" to set
> what is really the response status line, I'm not a php person, but that
> looks wrong to me.

header() is correct for setting response headers in PHP :).

> Check you aren't sending back a 200 OK without realising it (which will
> cause the cache to go "oh, the entity just got refreshed, send 200 back
> to the original client", which is in turn the symptom you are seeing).

The PHP script is definitely sending a 304, it logs it to a file to 
confirm (and I've verified that file). You can actually tell that 
mod_cache is getting the 304 response code, because mod_cache serves the 
document body from the cache along with the incorrect 200 code (the body 
of the 304 response from PHP itself is of course empty). Using that test 
code, if the branch that was supposed to set a 304 code set a 200 code 
instead, you would expect an empty document body.

I'm currently running unmodified Apache and PHP patched to not set 
no_local_copy=1 in its response constructor on my production server, and 
mod_cache works flawlessly - the 304 code is correctly sent to the 
client instead of the 200 code.

Cheers,
Nicholas Sherlock


Re: mod_cache sends 200 code instead of 304

Posted by Graham Leggett <mi...@sharp.fm>.
Nicholas Sherlock wrote:

> If you make a conditional request for a cached document, but the
> document is expired in the cache, mod_cache currently passes on the
> conditional request to the backend. If the backend responds with a "304
> Not Modified" response that indicates that the cached copy is still up
> to date, mod_cache serves the contents of the cache to the client with a
> 200 code.
> 
> But couldn't it just send a 304 Not Modified code instead? At the moment
> it ends up wasting large amounts of bandwidth on my website in the case
> where you press refresh on an unmodified object in Firefox, which sends
> these request headers:

I kept this back to investigate as I have been ENOTIME, but I've noticed
a small detail:

> if (isset($_SERVER['HTTP_IF_NONE_MATCH']) &&
>     $_SERVER['HTTP_IF_NONE_MATCH'] == $etag) {
> 
>     /* At a users' request, the cache has been bypassed, but the
>      * document is still the same. Avoid costly response generation
>      * and waste of bandwidth by just sending not-modified.
>      */
>     header('HTTP/1.0 304 Not Modified');
              ^^^^^^^^
>     
>     error_log(date('r')." - Response: 304 Not Modified\n");
>     exit(); //Don't generate or send the body
> }   

Etags and If-None-Match are HTTP/1.1 caching concepts, and yet you're
sending a response back to the cache telling the cache that you are an
HTTP/1.0 server.

I suspect what is happening is that the cache is seeing an HTTP/1.0
response with HTTP/1.1 headers in it, and is in turn ignoring your 304
not modified response.

Try change your response to 'HTTP/1.1 304 Not Modified' instead.

Another thing to check, you're using a function called "header" to set
what is really the response status line, I'm not a php person, but that
looks wrong to me.

Check you aren't sending back a 200 OK without realising it (which will
cause the cache to go "oh, the entity just got refreshed, send 200 back
to the original client", which is in turn the symptom you are seeing).

Regards,
Graham
--

Re: mod_cache sends 200 code instead of 304

Posted by Nicholas Sherlock <n....@gmail.com>.
Nicholas Sherlock wrote:
> Thanks, I wasn't certain if the behaviour I wanted was HTTP-correct, but 
> it seems that it is (and anyway it'll save me on bandwidth costs, so I 
> really want to fix it). I'll go add it now.

This is now bug report #47580

https://issues.apache.org/bugzilla/show_bug.cgi?id=47580

Cheers,
Nicholas Sherlock


Re: mod_cache sends 200 code instead of 304

Posted by Nicholas Sherlock <n....@gmail.com>.
Dan Poirier wrote:
> Nicholas Sherlock <n....@gmail.com> writes:
> 
>> If you make a conditional request for a cached document, but the
>> document is expired in the cache, mod_cache currently passes on the
>> conditional request to the backend. If the backend responds with a
>> "304 Not Modified" response that indicates that the cached copy is
>> still up to date, mod_cache serves the contents of the cache to the
>> client with a 200 code.
> 
> This wouldn't surprise me.  There's currently a bug open for the
> opposite case, returning a 304 to an unconditional request (45341).
> 
> I believe this violates a SHOULD in 14.25 of RFC 2616, which isn't as
> strong as a MUST, but certainly would indicate it's worthwhile to try to
> fix it.
> 
> I'd suggest opening a bug report
> (http://httpd.apache.org/bug_report.html), including all the details
> from your original message, so this doesn't fall through the cracks
> before someone gets to look at it in more depth.

Thanks, I wasn't certain if the behaviour I wanted was HTTP-correct, but 
it seems that it is (and anyway it'll save me on bandwidth costs, so I 
really want to fix it). I'll go add it now.

Cheers,
Nicholas Sherlock


Re: mod_cache sends 200 code instead of 304

Posted by Dan Poirier <po...@pobox.com>.
Nicholas Sherlock <n....@gmail.com> writes:

> If you make a conditional request for a cached document, but the
> document is expired in the cache, mod_cache currently passes on the
> conditional request to the backend. If the backend responds with a
> "304 Not Modified" response that indicates that the cached copy is
> still up to date, mod_cache serves the contents of the cache to the
> client with a 200 code.

This wouldn't surprise me.  There's currently a bug open for the
opposite case, returning a 304 to an unconditional request (45341).

I believe this violates a SHOULD in 14.25 of RFC 2616, which isn't as
strong as a MUST, but certainly would indicate it's worthwhile to try to
fix it.

I'd suggest opening a bug report
(http://httpd.apache.org/bug_report.html), including all the details
from your original message, so this doesn't fall through the cracks
before someone gets to look at it in more depth.

Dan