You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@httpd.apache.org by Stefan Podskubka <s....@hcs.at> on 2013/08/16 17:18:11 UTC

[users@httpd] Multiple problems with mod_cache (httpd 2.4.6) on Windows

I'm sorry that this mail is long, but I wanted to include the necessary 
details.

This is my scenario:

Software:
Apache httpd 2.4.6 VC10 Windows 32 bits binary built with OpenSL 0.9.8y 
from www.apachelounge.com.

Frontend server:
lots of bandwidth
caching reverse proxy setup using httpd 2.4.6 + mod_cache + 
mod_cache_disk + mod_proxy + mod_proxy_httpd on Windows 2003 Server

Backend server:
very limited bandwidth (so it is very important for me to minimize the 
data transferred between the frontend and backend servers)

Frontend server virtual host configuration:
<VirtualHost 192.168.10.8:443>
   ServerName front.example.com

   CacheEnable disk /dir/
   CacheIgnoreHeaders Set-Cookie
   CacheLock On
   CacheLockPath C:/ApacheCache/Locks

   CacheRoot C:/ApacheCache/Cache
   CacheDirLevels 1
   CacheDirLength 1
   CacheMaxFileSize 10000000

   <Directory C:/ApacheRoot/dir>
     RewriteEngine on
     RewriteRule ^update/$ http://back.example.com/update.cfm [P,L]
     Header unset Set-Cookie
   </Directory>
</VirtualHost>

I have studied the documentation of mod_cache as well as different 
documents regarding HTTP caching but I am encountering some behaviour I 
cannot explain with this setup and hope that someone is able to help me.

Problem 1: the second and subsequent frontend requests during the 
refreshment of a stale entry don't return stale data but cause multiple 
backend requests (the "Thundering Herd" is not kept at bay although the 
cache lock is enabled)

The requests are completely unconditional and look like this:
GET /dir/update/ HTTP/1.1
User-Agent: xyz
Host: front.example.com

This is the mod_cache logging for the first (tid 2628) and second (tid 
11048) request:
[Fri Aug 16 13:06:30.188500 2013] [cache:debug] [pid 2284:tid 2628] 
cache_storage.c(664): [client 192.168.2.234:49770] AH00698: cache: Key 
for entity /dir/update/?(null) is https://front.example.com:443/dir/update/?
[Fri Aug 16 13:06:30.188500 2013] [cache:debug] [pid 2284:tid 2628] 
cache_util.c(732): [client 192.168.2.234:49770] AH00782: Cache lock 
obtained for stale cached URL, revalidating entry: /dir/update/
[Fri Aug 16 13:06:30.188500 2013] [cache:debug] [pid 2284:tid 2628] 
cache_storage.c(358): [client 192.168.2.234:49770] AH00695: Cached 
response for /dir/update/ isn't fresh. Adding conditional request headers.
[Fri Aug 16 13:06:30.188500 2013] [cache:debug] [pid 2284:tid 2628] 
mod_cache.c(192): [client 192.168.2.234:49770] AH00750: Adding 
CACHE_SAVE filter for /dir/update/
[Fri Aug 16 13:06:30.188500 2013] [cache:debug] [pid 2284:tid 2628] 
mod_cache.c(202): [client 192.168.2.234:49770] AH00751: Adding 
CACHE_REMOVE_URL filter for /dir/update/
[Fri Aug 16 13:06:30.922875 2013] [cache:debug] [pid 2284:tid 11048] 
cache_storage.c(664): [client 192.168.2.234:49771] AH00698: cache: Key 
for entity /dir/update/?(null) is https://front.example.com:443/dir/update/?
[Fri Aug 16 13:06:30.922875 2013] [cache:debug] [pid 2284:tid 11048] 
cache_util.c(757): (OS 80)Die Datei ist vorhanden.  : [client 
192.168.2.234:49771] AH00784: Attempt to obtain a cache lock for stale 
cached URL failed, revalidating entry anyway: /dir/update/
[Fri Aug 16 13:06:30.922875 2013] [cache:debug] [pid 2284:tid 11048] 
cache_storage.c(358): [client 192.168.2.234:49771] AH00695: Cached 
response for /dir/update/ isn't fresh. Adding conditional request headers.
[Fri Aug 16 13:06:30.922875 2013] [cache:debug] [pid 2284:tid 11048] 
mod_cache.c(218): (OS 80)Die Datei ist vorhanden.  : [client 
192.168.2.234:49771] AH00752: Cache locked for url, not caching 
response: /dir/update/
[Fri Aug 16 13:06:31.438500 2013] [cache:debug] [pid 2284:tid 2628] 
mod_cache.c(1325): [client 192.168.2.234:49770] AH00769: cache: Caching 
url: /dir/update/
[Fri Aug 16 13:06:31.438500 2013] [cache:debug] [pid 2284:tid 2628] 
mod_cache.c(1331): [client 192.168.2.234:49770] AH00770: cache: Removing 
CACHE_REMOVE_URL filter.

I took a quick look at the source of mod_cache and I guess there may be 
something wrong with the locking code, probably only on Windows systems.
The first request correctly obtains a cache lock (line 2).
The second request cannot obtain a cache lock (line 7) because the lock 
file already exists (the germain Windows error message translates to 
"The file already exists.") but in that case the log message should 
actually read "Cache already locked for stale cached URL, pretend it is 
fresh: /dir/update/" and stale content should be returned. Obviously the 
call to cache_try_lock(conf, cache, r) in cache_util.c does not return 
APR_EEXIST, like is expected in this case, but something else, which is 
not correctly detected as an existing lock...

Additionally to the multiple backend requests there is also the critical 
problem that the second request receives a 304 Not Modified response 
(passed through from the backend), although it was unconditional and the 
first, lock-obtaining, request received an expected 200 OK response with 
the cached data.

Do you think my assumption is correct that this is a problem with 
mod_cache on Windows? Is the 304 response to the unconditional request 
also expected or is it maybe some kind of aftereffect of the cache lock 
problems?

Problem 2: conditional frontend requests where the If-Modified-Since 
request header field is earlier than the cached Last-Modified response 
header field always cause backend requests

I am starting with an empty cache and issue the same request twice 
(waiting until the first request finishes before executing the second 
one). The requests contain the same If-Modified-Since header field 
value, which is one second before the Last-Modified header field 
returned by the backend.

The first and second requests look like this:
GET /dir/update/ HTTP/1.1
User-Agent: xyz
Host: front.example.com
If-Modified-Since: Fri, 16 Aug 2013 09:40:00 GMT

After that I issue a third request where I set the If-Modified-Since 
request header field to the exact value of the Last-Modified header 
field returned by the backend.

The third request looks like this:
GET /dir/update/ HTTP/1.1
User-Agent: xyz
Host: front.example.com
If-Modified-Since: Fri, 16 Aug 2013 09:40:01 GMT

As I understand it the first request should result in a request to the 
backend server. The returned data should be cached and returned to the 
second request without causing a second request to the backend. The 
third request should receive 304 Not Modifed from the cache, also 
without a backend request. But that is not what happens.

This is the mod_cache logging for the three requests:
[Fri Aug 16 16:49:09.610375 2013] [cache:debug] [pid 1860:tid 10344] 
cache_storage.c(664): [client 192.168.2.234:50220] AH00698: cache: Key 
for entity /dir/update/?(null) is https://front.example.com:443/dir/update/?
[Fri Aug 16 16:49:09.610375 2013] [cache:debug] [pid 1860:tid 10344] 
mod_cache.c(192): [client 192.168.2.234:50220] AH00750: Adding 
CACHE_SAVE filter for /dir/update/
[Fri Aug 16 16:49:09.610375 2013] [cache:debug] [pid 1860:tid 10344] 
mod_cache.c(202): [client 192.168.2.234:50220] AH00751: Adding 
CACHE_REMOVE_URL filter for /dir/update/
[Fri Aug 16 16:49:10.579125 2013] [cache:debug] [pid 1860:tid 10344] 
mod_cache.c(1325): [client 192.168.2.234:50220] AH00769: cache: Caching 
url: /dir/update/
[Fri Aug 16 16:49:10.579125 2013] [cache:debug] [pid 1860:tid 10344] 
mod_cache.c(1331): [client 192.168.2.234:50220] AH00770: cache: Removing 
CACHE_REMOVE_URL filter.
[Fri Aug 16 16:49:23.813500 2013] [cache:debug] [pid 1860:tid 10344] 
cache_storage.c(664): [client 192.168.2.234:50220] AH00698: cache: Key 
for entity /dir/update/?(null) is https://front.example.com:443/dir/update/?
[Fri Aug 16 16:49:23.813500 2013] [cache:debug] [pid 1860:tid 10344] 
mod_cache.c(192): [client 192.168.2.234:50220] AH00750: Adding 
CACHE_SAVE filter for /dir/update/
[Fri Aug 16 16:49:23.813500 2013] [cache:debug] [pid 1860:tid 10344] 
mod_cache.c(202): [client 192.168.2.234:50220] AH00751: Adding 
CACHE_REMOVE_URL filter for /dir/update/
[Fri Aug 16 16:49:25.610375 2013] [cache:debug] [pid 1860:tid 10344] 
mod_cache.c(1325): [client 192.168.2.234:50220] AH00769: cache: Caching 
url: /dir/update/
[Fri Aug 16 16:49:25.610375 2013] [cache:debug] [pid 1860:tid 10344] 
mod_cache.c(1331): [client 192.168.2.234:50220] AH00770: cache: Removing 
CACHE_REMOVE_URL filter.
[Fri Aug 16 16:49:35.344750 2013] [cache:debug] [pid 1860:tid 10344] 
cache_storage.c(664): [client 192.168.2.234:50220] AH00698: cache: Key 
for entity /dir/update/?(null) is https://front.example.com:443/dir/update/?

This is the backend server's access log:
192.168.10.8 - - [16/Aug/2013:16:49:10 +0200] "GET /update.cfm HTTP/1.1" 
200 359261 "-" "xyz"
192.168.10.8 - - [16/Aug/2013:16:49:25 +0200] "GET /update.cfm HTTP/1.1" 
200 359261 "-" "xyz"

As you can see the second request also issues the same backend request 
as the first request although the data was already in the cache at the 
time. The third request is handled correctly (the If-Modified-Since 
header field is matched with the cached Last-Modified header field and 
304 Not Modified is returned without a backend request). Is there a 
reason why the second request behaves like this?


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org