You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Paul Querna <ch...@force-elite.com> on 2004/10/26 04:02:02 UTC

mod_cache: Content Generation Dependencies?

I have been doing some stuff with mod_transform (XSLT processor) and 
mod_cache.

The problem is, mod_cache doesn't have any easy way to know if a request 
needs to be regenerated.  Right now, it just blindly caches until a 
timeout.  What I would prefer is that it knows what files or URLs a 
specific request depends upon, and if any of those change, then 
regenerate the request.

An example:

cache_add_depends(r, "/home/httpd/site/xsl/foo.xsl");

This would add 'foo.xsl' as a dependency of the current request.  If the 
file's mtime changes, mod_cache would invalidate the cache of the 
current request.

Any opinions or suggestions?

A stat() call on several files is hundreds of times faster than having 
mod_transform re-generate the output.  While I would hate to stat() 
hundreds of files on every request, this method could eliminate all 
unesesary regeneration of cached content.

-Paul Querna

Re: mod_cache: Content Generation Dependencies?

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
--On Tuesday, October 26, 2004 4:32 AM +0200 Graham Leggett <mi...@sharp.fm> 
wrote:

> If mod_transform isn't supporting Etag properly, then I'd say mod_transform
> was broken, and fixing it would probably solve your problem.

+1.  If the content changes, so should the ETag.  mod_transform could also set 
some Cache-Control headers.

In short, think about what an intermediary caching HTTP proxy would do with 
that request.  It'd check the expiration header ('freshness' tests), and, if 
that fails, then the external cache would try to send an If-Modified-Since (or 
some variant) request to the upstream server: if httpd responds it hasn't 
changed, then it'd serve from the cache until the timeout.  So, it'd exactly 
act the same as mod_cache.  Hence, adding 'private' hooks for mod_cache would 
still allow stale responses from external HTTP caches.  -- justin

Re: mod_cache: Content Generation Dependencies?

Posted by Graham Leggett <mi...@sharp.fm>.
Paul Querna wrote:

> I have been doing some stuff with mod_transform (XSLT processor) and 
> mod_cache.
> 
> The problem is, mod_cache doesn't have any easy way to know if a request 
> needs to be regenerated.  Right now, it just blindly caches until a 
> timeout.  What I would prefer is that it knows what files or URLs a 
> specific request depends upon, and if any of those change, then 
> regenerate the request.
> 
> An example:
> 
> cache_add_depends(r, "/home/httpd/site/xsl/foo.xsl");
> 
> This would add 'foo.xsl' as a dependency of the current request.  If the 
> file's mtime changes, mod_cache would invalidate the cache of the 
> current request.
> 
> Any opinions or suggestions?
> 
> A stat() call on several files is hundreds of times faster than having 
> mod_transform re-generate the output.  While I would hate to stat() 
> hundreds of files on every request, this method could eliminate all 
> unesesary regeneration of cached content.

The cache has no knowledge about underlying files, never mind multiple 
dependancies. It relies on HTTP/1.1 to work out cache freshness.

Dependancies should be tracked by mod_transform not mod_cache - if 
either the source file, or XSL file changes, then the Etag should 
change, which will signal mod_cache (and any other caching proxies along 
the way) that the content is no longer fresh.

If mod_transform isn't supporting Etag properly, then I'd say 
mod_transform was broken, and fixing it would probably solve your problem.

Regards,
Graham
--