You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Niklas Edmundsson <ni...@acc.umu.se> on 2007/01/17 11:16:35 UTC

mod_disk_cache jumbopatch - new revision

I uploaded a new version of our mod_disk_cache jumbopatch for httpd 
2.2.4 to http://issues.apache.org/bugzilla/show_bug.cgi?id=39380

It's what we've been using for a couple of months now (modulo upgrade 
to httpd 2.2.4) and should be considered fairly stable. It has 
survived all sorts of pathetic load-cases on http://ftp.acc.umu.se/ 
(also known as ftp.se.debian.org ftp.gnome.org, 
se.releases.ubuntu.com, se.archive.ubuntu.com, releases.mozilla.org) 
including our nfs backend going bezerk and bottoming out at a few MB/s 
when all frontends wanted to cache 300GB of new 
debian-weekly-build-isos.

Highlights from previous patch:
* Reverted to separate files for header and data, there were too many
   corner cases and having the data file separate allows us to reuse
   the cached data for other purposes (for example rsync).
* Fixed on disk headers to be stored in easily machine parseable
   format which allows for error checking instead human readable form
   that doesn't.
* Attaching the background thread to the connection instead of request
   pool allows for restarts to work, the thing doesn't crash when you
   do apachectl graceful anymore :)
* Lots of error handling fixes and corner cases, we triggered most of
   them when our backend went bezerk-go-slow-mode.
* Deletes cached files when cache decides that the object is stale for
   real, previously it only NULL:ed the data structure in memory
   causing other requests oto read headers etc.

Not mentioned in bugzilla, this is probably also relevant:
* Cache-file-path for Headers are hashed on URL, body on r->filename
   if present. This allows for using the same cache with external
   programs (for example rsync).

For those interested in using the same cache for rsync, we have 
whipped up an open-wrapper (uses LD_PRELOAD) which seems to be doing 
the job nicely. It can't cache as much metadata as mod_disk_cache, 
but it is able to reuse the cached bodies at least, which is a good 
thing if you have a lot of client sites that rsync the same trees 
daily.

We're awaiting some progress on mod_ftp to be able to cache ftp too, 
all usable ftpd's we have seen uses chroot() which causes trouble when 
trying to wrap open() and friends to access files outside the chroot 
;)


/Nikke
-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se      |     nikke@acc.umu.se
---------------------------------------------------------------------------
  "What?  Hey.  Beverly." - Picard
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=