You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@httpd.apache.org by Justin Erenkrantz <ju...@erenkrantz.com> on 2004/08/01 10:54:06 UTC

[PATCH] mod_cache fixes

While at OSCON last week, Madhu and I had a chat about mod_cache - and how
Zeus is beating httpd 2.x all over the place because they cache and we still
don't by default and about how poor a job mod_cache is doing in the real
world.  Madhu has a bunch of gprof numbers that he said he'll be posting soon.
Anyway, during the course of our conversation, I got to looking at the actual
code and took a pass over mod_cache, mod_disk_cache, and mod_mem_cache to get
it to:

1) Allow for caching of pages in an internal redirect cycle
   [Added quick_handler to ap_internal_redirect]
   (MultiView pages wouldn't be cached at all: the cache is keyed off
    the final internal URI - so the redirects still need to happen, but we
    don't need to read the final file or walk its handlers...)
2) Shrunk code in mod_cache's quick_handler by eliminating dead branches
3) Allow mod_disk_cache to use sendfile
4) Always use atomics in mod_mem_cache [mandatory presence with APR 1.0!]
5) Try to reduce mod_cache's log verbosity [it was a significant bottleneck]

That's an overview of the functional changes.  The other aspect of this patch
is that mod_cache is in almost complete disarray - so there's a load of style
nits to get it readable and maintainable.  I also renamed the 'read_body' and
'write_body' to 'load_body' and 'save_body' respectively.  read and write in
the context of a network cache is *really* confusing.  I spent my first 30
minutes assuming that write_body was used for writing to the network - not to
the cache storage.  'save' and 'load' seem far better terminology in this
context.  (Don't even get me started on CACHE_IN not being an input filter!
That was another 45 minutes wasted - CACHE_SAVE is better terminology, IMHO.)

On a cursory look, a more efficient memory management scheme for mod_mem_cache
may need to be implemented (pools?) - but I need to get a better handle on
what the memory strategy is there right now.  mod_disk_cache could probably
have a more efficient on-disk format (and somehow skip stat calls).  Madhu
also pointed out that mod_disk_cache should try to leave the file handles open
if possible instead of closing them per request.  So, that probably needs some
thinking through.

I've only tested it locally with a variety of browsers (Safari, Firefox, wget,
curl) - seems to work.  Comments welcome.  I'd like to commit these fixes to
httpd 2.1 and work towards getting the caching code to comply to the RFC and
increasing our performance significantly (with the goal of getting cache out
of experimental for 2.2).  IMHO, it's one of the major components of RFC 2616
we've yet to solidly nail.  -- justin

P.S.  Here's hoping my mailer doesn't munge the patch.

Index: modules/http/http_request.c
===================================================================
RCS file: /home/cvs/httpd-2.0/modules/http/http_request.c,v
retrieving revision 1.164
diff -u -r1.164 http_request.c
--- modules/http/http_request.c	9 Feb 2004 20:29:20 -0000	1.164
+++ modules/http/http_request.c	1 Aug 2004 08:24:53 -0000
@@ -455,16 +455,19 @@
         return;
     }

-    access_status = ap_process_request_internal(new);
-    if (access_status == OK) {
-        if ((access_status = ap_invoke_handler(new)) != 0) {
+    access_status = ap_run_quick_handler(new, 0);  /* Not a look-up request */
+    if (access_status == DECLINED) {
+        access_status = ap_process_request_internal(new);
+        if (access_status == OK) {
+            if ((access_status = ap_invoke_handler(new)) != 0) {
+                ap_die(access_status, new);
+                return;
+            }
+            ap_finalize_request_protocol(new);
+        }
+        else {
             ap_die(access_status, new);
-            return;
         }
-        ap_finalize_request_protocol(new);
-    }
-    else {
-        ap_die(access_status, new);
     }
 }

Index: modules/experimental/mod_cache.c
===================================================================
RCS file: /home/cvs/httpd-2.0/modules/experimental/mod_cache.c,v
retrieving revision 1.83
diff -u -r1.83 mod_cache.c
--- modules/experimental/mod_cache.c	25 May 2004 18:01:02 -0000	1.83
+++ modules/experimental/mod_cache.c	1 Aug 2004 08:24:52 -0000
@@ -26,7 +26,7 @@
 /* Handles for cache filters, resolved at startup to eliminate
  * a name-to-function mapping on each request
  */
-static ap_filter_rec_t *cache_in_filter_handle;
+static ap_filter_rec_t *cache_save_filter_handle;
 static ap_filter_rec_t *cache_out_filter_handle;
 static ap_filter_rec_t *cache_conditional_filter_handle;

@@ -40,7 +40,7 @@
  * If no:
  *   check whether we're allowed to try cache it
  *   If yes:
- *     add CACHE_IN filter
+ *     add CACHE_SAVE filter
  *   If No:
  *     oh well.
  */
@@ -48,24 +48,30 @@
 static int cache_url_handler(request_rec *r, int lookup)
 {
     apr_status_t rv;
-    const char *cc_in, *pragma, *auth;
-    apr_uri_t uri = r->parsed_uri;
-    char *url = r->unparsed_uri;
+    const char *pragma, *auth;
+    apr_uri_t uri;
+    char *url;
     apr_size_t urllen;
-    char *path = uri.path;
+    char *path;
     const char *types;
-    cache_info *info = NULL;
+    cache_info *info;
     cache_request_rec *cache;
     cache_server_conf *conf;
+    apr_bucket_brigade *out;

-    conf = (cache_server_conf *) 
ap_get_module_config(r->server->module_config,
-                                                      &cache_module);
-
-    /* we don't handle anything but GET */
+    /* Delay initialization until we know we are handling a GET */
     if (r->method_number != M_GET) {
         return DECLINED;
     }

+    uri = r->parsed_uri;
+    url = r->unparsed_uri;
+    path = uri.path;
+    info = NULL;
+
+    conf = (cache_server_conf *) 
ap_get_module_config(r->server->module_config,
+                                                      &cache_module);
+
     /*
      * Which cache module (if any) should handle this request?
      */
@@ -73,19 +79,8 @@
         return DECLINED;
     }

-    urllen = strlen(url);
-    if (urllen > MAX_URL_LENGTH) {
-        ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, r->server,
-                     "cache: URL exceeds length threshold: %s", url);
-        return DECLINED;
-    }
-    /* DECLINE urls ending in / ??? EGP: why? */
-    if (url[urllen-1] == '/') {
-        return DECLINED;
-    }
-
     /* make space for the per request config */
-    cache = (cache_request_rec *) ap_get_module_config(r->request_config,
+    cache = (cache_request_rec *) ap_get_module_config(r->request_config,
                                                        &cache_module);
     if (!cache) {
         cache = apr_pcalloc(r->pool, sizeof(cache_request_rec));
@@ -100,7 +95,6 @@
      */

     /* find certain cache controlling headers */
-    cc_in = apr_table_get(r->headers_in, "Cache-Control");
     pragma = apr_table_get(r->headers_in, "Pragma");
     auth = apr_table_get(r->headers_in, "Authorization");

@@ -123,13 +117,14 @@
                      "%s, but we know better and are ignoring it", url);
     }
     else {
-        if (ap_cache_liststr(NULL, cc_in, "no-store", NULL) ||
-            ap_cache_liststr(NULL, pragma, "no-cache", NULL) || (auth != 
NULL)) {
+        if (ap_cache_liststr(NULL, pragma, "no-cache", NULL) ||
+            auth != NULL) {
             /* delete the previously cached file */
             cache_remove_url(r, cache->types, url);

             ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, r->server,
-                         "cache: no-store forbids caching of %s", url);
+                         "cache: no-cache or authorization forbids caching "
+                         "of %s", url);
             return DECLINED;
         }
     }
@@ -137,171 +132,106 @@
     /*
      * Try to serve this request from the cache.
      *
-     * If no existing cache file
-     *   add cache_in filter
-     * If stale cache file
-     *   If conditional request
-     *     add cache_in filter
-     *   If non-conditional request
-     *     fudge response into a conditional
-     *     add cache_conditional filter
-     * If fresh cache file
-     *   clear filter stack
-     *   add cache_out filter
+     * If no existing cache file (DECLINED)
+     *   add cache_save filter
+     * If cached file (OK)
+     *   If fresh cache file
+     *     clear filter stack
+     *     add cache_out filter
+     *     return OK
+     *   If stale cache file
+     *       add cache_conditional filter (which updates cache)
      */

     rv = cache_select_url(r, cache->types, url);
-    if (DECLINED == rv) {
-        if (!lookup) {
-            /* no existing cache file */
-            ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, r->server,
-                         "cache: no cache - add cache_in filter and DECLINE");
-            /* add cache_in filter to cache this request */
-            ap_add_output_filter_handle(cache_in_filter_handle, NULL, r,
-                                        r->connection);
+    if (rv != OK) {
+        if (rv == DECLINED) {
+            if (!lookup) {
+                /* add cache_save filter to cache this request */
+                ap_add_output_filter_handle(cache_save_filter_handle, NULL, r,
+                                            r->connection);
+            }
+        }
+        else {
+            /* error */
+            ap_log_error(APLOG_MARK, APLOG_ERR, rv, r->server,
+                         "cache: error returned while checking for cached "
+                         "file by %s cache", cache->type);
         }
         return DECLINED;
     }
-    else if (OK == rv) {
-        /* RFC2616 13.2 - Check cache object expiration */
-        cache->fresh = ap_cache_check_freshness(cache, r);
-        if (cache->fresh) {
-            /* fresh data available */
-            apr_bucket_brigade *out;
-            conn_rec *c = r->connection;

-            if (lookup) {
-                return OK;
-            }
+    /* We have located a suitable cache file now. */

+    /* RFC2616 13.2 - Check cache object expiration */
+    cache->fresh = ap_cache_check_freshness(cache, r);
+
+    /* What we have in our cache isn't fresh. */
+    if (!cache->fresh) {
+        /* If our stale cached response was conditional... */
+        if (!lookup && ap_cache_request_is_conditional(r)) {
             info = &(cache->handle->cache_obj->info);

-            if (info && info->lastmod) {
-                ap_update_mtime(r, info->lastmod);
+            /* fudge response into a conditional */
+            if (info && info->etag) {
+                /* if we have a cached etag */
+                apr_table_set(r->headers_in, "If-None-Match", info->etag);
             }
-
-            rv = ap_meets_conditions(r);
-            if (rv != OK) {
-                ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, r->server,
-                             "cache: fresh cache - returning status %d", rv);
-                return rv;
+            else if (info && info->lastmods) {
+                /* if we have a cached IMS */
+                apr_table_set(r->headers_in, "If-Modified-Since",
+                              info->lastmods);
             }
+        }

-            /*
-             * Not a conditionl request. Serve up the content
-             */
-            ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, r->server,
-                         "cache: fresh cache - add cache_out filter and "
-                         "handle request");
+        /* Add cache_conditional_filter to see if we can salvage
+         * later.
+         */
+        ap_add_output_filter_handle(cache_conditional_filter_handle,
+                                    NULL, r, r->connection);
+        return DECLINED;
+    }

-            /* We are in the quick handler hook, which means that no output
-             * filters have been set. So lets run the insert_filter hook.
-             */
-            ap_run_insert_filter(r);
-            ap_add_output_filter_handle(cache_out_filter_handle, NULL,
-                                        r, r->connection);
-
-            /* kick off the filter stack */
-            out = apr_brigade_create(r->pool, c->bucket_alloc);
-            if (APR_SUCCESS
-                != (rv = ap_pass_brigade(r->output_filters, out))) {
-                ap_log_error(APLOG_MARK, APLOG_ERR, rv, r->server,
-                             "cache: error returned while trying to return %s 
"
-                             "cached data",
-                             cache->type);
-                return rv;
-            }
-            return OK;
-        }
-        else {
-            if (!r->err_headers_out) {
-                r->err_headers_out = apr_table_make(r->pool, 3);
-            }
-            /* stale data available */
-            if (lookup) {
-                return DECLINED;
-            }
+    /* fresh data available */

-            ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, r->server,
-                         "cache: stale cache - test conditional");
-            /* if conditional request */
-            if (ap_cache_request_is_conditional(r)) {
-                ap_log_error(APLOG_MARK, APLOG_DEBUG, 0,
-                             r->server,
-                             "cache: conditional - add cache_in filter and "
-                             "DECLINE");
-                /* Why not add CACHE_CONDITIONAL? */
-                ap_add_output_filter_handle(cache_in_filter_handle, NULL,
-                                            r, r->connection);
+    info = &(cache->handle->cache_obj->info);

-                return DECLINED;
-            }
-            /* else if non-conditional request */
-            else {
-                /* Temporarily hack this to work the way it had been. Its 
broken,
-                 * but its broken the way it was before. I'm working on 
figuring
-                 * out why the filter add in the conditional filter doesn't 
work. pjr
-                 *
-                 * info = &(cache->handle->cache_obj->info);
-                 *
-                 * Uncomment the above when the code in 
cache_conditional_filter_handle
-                 * is properly fixed...  pjr
-                 */
-
-                /* fudge response into a conditional */
-                if (info && info->etag) {
-                    ap_log_error(APLOG_MARK, APLOG_DEBUG, 0,
-                                 r->server,
-                                 "cache: nonconditional - fudge conditional "
-                                 "by etag");
-                    /* if we have a cached etag */
-                    apr_table_set(r->headers_in, "If-None-Match", info->etag);
-                }
-                else if (info && info->lastmods) {
-                    ap_log_error(APLOG_MARK, APLOG_DEBUG, 0,
-                                 r->server,
-                                 "cache: nonconditional - fudge conditional "
-                                 "by lastmod");
-                    /* if we have a cached IMS */
-                    apr_table_set(r->headers_in,
-                                  "If-Modified-Since",
-                                  info->lastmods);
-                }
-                else {
-                    /* something else - pretend there was no cache */
-                    ap_log_error(APLOG_MARK, APLOG_DEBUG, 0,
-                                 r->server,
-                                 "cache: nonconditional - no cached "
-                                 "etag/lastmods - add cache_in and DECLINE");
-
-                    ap_add_output_filter_handle(cache_in_filter_handle, NULL,
-                                                r, r->connection);
-
-                    return DECLINED;
-                }
-                /* add cache_conditional filter */
-                ap_log_error(APLOG_MARK, APLOG_DEBUG, 0,
-                             r->server,
-                             "cache: nonconditional - add cache_conditional "
-                             "and DECLINE");
-                ap_add_output_filter_handle(cache_conditional_filter_handle,
-                                            NULL,
-                                            r,
-                                            r->connection);
+    if (info && info->lastmod) {
+        ap_update_mtime(r, info->lastmod);
+    }

-                return DECLINED;
-            }
-        }
+    rv = ap_meets_conditions(r);
+    if (rv != OK) {
+        /* Return cached status. */
+        return rv;
     }
-    else {
-        /* error */
-        ap_log_error(APLOG_MARK, APLOG_ERR, rv,
-                     r->server,
-                     "cache: error returned while checking for cached file by 
"
-                     "%s cache",
+
+    /* If we're a lookup, we can exit now instead of serving the content. */
+    if (lookup) {
+        return OK;
+    }
+
+    /* Serve up the content */
+
+    /* We are in the quick handler hook, which means that no output
+     * filters have been set. So lets run the insert_filter hook.
+     */
+    ap_run_insert_filter(r);
+    ap_add_output_filter_handle(cache_out_filter_handle, NULL,
+                                r, r->connection);
+
+    /* kick off the filter stack */
+    out = apr_brigade_create(r->pool, r->connection->bucket_alloc);
+    rv = ap_pass_brigade(r->output_filters, out);
+    if (rv != APR_SUCCESS) {
+        ap_log_error(APLOG_MARK, APLOG_ERR, rv, r->server,
+                     "cache: error returned while trying to return %s "
+                     "cached data",
                      cache->type);
-        return DECLINED;
+        return rv;
     }
+
+    return OK;
 }

 /*
@@ -327,17 +257,12 @@
         return ap_pass_brigade(f->next, bb);
     }

-    ap_log_error(APLOG_MARK, APLOG_DEBUG, APR_SUCCESS, r->server,
-                 "cache: running CACHE_OUT filter");
-
     /* cache_read_entity_headers() was called in cache_select_url() */
     cache_read_entity_body(cache->handle, r->pool, bb);

     /* This filter is done once it has served up its content */
     ap_remove_output_filter(f);

-    ap_log_error(APLOG_MARK, APLOG_DEBUG, APR_SUCCESS, r->server,
-                 "cache: serving %s", r->uri);
     return ap_pass_brigade(f->next, bb);
 }

@@ -349,9 +274,9 @@
  * Decide whether or not cached content should be delivered
  * based on our fudged conditional request.
  * If response HTTP_NOT_MODIFIED
- *   replace ourselves with cache_out filter
+ *   don't do anything (no body sent anyway)
  * Otherwise
- *   replace ourselves with cache_in filter
+ *   replace ourselves with cache_save filter to update our cache
  */

 static int cache_conditional_filter(ap_filter_t *f, apr_bucket_brigade *in)
@@ -359,14 +284,9 @@
     ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, f->r->server,
                  "cache: running CACHE_CONDITIONAL filter");

-    if (f->r->status == HTTP_NOT_MODIFIED) {
-        /* replace ourselves with CACHE_OUT filter */
-        ap_add_output_filter_handle(cache_out_filter_handle, NULL,
-                                    f->r, f->r->connection);
-    }
-    else {
-        /* replace ourselves with CACHE_IN filter */
-        ap_add_output_filter_handle(cache_in_filter_handle, NULL,
+    if (f->r->status != HTTP_NOT_MODIFIED) {
+        /* replace ourselves with CACHE_SAVE filter */
+        ap_add_output_filter_handle(cache_save_filter_handle, NULL,
                                     f->r, f->r->connection);
     }
     ap_remove_output_filter(f);
@@ -376,7 +296,7 @@


 /*
- * CACHE_IN filter
+ * CACHE_SAVE filter
  * ---------------
  *
  * Decide whether or not this content should be cached.
@@ -388,7 +308,7 @@
  *
  */

-static int cache_in_filter(ap_filter_t *f, apr_bucket_brigade *in)
+static int cache_save_filter(ap_filter_t *f, apr_bucket_brigade *in)
 {
     int rv;
     int date_in_errhdr = 0;
@@ -396,7 +316,7 @@
     cache_request_rec *cache;
     cache_server_conf *conf;
     char *url = r->unparsed_uri;
-    const char *cc_out, *cl;
+    const char *cc_in, *cc_out, *cl;
     const char *exps, *lastmods, *dates, *etag;
     apr_time_t exp, date, lastmod, now;
     apr_off_t size;
@@ -405,17 +325,19 @@
     apr_pool_t *p;

     /* check first whether running this filter has any point or not */
-    if(r->no_cache) {
+    /* If the user has Cache-Control: no-store from RFC 2616, don't store! */
+    cc_in = apr_table_get(r->headers_in, "Cache-Control");
+    if (r->no_cache || ap_cache_liststr(NULL, cc_in, "no-store", NULL)) {
         ap_remove_output_filter(f);
         return ap_pass_brigade(f->next, in);
     }

-    ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, r->server,
-                 "cache: running CACHE_IN filter");
-
     /* Setup cache_request_rec */
-    cache = (cache_request_rec *) ap_get_module_config(r->request_config, 
&cache_module);
+    cache = (cache_request_rec *) ap_get_module_config(r->request_config,
+                                                       &cache_module);
     if (!cache) {
+        /* user likely configured CACHE_SAVE manually; they should really use
+         * mod_cache configuration to do that */
         cache = apr_pcalloc(r->pool, sizeof(cache_request_rec));
         ap_set_module_config(r->request_config, &cache_module, cache);
     }
@@ -1028,9 +950,9 @@
      * Make them AP_FTYPE_CONTENT for now.
      * XXX ianhH:they should run AFTER all the other content filters.
      */
-    cache_in_filter_handle =
-        ap_register_output_filter("CACHE_IN",
-                                  cache_in_filter,
+    cache_save_filter_handle =
+        ap_register_output_filter("CACHE_SAVE",
+                                  cache_save_filter,
                                   NULL,
                                   AP_FTYPE_CONTENT_SET-1);
     /* CACHE_OUT must go into the filter chain before SUBREQ_CORE to
Index: modules/experimental/mod_cache.h
===================================================================
RCS file: /home/cvs/httpd-2.0/modules/experimental/mod_cache.h,v
retrieving revision 1.44
diff -u -r1.44 mod_cache.h
--- modules/experimental/mod_cache.h	9 Feb 2004 20:29:18 -0000	1.44
+++ modules/experimental/mod_cache.h	1 Aug 2004 08:24:52 -0000
@@ -64,11 +64,7 @@
 #include <arpa/inet.h>
 #endif

-/* USE_ATOMICS should be replaced with the appropriate APR feature macro */
-#define USE_ATOMICS
-#ifdef USE_ATOMICS
 #include "apr_atomic.h"
-#endif

 #ifndef MAX
 #define MAX(a,b)                ((a) > (b) ? (a) : (b))
@@ -175,11 +171,7 @@
     void *vobj;         /* Opaque portion (specific to the cache 
implementation) of the cache object */
     apr_size_t count;   /* Number of body bytes written to the cache so far */
     int complete;
-#ifdef USE_ATOMICS
     apr_uint32_t refcount;
-#else
-    apr_size_t refcount;
-#endif
     apr_size_t cleanup;
 };

Index: modules/experimental/mod_disk_cache.c
===================================================================
RCS file: /home/cvs/httpd-2.0/modules/experimental/mod_disk_cache.c,v
retrieving revision 1.52
diff -u -r1.52 mod_disk_cache.c
--- modules/experimental/mod_disk_cache.c	18 Mar 2004 21:40:12 -0000	1.52
+++ modules/experimental/mod_disk_cache.c	1 Aug 2004 08:24:52 -0000
@@ -64,7 +64,7 @@
     apr_time_t gcinterval;       /* garbage collection interval, in msec */
     int dirlevels;               /* Number of levels of subdirectories */
     int dirlength;               /* Length of subdirectory names */
-    int        expirychk;               /* true if expiry time is observed 
for cached files */
+    int expirychk;               /* true if expiry time is observed for 
cached files */
     apr_size_t minfs;            /* minumum file size for cached files */
     apr_size_t maxfs;            /* maximum file size for cached files */
     apr_time_t mintm;            /* minimum time margin for caching files */
@@ -78,10 +78,12 @@

 /* Forward declarations */
 static int remove_entity(cache_handle_t *h);
-static apr_status_t write_headers(cache_handle_t *h, request_rec *r, 
cache_info *i);
-static apr_status_t write_body(cache_handle_t *h, request_rec *r, 
apr_bucket_brigade *b);
-static apr_status_t read_headers(cache_handle_t *h, request_rec *r);
-static apr_status_t read_body(cache_handle_t *h, apr_pool_t *p, 
apr_bucket_brigade *bb);
+static apr_status_t save_headers(cache_handle_t *h, request_rec *r,
+                                 cache_info *i);
+static apr_status_t save_body(cache_handle_t *h, request_rec *r,
+                              apr_bucket_brigade *b);
+static apr_status_t load_headers(cache_handle_t *h, request_rec *r);
+static apr_status_t load_body(cache_handle_t *h, apr_pool_t *p, 
apr_bucket_brigade *bb);

 /*
  * Local static functions
@@ -136,8 +138,9 @@
     if (dobj->fd) {
         apr_file_flush(dobj->fd);
         if (!dobj->datafile) {
-            dobj->datafile = data_file(r->pool, conf->dirlevels, 
conf->dirlength,
-                                       conf->cache_root, h->cache_obj->key);
+            dobj->datafile = data_file(r->pool, conf->dirlevels,
+                                       conf->dirlength, conf->cache_root,
+                                       h->cache_obj->key);
         }
         /* Remove old file with the same name. If remove fails, then
          * perhaps we need to create the directory tree where we are
@@ -249,11 +252,11 @@
     char *buf;
     apr_size_t amt;

-    char	dateHexS[sizeof(apr_time_t) * 2 + 1];
-    char	expireHexS[sizeof(apr_time_t) * 2 + 1];
-    char	verHexS[sizeof(apr_time_t) * 2 + 1];
-    char	requestHexS[sizeof(apr_time_t) * 2 + 1];
-    char	responseHexS[sizeof(apr_time_t) * 2 + 1];
+    char dateHexS[sizeof(apr_time_t) * 2 + 1];
+    char expireHexS[sizeof(apr_time_t) * 2 + 1];
+    char verHexS[sizeof(apr_time_t) * 2 + 1];
+    char requestHexS[sizeof(apr_time_t) * 2 + 1];
+    char responseHexS[sizeof(apr_time_t) * 2 + 1];
     cache_info *info = &(h->cache_obj->info);
     disk_cache_object_t *dobj = (disk_cache_object_t *) h->cache_obj->vobj;

@@ -337,17 +340,17 @@
     if (rv == APR_SUCCESS) {
         /* Populate the cache handle */
         h->cache_obj = obj;
-        h->read_body = &read_body;
-        h->read_headers = &read_headers;
-        h->write_body = &write_body;
-        h->write_headers = &write_headers;
+        h->read_body = &load_body;
+        h->read_headers = &load_headers;
+        h->write_body = &save_body;
+        h->write_headers = &save_headers;
         h->remove_entity = &remove_entity;

-        ap_log_error(APLOG_MARK, APLOG_INFO, 0, r->server,
+        ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, r->server,
                      "disk_cache: Caching URL %s",  key);
     }
     else {
-        ap_log_error(APLOG_MARK, APLOG_INFO, 0, r->server,
+        ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, r->server,
                      "disk_cache: Could not cache URL %s [%d]", key, rv);

         return DECLINED;
@@ -356,7 +359,8 @@
     return OK;
 }

-static int open_entity(cache_handle_t *h, request_rec *r, const char *type, 
const char *key)
+static int open_entity(cache_handle_t *h, request_rec *r, const char *type,
+                       const char *key)
 {
     apr_status_t rc;
     static int error_logged = 0;
@@ -370,6 +374,7 @@
     cache_object_t *obj;
     cache_info *info;
     disk_cache_object_t *dobj;
+    int flags;

     h->cache_obj = NULL;

@@ -382,7 +387,8 @@
         if (!error_logged) {
             error_logged = 1;
             ap_log_error(APLOG_MARK, APLOG_ERR, 0, r->server,
-                         "disk_cache: Cannot cache files to disk without a 
CacheRoot specified.");
+                         "disk_cache: Cannot cache files to disk without a "
+                         "CacheRoot specified.");
         }
         return DECLINED;
     }
@@ -393,14 +399,18 @@
                           conf->cache_root, key);

     /* Open the data file */
-    rc = apr_file_open(&fd, data, APR_READ|APR_BINARY, 0, r->pool);
+    flags = APR_READ|APR_BINARY;
+#ifdef APR_SENDFILE_ENABLED
+    flags |= APR_SENDFILE_ENABLED;
+#endif
+    rc = apr_file_open(&fd, data, flags, 0, r->pool);
     if (rc != APR_SUCCESS) {
         /* XXX: Log message */
         return DECLINED;
     }

     /* Open the headers file */
-    rc = apr_file_open(&hfd, headers, APR_READ|APR_BINARY, 0, r->pool);
+    rc = apr_file_open(&hfd, headers, flags, 0, r->pool);
     if (rc != APR_SUCCESS) {
         /* XXX: Log message */
         return DECLINED;
@@ -431,10 +441,10 @@
     }

     /* Initialize the cache_handle callback functions */
-    h->read_body = &read_body;
-    h->read_headers = &read_headers;
-    h->write_body = &write_body;
-    h->write_headers = &write_headers;
+    h->read_body = &load_body;
+    h->read_headers = &load_headers;
+    h->write_body = &save_body;
+    h->write_headers = &save_headers;
     h->remove_entity = &remove_entity;

     ap_log_error(APLOG_MARK, APLOG_INFO, 0, r->server,
@@ -456,7 +466,7 @@
  * @@@: XXX: FIXME: currently the headers are passed thru un-merged.
  * Is that okay, or should they be collapsed where possible?
  */
-static apr_status_t read_headers(cache_handle_t *h, request_rec *r)
+static apr_status_t load_headers(cache_handle_t *h, request_rec *r)
 {
     apr_status_t rv;
     char urlbuff[1034];
@@ -465,7 +475,7 @@
     apr_table_t * tmp;

     /* This case should not happen... */
-    if (!dobj->fd || !dobj->hfd) {
+    if (!dobj->hfd) {
         /* XXX log message */
         return APR_NOTFOUND;
     }
@@ -517,7 +527,7 @@
     return APR_SUCCESS;
 }

-static apr_status_t read_body(cache_handle_t *h, apr_pool_t *p, 
apr_bucket_brigade *bb)
+static apr_status_t load_body(cache_handle_t *h, apr_pool_t *p, 
apr_bucket_brigade *bb)
 {
     apr_bucket *e;
     disk_cache_object_t *dobj = (disk_cache_object_t*) h->cache_obj->vobj;
@@ -531,7 +541,8 @@
     return APR_SUCCESS;
 }

-static apr_status_t write_headers(cache_handle_t *h, request_rec *r, 
cache_info *info)
+static apr_status_t save_headers(cache_handle_t *h, request_rec *r,
+                                 cache_info *info)
 {
     disk_cache_conf *conf = ap_get_module_config(r->server->module_config,
                                                  &disk_cache_module);
@@ -611,9 +622,9 @@
         amt = strlen(buf);
         apr_file_write(hfd, buf, &amt);

-	/* Parse the vary header and dump those fields from the headers_in. */
-	/* Make call to the same thing cache_select_url calls to crack Vary. */
-	/* @@@ Some day, not today. */
+        /* Parse the vary header and dump those fields from the headers_in. */
+        /* Make call to the same thing cache_select_url calls to crack Vary. 
*/
+        /* @@@ Some day, not today. */
         if (r->headers_in) {
             int i;
             apr_table_entry_t *elts = (apr_table_entry_t *) 
apr_table_elts(r->headers_in)->elts;
@@ -639,7 +650,8 @@
     return APR_SUCCESS;
 }

-static apr_status_t write_body(cache_handle_t *h, request_rec *r, 
apr_bucket_brigade *b)
+static apr_status_t save_body(cache_handle_t *h, request_rec *r,
+                              apr_bucket_brigade *b)
 {
     apr_bucket *e;
     apr_status_t rv;
Index: modules/experimental/mod_mem_cache.c
===================================================================
RCS file: /home/cvs/httpd-2.0/modules/experimental/mod_mem_cache.c,v
retrieving revision 1.109
diff -u -r1.109 mod_mem_cache.c
--- modules/experimental/mod_mem_cache.c	25 May 2004 18:22:58 -0000	1.109
+++ modules/experimental/mod_mem_cache.c	1 Aug 2004 08:24:52 -0000
@@ -57,13 +57,8 @@
     apr_os_file_t fd;
     apr_int32_t flags;  /* File open flags */
     long priority;      /**< the priority of this entry */
-    long total_refs;          /**< total number of references this entry has 
had */
-
-#ifdef USE_ATOMICS
+    long total_refs;    /**< total number of references this entry has had */
     apr_uint32_t pos;   /**< the position of this entry in the cache */
-#else
-    apr_ssize_t pos;
-#endif

 } mem_cache_object_t;

@@ -93,22 +88,25 @@

 /* Forward declarations */
 static int remove_entity(cache_handle_t *h);
-static apr_status_t write_headers(cache_handle_t *h, request_rec *r, 
cache_info *i);
-static apr_status_t write_body(cache_handle_t *h, request_rec *r, 
apr_bucket_brigade *b);
-static apr_status_t read_headers(cache_handle_t *h, request_rec *r);
-static apr_status_t read_body(cache_handle_t *h, apr_pool_t *p, 
apr_bucket_brigade *bb);
+static apr_status_t save_headers(cache_handle_t *h, request_rec *r,
+                                 cache_info *i);
+static apr_status_t save_body(cache_handle_t *h, request_rec *r,
+                              apr_bucket_brigade *b);
+static apr_status_t load_headers(cache_handle_t *h, request_rec *r);
+static apr_status_t load_body(cache_handle_t *h, apr_pool_t *p,
+                              apr_bucket_brigade *bb);

 static void cleanup_cache_object(cache_object_t *obj);

-static long memcache_get_priority(void*a)
+static long memcache_get_priority(void *a)
 {
     cache_object_t *obj = (cache_object_t *)a;
     mem_cache_object_t *mobj = obj->vobj;

-    return  mobj->priority;
+    return mobj->priority;
 }

-static void memcache_inc_frequency(void*a)
+static void memcache_inc_frequency(void *a)
 {
     cache_object_t *obj = (cache_object_t *)a;
     mem_cache_object_t *mobj = obj->vobj;
@@ -122,44 +120,39 @@
     cache_object_t *obj = (cache_object_t *)a;
     mem_cache_object_t *mobj = obj->vobj;

-#ifdef USE_ATOMICS
     apr_atomic_set32(&mobj->pos, pos);
-#else
-    mobj->pos = pos;
-#endif
 }
+
 static apr_ssize_t memcache_get_pos(void *a)
 {
     cache_object_t *obj = (cache_object_t *)a;
     mem_cache_object_t *mobj = obj->vobj;

-#ifdef USE_ATOMICS
     return apr_atomic_read32(&mobj->pos);
-#else
-    return mobj->pos;
-#endif
 }

-static apr_size_t memcache_cache_get_size(void*a)
+static apr_size_t memcache_cache_get_size(void *a)
 {
     cache_object_t *obj = (cache_object_t *)a;
     mem_cache_object_t *mobj = obj->vobj;
     return mobj->m_len;
 }
+
 /** callback to get the key of a item */
-static const char* memcache_cache_get_key(void*a)
+static const char* memcache_cache_get_key(void *a)
 {
     cache_object_t *obj = (cache_object_t *)a;
     return obj->key;
 }
-/**
- * callback to free an entry
+
+/**
+ * callback to free an entry
  * There is way too much magic in this code. Right now, this callback
  * is only called out of cache_insert() which is called under protection
  * of the sconf->lock, which means that we do not (and should not)
- * attempt to obtain the lock recursively.
+ * attempt to obtain the lock recursively.
  */
-static void memcache_cache_free(void*a)
+static void memcache_cache_free(void *a)
 {
     cache_object_t *obj = (cache_object_t *)a;

@@ -167,44 +160,34 @@
      * now. Increment the refcount before setting cleanup to avoid a race
      * condition. A similar pattern is used in remove_url()
      */
-#ifdef USE_ATOMICS
     apr_atomic_inc32(&obj->refcount);
-#else
-    obj->refcount++;
-#endif

     obj->cleanup = 1;

-#ifdef USE_ATOMICS
     if (!apr_atomic_dec32(&obj->refcount)) {
         cleanup_cache_object(obj);
     }
-#else
-    obj->refcount--;
-    if (!obj->refcount) {
-        cleanup_cache_object(obj);
-    }
-#endif
 }
+
 /*
  * functions return a 'negative' score since priority queues
  * dequeue the object with the highest value first
  */
-static long memcache_lru_algorithm(long queue_clock, void *a)
+static long memcache_lru_algorithm(long queue_clock, void *a)
 {
     cache_object_t *obj = (cache_object_t *)a;
     mem_cache_object_t *mobj = obj->vobj;
     if (mobj->priority == 0)
         mobj->priority = queue_clock - mobj->total_refs;

-    /*	
+    /*
      * a 'proper' LRU function would just be
-     *  mobj->priority = mobj->total_refs;
+     *  mobj->priority = mobj->total_refs;
      */
     return mobj->priority;
 }

-static long memcache_gdsf_algorithm(long queue_clock, void *a)
+static long memcache_gdsf_algorithm(long queue_clock, void *a)
 {
     cache_object_t *obj = (cache_object_t *)a;
     mem_cache_object_t *mobj = obj->vobj;
@@ -244,7 +227,7 @@
     }

     free(obj);
-
+
     /* Cleanup the mem_cache_object_t */
     if (mobj) {
         if (mobj->type == CACHE_TYPE_HEAP && mobj->m) {
@@ -285,7 +268,8 @@
         free(mobj);
     }
 }
-static apr_status_t decrement_refcount(void *arg)
+
+static apr_status_t decrement_refcount(void *arg)
 {
     cache_object_t *obj = (cache_object_t *) arg;

@@ -310,29 +294,14 @@
     }

     /* Cleanup the cache object */
-#ifdef USE_ATOMICS
     if (!apr_atomic_dec32(&obj->refcount)) {
         if (obj->cleanup) {
             cleanup_cache_object(obj);
         }
     }
-#else
-    if (sconf->lock) {
-        apr_thread_mutex_lock(sconf->lock);
-    }
-    obj->refcount--;
-    /* If the object is marked for cleanup and the refcount
-     * has dropped to zero, cleanup the object
-     */
-    if ((obj->cleanup) && (!obj->refcount)) {
-        cleanup_cache_object(obj);
-    }
-    if (sconf->lock) {
-        apr_thread_mutex_unlock(sconf->lock);
-    }
-#endif
     return APR_SUCCESS;
 }
+
 static apr_status_t cleanup_cache_mem(void *sconfv)
 {
     cache_object_t *obj;
@@ -349,23 +318,18 @@
         apr_thread_mutex_lock(sconf->lock);
     }
     obj = cache_pop(co->cache_cache);
-    while (obj) {
-    /* Iterate over the cache and clean up each entry */
-    /* Free the object if the recount == 0 */
-#ifdef USE_ATOMICS
+    while (obj) {
+        /* Iterate over the cache and clean up each entry */
+        /* Free the object if the recount == 0 */
         apr_atomic_inc32(&obj->refcount);
         obj->cleanup = 1;
         if (!apr_atomic_dec32(&obj->refcount)) {
-#else
-        obj->cleanup = 1;
-        if (!obj->refcount) {
-#endif
             cleanup_cache_object(obj);
         }
         obj = cache_pop(co->cache_cache);
     }

-    /* Cache is empty, free the cache table */
+    /* Cache is empty, free the cache table */
     cache_free(co->cache_cache);

     if (sconf->lock) {
@@ -373,6 +337,7 @@
     }
     return APR_SUCCESS;
 }
+
 /*
  * TODO: enable directives to be overridden in various containers
  */
@@ -416,7 +381,7 @@
     if (len == -1) {
         /* Caching a streaming response. Assume the response is
          * less than or equal to max_streaming_buffer_size. We will
-         * correct all the cache size counters in write_body once
+         * correct all the cache size counters in save_body once
          * we know exactly know how much we are caching.
          */
         len = sconf->max_streaming_buffer_size;
@@ -427,11 +392,11 @@
      * exceeded. This means mod_mem_cache does not need to implement
      * max_cache_size checks.
      */
-    if (len < sconf->min_cache_object_size ||
+    if (len < sconf->min_cache_object_size ||
         len > sconf->max_cache_object_size) {
         ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, r->server,
-                     "mem_cache: URL %s failed the size check and will not be 
cached.",
-                     key);
+                     "mem_cache: URL %s failed the size check and will not be 
"
+                     "cached.", key);
         return DECLINED;
     }

@@ -468,11 +433,7 @@
     }

     /* Finish initing the cache object */
-#ifdef USE_ATOMICS
     apr_atomic_set32(&obj->refcount, 1);
-#else
-    obj->refcount = 1;
-#endif
     mobj->total_refs = 1;
     obj->complete = 0;
     obj->cleanup = 0;
@@ -499,7 +460,7 @@
     tmp_obj = (cache_object_t *) cache_find(sconf->cache_cache, key);

     if (!tmp_obj) {
-        cache_insert(sconf->cache_cache, obj);
+        cache_insert(sconf->cache_cache, obj);
     }
     if (sconf->lock) {
         apr_thread_mutex_unlock(sconf->lock);
@@ -519,16 +480,17 @@

     /* Populate the cache handle */
     h->cache_obj = obj;
-    h->read_body = &read_body;
-    h->read_headers = &read_headers;
-    h->write_body = &write_body;
-    h->write_headers = &write_headers;
+    h->read_body = &load_body;
+    h->read_headers = &load_headers;
+    h->write_body = &save_body;
+    h->write_headers = &save_headers;
     h->remove_entity = &remove_entity;

     return OK;
 }

-static int open_entity(cache_handle_t *h, request_rec *r, const char *type, 
const char *key)
+static int open_entity(cache_handle_t *h, request_rec *r, const char *type,
+                       const char *key)
 {
     cache_object_t *obj;

@@ -543,11 +505,7 @@
     if (obj) {
         if (obj->complete) {
             request_rec *rmain=r, *rtmp;
-#ifdef USE_ATOMICS
             apr_atomic_inc32(&obj->refcount);
-#else
-            obj->refcount++;
-#endif
             /* cache is worried about overall counts, not 'open' ones */
             cache_update(sconf->cache_cache, obj);

@@ -578,13 +536,13 @@
     }

     /* Initialize the cache_handle */
-    h->read_body = &read_body;
-    h->read_headers = &read_headers;
-    h->write_body = &write_body;
-    h->write_headers = &write_headers;
+    h->read_body = &load_body;
+    h->read_headers = &load_headers;
+    h->write_body = &save_body;
+    h->write_headers = &save_headers;
     h->remove_entity = &remove_entity;
     h->cache_obj = obj;
-    h->req_hdrs = NULL;  /* Pick these up in read_headers() */
+    h->req_hdrs = NULL;  /* Pick these up in load_headers() */
     return OK;
 }

@@ -623,10 +581,10 @@
     apr_size_t len = 0;
     apr_size_t idx = 0;
     char *buf;
-
+
     *nelts = elts_arr->nelts;
-    if (*nelts == 0 ) {
-        *obj=NULL;
+    if (*nelts == 0) {
+        *obj = NULL;
         return APR_SUCCESS;
     }
     *obj = malloc(sizeof(cache_header_tbl_t) * elts_arr->nelts);
@@ -648,7 +606,8 @@

     for (i = 0; i < *nelts; ++i) {
         (*obj)[i].hdr = &buf[idx];
-        len = strlen(elts[i].key) + 1;              /* Include NULL 
terminator */
+        /* Include NULL terminator */
+        len = strlen(elts[i].key) + 1;
         memcpy(&buf[idx], elts[i].key, len);
         idx+=len;

@@ -659,9 +618,9 @@
     }
     return APR_SUCCESS;
 }
-static int unserialize_table( cache_header_tbl_t *ctbl,
-                              int num_headers,
-                              apr_table_t *t )
+
+static int unserialize_table(cache_header_tbl_t *ctbl, int num_headers,
+                             apr_table_t *t)
 {
     int i;

@@ -671,14 +630,16 @@

     return APR_SUCCESS;
 }
+
 /* Define request processing hook handlers */
-static int remove_url(const char *type, const char *key)
+static int remove_url(const char *type, const char *key)
 {
     cache_object_t *obj;

     if (strcasecmp(type, "mem") && strcasecmp(type, "fd")) {
         return DECLINED;
     }
+
     /* Order of the operations is important to avoid race conditions.
      * First, remove the object from the cache. Remember, all additions
      * deletions from the cache are protected by sconf->lock.
@@ -691,24 +652,17 @@
     if (sconf->lock) {
         apr_thread_mutex_lock(sconf->lock);
     }
-
-    obj = cache_find(sconf->cache_cache, key);
+
+    obj = cache_find(sconf->cache_cache, key);
     if (obj) {
         mem_cache_object_t *mobj;
         cache_remove(sconf->cache_cache, obj);
         mobj = (mem_cache_object_t *) obj->vobj;

-#ifdef USE_ATOMICS
-        /* Refcount increment in this case MUST be made under
+        /* Refcount increment in this case MUST be made under
          * protection of the lock
          */
         apr_atomic_inc32(&obj->refcount);
-#else
-        if (!obj->refcount) {
-            cleanup_cache_object(obj);
-            obj = NULL;
-        }
-#endif
         if (obj) {
             obj->cleanup = 1;
         }
@@ -716,21 +670,19 @@
     if (sconf->lock) {
         apr_thread_mutex_unlock(sconf->lock);
     }
-#ifdef USE_ATOMICS
     if (obj) {
         if (!apr_atomic_dec32(&obj->refcount)) {
             cleanup_cache_object(obj);
         }
     }
-#endif
     return OK;
 }

-static apr_status_t read_headers(cache_handle_t *h, request_rec *r)
+static apr_status_t load_headers(cache_handle_t *h, request_rec *r)
 {
     int rc;
     mem_cache_object_t *mobj = (mem_cache_object_t*) h->cache_obj->vobj;
-
+
     h->req_hdrs = apr_table_make(r->pool, mobj->num_req_hdrs);
     r->headers_out = apr_table_make(r->pool, mobj->num_header_out);
     r->err_headers_out = apr_table_make(r->pool, mobj->num_err_header_out);
@@ -754,14 +706,16 @@
                             r->notes);

     /* Content-Type: header may not be set if content is local since
-     * CACHE_IN runs before header filters....
+     * CACHE_SAVE runs before header filters....
      */
-    ap_set_content_type(r, apr_pstrdup(r->pool, 
h->cache_obj->info.content_type));
+    ap_set_content_type(r, apr_pstrdup(r->pool,
+                                       h->cache_obj->info.content_type));

     return rc;
 }

-static apr_status_t read_body(cache_handle_t *h, apr_pool_t *p, 
apr_bucket_brigade *bb)
+static apr_status_t load_body(cache_handle_t *h, apr_pool_t *p,
+                              apr_bucket_brigade *bb)
 {
     apr_bucket *b;
     mem_cache_object_t *mobj = (mem_cache_object_t*) h->cache_obj->vobj;
@@ -784,12 +738,13 @@
 }


-static apr_status_t write_headers(cache_handle_t *h, request_rec *r, 
cache_info *info)
+static apr_status_t save_headers(cache_handle_t *h, request_rec *r,
+                                 cache_info *info)
 {
     cache_object_t *obj = h->cache_obj;
     mem_cache_object_t *mobj = (mem_cache_object_t*) obj->vobj;
     int rc;
-
+
     /*
      * The cache needs to keep track of the following information:
      * - Date, LastMod, Version, ReqTime, RespTime, ContentLength
@@ -805,21 +760,22 @@
     }

     /* Precompute how much storage we need to hold the headers */
-    rc = serialize_table(&mobj->header_out,
-                         &mobj->num_header_out,
-                         ap_cache_cacheable_hdrs_out(r->pool, 
r->headers_out));
+    rc = serialize_table(&mobj->header_out,
+                         &mobj->num_header_out,
+                         ap_cache_cacheable_hdrs_out(r->pool, 
r->headers_out));
     if (rc != APR_SUCCESS) {
         return rc;
     }
-    rc = serialize_table(&mobj->err_header_out,
-                         &mobj->num_err_header_out,
-                         ap_cache_cacheable_hdrs_out(r->pool, 
r->err_headers_out));
+    rc = serialize_table(&mobj->err_header_out,
+                         &mobj->num_err_header_out,
+                         ap_cache_cacheable_hdrs_out(r->pool,
+                                                     r->err_headers_out));
     if (rc != APR_SUCCESS) {
         return rc;
     }
     rc = serialize_table(&mobj->subprocess_env,
-                         &mobj->num_subprocess_env,
-                         r->subprocess_env );
+                         &mobj->num_subprocess_env,
+                         r->subprocess_env);
     if (rc != APR_SUCCESS) {
         return rc;
     }
@@ -828,7 +784,7 @@
     if (rc != APR_SUCCESS) {
         return rc;
     }
-
+
     /* Init the info struct */
     if (info->date) {
         obj->info.date = info->date;
@@ -869,7 +825,7 @@
         }
         memcpy(obj->info.lastmods, info->lastmods, len);
     }
-    if ( info->filename) {
+    if (info->filename) {
         apr_size_t len = strlen(info->filename) + 1;
         obj->info.filename = (char*) malloc(len);
         if (!obj->info.filename ) {
@@ -881,7 +837,8 @@
     return APR_SUCCESS;
 }

-static apr_status_t write_body(cache_handle_t *h, request_rec *r, 
apr_bucket_brigade *b)
+static apr_status_t save_body(cache_handle_t *h, request_rec *r,
+                              apr_bucket_brigade *b)
 {
     apr_status_t rv;
     cache_object_t *obj = h->cache_obj;
@@ -923,7 +880,8 @@
             /* Open a new XTHREAD handle to the file */
             apr_file_name_get(&name, file);
             mobj->flags = ((APR_SENDFILE_ENABLED & apr_file_flags_get(file))
-                           | APR_READ | APR_BINARY | APR_XTHREAD | 
APR_FILE_NOCLEANUP);
+                           | APR_READ | APR_BINARY | APR_XTHREAD |
+                             APR_FILE_NOCLEANUP);
             rv = apr_file_open(&tmpfile, name, mobj->flags,
                                APR_OS_DEFAULT, r->pool);
             if (rv != APR_SUCCESS) {
@@ -934,7 +892,8 @@

             /* Open for business */
             ap_log_error(APLOG_MARK, APLOG_INFO, 0, r->server,
-                         "mem_cache: Cached file: %s with key: %s", name, 
obj->key);
+                         "mem_cache: Cached file: %s with key: %s", name,
+                         obj->key);
             obj->complete = 1;
             return APR_SUCCESS;
         }
@@ -943,10 +902,10 @@
         mobj->type = CACHE_TYPE_HEAP;
     }

-    /*
+    /*
      * FD cacheing is not enabled or the content was not
      * suitable for fd caching.
-     */
+     */
     if (mobj->m == NULL) {
         mobj->m = malloc(mobj->m_len);
         if (mobj->m == NULL) {
@@ -966,8 +925,8 @@

         if (APR_BUCKET_IS_EOS(e)) {
             if (mobj->m_len > obj->count) {
-                /* Caching a streamed response. Reallocate a buffer of the
-                 * correct size and copy the streamed response into that
+                /* Caching a streamed response. Reallocate a buffer of the
+                 * correct size and copy the streamed response into that
                  * buffer */
                 char *buf = malloc(obj->count);
                 if (!buf) {
@@ -979,20 +938,22 @@

                 /* Now comes the crufty part... there is no way to tell the
                  * cache that the size of the object has changed. We need
-                 * to remove the object, update the size and re-add the
+                 * to remove the object, update the size and re-add the
                  * object, all under protection of the lock.
                  */
                 if (sconf->lock) {
                     apr_thread_mutex_lock(sconf->lock);
                 }
                 if (obj->cleanup) {
-                    /* If obj->cleanup is set, the object has been prematurly
+                    /* If obj->cleanup is set, the object has been prematurely
                      * ejected from the cache by the garbage collector. Add 
the
-                     * object back to the cache. If an object with the same 
key is
-                     * found in the cache, eject it in favor of the completed 
obj.
+                     * object back to the cache. If an object with the same
+                     * key is found in the cache, eject it in favor of the
+                     * completed obj.
                      */
                     cache_object_t *tmp_obj =
-                      (cache_object_t *) cache_find(sconf->cache_cache, 
obj->key);
+                      (cache_object_t *) cache_find(sconf->cache_cache,
+                                                    obj->key);
                     if (tmp_obj) {
                         cache_remove(sconf->cache_cache, tmp_obj);
                         tmp_obj->cleanup = 1;
@@ -1006,7 +967,7 @@
                     cache_remove(sconf->cache_cache, obj);
                 }
                 mobj->m_len = obj->count;
-                cache_insert(sconf->cache_cache, obj);
+                cache_insert(sconf->cache_cache, obj);
                 if (sconf->lock) {
                     apr_thread_mutex_unlock(sconf->lock);
                 }
@@ -1039,6 +1000,7 @@
     }
     return APR_SUCCESS;
 }
+
 /**
  * Configuration and start-up
  */
@@ -1081,7 +1043,7 @@
     }

     sconf->cache_cache = cache_init(sconf->max_object_cnt,
-                                    sconf->max_cache_size, 

+                                    sconf->max_cache_size,
                                     memcache_get_priority,
                                     sconf->cache_remove_algorithm,
                                     memcache_get_pos,
@@ -1090,16 +1052,18 @@
                                     memcache_cache_get_size,
                                     memcache_cache_get_key,
                                     memcache_cache_free);
-    apr_pool_cleanup_register(p, sconf, cleanup_cache_mem, 
apr_pool_cleanup_null);
+    apr_pool_cleanup_register(p, sconf, cleanup_cache_mem,
+                              apr_pool_cleanup_null);

     if (sconf->cache_cache)
         return OK;

+    /* Shouldn't this be DECLINED not -1 */
     return -1;

 }
-
-static const char
+
+static const char
 *set_max_cache_size(cmd_parms *parms, void *in_struct_ptr, const char *arg)
 {
     apr_size_t val;
@@ -1110,7 +1074,8 @@
     sconf->max_cache_size = val*1024;
     return NULL;
 }
-static const char
+
+static const char
 *set_min_cache_object_size(cmd_parms *parms, void *in_struct_ptr, const char 
*arg)
 {
     apr_size_t val;
@@ -1121,7 +1086,8 @@
     sconf->min_cache_object_size = val;
     return NULL;
 }
-static const char
+
+static const char
 *set_max_cache_object_size(cmd_parms *parms, void *in_struct_ptr, const char 
*arg)
 {
     apr_size_t val;
@@ -1132,7 +1098,8 @@
     sconf->max_cache_object_size = val;
     return NULL;
 }
-static const char
+
+static const char
 *set_max_object_count(cmd_parms *parms, void *in_struct_ptr, const char *arg)
 {
     apr_size_t val;
@@ -1144,7 +1111,7 @@
     return NULL;
 }

-static const char
+static const char
 *set_cache_removal_algorithm(cmd_parms *parms, void *name, const char *arg)
 {
     if (strcasecmp("LRU", arg)) {
Index: modules/experimental/cache_storage.c
===================================================================
RCS file: /home/cvs/httpd-2.0/modules/experimental/cache_storage.c,v
retrieving revision 1.34
diff -u -r1.34 cache_storage.c
--- modules/experimental/cache_storage.c	9 Feb 2004 20:29:18 -0000	1.34
+++ modules/experimental/cache_storage.c	1 Aug 2004 08:24:52 -0000
@@ -65,7 +65,8 @@
  * decide whether or not it wants to cache this particular entity.
  * If the size is unknown, a size of -1 should be set.
  */
-int cache_create_entity(request_rec *r, const char *types, char *url, 
apr_off_t size)
+int cache_create_entity(request_rec *r, const char *types, char *url,
+                        apr_off_t size)
 {
     cache_handle_t *h = apr_pcalloc(r->pool, sizeof(cache_handle_t));
     const char *next = types;
@@ -152,11 +153,12 @@

             /*
              * Check Content-Negotiation - Vary
-             *
-             * At this point we need to make sure that the object we found in 
the cache
-             * is the same object that would be delivered to the client, when 
the
-             * effects of content negotiation are taken into effect.
-             *
+             *
+             * At this point we need to make sure that the object we found in
+             * the cache is the same object that would be delivered to the
+             * client, when the effects of content negotiation are taken into
+             * effect.
+             *
              * In plain english, we want to make sure that a 
language-negotiated
              * document in one language is not given to a client asking for a
              * language negotiated document in a different language by 
mistake.
@@ -198,7 +200,8 @@
                 else {
                     /* headers do not match, so Vary failed */
                     ap_log_error(APLOG_MARK, APLOG_INFO, APR_SUCCESS, 
r->server,
-                                 "cache_select_url(): Vary header mismatch - 
Cached document cannot be used. \n");
+                                 "cache_select_url(): Vary header mismatch: "
+                                 "Cached document cannot be used. \n");
                     apr_table_clear(r->headers_out);
                     r->status_line = NULL;
                     cache->handle = NULL;
@@ -228,7 +231,9 @@
 {
     return (h->write_headers(h, r, info));
 }
-apr_status_t cache_write_entity_body(cache_handle_t *h, request_rec *r, 
apr_bucket_brigade *b)
+
+apr_status_t cache_write_entity_body(cache_handle_t *h, request_rec *r,
+                                     apr_bucket_brigade *b)
 {
     return (h->write_body(h, r, b));
 }
@@ -244,16 +249,19 @@
         return rv;
     }

-    r->filename = apr_pstrdup(r->pool, info->filename );
+    r->filename = apr_pstrdup(r->pool, info->filename);

     return APR_SUCCESS;
 }
-apr_status_t cache_read_entity_body(cache_handle_t *h, apr_pool_t *p, 
apr_bucket_brigade *b)
+
+apr_status_t cache_read_entity_body(cache_handle_t *h, apr_pool_t *p,
+                                    apr_bucket_brigade *b)
 {
     return (h->read_body(h, p, b));
 }

-apr_status_t cache_generate_key_default( request_rec *r, apr_pool_t*p, 
char**key )
+apr_status_t cache_generate_key_default(request_rec *r, apr_pool_t *p,
+                                        char **key)
 {
     if (r->hostname) {
         *key = apr_pstrcat(p, r->hostname, r->uri, "?", r->args, NULL);
@@ -264,15 +272,17 @@
     return APR_SUCCESS;
 }

-APR_IMPLEMENT_EXTERNAL_HOOK_RUN_FIRST(cache, CACHE, int, create_entity,
-                                      (cache_handle_t *h, request_rec *r, 
const char *type,
-                                      const char *urlkey, apr_off_t len),
+APR_IMPLEMENT_EXTERNAL_HOOK_RUN_FIRST(cache, CACHE, int, create_entity,
+                                      (cache_handle_t *h, request_rec *r,
+                                       const char *type, const char *urlkey,
+                                       apr_off_t len),
                                       (h, r, type,urlkey,len),DECLINED)
-APR_IMPLEMENT_EXTERNAL_HOOK_RUN_FIRST(cache, CACHE, int, open_entity,
-                                      (cache_handle_t *h, request_rec *r, 
const char *type,
-                                      const char *urlkey),(h,r,type,urlkey),
+APR_IMPLEMENT_EXTERNAL_HOOK_RUN_FIRST(cache, CACHE, int, open_entity,
+                                      (cache_handle_t *h, request_rec *r,
+                                       const char *type, const char *urlkey),
+                                      (h,r,type,urlkey),
                                       DECLINED)
-APR_IMPLEMENT_EXTERNAL_HOOK_RUN_ALL(cache, CACHE, int, remove_url,
+APR_IMPLEMENT_EXTERNAL_HOOK_RUN_ALL(cache, CACHE, int, remove_url,
                                     (const char *type, const char *urlkey),
                                     (type,urlkey),OK,DECLINED)

Re: [PATCH] mod_cache fixes: #8

Posted by Brian Akins <ba...@web.turner.com>.

Bill Stoddard wrote:

>
> Please, no more specialized knobs which 99.99999% of the world cares 
> nothing about.
>
How do you define that percentage? By domains?  In that case 99.999% 
probably care nothing about what we are doing.  If you look at total 
traffic, however, would not options that help the top 10% be "a good thing?"

Also, if the defaults are reasonable, what difference does it make?


-- 
Brian Akins
Senior Systems Engineer
CNN Internet Technologies

Re: [PATCH] mod_cache fixes: #8

Posted by Bill Stoddard <bi...@wstoddard.com>.

Mads Toftum wrote:

> On Mon, Aug 02, 2004 at 04:54:07PM -0400, Brian Akins wrote:
> 
>>Any reason md4 was not used in mod_cache?  In my ad hoc tests, it seems 
>>much faster.  I do not know the in-and-outs of encryption, but is there 
>>any compelling reason to use md5 over md4 in this case?  We don't care 
>>if someone "cracks" this and gets a url...
>>
> 
> The choice between different digests wouldn't be wether one is easier to
> "crack" than the other. The main concern should be how many collisions 
> that there will be and wether the hash leaves us with a more or less
> balanced tree, 
> Come to think of it, perhaps it would be an idea to make the choice of
> digest configurable - 

Please, no more specialized knobs which 99.99999% of the world cares nothing about.

Bill

Re: [PATCH] mod_cache fixes: #8

Posted by Mads Toftum <ma...@toftum.dk>.

On Mon, Aug 02, 2004 at 04:54:07PM -0400, Brian Akins wrote:
> Any reason md4 was not used in mod_cache?  In my ad hoc tests, it seems 
> much faster.  I do not know the in-and-outs of encryption, but is there 
> any compelling reason to use md5 over md4 in this case?  We don't care 
> if someone "cracks" this and gets a url...
> 
The choice between different digests wouldn't be wether one is easier to
"crack" than the other. The main concern should be how many collisions 
that there will be and wether the hash leaves us with a more or less
balanced tree, 
Come to think of it, perhaps it would be an idea to make the choice of
digest configurable - I can think of cases where the extra overhead of
a digest with fewer collisions would be worth the extra cycles.

vh

Mads Toftum
-- 
`Darn it, who spiked my coffee with water?!' - lwall

Re: [PATCH] mod_cache fixes: #8

Posted by Brian Akins <ba...@web.turner.com>.

Justin Erenkrantz wrote:

>
> MD5 has the possibility for collisions, too.  

Any reason md4 was not used in mod_cache?  In my ad hoc tests, it seems 
much faster.  I do not know the in-and-outs of encryption, but is there 
any compelling reason to use md5 over md4 in this case?  We don't care 
if someone "cracks" this and gets a url...

-- 
Brian Akins
Senior Systems Engineer
CNN Internet Technologies

Re: [PATCH] mod_cache fixes: #8

Posted by Graham Leggett <mi...@sharp.fm>.

Brian Akins wrote:

>> On one hand, I think doing MD5 is sort of silly - just use the URL 
>> itself.  *shrug*  -- justin

One of the goals of the new mod_cache was to support the caching of URL 
variants.

> I actually have somewhat of a solution:
> 
> URL encode the uri and any vary elements:
> www.cnn.com/index.html?this=that
> Accept-Encoding: gzip
> Cookie: Special=SomeValue
> 
> may become:
> 
> www.cnn.com%2Findex.html%3Fthis%3Dthat+Accept-Encoding%3A+gzip+Cookie%3A+Special%3DSomeValue 

Cookies can be up to 4kb in size - it might run into limitations on 
filename and pathname lengths.

Regards,
Graham
--

Re: [PATCH] mod_cache fixes: #8

Posted by Brian Akins <ba...@web.turner.com>.

Justin Erenkrantz wrote:

>
> MD5 has the possibility for collisions, too.  What do squid or other 
> proxies  do?

True. I'll see what others do.

> On one hand, I think doing MD5 is sort of silly - just use the URL 
> itself.  *shrug*  -- justin
>

With mod_disk_cache, how about urls such as:

http://www.cnn.com/2004/US/West/08/02/missing.woman/index.html?this=that&the%20other=something

or when you need to cache two different versions of the same URL because 
of Vary?

I actually have somewhat of a solution:

URL encode the uri and any vary elements:
www.cnn.com/index.html?this=that
Accept-Encoding: gzip
Cookie: Special=SomeValue

may become:

www.cnn.com%2Findex.html%3Fthis%3Dthat+Accept-Encoding%3A+gzip+Cookie%3A+Special%3DSomeValue

A very simple hashing function could put this in some directory 
structure, so the file on disk may be:

/var/cache/apache/00/89/www.cnn.com%2Findex.html%3Fthis%3Dthat+Accept-Encoding%3A+gzip+Cookie%3A+Special%3DSomeValue

Should be pretty fast (?) if the urlencode was effecient.

-- 
Brian Akins
Senior Systems Engineer
CNN Internet Technologies

Re: [PATCH] mod_cache fixes: #8

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.

--On Monday, August 2, 2004 2:54 PM -0400 Brian Akins 
<ba...@web.turner.com> wrote:

>> The other bottleneck I looked at was MD5 as the on-disk naming
>> scheme.  I think MD5 is a poor choice here because it's not very
>> fast.   Ideally, switching to a variant of the times-33 hash might
>> work out better. *shrug*
>>
>
> How to handle collisions?

MD5 has the possibility for collisions, too.  What do squid or other 
proxies  do?  On one hand, I think doing MD5 is sort of silly - just use 
the URL itself.  *shrug*  -- justin

Re: [PATCH] mod_cache fixes: #8

Posted by Brian Akins <ba...@web.turner.com>.

Justin Erenkrantz wrote:

>
> The other bottleneck I looked at was MD5 as the on-disk naming 
> scheme.  I think MD5 is a poor choice here because it's not very 
> fast.   Ideally, switching to a variant of the times-33 hash might 
> work out better. *shrug*
>

How to handle collisions?

-- 
Brian Akins
Senior Systems Engineer
CNN Internet Technologies

Re: [PATCH] mod_cache fixes: #8

Posted by Brian Akins <ba...@web.turner.com>.

Justin Erenkrantz wrote:

>
> post_config's not a bad place for that.  But, I've yet to get a good 
> handle on my thoughts for the storage mechanism that mod_disk_cache is 
> using.  My hunch so far is that it's really inefficient.  (The reading 
> of the headers one-byte at a time with the brain-dead apr_file_gets() 
> just *killed* performance.) 

Could do something like reading the struct first.  Then just read the 
rest of the headers into memoyr and parse them there.  Also having an 
array of headers and just saving the index to the file is more effecient:

psuedo code:

headers = {
  "Cache-Control", "Expires", .... }

then while caching them, to store the Expires header, for example, just 
write 1 (the index of the expires header) and then the value to the 
headers file.  Then parsing is much faster as well.

> And, there might be a way to also save the entire response headers and 
> body and allow a straight sendfile from that.

Don't know if that will work.  Also, you still have to store and parse 
meta data as well.

> Sort of dislike the fact that you must pre-create the dirs.  Yet, I 
> understand the point here.  The extra stat calls might end up being 
> harmful.  Will chew on this and see if the dirs become a bottleneck or 
> not.  -- justin
>
It is a bottle neck on very busy server when the directories do not 
exist.  Once the server has ran for a while, it doesn't matter as the 
majority of the directories have been made...

Also, is it necessary to try to remove the file even when we are not 
sure it exists?

-- 
Brian Akins
Senior Systems Engineer
CNN Internet Technologies

Re: [PATCH] mod_cache fixes: #8

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.

--On Monday, August 2, 2004 8:46 AM -0400 Brian Akins <ba...@web.turner.com> 
wrote:

> Another big speed-up may be to pre-make all of the directories.  A simple
> script could use CacheRoot, |CacheDirLength|, and |CacheDirLevels to  create
> them all.  Just require that this script be ran before starting a
> mod_disk_cache server.  Or if you wanted, you could generate/check the
> directories in post_config.

post_config's not a bad place for that.  But, I've yet to get a good handle on 
my thoughts for the storage mechanism that mod_disk_cache is using.  My hunch 
so far is that it's really inefficient.  (The reading of the headers one-byte 
at a time with the brain-dead apr_file_gets() just *killed* performance.) 
And, there might be a way to also save the entire response headers and body 
and allow a straight sendfile from that.  My guess is that it'll be a slightly 
different cache module though - mod_cache does its best to play nice with the 
other filters.  An option to disregard them may make sense, but we'd really 
have to pay attention to the Vary header then...

> The Squid proxy cache requires you to create all the directories beforehand
> by running it with a specific switch.  This save a ton of checking for
> and/or making directories.

Sort of dislike the fact that you must pre-create the dirs.  Yet, I understand 
the point here.  The extra stat calls might end up being harmful.  Will chew 
on this and see if the dirs become a bottleneck or not.  -- justin

Re: mod_cache performance

Posted by Brian Akins <ba...@web.turner.com>.

Graham Leggett wrote:

>
> mem cache and disk cache were created because not every platform 
> performs best using the same techniques.
>
> This competition between mem cache and disk cache will hopefully make 
> them both faster, and in turn faster than other caches out there.
>
True.  Competetion is good. 


-- 
Brian Akins
Senior Systems Engineer
CNN Internet Technologies

Re: mod_cache performance

Posted by Graham Leggett <mi...@sharp.fm>.

Brian Akins wrote:

>> mod_mem_cache is broken then. It used to kick the pants off of 'no 
>> cache' and mod_disk_cache.

> If mod_disk_cache was patched to use sendfile, it will perform better.

mem cache and disk cache were created because not every platform 
performs best using the same techniques.

This competition between mem cache and disk cache will hopefully make 
them both faster, and in turn faster than other caches out there.

Regards,
Graham
--

Re: mod_cache performance

Posted by Brian Akins <ba...@web.turner.com>.

Bill Stoddard wrote:

>
> mod_mem_cache is broken then. It used to kick the pants off of 'no 
> cache' and mod_disk_cache.


If mod_disk_cache was patched to use sendfile, it will perform better.

-- 
Brian Akins
Senior Systems Engineer
CNN Internet Technologies

Re: mod_cache performance

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.

--On Tuesday, August 3, 2004 9:12 AM -0400 Brian Akins <ba...@web.turner.com> 
wrote:

> Propably not, because you would propably have to lock around it.  It just
> seems it's better to let the filesystem worry about alot of this stuff
> (locking, reference counting, etc.).

+1.  =)  -- justin

Re: mod_cache performance

Posted by Brian Akins <ba...@web.turner.com>.

Graham Leggett wrote:

>
> This is true - mem cache would probably improve drastically with a 
> shared memory cache.

Propably not, because you would propably have to lock around it.  It 
just seems it's better to let the filesystem worry about alot of this 
stuff (locking, reference counting, etc.). 

-- 
Brian Akins
Senior Systems Engineer
CNN Internet Technologies

Re: mod_cache performance

Posted by Graham Leggett <mi...@sharp.fm>.

Brian Akins wrote:

> The big hits for mem cache are:
> 
> The cache is not shared between processes, so you use alot more memory 
> and have a lot less "hits."

This is true - mem cache would probably improve drastically with a 
shared memory cache.

Regards,
Graham
--

Re: mod_cache performance

Posted by Brian Akins <ba...@web.turner.com>.

Eli Marmor wrote:

>Graham Leggett wrote:
>  
>
>>Brian Akins wrote:
>>
>>    
>>
>>>On an OS that supports sendfile, a disk based cache will almost always
>>>bury a memory based one.
>>>      
>>>
>>Quite probably. But on a system without a disk, chances are it won't. :(
>>    
>>
>
>It will.
>Unless mod_disk_cache + ram-disk + sendfile doesn't outperform
>mod_mem_cache.
>
>  
>
This setup performs quite nicely on Linux.

The big hits for mem cache are:

The cache is not shared between processes, so you use alot more memory 
and have a lot less "hits."

You have to copy data from user to kernel, which can be a huge hit.


Even without sendfile, mmap is generally faster than mem cache.

-- 
Brian Akins
Senior Systems Engineer
CNN Internet Technologies

Re: mod_cache performance

Posted by Eli Marmor <ma...@netmask.it>.

Graham Leggett wrote:
> 
> Brian Akins wrote:
> 
> > On an OS that supports sendfile, a disk based cache will almost always
> > bury a memory based one.
> 
> Quite probably. But on a system without a disk, chances are it won't. :(

It will.
Unless mod_disk_cache + ram-disk + sendfile doesn't outperform
mod_mem_cache.

-- 
Eli Marmor
marmor@netmask.it
CTO, Founder
Netmask (El-Mar) Internet Technologies Ltd.
__________________________________________________________
Tel.:   +972-9-766-1020          8 Yad-Harutzim St.
Fax.:   +972-9-766-1314          P.O.B. 7004
Mobile: +972-50-23-7338          Kfar-Saba 44641, Israel

Re: mod_cache performance

Posted by Graham Leggett <mi...@sharp.fm>.

Brian Akins wrote:

> On an OS that supports sendfile, a disk based cache will almost always 
> bury a memory based one.

Quite probably. But on a system without a disk, chances are it won't. :(

Regards,
Graham
--

Re: mod_cache performance

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.

--On Tuesday, August 3, 2004 2:35 PM -0400 David Nicklay 
<dn...@web.turner.com> wrote:

> Send us your squid.conf and your configure options from when you built it
> (as well as what squid version), and I can tell you how to optimize it.
> I've had a lot of practice......

I've posted the squid.conf from RHEL3 at:

<http://www.ics.uci.edu/~jerenkra/caching/>

There is also the output of 'squid -v' in squid-configure.

This is just the straight RHEL3 install.  I'm open to building from sources 
as long as someone tells me what config options and squid.conf to use.  ;-)

At that URL is the proxy.xml flood test case as well.  Plus, summary 
results from flood's analyze-relative report.  (mod_mem_cache and 
mod_disk_cache are maxing out the network now...)

Thanks!  -- justin

Re: mod_cache performance

Posted by David Nicklay <dn...@web.turner.com>.

Hi,

Send us your squid.conf and your configure options from when you built 
it (as well as what squid version), and I can tell you how to optimize 
it.  I've had a lot of practice......

Brian Akins wrote:
> Justin Erenkrantz wrote:
> 
>> --On Tuesday, August 3, 2004 8:11 AM -0400 Brian Akins 
>> <ba...@web.turner.com> wrote:
>>
>>> Under load, squid will always use 100% of the CPU.  This is because 
>>> it uses
>>> poll/select.
>>
>>
>>
>> Ouch.  That sucks.
>>
>> (But, httpd uses poll - so why does that force 100% CPU usage?)
>>
> 
> httpd blocks.  Squid doesn't in general.  Squid just calls poll over and 
> over and does lots of very small reads and writes.
> 
>>
>> Is it worth compiling my own squid then?  (Read that as 'reboot my box 
>> to FreeBSD and use the squid port.')
>>
> Check the configure and make sure you up open files and use poll.  Also 
> kill ident checks.
> 
> 

-- 
David Nicklay     O-
Location: CNN Center - SE0811A
Office: 404-827-2698	Cell: 404-545-6218

Re: mod_cache performance

Posted by Brian Akins <ba...@web.turner.com>.

Justin Erenkrantz wrote:

> --On Tuesday, August 3, 2004 8:11 AM -0400 Brian Akins 
> <ba...@web.turner.com> wrote:
>
>> Under load, squid will always use 100% of the CPU.  This is because 
>> it uses
>> poll/select.
>
>
> Ouch.  That sucks.
>
> (But, httpd uses poll - so why does that force 100% CPU usage?)
>

httpd blocks.  Squid doesn't in general.  Squid just calls poll over and 
over and does lots of very small reads and writes.

>
> Is it worth compiling my own squid then?  (Read that as 'reboot my box 
> to FreeBSD and use the squid port.')
>
Check the configure and make sure you up open files and use poll.  Also 
kill ident checks.


-- 
Brian Akins
Senior Systems Engineer
CNN Internet Technologies

Re: mod_cache performance

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.

--On Tuesday, August 3, 2004 8:11 AM -0400 Brian Akins <ba...@web.turner.com> 
wrote:

> Under load, squid will always use 100% of the CPU.  This is because it uses
> poll/select.

Ouch.  That sucks.

(But, httpd uses poll - so why does that force 100% CPU usage?)

> RHEL 3 sucks.  Fedora Core 2 would have been a much better choice.  Also,
> did you use poll?  I know a large website that does several dozen hits per
> day using squid :)

Heh.  RHEL3 is the Linux distribution we use within the ASF.  (My local box is 
a mirror of the ASF Linux and FreeBSD setups.)  Fedora Core 2 isn't an option.

Is it worth compiling my own squid then?  (Read that as 'reboot my box to 
FreeBSD and use the squid port.')

> On an OS that supports sendfile, a disk based cache will almost always bury
> a memory based one.

Agreed.

I don't think it's worth putting a lot of effort into mod_mem_cache.  Doing 
zero-copy is just going to scale better than memory caching.  -- justin

Re: mod_cache performance

Posted by Brian Akins <ba...@web.turner.com>.

Justin Erenkrantz wrote:

>  
> squid was really inefficient both CPU and network-wise.
>
Under load, squid will always use 100% of the CPU.  This is because it 
uses poll/select.

> The squid numbers *completely* baffle me.  I have to believe I've got 
> something stupid configured in squid (or I did something stupid with 
> flood; but the network traces and truss output convince me 
> otherwise).  My squid is using the default RHEL3 installation (Squid 
> Cache: Version 2.5.STABLE3).
>
RHEL 3 sucks.  Fedora Core 2 would have been a much better choice.  
Also, did you use poll?  I know a large website that does several dozen 
hits per day using squid :)


On an OS that supports sendfile, a disk based cache will almost always 
bury a memory based one.

-- 
Brian Akins
Senior Systems Engineer
CNN Internet Technologies

Re: mod_cache performance

Posted by Ian Holsman <li...@holsman.net>.

Brian Akins wrote:

> Justin Erenkrantz wrote:
> 
>>
>> That brings it in line with mod_disk_cache in maxing out my network.  
>> Time to craft some better tests or find a faster network...  -- justin
>>
> 
> I can probably help with the latter :)
> 
> Can you send me details of your setup and I'll try to test later this week.
> 
> 
we have some boxes with a GigE network as well. (set up to use flood 
with 10 PC's generating the load)


also .. we might have 1-2 amd-64 boxes I could presuade the higher ups 
to use.

Re: mod_cache performance

Posted by Brian Akins <ba...@web.turner.com>.

Justin Erenkrantz wrote:

>
> That brings it in line with mod_disk_cache in maxing out my network.  
> Time to craft some better tests or find a faster network...  -- justin
>

I can probably help with the latter :)

Can you send me details of your setup and I'll try to test later this week.


-- 
Brian Akins
Senior Systems Engineer
CNN Internet Technologies

Re: mod_cache performance

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.

--On Tuesday, August 3, 2004 6:50 PM +0200 Graham Leggett <mi...@sharp.fm> 
wrote:

>>> mod_mem_cache:  Requests: 35000 Time: 54.90 Req/Sec: 637.81
>>> no cache:       Requests: 35000 Time: 54.86 Req/Sec: 638.81
>
> The above result would suggest that mod_mem_cache isn't being used in this
> case. It could be that mem cache has decided not to cache the requested file
> for whatever reason, which is being served via the normal "no cache" path.

It'd help if I compiled mod_mem_cache in.  *duck*  (We need better error 
messages when the cache type isn't found!  Can't we error out at config time?)

Anyway, mod_mem_cache yields (after bumping up MCacheMaxObjectSize to 100k):

Requests: 35000 Time: 40.99 Req/Sec: 856.73

That brings it in line with mod_disk_cache in maxing out my network.  Time to 
craft some better tests or find a faster network...  -- justin

Re: mod_cache performance

Posted by Graham Leggett <mi...@sharp.fm>.

Bill Stoddard wrote:

>> mod_mem_cache:  Requests: 35000 Time: 54.90 Req/Sec: 637.81
>> no cache:       Requests: 35000 Time: 54.86 Req/Sec: 638.81

The above result would suggest that mod_mem_cache isn't being used in 
this case. It could be that mem cache has decided not to cache the 
requested file for whatever reason, which is being served via the normal 
"no cache" path.

Regards,
Graham
--

Re: mod_cache performance

Posted by Bill Stoddard <bi...@wstoddard.com>.

Bill Stoddard wrote:

> Justin Erenkrantz wrote:
> 
>> --On Monday, August 2, 2004 2:49 PM -0400 Bill Stoddard 
>> <bi...@wstoddard.com> wrote:
>>
>>> To get mod_cache/mod_mem_cache (I know little or nothing about
>>> mod_disk_cache) really performing competatively against best-of-breed 
>>> caches
>>> will require bypassing output filters (and prebuilding headers) and 
>>> possibly
>>
>>
>>
>> Here's some comparative numbers to chew on.
>>
>> One client and one server on 100Mbps network (cheapy 100Base-T switch);
>> 50 simulated users hitting 7 URLs 100 times with flood (35,000 requests).
>>
>> mod_disk_cache: Requests: 35000 Time: 40.91 Req/Sec: 856.78
>> mod_mem_cache:  Requests: 35000 Time: 54.90 Req/Sec: 637.81
>> no cache:       Requests: 35000 Time: 54.86 Req/Sec: 638.81
>> squid:          Requests: 35000 Time: 105.35 Req/Sec: 332.25
>>
>> mod_disk_cache completely filled out the network at ~50% CPU usage.
>>    [Can't push through more than ~8MB/sec (~64Mb/sec) without GigE.]
>> mod_mem_cache filled up the CPU but not the network
>>    [Poor scaling characteristics.  It goes to 100% CPU with just 5 
>> users!]
> 
> 
> mod_mem_cache is broken 
Or mistuned?
Here are the defaults for the mem_cache directives:

MCacheSize               ~100 MB
MCacheMaxObjectCount     1009
MCacheMinObjectSize      0 (bytes)
MCacheMaxObjectSize      10000 (bytes)
MCacheRemovalAlgorithm    GDSF
MCacheMaxStreamingBuffer 100000 (bytes)

I have no idea if the urls ending in / are being served at all by mod_mem_cache. Wouldn;t suprise me if tehre 
is a bug there.

Bill

Re: mod_cache performance

Posted by Bill Stoddard <bi...@wstoddard.com>.

Justin Erenkrantz wrote:
> --On Monday, August 2, 2004 2:49 PM -0400 Bill Stoddard 
> <bi...@wstoddard.com> wrote:
> 
>> To get mod_cache/mod_mem_cache (I know little or nothing about
>> mod_disk_cache) really performing competatively against best-of-breed 
>> caches
>> will require bypassing output filters (and prebuilding headers) and 
>> possibly
> 
> 
> Here's some comparative numbers to chew on.
> 
> One client and one server on 100Mbps network (cheapy 100Base-T switch);
> 50 simulated users hitting 7 URLs 100 times with flood (35,000 requests).
> 
> mod_disk_cache: Requests: 35000 Time: 40.91 Req/Sec: 856.78
> mod_mem_cache:  Requests: 35000 Time: 54.90 Req/Sec: 637.81
> no cache:       Requests: 35000 Time: 54.86 Req/Sec: 638.81
> squid:          Requests: 35000 Time: 105.35 Req/Sec: 332.25
> 
> mod_disk_cache completely filled out the network at ~50% CPU usage.
>    [Can't push through more than ~8MB/sec (~64Mb/sec) without GigE.]
> mod_mem_cache filled up the CPU but not the network
>    [Poor scaling characteristics.  It goes to 100% CPU with just 5 users!]

mod_mem_cache is broken then. It used to kick the pants off of 'no cache' and mod_disk_cache.

Bill

mod_cache performance

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.

--On Monday, August 2, 2004 2:49 PM -0400 Bill Stoddard <bi...@wstoddard.com> 
wrote:

> To get mod_cache/mod_mem_cache (I know little or nothing about
> mod_disk_cache) really performing competatively against best-of-breed caches
> will require bypassing output filters (and prebuilding headers) and possibly

Here's some comparative numbers to chew on.

One client and one server on 100Mbps network (cheapy 100Base-T switch);
50 simulated users hitting 7 URLs 100 times with flood (35,000 requests).

mod_disk_cache: Requests: 35000 Time: 40.91 Req/Sec: 856.78
mod_mem_cache:  Requests: 35000 Time: 54.90 Req/Sec: 637.81
no cache:       Requests: 35000 Time: 54.86 Req/Sec: 638.81
squid:          Requests: 35000 Time: 105.35 Req/Sec: 332.25

mod_disk_cache completely filled out the network at ~50% CPU usage.
    [Can't push through more than ~8MB/sec (~64Mb/sec) without GigE.]
mod_mem_cache filled up the CPU but not the network
    [Poor scaling characteristics.  It goes to 100% CPU with just 5 users!]
No caching was better CPU-wise (less utilization) than mod_mem_cache
    [Still not as good network or CPU-wise as mod_disk_cache]
squid was really inefficient both CPU and network-wise.

The squid numbers *completely* baffle me.  I have to believe I've got 
something stupid configured in squid (or I did something stupid with flood; 
but the network traces and truss output convince me otherwise).  My squid is 
using the default RHEL3 installation (Squid Cache: Version 2.5.STABLE3).

squid and httpd are on the same box - I may try to move squid to another box - 
will see if I have time tomorrow to find a suitable target to move to...

For those playing along at home, I am hitting the following URLs with flood:

   /
   /apache_pb.gif
   /manual/
   /manual/images/left.gif
   /manual/images/feather.gif
   /manual/content-negotiation.html
   /icons/

HTH.  -- justin

Re: [PATCH] mod_cache fixes: #8

Posted by Graham Leggett <mi...@sharp.fm>.

Bill Stoddard wrote:

> To get mod_cache/mod_mem_cache (I know little or nothing about 
> mod_disk_cache) really performing competatively against best-of-breed 
> caches will require bypassing output filters (and prebuilding headers) 
> and possibly bypassing or at least reworking input filters.

The initial idea was that the CACHE_SAVE filter would run as close to 
the network as possible (to cache as much of the input filters as 
possible), but over time people moved filters in front of CACHE_SAVE for 
various reasons, which has performance issues.

The ability to cache variants was made so that both an uncompressed and 
compressed variant of the same resource could be cached at the same time.

This means that a browser that does not support compressed content isn't 
going to invalidate the contents of the cache, hurting performance.

Regards,
Graham
--

Re: [PATCH] mod_cache fixes: #8

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.

--On Monday, August 2, 2004 2:49 PM -0400 Bill Stoddard 
<bi...@wstoddard.com> wrote:

> To get mod_cache/mod_mem_cache (I know little or nothing about
> mod_disk_cache) really performing competatively against best-of-breed
> caches will require bypassing output filters (and prebuilding headers)
> and possibly bypassing or at least reworking input filters. And the

*nod*  However, due to the filter design, it should be easy to move it 
around.  The catch with bypassing the protocol filters is that we have to 
be really smart about the Vary headers and such.

> prebuilt headers will need to be properly aligned in memory. You remember
> the patch I posted to the list, gee, maybe up to 2 years ago now to
> reimplement some aspects of how data was read into input filters? Dipping
> into the input stream to read one line at a time turns out to be a fairly
> significant bottleneck serving cached content. That patch improved
> performance servibg cached content by 5 maybe 10% if I recall correctly.

Is that just for proxies?  Not sure that reading input would be such a 
bottleneck here.  So far, I haven't seen anything to indicate it is.

> I also seem to recall mod_cache/mod_mem_cache making a couple of
> gratuitous calls to apr_time() which can be rather expensive on some
> platforms.  Maybe that has changed since last time i looked at it.

The only similar thing I saw was the ap_set_last_modified().  The explode 
code path had to call down to gmtime which was ridiculously expensive. 
(About 1-2% of the overall time was down in gmtime.)  -- justin

Re: [PATCH] mod_cache fixes: #8

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.

--On Monday, August 2, 2004 3:01 PM -0400 Brian Akins 
<ba...@web.turner.com> wrote:

>> (and prebuilding headers)
>
> Is there a way to send pre built headers?  mod_asis uses
> ap_scan_script_header_er which is fairly slow.

That was what I fixed with apr_file_gets() last night.  The code there was 
really naive about reading buffered files.  After my changes, 
ap_scan_script_header_err, more specifically apr_file_gets(), dropped out 
of the top CPU hogs.  I'm not really sure it's worth optimizing the header 
caching right now.  -- justin

Re: [PATCH] mod_cache fixes: #8

Posted by Brian Akins <ba...@web.turner.com>.

Bill Stoddard wrote:

>
> To get mod_cache/mod_mem_cache (I know little or nothing about 
> mod_disk_cache) really performing competatively against best-of-breed 
> caches will require bypassing output filters 


Yes.  

> (and prebuilding headers) 

Is there a way to send pre built headers?  mod_asis uses 
ap_scan_script_header_er which is fairly slow.

> and possibly bypassing or at least reworking input filters. And the 
> prebuilt headers will need to be properly aligned in memory. 


If using disk cache, this obviously does not matter.


-- 
Brian Akins
Senior Systems Engineer
CNN Internet Technologies

Re: [PATCH] mod_cache fixes: #8

Posted by Bill Stoddard <bi...@wstoddard.com>.

Justin Erenkrantz wrote:
mod_cache isn't doing anything stupid or damaging
> performance-wise, I'd like to start being more aggressive about what we 
> can cache.  From my perspective, these patches I've posted (and started 
> to commit) are just the beginning of trying to get mod_cache on more 
> solid ground both performance and RFC-wise.  And, that needs to happen 
> before mod_cache can even get out of experimental...  -- justin
> 
To get mod_cache/mod_mem_cache (I know little or nothing about mod_disk_cache) really performing competatively 
against best-of-breed caches will require bypassing output filters (and prebuilding headers) and possibly 
bypassing or at least reworking input filters. And the prebuilt headers will need to be properly aligned in 
memory. You remember the patch I posted to the list, gee, maybe up to 2 years ago now to reimplement some 
aspects of how data was read into input filters? Dipping into the input stream to read one line at a time 
turns out to be a fairly significant bottleneck serving cached content. That patch improved performance 
servibg cached content by 5 maybe 10% if I recall correctly.

I also seem to recall mod_cache/mod_mem_cache making a couple of gratuitous calls to apr_time() which can be 
rather expensive on some platforms.  Maybe that has changed since last time i looked at it.

Bill

Re: [PATCH] mod_cache fixes: #8

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.

--On Monday, August 2, 2004 9:21 AM -0400 Brian Akins <ba...@web.turner.com> 
wrote:

> In a low traffic site, yes.  In a very high traffic site, with lots of
> objects, the mkdir's kill you.  After a "while,"  most of the directories
> will be created.  However, bringing up a "fresh" server behind a very busy
> site is bad in this case.  Perhaps, what mod_disk_cache does now is best and
> the pre-make directories script should be "contrib."

*nod*

> Also, any thought son how to handle Vary?  I have a way to do it in our
> cache, but it seriously violates the RFC's.  There is a config directive
> that defines what aspects of the request are used to generate the MD5
> (virtual host, url, args, header values, environment values, etc.) and thats
> it.  Only useful when you know a lot about the content being served, which
> is the case for a reverse proxy usually.

After I know that mod_cache isn't doing anything stupid or damaging 
performance-wise, I'd like to start being more aggressive about what we can 
cache.  From my perspective, these patches I've posted (and started to commit) 
are just the beginning of trying to get mod_cache on more solid ground both 
performance and RFC-wise.  And, that needs to happen before mod_cache can even 
get out of experimental...  -- justin

Re: [PATCH] mod_cache fixes: #8

Posted by Brian Akins <ba...@web.turner.com>.

Jeff Trawick wrote:

>Wouldn't assuming that the directory is already there be sufficient
>(and then create the directory structure on the error path)?  It looks
>to me that we only assume the directory structure exists if we had
>this very same file cached previously.
>
>  
>
In a low traffic site, yes.  In a very high traffic site, with lots of 
objects, the mkdir's kill you.  After a "while,"  most of the 
directories will be created.  However, bringing up a "fresh" server 
behind a very busy site is bad in this case.  Perhaps, what 
mod_disk_cache does now is best and the pre-make directories script 
should be "contrib."

Also, any thought son how to handle Vary?  I have a way to do it in our 
cache, but it seriously violates the RFC's.  There is a config directive 
that defines what aspects of the request are used to generate the MD5 
(virtual host, url, args, header values, environment values, etc.) and 
thats it.  Only useful when you know a lot about the content being 
served, which is the case for a reverse proxy usually.

-- 
Brian Akins
Senior Systems Engineer
CNN Internet Technologies

Re: [PATCH] mod_cache fixes: #8

Posted by Jeff Trawick <tr...@gmail.com>.

On Mon, 02 Aug 2004 08:46:07 -0400, Brian Akins <ba...@web.turner.com> wrote:
> Justin Erenkrantz wrote:
> 
> > -
> > static void mkdir_structure(disk_cache_conf *conf, char *file,
> > apr_pool_t *pool)
> 
> Another big speed-up may be to pre-make all of the directories.  A simple script could use CacheRoot, |CacheDirLength|, and |CacheDirLevels to  create them all.  Just require that this script be ran before starting a mod_disk_cache server.  Or if you wanted, you could generate/check the directories in post_config.
> 
> The Squid proxy cache requires you to create all the directories beforehand by running it with a specific switch.  This save a ton of checking for and/or making directories.
> 
> Trust me, it does speed things up a bit....

Wouldn't assuming that the directory is already there be sufficient
(and then create the directory structure on the error path)?  It looks
to me that we only assume the directory structure exists if we had
this very same file cached previously.

Re: [PATCH] mod_cache fixes: #8

Posted by Brian Akins <ba...@web.turner.com>.

Justin Erenkrantz wrote:

> -
> static void mkdir_structure(disk_cache_conf *conf, char *file, 
> apr_pool_t *pool)

Another big speed-up may be to pre-make all of the directories.  A simple script could use CacheRoot, |CacheDirLength|, and |CacheDirLevels to  create them all.  Just require that this script be ran before starting a mod_disk_cache server.  Or if you wanted, you could generate/check the directories in post_config.

The Squid proxy cache requires you to create all the directories beforehand by running it with a specific switch.  This save a ton of checking for and/or making directories.

Trust me, it does speed things up a bit....

|-- 
Brian Akins
Senior Systems Engineer
CNN Internet Technologies