You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@httpd.apache.org by Dirk-Willem van Gulik <di...@webweaving.org> on 2008/02/10 14:12:48 UTC

cache - cleaning up mod_memcache and making other caches their live easier

Right now (only) mod_disk_cache is doing the 'right(tm)' thing w.r.t.  
to Vary - the other caches (mod_memcache as part of the distribution  
and a handful of memcached, distcache and commercial cache modules)  
are just acting on the URI (key).

Bringing them in line involves a bit of cut-and-paste from disk-cache.

So I am wondering if I should do the following -- but want to have  
some feedback of the folks who have been spending the last years on  
this -- as I may have missed something fundamental:

-	Move sundry like array_alphasort, tokens_to_array up into  
cache_utils or similar.

-	Perhaps add a extra function vector called 'make_key' -- which can  
be NULL to the
	cache_provider; which understands most of rfc2616; including case  
(insensitivity). In
	short - we'll propably end up with a struct which details the  
relevant headers, if
	they are int, date, case-insensitive or sensitive. Which allows us  
then to always
	do the right thing.

-	When we call store_* already pre-fillout cache_info (or add a param)  
which does the
	right things around checking for Vary. And perhaps even a precooked  
version of
	headers_out/in.

Before I embark on an experiment (without much design/planning) -- any  
thoughts ? Or has someone already done most of this and/or designed it  
properly ?

Thanks,

Dw

Re: cache - cleaning up mod_memcache and making other caches their live easier

Posted by Plüm, Rüdiger, VF-Group <ru...@vodafone.com>.

> -----Ursprüngliche Nachricht-----
> Von: Dirk-Willem van Gulik  
> Gesendet: Montag, 11. Februar 2008 13:12
> An: dev@httpd.apache.org
> Betreff: Re: cache - cleaning up mod_memcache and making 
> other caches their live easier
> 
> 
> On Feb 11, 2008, at 12:58 PM, Plüm, Rüdiger, VF-Group wrote:
> 
> > The contents of the cache is not protected by any means. So I do not
> > see a security issue here. Somemone who has access to one 
> cache entity
> > has access to all.
> 
> Agreed. But what I worry about is that you get some subtle 
> interaction  
> with some obscure header;  which effectively is used by some site  
> builder as implying certain access - or used, say, for ensuring that  
> certain documents are only shown to, say, French people.
> 
> There is no doubt that this is 'wrong' on just about every level --  
> but given how careless some of the new web app frameworks are put to  

I agree that some web app frameworks might be careless, but the cache is
IMHO the wrong location to fix this kind of sloppyness. On the contrary
I think we must make clear explicitly that nothing in the cache is protected
from access. Keep in mind that none of the access / authz restrictions apply
to cached content. No deny from / require directive will be applied to cached
content once it is in the cache. It is open to *anyone*.
The only security issue we must take care of is to avoid cache poisoning.
This might be possible with the following kind of requests:

    	GET / HTTP/1.0
	User-Agent: enMozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.4) Gecko/20070515 Firefox/2.0.0.4
      Accept-Language:

	GET / HTTP/1.0
	User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv: 1.8.1.4) Gecko/20070515 Firefox/2.0.0.4
	Accept-Language: en

which may both have

    	Vary: Accept-Language User-Agent

in there response. But as we create the key of

[old_key][header name][header value].... both requests result in different cache keys (keys are hashes of the values below):

/Accept-LanguageUser-AgentenMozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.4) Gecko/20070515 Firefox/2.0.0.4
/Accept-LanguageenUser-AgentMozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.4) Gecko/20070515 Firefox/2.0.0.4

So I see no danger for cache poisioning here.

Regards

Rüdiger

Re: cache - cleaning up mod_memcache and making other caches their live easier

Posted by Dirk-Willem van Gulik <di...@webweaving.org>.

On Feb 11, 2008, at 12:58 PM, Plüm, Rüdiger, VF-Group wrote:

> The contents of the cache is not protected by any means. So I do not
> see a security issue here. Somemone who has access to one cache entity
> has access to all.

Agreed. But what I worry about is that you get some subtle interaction  
with some obscure header;  which effectively is used by some site  
builder as implying certain access - or used, say, for ensuring that  
certain documents are only shown to, say, French people.

There is no doubt that this is 'wrong' on just about every level --  
but given how careless some of the new web app frameworks are put to  
use - seems an easy/cheap thing to fix. Just not sure how.

Dw.

Re: cache - cleaning up mod_memcache and making other caches their live easier

Posted by "Akins, Brian" <Br...@turner.com>.

If anyone cares, here's how we do keys and vary in our cache:

On store:

Generate key: using url and r->args (we can ignore r->args per server, if
needed) (http://www.domain.com/index.html?you=me)

If(vary) {
    store the following info in meta file:
      cache_version_t - ala disk_cache (ours includes expire time)
      length of key (apr_size_t) including \0
      key with \0
      length of "array" (apr_size_t)
      a \0 delimited array of Vary headers

     regenerate key (basically the original key + vary info:
 http://www.domain.com/index.html?you=meuser-agentmozillaaccept-encodinggzip
)   
}
  
Store key in meta file.  a normal meta file has this format:
    cache_version_t (ours includes expire time)
    length of key (apr_size_t) including \0
    key with \0   
    length of "table" (apr_size_t)
    a \0 delimited table (key\0value\0key\0value\0....) of response headers
    
Note: the reason we use \0 delimited arrays/tables is we read the entire
metafile info into memory on serving and then just apr_table_setn on the
values.  In theory we could mmap the meta files, but we actually found that
to be slower.


On serving:
Generate key: using url and r->args (we can ignore r->args per server, if
needed) (http://www.domain.com/index.html?you=me)

Open metafile

If(vary) {
   thaw vary array
   generate new key (vary values may be overridden by env)
   open new metafile
}

Thaw headers, etc.



So, we only store the headers that we use in vary key calculation.  On my
TODO list is to make key generation a hook bcs we have apps that would
benefit from that.

Of course, we have only one provider (disk) and ignore a lot of RFC stuff
(although we have made most of that configurable), but our key/vary handling
is pretty fast (I spent a lot of time with profilers when writing it).

I'm still working on my side to allow us to actually release the code.

-- 
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies

Re: cache - cleaning up mod_memcache and making other caches their live easier

Posted by Plüm, Rüdiger, VF-Group <ru...@vodafone.com>.

 

> -----Ursprüngliche Nachricht-----
> Von: Dirk-Willem van Gulik 
> Gesendet: Montag, 11. Februar 2008 01:22
> An: dev@httpd.apache.org
> Betreff: Re: cache - cleaning up mod_memcache and making 
> other caches their live easier
> 

> >
> > I currently do not understand your worries here. Could you please  
> > explain this
> > in more detail?
> 
> Right now we simply concatenate values without any 
> 'separator'. So by  
> for example playing with the User-Agent - adding/prefixing another  
> Vary value - you could perhaps fool us in thinking that another  
> header was set - which was not set at all. I.e. with:
> 
> 	Vary: Content-Language User-Agent
> 
> and a value on disk of
> 
> 	EnMozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; 
> rv:1.8.1.4)  
> Gecko/20070515 Firefox/2.0.0.4
> 
> then the question is did I pass
> 
> 	GET / HTTP/1.0
> 	User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; 
> en-US; rv: 
> 1.8.1.4) Gecko/20070515 Firefox/2.0.0.4
> 	Accept-Language; en
> 	Host : foo
> 
> or
> 
> 	GET / HTTP/1.0
> 	User-Agent: EnMozilla/5.0 (Macintosh; U; Intel Mac OS 
> X; en-US; rv: 
> 1.8.1.4) Gecko/20070515 Firefox/2.0.0.4
> 	Foo
> 
> or something along those lines. Not sure how bad this is -- but I've  
> been bitten by things like this in the past. What I worry about is  
> that a clever user can get something out of the cache we did 
> not expect.
> 
> Or am I way off here ?

Thanks for explaining.
The contents of the cache is not protected by any means. So I do not
see a security issue here. Somemone who has access to one cache entity
has access to all.
This doesn't mean that a separator is unneeded, but currently I for myself
see no need for it.

Regards

Rüdiger

Re: cache - cleaning up mod_memcache and making other caches their live easier

Posted by Dirk-Willem van Gulik <di...@webweaving.org>.

On Feb 10, 2008, at 10:45 PM, Ruediger Pluem wrote:

> 1. Some style nitpicks (especially indenting, sometimes typos in  
> the comments) that
>    make the patch sometimes hard to read.

most files in cache seem out of sync with the .ident.pro file - so I  
guess I should do an
reformat run.

> 3. While it may be a worthwhile goal to improve the error handling  
> of mod_disk_cache
>    I would love to see this addressed in a separate patch as these  
> pieces of code distract
>    while reading the patch and are not strictly connected to your  
> goal of abstracting
>    logic out of mod_disk_cache.

Ok - will do separately - this patch was more thinking aloud than  
seriously - it has some holes still.

>> +/* Create a new table consisting of those elements from an input
>> + * headers table that are allowed to be stored in a cache.  
>> Optionally
>> + * filtered to just the fields passed in the vary_filter array.
>
> Why this? Do you want to save space by not storing the input  
> headers or
> only storing those relevant to varying?

No - what I'd like to do is 'minimize' what a storage needs to  
understand and do in order to be compliant. Specifically - I expect  
us to get more complex that just Vary at some point - and would  
really move this 'up' and out of the storage layer.

> From a first glance this looks like to be a good idea, but it may  
> be better
> separated to another patch to ease reading. This could possibly  
> used later
> to ease the logic in cache_select in cache_storage.c, as today its  
> vary logic
> is unneeded IMHO in the mod_disk_cache case. IMHO I cannot open the  
> mod_disk_cache

Agreed - that logic is un-needed for the disk cache.

But everyone does either need logic to filter theinput lines down to  
just those in 'Vary' --OR-- we need to build a two layer approach  
into the API down to the storage modules. E.g. by having open/create  
sometimes called twice - or by having mod_cache owning some of the  
key generation more strictly.

> entity if the cached resource had vary headers and the cached  
> headers do not
> match the ones from the incoming request. I would not have a hdrs  
> file in this
> case since the path to it is based on the hashed values of the  
> input headers
> that vary.
>>

>> +     *  4.20 Except [exerpt]
>> +     *     Comparison of expectation values is case-insensitive  
>> for unquoted
>> +     *     tokens (including the 100-continue token), and is case- 
>> sensitive for
>> +     *     quoted-string expectation-extensions.
>> +     *
>> +     *  XX we are currently concatenating the values. With some  
>> puzzling one could
>> +     *     set a header value which looks like the 'next/ 
>> previous' value - and hence
>> +     *     cause a value which is not the 'real' one. This may be  
>> a security risk.
>> +     *     Ideally we should use a special value which cannot  
>> occur in a header
>> +     *     legally (or is escaped/wacked). Given that (at least)  
>> a Header cannot
>> +     *     contain a space or ':' - may be an idea to insert those.
>> +     */
>
> I currently do not understand your worries here. Could you please  
> explain this
> in more detail?

Right now we simply concatenate values without any 'separator'. So by  
for example playing with the User-Agent - adding/prefixing another  
Vary value - you could perhaps fool us in thinking that another  
header was set - which was not set at all. I.e. with:

	Vary: Content-Language User-Agent

and a value on disk of

	EnMozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.4)  
Gecko/20070515 Firefox/2.0.0.4

then the question is did I pass

	GET / HTTP/1.0
	User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv: 
1.8.1.4) Gecko/20070515 Firefox/2.0.0.4
	Accept-Language; en
	Host : foo

or

	GET / HTTP/1.0
	User-Agent: EnMozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv: 
1.8.1.4) Gecko/20070515 Firefox/2.0.0.4
	Foo

or something along those lines. Not sure how bad this is -- but I've  
been bitten by things like this in the past. What I worry about is  
that a clever user can get something out of the cache we did not expect.

Or am I way off here ?

Dw.

Re: cache - cleaning up mod_memcache and making other caches their live easier

Posted by Ruediger Pluem <rp...@apache.org>.


On 02/10/2008 05:39 PM, Dirk-Willem van Gulik wrote:
> 

> 
> See below for some aloud thinking -- note that this is not yet well 
> thought out (all I did was try to minimize the overlap between 
> disk_cache, memcached, distcache and some commercial thing -- and tried 
> to move as much RFC2616 knowledge into mod_cache.c*).
> 
> -    internal ap_cache_cacheable_hdr to hold all the RFC 2616
>         non cachable headers info as before.
> 
> -    ap_cache_cacheable_hdrs_out introduced as per earlier discussion
>     which knows about error headers and mandatory content setting.
> 
> -    ap_cache_cacheable_hdrs_in introduced - which has an optional
>     argument to further curtail the headers returned by just those
>     in the 'Vary'.
> 
> -    ap_normalize_vary_to_array() introduced. Which can help
>     caching storage modules to create a key across the Vary
>     relevant fields, if any.
> 
>     => given that they all have to create this -- wondering if
>     we should make this part of the cache_info struct already.

Maybe not a bad idea.

> 
> -    Likewise a ap_generate_vary_key() which understands
>     vary if/when needed.
> 
> Below more or less cleans mod_disk_cache completly of having to 
> understand http (or any protocol aspect) - but for 
> ap_generate_vary_key() -- but it is not yet HTTP neutral (i.e. a IMAP 
> protocol module would have to overwrite things it has not yet XS to).
> 
> Thoughts,
> 
> Dw.
> 

Some comments:

1. Some style nitpicks (especially indenting, sometimes typos in the comments) that
    make the patch sometimes hard to read.
2. Don't forget to add a minor bump once you commit.
3. While it may be a worthwhile goal to improve the error handling of mod_disk_cache
    I would love to see this addressed in a separate patch as these pieces of code distract
    while reading the patch and are not strictly connected to your goal of abstracting
    logic out of mod_disk_cache.
4. Looks good in general, some more questions inline

> 
> *: not yet nearly enough - i.e. lacking the whole case (in)sensitive, 
> date, etc stuff.
> 
> Index: cache_util.c
> ===================================================================
> --- cache_util.c    (revision 620288)
> +++ cache_util.c    (working copy)
> @@ -571,10 +573,10 @@
>      return apr_pstrdup(p, hashfile);
>  }
> 
> -/* Create a new table consisting of those elements from an input
> +/* Create a new table consisting of those elements from an
>   * headers table that are allowed to be stored in a cache.
>   */
> -CACHE_DECLARE(apr_table_t *)ap_cache_cacheable_hdrs_out(apr_pool_t *pool,
> +CACHE_DECLARE(apr_table_t *)ap_cache_cacheable_hdrs(apr_pool_t *pool,
>                                                          apr_table_t *t,
>                                                          server_rec *s)
>  {
> @@ -582,12 +584,17 @@
>      char **header;
>      int i;
> 
> +    if (t == NULL) {
> +    return NULL;
> +    };
> +
>      /* Make a copy of the headers, and remove from
>       * the copy any hop-by-hop headers, as defined in Section
>       * 13.5.1 of RFC 2616
>       */
>      apr_table_t *headers_out;
>      headers_out = apr_table_copy(pool, t);
> +
>      apr_table_unset(headers_out, "Connection");
>      apr_table_unset(headers_out, "Keep-Alive");
>      apr_table_unset(headers_out, "Proxy-Authenticate");
> @@ -599,6 +606,7 @@
> 
>      conf = (cache_server_conf *)ap_get_module_config(s->module_config,
>                                                       &cache_module);
> +
>      /* Remove the user defined headers set with CacheIgnoreHeaders.
>       * This may break RFC 2616 compliance on behalf of the administrator.
>       */
> @@ -608,3 +616,170 @@
>      }
>      return headers_out;
>  }
> +
> +/* Create a new table consisting of those elements from an input
> + * headers table that are allowed to be stored in a cache. Optionally
> + * filtered to just the fields passed in the vary_filter array.

Why this? Do you want to save space by not storing the input headers or
only storing those relevant to varying?
 From a first glance this looks like to be a good idea, but it may be better
separated to another patch to ease reading. This could possibly used later
to ease the logic in cache_select in cache_storage.c, as today its vary logic
is unneeded IMHO in the mod_disk_cache case. IMHO I cannot open the mod_disk_cache
entity if the cached resource had vary headers and the cached headers do not
match the ones from the incoming request. I would not have a hdrs file in this
case since the path to it is based on the hashed values of the input headers
that vary.

> + *
> + * @return a table with the valid input headers or NULL if none are 
> relevant.
> + */
> +CACHE_DECLARE(apr_table_t *)ap_cache_cacheable_hdrs_in(request_rec * r, 
> apr_array_header_t * vary_filter)
> +{
> +    apr_table_t *headers_in, * vary_filtered;
> +        int i;
> +
> +    if (r->headers_in == NULL)
> +        return NULL;
> +
> +        headers_in = ap_cache_cacheable_hdrs(r->pool, r->headers_in, 
> r->server);
> +   
> +    if (!vary_filter)
> +        return apr_is_empty_table(headers_in) ? NULL: headers_in;
> +
> +        /* The vary array is most propably the smallest of the two - su
> +     * only copy just those fields across.
> +     */
> +         vary_filtered = apr_table_make(r->pool, vary_filter->nelts);
> +        for (i = 0; i < vary_filter->nelts; i++) {
> +                const char *key = ((const char**)vary_filter->elts)[i];
> +                const char * val = apr_table_get(headers_in, key); /* 
> XXX: Not case insensitive */
> +                if (val)
> +                        apr_table_add(vary_filtered, key, val);
> +        };
> +
> +    return apr_is_empty_table(vary_filtered) ? NULL : vary_filtered;
> +}
> +
> +/* Create a new table consisting of those elements from an output
> + * headers table that are allowed to be stored in a cache;
> + * ensure there is a content type and capture any errors.
> + */
> +CACHE_DECLARE(apr_table_t *)ap_cache_cacheable_hdrs_out(request_rec * r)
> +{
> +    apr_table_t *headers_out;
> +
> +    headers_out = ap_cache_cacheable_hdrs(r->pool, r->headers_out,
> +                                                  r->server);
> +
> +        if (!apr_table_get(headers_out, "Content-Type")
> +            && r->content_type) {
> +            apr_table_setn(headers_out, "Content-Type",
> +                           ap_make_content_type(r, r->content_type));
> +        }
> +
> +        headers_out = apr_table_overlay(r->pool, headers_out,
> +                                        r->err_headers_out);
> +
> +    return headers_out;
> +}
> +
> +static int array_alphasort(const void *fn1, const void *fn2)
> +{
> +    return strcmp(*(char**)fn1, *(char**)fn2);
> +}
> +
> +static void vary_tokens_to_array(apr_pool_t *p, const char *data,
> +                            apr_array_header_t *arr)
> +{
> +    char *token;
> +
> +    while ((token = ap_get_list_item(p, &data)) != NULL) {
> +        *((const char **) apr_array_push(arr)) = token;
> +    }
> +
> +    qsort((void *) arr->elts, arr->nelts,
> +         sizeof(char *), array_alphasort);
> +}
> +
> +/* Given a header table; extract any Vary present and return
> + * the values in a predictable way (i.e. Sort it so that
> + * "Vary: A, B" and "Vary: B, A" are stored the same; and take
> + * care of any case sensitive issues. Onb exit the array will
> + * either be NULL (no VARYance) or a sorted array.
> + *
> + * @return vary-array or NULL, if no variance
> + */
> +CACHE_DECLARE(apr_array_header_t*) 
> ap_normalize_vary_to_array(apr_pool_t *p, apr_table_t * headers)
> +{
> +    apr_array_header_t* varray = NULL;
> +    const char *vary;
> +   
> +    if ((headers) && ((vary= apr_table_get(headers, "Vary")) != NULL))
> +    {
> +        varray = apr_array_make(p, 6, sizeof(char*));
> +            vary_tokens_to_array(p, vary, varray);
> +    }
> +   
> +    return varray;
> +}
> +
> +/* Generate a new key based on the normal key (URI) and normalized 
> values from
> + * the Vary array. The vary array is assumed to be in a stable order 
> and format.
> + */
> +CACHE_DECLARE(const char*) ap_generate_vary_key(apr_pool_t *p, 
> apr_table_t * input_headers,
> +                             apr_array_header_t *varray, const char 
> *oldkey)
> +{
> +    struct iovec *iov;
> +    int i, k;
> +    int nvec;
> +    const char *header;
> +    const char **elts;
> +
> +    if (varray == NULL);
> +    return oldkey;
> +
> +    nvec = (varray->nelts * 2) + 1;
> +    iov = apr_palloc(p, sizeof(struct iovec) * nvec);
> +    elts = (const char **) varray->elts;
> +
> +    /* TODO:
> +     *    - Handle multiple-value headers better. (sort them?)
> +     *    - Handle Case in-sensitive Values better.
> +     *        This isn't the end of the world, since it just lowers the 
> cache
> +     *        hit rate, but it would be nice to fix.
> +     *
> +     * The majority are case insenstive if they are values (encoding etc).
> +     * Most of rfc2616 is case insensitive on header contents.
> +     *
> +     * So the better solution may be to identify headers which should be
> +     * treated case-sensitive?
> +     *  HTTP URI's (3.2.3) [host and scheme are insensitive]
> +     *  HTTP method (5.1.1)
> +     *  HTTP-date values (3.3.1)
> +     *  3.7 Media Types [exerpt]
> +     *     The type, subtype, and parameter attribute names are case-
> +     *     insensitive. Parameter values might or might not be 
> case-sensitive,
> +     *     depending on the semantics of the parameter name.
> +     *  4.20 Except [exerpt]
> +     *     Comparison of expectation values is case-insensitive for 
> unquoted
> +     *     tokens (including the 100-continue token), and is 
> case-sensitive for
> +     *     quoted-string expectation-extensions.
> +     *
> +     *  XX we are currently concatenating the values. With some 
> puzzling one could
> +     *     set a header value which looks like the 'next/previous' 
> value - and hence
> +     *     cause a value which is not the 'real' one. This may be a 
> security risk.
> +     *     Ideally we should use a special value which cannot occur in 
> a header
> +     *     legally (or is escaped/wacked). Given that (at least) a 
> Header cannot
> +     *     contain a space or ':' - may be an idea to insert those.
> +     */

I currently do not understand your worries here. Could you please explain this
in more detail?



> -    /* Parse the vary header and dump those fields from the headers_in. */
> -    /* FIXME: Make call to the same thing cache_select calls to crack 
> Vary. */
> -    if (r->headers_in) {
> +    /* Parse the vary header and dump those fields from the headers_in.
> +     * But only do so if needed - i.e. there is a Vary - and then limit
> +     * it to the headers needed to validate the Vary.
> +     */

See my comment above regarding vary filtering.

> +    if (r->headers_in && varray) {
>          apr_table_t *headers_in;
> -
> -        headers_in = ap_cache_cacheable_hdrs_out(r->pool, r->headers_in,
> -                                                 r->server);
> +        headers_in = ap_cache_cacheable_hdrs_in(r, varray);
> +        if (headers_in) {
>          rv = store_table(dobj->hfd, headers_in);
>          if (rv != APR_SUCCESS) {
> +                 apr_file_close(dobj->hfd); /* flush and close */
> +                 apr_file_remove(dobj->tempfile, r->pool);
>              return rv;
>          }
>      }
> +    }

Regards

Rüdiger

Re: cache - cleaning up mod_memcache and making other caches their live easier

Posted by Dirk-Willem van Gulik <di...@webweaving.org>.

On Feb 10, 2008, at 2:12 PM, Dirk-Willem van Gulik wrote:

> Right now (only) mod_disk_cache is doing the 'right(tm)' thing  
> w.r.t. to Vary - the other caches (mod_memcache as part of the  
> distribution and a handful of memcached, distcache and commercial  
> cache modules) are just acting on the URI (key).
>
> Bringing them in line involves a bit of cut-and-paste from disk-cache.
>
> So I am wondering if I should do the following -- but want to have  
> some feedback of the folks who have been spending the last years on  
> this -- as I may have missed something fundamental:
>
> -	Move sundry like array_alphasort, tokens_to_array up into  
> cache_utils or similar.
>
> -	Perhaps add a extra function vector called 'make_key' -- which can  
> be NULL to the
> 	cache_provider; which understands most of rfc2616; including case  
> (insensitivity). In
> 	short - we'll propably end up with a struct which details the  
> relevant headers, if
> 	they are int, date, case-insensitive or sensitive. Which allows us  
> then to always
> 	do the right thing.
>
> -	When we call store_* already pre-fillout cache_info (or add a  
> param) which does the
> 	right things around checking for Vary. And perhaps even a precooked  
> version of
> 	headers_out/in.
>
> Before I embark on an experiment (without much design/planning) --  
> any thoughts ? Or has someone already done most of this and/or  
> designed it properly ?
>
> Thanks,
>
> Dw

See below for some aloud thinking -- note that this is not yet well  
thought out (all I did was try to minimize the overlap between  
disk_cache, memcached, distcache and some commercial thing -- and  
tried to move as much RFC2616 knowledge into mod_cache.c*).

-	internal ap_cache_cacheable_hdr to hold all the RFC 2616
     	non cachable headers info as before.

-	ap_cache_cacheable_hdrs_out introduced as per earlier discussion
	which knows about error headers and mandatory content setting.

-	ap_cache_cacheable_hdrs_in introduced - which has an optional
	argument to further curtail the headers returned by just those
	in the 'Vary'.

-	ap_normalize_vary_to_array() introduced. Which can help
	caching storage modules to create a key across the Vary
	relevant fields, if any.

	=> given that they all have to create this -- wondering if
	we should make this part of the cache_info struct already.

-	Likewise a ap_generate_vary_key() which understands
	vary if/when needed.

Below more or less cleans mod_disk_cache completly of having to  
understand http (or any protocol aspect) - but for  
ap_generate_vary_key() -- but it is not yet HTTP neutral (i.e. a IMAP  
protocol module would have to overwrite things it has not yet XS to).

Thoughts,

Dw.


*: not yet nearly enough - i.e. lacking the whole case (in)sensitive,  
date, etc stuff.

Index: cache_util.c
===================================================================
--- cache_util.c	(revision 620288)
+++ cache_util.c	(working copy)
@@ -571,10 +573,10 @@
      return apr_pstrdup(p, hashfile);
  }

-/* Create a new table consisting of those elements from an input
+/* Create a new table consisting of those elements from an
   * headers table that are allowed to be stored in a cache.
   */
-CACHE_DECLARE(apr_table_t *)ap_cache_cacheable_hdrs_out(apr_pool_t  
*pool,
+CACHE_DECLARE(apr_table_t *)ap_cache_cacheable_hdrs(apr_pool_t *pool,
                                                          apr_table_t  
*t,
                                                          server_rec *s)
  {
@@ -582,12 +584,17 @@
      char **header;
      int i;

+    if (t == NULL) {
+	return NULL;
+    };
+
      /* Make a copy of the headers, and remove from
       * the copy any hop-by-hop headers, as defined in Section
       * 13.5.1 of RFC 2616
       */
      apr_table_t *headers_out;
      headers_out = apr_table_copy(pool, t);
+
      apr_table_unset(headers_out, "Connection");
      apr_table_unset(headers_out, "Keep-Alive");
      apr_table_unset(headers_out, "Proxy-Authenticate");
@@ -599,6 +606,7 @@

      conf = (cache_server_conf *)ap_get_module_config(s->module_config,
                                                       &cache_module);
+
      /* Remove the user defined headers set with CacheIgnoreHeaders.
       * This may break RFC 2616 compliance on behalf of the  
administrator.
       */
@@ -608,3 +616,170 @@
      }
      return headers_out;
  }
+
+/* Create a new table consisting of those elements from an input
+ * headers table that are allowed to be stored in a cache. Optionally
+ * filtered to just the fields passed in the vary_filter array.
+ *
+ * @return a table with the valid input headers or NULL if none are  
relevant.
+ */
+CACHE_DECLARE(apr_table_t *)ap_cache_cacheable_hdrs_in(request_rec *  
r, apr_array_header_t * vary_filter)
+{
+	apr_table_t *headers_in, * vary_filtered;
+        int i;
+
+	if (r->headers_in == NULL)
+		return NULL;
+
+        headers_in = ap_cache_cacheable_hdrs(r->pool, r->headers_in,  
r->server);
+	
+	if (!vary_filter)
+		return apr_is_empty_table(headers_in) ? NULL: headers_in;
+
+        /* The vary array is most propably the smallest of the two - su
+	 * only copy just those fields across.
+	 */
+     	vary_filtered = apr_table_make(r->pool, vary_filter->nelts);
+        for (i = 0; i < vary_filter->nelts; i++) {
+                const char *key = ((const char**)vary_filter->elts)[i];
+                const char * val = apr_table_get(headers_in, key); /*  
XXX: Not case insensitive */
+                if (val)
+                        apr_table_add(vary_filtered, key, val);
+        };
+
+	return apr_is_empty_table(vary_filtered) ? NULL : vary_filtered;
+}
+
+/* Create a new table consisting of those elements from an output
+ * headers table that are allowed to be stored in a cache;
+ * ensure there is a content type and capture any errors.
+ */
+CACHE_DECLARE(apr_table_t *)ap_cache_cacheable_hdrs_out(request_rec *  
r)
+{
+	apr_table_t *headers_out;
+
+	headers_out = ap_cache_cacheable_hdrs(r->pool, r->headers_out,
+                                                  r->server);
+
+        if (!apr_table_get(headers_out, "Content-Type")
+            && r->content_type) {
+            apr_table_setn(headers_out, "Content-Type",
+                           ap_make_content_type(r, r->content_type));
+        }
+
+        headers_out = apr_table_overlay(r->pool, headers_out,
+                                        r->err_headers_out);
+
+	return headers_out;
+}
+
+static int array_alphasort(const void *fn1, const void *fn2)
+{
+    return strcmp(*(char**)fn1, *(char**)fn2);
+}
+
+static void vary_tokens_to_array(apr_pool_t *p, const char *data,
+                            apr_array_header_t *arr)
+{
+    char *token;
+
+    while ((token = ap_get_list_item(p, &data)) != NULL) {
+        *((const char **) apr_array_push(arr)) = token;
+    }
+
+    qsort((void *) arr->elts, arr->nelts,
+         sizeof(char *), array_alphasort);
+}
+
+/* Given a header table; extract any Vary present and return
+ * the values in a predictable way (i.e. Sort it so that
+ * "Vary: A, B" and "Vary: B, A" are stored the same; and take
+ * care of any case sensitive issues. Onb exit the array will
+ * either be NULL (no VARYance) or a sorted array.
+ *
+ * @return vary-array or NULL, if no variance
+ */
+CACHE_DECLARE(apr_array_header_t*)  
ap_normalize_vary_to_array(apr_pool_t *p, apr_table_t * headers)
+{
+	apr_array_header_t* varray = NULL;
+	const char *vary;
+	
+	if ((headers) && ((vary= apr_table_get(headers, "Vary")) != NULL))
+	{
+	    varray = apr_array_make(p, 6, sizeof(char*));
+            vary_tokens_to_array(p, vary, varray);
+	}
+	
+	return varray;
+}
+
+/* Generate a new key based on the normal key (URI) and normalized  
values from
+ * the Vary array. The vary array is assumed to be in a stable order  
and format.
+ */
+CACHE_DECLARE(const char*) ap_generate_vary_key(apr_pool_t *p,  
apr_table_t * input_headers,
+                             apr_array_header_t *varray, const char  
*oldkey)
+{
+    struct iovec *iov;
+    int i, k;
+    int nvec;
+    const char *header;
+    const char **elts;
+
+    if (varray == NULL);
+	return oldkey;
+
+    nvec = (varray->nelts * 2) + 1;
+    iov = apr_palloc(p, sizeof(struct iovec) * nvec);
+    elts = (const char **) varray->elts;
+
+    /* TODO:
+     *    - Handle multiple-value headers better. (sort them?)
+     *    - Handle Case in-sensitive Values better.
+     *        This isn't the end of the world, since it just lowers  
the cache
+     *        hit rate, but it would be nice to fix.
+     *
+     * The majority are case insenstive if they are values (encoding  
etc).
+     * Most of rfc2616 is case insensitive on header contents.
+     *
+     * So the better solution may be to identify headers which should  
be
+     * treated case-sensitive?
+     *  HTTP URI's (3.2.3) [host and scheme are insensitive]
+     *  HTTP method (5.1.1)
+     *  HTTP-date values (3.3.1)
+     *  3.7 Media Types [exerpt]
+     *     The type, subtype, and parameter attribute names are case-
+     *     insensitive. Parameter values might or might not be case- 
sensitive,
+     *     depending on the semantics of the parameter name.
+     *  4.20 Except [exerpt]
+     *     Comparison of expectation values is case-insensitive for  
unquoted
+     *     tokens (including the 100-continue token), and is case- 
sensitive for
+     *     quoted-string expectation-extensions.
+	 *
+	 *  XX we are currently concatenating the values. With some puzzling  
one could
+	 *     set a header value which looks like the 'next/previous' value  
- and hence
+	 *     cause a value which is not the 'real' one. This may be a  
security risk.
+	 *     Ideally we should use a special value which cannot occur in a  
header
+	 *     legally (or is escaped/wacked). Given that (at least) a  
Header cannot
+	 *     contain a space or ':' - may be an idea to insert those.
+     */
+    for(i=0, k=0; i < varray->nelts; i++) {
+		/* Note that we're assuming that the case if the Vary fields matches
+		 * the case of the actual headers.
+		 */
+        header = apr_table_get(input_headers, elts[i]);
+        if (!header) {
+            header = "";
+        }
+        iov[k].iov_base = (char*) elts[i];
+        iov[k].iov_len = strlen(elts[i]);
+        k++;
+        iov[k].iov_base = (char*) header;
+        iov[k].iov_len = strlen(header);
+        k++;
+    }
+    iov[k].iov_base = (char*) oldkey;
+    iov[k].iov_len = strlen(oldkey);
+    k++;
+
+    return apr_pstrcatv(p, iov, k, NULL);
+}
Index: mod_mem_cache.c
===================================================================
--- mod_mem_cache.c	(revision 620288)
+++ mod_mem_cache.c	(working copy)
@@ -605,17 +605,8 @@
      mobj->req_hdrs = deep_table_copy(mobj->pool, r->headers_in);

      /* Precompute how much storage we need to hold the headers */
-    headers_out = ap_cache_cacheable_hdrs_out(r->pool, r->headers_out,
-                                              r->server);
+    headers_out = ap_cache_cacheable_hdrs_out(r);

-    /* If not set in headers_out, set Content-Type */
-    if (!apr_table_get(headers_out, "Content-Type")
-        && r->content_type) {
-        apr_table_setn(headers_out, "Content-Type",
-                       ap_make_content_type(r, r->content_type));
-    }
-
-    headers_out = apr_table_overlay(r->pool, headers_out, r- 
 >err_headers_out);
      mobj->header_out = deep_table_copy(mobj->pool, headers_out);

      /* Init the info struct */
Index: mod_cache.c
===================================================================
--- mod_cache.c	(revision 620288)
+++ mod_cache.c	(working copy)
@@ -752,10 +752,12 @@
           * However, before doing that, we need to first merge in
           * err_headers_out and we also need to strip any hop-by-hop
           * headers that might have snuck in.
           */
          r->headers_out = apr_table_overlay(r->pool, r->headers_out,
                                             r->err_headers_out);
-        r->headers_out = ap_cache_cacheable_hdrs_out(r->pool, r- 
 >headers_out,
+        r->headers_out = ap_cache_cacheable_hdrs(r->pool, r- 
 >headers_out,
                                                       r->server);
          apr_table_clear(r->err_headers_out);

Index: mod_cache.h
===================================================================
--- mod_cache.h	(revision 620288)
+++ mod_cache.h	(working copy)
@@ -288,13 +288,25 @@
                                      const char *key, char **val);
  CACHE_DECLARE(const char *)ap_cache_tokstr(apr_pool_t *p, const char  
*list, const char **str);

-/* Create a new table consisting of those elements from a request_rec's
- * headers_out that are allowed to be stored in a cache
+/* Create a new table consisting of those elements from an
+ * headers table that are allowed to be stored in a cache.
   */
-CACHE_DECLARE(apr_table_t *)ap_cache_cacheable_hdrs_out(apr_pool_t  
*pool,
+CACHE_DECLARE(apr_table_t *)ap_cache_cacheable_hdrs(apr_pool_t *pool,
                                                          apr_table_t  
*t,
                                                          server_rec  
*s);

+/* Create a new table consisting of those elements from an input
+ * headers table that are allowed to be stored in a cache. Optionally
+ * filtered to just the fields in the vary_filter.
+ */
+CACHE_DECLARE(apr_table_t *)ap_cache_cacheable_hdrs_in(request_rec *  
r, apr_array_header_t * vary_filter);
+
+/* Create a new table consisting of those elements from an output
+ * headers table that are allowed to be stored in a cache;
+ * ensure there is a content type and capture any errors.
+ */
+CACHE_DECLARE(apr_table_t *)ap_cache_cacheable_hdrs_out(request_rec *  
r);
+
  /**
   * cache_storage.c
   */
@@ -316,6 +328,21 @@
  apr_status_t cache_recall_entity_body(cache_handle_t *h, apr_pool_t  
*p, apr_bucket_brigade *bb);
  */

+/* Given a header table; extract any Vary present and return
+ * the values in a predictable way (i.e. Sort it so that
+ * "Vary: A, B" and "Vary: B, A" are stored the same; and take
+ * care of any case sensitive issues. Onb exit the array will
+ * either be NULL (no VARYance) or a sorted array.
+ *
+ * @return vary-array or NULL, if no variance
+ */
+CACHE_DECLARE(apr_array_header_t*)  
ap_normalize_vary_to_array(apr_pool_t *p, apr_table_t * headers);
+
+/* Generate a new key based on the normal key (URI) and normalized  
values from
+ * the Vary array. The vary array is assumed to be in a stable order  
and format.
+ */
+CACHE_DECLARE(const char*) ap_generate_vary_key(apr_pool_t *p,  
apr_table_t * input_headers,
+                             apr_array_header_t *varray, const char  
*oldkey);
  /* hooks */

  APR_DECLARE_OPTIONAL_FN(apr_status_t,
Index: mod_disk_cache.c
===================================================================
--- mod_disk_cache.c	(revision 620288)
+++ mod_disk_cache.c	(working copy)
@@ -102,7 +102,7 @@
       }
  }

-static void mkdir_structure(disk_cache_conf *conf, const char *file,  
apr_pool_t *pool)
+static apr_status_t mkdir_structure(disk_cache_conf *conf, const char  
*file, apr_pool_t *pool)
  {
      apr_status_t rv;
      char *p;
@@ -116,11 +116,12 @@
          rv = apr_dir_make(file,
                            APR_UREAD|APR_UWRITE|APR_UEXECUTE, pool);
          if (rv != APR_SUCCESS && !APR_STATUS_IS_EEXIST(rv)) {
-            /* XXX */
+            return rv;
          }
          *p = '/';
          ++p;
      }
+    return APR_SUCCESS;
  }

  /* htcacheclean may remove directories underneath us.
@@ -141,7 +142,9 @@
              /* 1000 micro-seconds aka 0.001 seconds. */
              apr_sleep(1000);

-            mkdir_structure(conf, dest, pool);
+            rv = mkdir_structure(conf, dest, pool);
+	    if (rv != APR_SUCCESS)
+		continue;

              rv = apr_file_rename(src, dest, pool);
          }
@@ -241,81 +244,6 @@
      return APR_SUCCESS;
  }

-static const char* regen_key(apr_pool_t *p, apr_table_t *headers,
-                             apr_array_header_t *varray, const char  
*oldkey)
-{
-    struct iovec *iov;
-    int i, k;
-    int nvec;
-    const char *header;
-    const char **elts;
-
-    nvec = (varray->nelts * 2) + 1;
-    iov = apr_palloc(p, sizeof(struct iovec) * nvec);
-    elts = (const char **) varray->elts;
-
-    /* TODO:
-     *    - Handle multiple-value headers better. (sort them?)
-     *    - Handle Case in-sensitive Values better.
-     *        This isn't the end of the world, since it just lowers  
the cache
-     *        hit rate, but it would be nice to fix.
-     *
-     * The majority are case insenstive if they are values (encoding  
etc).
-     * Most of rfc2616 is case insensitive on header contents.
-     *
-     * So the better solution may be to identify headers which should  
be
-     * treated case-sensitive?
-     *  HTTP URI's (3.2.3) [host and scheme are insensitive]
-     *  HTTP method (5.1.1)
-     *  HTTP-date values (3.3.1)
-     *  3.7 Media Types [exerpt]
-     *     The type, subtype, and parameter attribute names are case-
-     *     insensitive. Parameter values might or might not be case- 
sensitive,
-     *     depending on the semantics of the parameter name.
-     *  4.20 Except [exerpt]
-     *     Comparison of expectation values is case-insensitive for  
unquoted
-     *     tokens (including the 100-continue token), and is case- 
sensitive for
-     *     quoted-string expectation-extensions.
-     */
-
-    for(i=0, k=0; i < varray->nelts; i++) {
-        header = apr_table_get(headers, elts[i]);
-        if (!header) {
-            header = "";
-        }
-        iov[k].iov_base = (char*) elts[i];
-        iov[k].iov_len = strlen(elts[i]);
-        k++;
-        iov[k].iov_base = (char*) header;
-        iov[k].iov_len = strlen(header);
-        k++;
-    }
-    iov[k].iov_base = (char*) oldkey;
-    iov[k].iov_len = strlen(oldkey);
-    k++;
-
-    return apr_pstrcatv(p, iov, k, NULL);
-}
-
-static int array_alphasort(const void *fn1, const void *fn2)
-{
-    return strcmp(*(char**)fn1, *(char**)fn2);
-}
-
-static void tokens_to_array(apr_pool_t *p, const char *data,
-                            apr_array_header_t *arr)
-{
-    char *token;
-
-    while ((token = ap_get_list_item(p, &data)) != NULL) {
-        *((const char **) apr_array_push(arr)) = token;
-    }
-
-    /* Sort it so that "Vary: A, B" and "Vary: B, A" are stored the  
same. */
-    qsort((void *) arr->elts, arr->nelts,
-         sizeof(char *), array_alphasort);
-}
-
  /*
   * Hook and mod_cache callback functions
   */
@@ -432,7 +360,7 @@
          }
          apr_file_close(dobj->hfd);

-        nkey = regen_key(r->pool, r->headers_in, varray, key);
+        nkey = ap_generate_vary_key(r->pool, r->headers_in, varray,  
key);

          dobj->hashfile = NULL;
          dobj->prefix = dobj->hdrsfile;
@@ -472,7 +400,8 @@
  #endif
      rc = apr_file_open(&dobj->fd, dobj->datafile, flags, 0, r->pool);
      if (rc != APR_SUCCESS) {
-        /* XXX: Log message */
+    	ap_log_error(APLOG_MARK, APLOG_ERR, rc, r->server,
+                 "disk_cache: Cannot open %f",  dobj->datafile);
          return DECLINED;
      }

@@ -484,7 +413,8 @@
      /* Read the bytes to setup the cache_info fields */
      rc = file_cache_recall_mydata(dobj->hfd, info, dobj, r);
      if (rc != APR_SUCCESS) {
-        /* XXX log message */
+    	ap_log_error(APLOG_MARK, APLOG_ERR, rc, r->server,
+                 "disk_cache: Cannot read cache_info from %f",  dobj- 
 >hdrsfile);
          return DECLINED;
      }

@@ -821,6 +751,7 @@
      apr_status_t rv;
      apr_size_t amt;
      disk_cache_object_t *dobj = (disk_cache_object_t*) h->cache_obj- 
 >vobj;
+    apr_array_header_t* varray;

      disk_cache_info_t disk_info;
      struct iovec iov[2];
@@ -828,15 +759,18 @@
      /* This is flaky... we need to manage the cache_info differently  
*/
      h->cache_obj->info = *info;

-    if (r->headers_out) {
-        const char *tmp;
-
-        tmp = apr_table_get(r->headers_out, "Vary");
-
-        if (tmp) {
-            apr_array_header_t* varray;
+    /* XXX it may make sense to make this part of cache_info -- as  
every
+     *     sane module will need to calculate this. Potentially
+     *     allong with a ap_generate_vary_key() value as well.
+     *
+     *     Or alternatively insist on the create_/open_ to already
+     *     prepare for this eventuality.
+     */
+    varray  = ap_normalize_vary_to_array(r->pool, r->headers_out);
+    if (varray) {
              apr_uint32_t format = VARY_FORMAT_VERSION;
-
+	    const char * nkey;
+			
              /* If we were initially opened as a vary format, rollback
               * that internal state for the moment so we can recreate  
the
               * vary format hints in the appropriate directory.
@@ -846,7 +780,12 @@
                  dobj->prefix = NULL;
              }

-            mkdir_structure(conf, dobj->hdrsfile, r->pool);
+            rv = mkdir_structure(conf, dobj->hdrsfile, r->pool);
+            if (rv != APR_SUCCESS) {
+        	ap_log_error(APLOG_MARK, APLOG_WARNING, rv, r->server,
+                     "disk_cache: cannot create path for %s", dobj- 
 >hdrsfile);
+                return rv;
+            }

              rv = apr_file_mktemp(&dobj->tfd, dobj->tempfile,
                                   APR_CREATE | APR_WRITE | APR_BINARY  
| APR_EXCL,
@@ -862,9 +801,6 @@
              amt = sizeof(info->expire);
              apr_file_write(dobj->tfd, &info->expire, &amt);

-            varray = apr_array_make(r->pool, 6, sizeof(char*));
-            tokens_to_array(r->pool, tmp, varray);
-
              store_array(dobj->tfd, varray);

              apr_file_close(dobj->tfd);
@@ -882,15 +818,13 @@
              }

              dobj->tempfile = apr_pstrcat(r->pool, conf->cache_root,  
AP_TEMPFILE, NULL);
-            tmp = regen_key(r->pool, r->headers_in, varray, dobj- 
 >name);
+            nkey = ap_generate_vary_key(r->pool, r->headers_in,  
varray, dobj->name);
              dobj->prefix = dobj->hdrsfile;
              dobj->hashfile = NULL;
-            dobj->datafile = data_file(r->pool, conf, dobj, tmp);
-            dobj->hdrsfile = header_file(r->pool, conf, dobj, tmp);
+            dobj->datafile = data_file(r->pool, conf, dobj, nkey);
+            dobj->hdrsfile = header_file(r->pool, conf, dobj, nkey);
          }
-    }

-
      rv = apr_file_mktemp(&dobj->hfd, dobj->tempfile,
                           APR_CREATE | APR_WRITE | APR_BINARY |
                           APR_BUFFERED | APR_EXCL, r->pool);
@@ -916,41 +850,40 @@

      rv = apr_file_writev(dobj->hfd, (const struct iovec *) &iov, 2,  
&amt);
      if (rv != APR_SUCCESS) {
+        apr_file_close(dobj->hfd); /* flush and close */
+	apr_file_remove(dobj->tempfile, r->pool);
          return rv;
      }

      if (r->headers_out) {
          apr_table_t *headers_out;

-        headers_out = ap_cache_cacheable_hdrs_out(r->pool, r- 
 >headers_out,
-                                                  r->server);
+        headers_out = ap_cache_cacheable_hdrs_out(r);

-        if (!apr_table_get(headers_out, "Content-Type")
-            && r->content_type) {
-            apr_table_setn(headers_out, "Content-Type",
-                           ap_make_content_type(r, r->content_type));
-        }
-
-        headers_out = apr_table_overlay(r->pool, headers_out,
-                                        r->err_headers_out);
          rv = store_table(dobj->hfd, headers_out);
          if (rv != APR_SUCCESS) {
+    	    apr_file_close(dobj->hfd); /* flush and close */
+	    apr_file_remove(dobj->tempfile, r->pool);
              return rv;
          }
      }

-    /* Parse the vary header and dump those fields from the  
headers_in. */
-    /* FIXME: Make call to the same thing cache_select calls to crack  
Vary. */
-    if (r->headers_in) {
+    /* Parse the vary header and dump those fields from the headers_in.
+     * But only do so if needed - i.e. there is a Vary - and then limit
+     * it to the headers needed to validate the Vary.
+     */
+    if (r->headers_in && varray) {
          apr_table_t *headers_in;
-
-        headers_in = ap_cache_cacheable_hdrs_out(r->pool, r- 
 >headers_in,
-                                                 r->server);
+        headers_in = ap_cache_cacheable_hdrs_in(r, varray);
+        if (headers_in) {
          rv = store_table(dobj->hfd, headers_in);
          if (rv != APR_SUCCESS) {
+    		     apr_file_close(dobj->hfd); /* flush and close */
+	    	     apr_file_remove(dobj->tempfile, r->pool);
              return rv;
          }
      }
+    }

      apr_file_close(dobj->hfd); /* flush and close */

@@ -960,8 +893,14 @@
       */
      rv = apr_file_remove(dobj->hdrsfile, r->pool);
      if (rv != APR_SUCCESS) {
-        mkdir_structure(conf, dobj->hdrsfile, r->pool);
+        rv = mkdir_structure(conf, dobj->hdrsfile, r->pool);
+            if (rv != APR_SUCCESS) {
+        	ap_log_error(APLOG_MARK, APLOG_WARNING, rv, r->server,
+                     "disk_cache: cannot create path for %s", dobj- 
 >hdrsfile);
+		apr_file_remove(dobj->tempfile, r->pool);
+                return rv;
      }
+    }

      rv = safe_file_rename(conf, dobj->tempfile, dobj->hdrsfile, r- 
 >pool);
      if (rv != APR_SUCCESS) {

Re: cache - cleaning up mod_memcache and making other caches their live easier

Posted by Graham Leggett <mi...@sharp.fm>.

Dirk-Willem van Gulik wrote:

> Before I embark on an experiment (without much design/planning) -- any 
> thoughts ? Or has someone already done most of this and/or designed it 
> properly ?

mod_disk_cache has certainly received more attention than the other 
modules, and can probably be considered the most "right" of the modules.

In addition, a lot of the work on the cache has been incremental in 
nature, so while it's "right", the code definitely has scope to become 
more efficient through refactoring and more code reuse.

Regards,
Graham
--