You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Brian Pane <bp...@pacbell.net> on 2001/06/22 23:03:46 UTC

directory_walk performance

More fun with gprof...
directory_walk is one of the bigger consumers of user-mode CPU time in
the current 2.0 httpd source, due largely to its calls to 
ap_merge_per_dir_configs.

For a typical configuration that has one <Directory> config for '/' and 
another for
the document root path, it looks like all requests for real files will 
follow the
same pattern:
  * Initialize per_dir_defaults to be the default_lookups for the vhost 
matching
     the request
  * Merge the configs for '/'
  * Merge the configs for the document root path

I.e., the same merge of the / and /document/root configs is happening on 
every
request.

Assuming that this configuration style (with a <Directory> block for the
document root path) is indeed common, would it be feasible to precompute
the merge of the '/' configs with the /document/root configs after 
initialization
to support optimized handling of this case in directory_walk?  In the 
implementation
I'm thinking of, the logic in directory_walk would look something like:
    if (path begins with sconf->ap_document_root) {
          per_dir_defaults = precomputed merged configs for document root;
          scan the rest of the path (after the docroot prefix) for 
possible additional dir matches;
    }
    else {
        use the current algorithm;
    }

Would that sort of optimization make sense, or is it too special-purpose 
(or too
incorrect, even) to be generally useful?

Thanks,
--Brian



Re: [PATCH] Re: directory_walk performance

Posted by Brian Pane <bp...@pacbell.net>.
dean gaudet wrote:

>On Thu, 2 Aug 2001, Brian Pane wrote:
>
>>are there any modules that don't treat the base as const?
>>
>
>such modules would be broken.  base is a const.
>
Good, so it should be safe to pre-merge the per-dir configs.

Does anybody have additional comments on the patch?  I haven't
yet seen much feedback on the code.

Thanks
--Brian




Re: [PATCH] Re: directory_walk performance

Posted by dean gaudet <dg...@arctic.org>.
On Thu, 2 Aug 2001, Brian Pane wrote:

> are there any modules that don't treat the base as const?

such modules would be broken.  base is a const.

-dean


Re: [PATCH] Re: directory_walk performance

Posted by Brian Pane <bp...@pacbell.net>.
William A. Rowe, Jr. wrote:

>From: "Brian Pane" <bp...@pacbell.net>
>Sent: Thursday, August 02, 2001 3:54 AM
>
>
>>William A. Rowe, Jr. wrote:
>>
>>> see my commit to core/request.c ... I've dropped in the new directory_walk
>>>code.
>>>
>>Here's a patch that implements my pre-merge optimization to
>>reduce (and in some cases completely eliminate) the calls to
>>ap_merge_per_dir_configs.  Its impacts on directory_walk are
>>minor; all the real work happens in a new post-config-phase
>>function.
>>
>>I was able to generalize the pre-merge technique to support
>>even wildcard directories (something that my original prototype
>>couldn't handle correctly).  The comments in core.c describe the
>>algorithm.
>>
>
>There is one _huge_, glaring omission (not in your patch).
>
>If we optimize an individual module dir_merge (see the tables in mod_mime)
>and the optimizer comes along and holds a partial merge, that cached merge 
>_may_ become corrupted when we try merging along the complete context in
>the r pool.
>
>This change will require all dir_merge operaions to be told if it is merging
>a final (cached) copy, or an incremental (the request, or part of a cached
>copy that isn't the end-node).  The dir_merge must be certain it doesn't modify
>anything in the working base if that base has come from the config rec or the
>cache.  Contrawise, it _may_ modify the base if the caller will be discarding
>that base.
>
I don't think this is a new problem.  It's never been safe for a
dir_merge to blindly modify the base.  In directory_walk, the very
first call to ap_merge_per_dir_configs gets r->server->lookup_defaults
as its base, and the dir_merge shouldn't be modifying that.  Mod_mime
looks okay in this regard, with its copy-on-write logic; are there any
modules that don't treat the base as const?

--Brian




Re: [PATCH] Re: directory_walk performance

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
From: "Brian Pane" <bp...@pacbell.net>
Sent: Thursday, August 02, 2001 3:54 AM


> William A. Rowe, Jr. wrote:
> 
> >  see my commit to core/request.c ... I've dropped in the new directory_walk
> >code.
> 
> Here's a patch that implements my pre-merge optimization to
> reduce (and in some cases completely eliminate) the calls to
> ap_merge_per_dir_configs.  Its impacts on directory_walk are
> minor; all the real work happens in a new post-config-phase
> function.
> 
> I was able to generalize the pre-merge technique to support
> even wildcard directories (something that my original prototype
> couldn't handle correctly).  The comments in core.c describe the
> algorithm.

There is one _huge_, glaring omission (not in your patch).

If we optimize an individual module dir_merge (see the tables in mod_mime)
and the optimizer comes along and holds a partial merge, that cached merge 
_may_ become corrupted when we try merging along the complete context in
the r pool.

This change will require all dir_merge operaions to be told if it is merging
a final (cached) copy, or an incremental (the request, or part of a cached
copy that isn't the end-node).  The dir_merge must be certain it doesn't modify
anything in the working base if that base has come from the config rec or the
cache.  Contrawise, it _may_ modify the base if the caller will be discarding
that base.

Bill


Re: [PATCH] Re: directory_walk performance

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
From: "Brian Pane" <bp...@pacbell.net>
Sent: Thursday, August 02, 2001 3:54 AM


> Here's a patch that implements my pre-merge optimization to
> reduce (and in some cases completely eliminate) the calls to
> ap_merge_per_dir_configs.  Its impacts on directory_walk are
> minor; all the real work happens in a new post-config-phase
> function.


Another glitch.  If the source is from the cache, we can't trust that elements
of base or add allocated will still in memory when the cache is called :(  Not
unless we also discard all decendants when we refresh the cache.  This implies
that we need a protected refcount for each of the cached elements, not only by
the request that is relying on the data, but the straight down the chain.

Bill



[PATCH] Re: directory_walk performance

Posted by Brian Pane <bp...@pacbell.net>.
William A. Rowe, Jr. wrote:

>Brian,
>
>  see my commit to core/request.c ... I've dropped in the new directory_walk
>code.
>
>  I'd like us to optimize it for things we _know_ (e.g., it's given as a
><Directory > so skip the stat, unless FollowSymLinks isn't set, and assume
>for the remainder of the request that it _is_ a directory) and then introduce
>your optimizations.  Hopefully both branches in the same source help us keep 
>running while we keep developing.  Focus on the second directory_walk function
>since the first will be deprecated and removed when we've got it right :)
>
>Bill
>

Here's a patch that implements my pre-merge optimization to
reduce (and in some cases completely eliminate) the calls to
ap_merge_per_dir_configs.  Its impacts on directory_walk are
minor; all the real work happens in a new post-config-phase
function.

I was able to generalize the pre-merge technique to support
even wildcard directories (something that my original prototype
couldn't handle correctly).  The comments in core.c describe the
algorithm.

--Brian

Index: include/http_core.h
===================================================================
RCS file: /home/cvspublic/httpd-2.0/include/http_core.h,v
retrieving revision 1.46
diff -u -r1.46 http_core.h
--- include/http_core.h    2001/07/30 18:51:57    1.46
+++ include/http_core.h    2001/08/02 08:39:09
@@ -492,6 +492,14 @@
     char *access_name;
     apr_array_header_t *sec;
     apr_array_header_t *sec_url;
+
+    /* Pre-merged per-dir configs */
+#define NUM_PRE_MERGE_DIRS 6
+    struct {
+    const char *dirname;
+    ap_conf_vector_t *conf;
+    } pre_merged_dir_conf[1 << (NUM_PRE_MERGE_DIRS - 1)];
+
 } core_server_config;
 
 /* for http_config.c */
Index: server/core.c
===================================================================
RCS file: /home/cvspublic/httpd-2.0/server/core.c,v
retrieving revision 1.32
diff -u -r1.32 core.c
--- server/core.c    2001/08/01 19:15:22    1.32
+++ server/core.c    2001/08/02 08:39:11
@@ -3295,6 +3295,166 @@
     ap_set_version(pconf);
 }
 
+
+static int is_possible_predecessor(const char *dir1, const char *dir2)
+{
+    char c1, c2;
+    while ((c1 = *dir1++)) {
+    c2 = *dir2++;
+    if (c1 != c2) {
+        if ((c1 == '*') || (c2 == '*')) {
+        while ((c1 = *dir1) && (c1 != '/'))
+            dir1++;
+        while ((c2 = *dir2) && (c2 != '/'))
+            dir2++;
+        }
+        else {
+        return 0;
+        }
+    }
+    }
+    return 1;
+}
+
+
+/* Pre-merge the per-directory configurations, to avoid the
+ * overhead of doing this for each request (see directory_walk
+ * in request.c).
+ *
+ * Background:
+ *
+ * There are 'n' per-directory configurations.  The directory_walk()
+ * function scans through them from i=0 through i=n-1 and either merges
+ * 'per-dir-config[i]' into its pending request configuration or
+ * leaves that config out (depending on whether the directory name
+ * associated with per-dir-config[i] matches the requested file
+ * or not).
+ *
+ * It's possible to represent the set of merges done for a request
+ * as a vector of bits, V[0] through V[n-1].  V[i] is 1 if
+ * per-dir-config[i] is used for the requests, or 0 if it isn't.
+ * Conceptually, pre_merge_per_dir_configs() is responsible for
+ * pre-building the final configuration associated with each
+ * possible value of this bit vector.  Given this pre-merged
+ * config structure, directory_walk() can just look up the
+ * end result instead of actually doing all of the merging at
+ * request time.
+ *
+ * In the general case, it's not feasible to precompute all
+ * permutations of V, because there are O(2^n) possibilities.
+ * Thus pre_merge_per_dir_configs() computes the pre-merge
+ * for the first NUM_PRE_MERGE_DIRS directories.  It still
+ * uses O(2^m) storage, but m is constrained to a small value.
+ * As an additional space optimization, the function detects
+ * impossible permutations (e.g., if you have <Directory>
+ * blocks for /foo and /bar/*, their configs will never actually
+ * be merged) and doesn't compute the merged configs for these.
+ */
+static void pre_merge_per_dir_configs(apr_pool_t *p, server_rec *s)
+{
+    core_server_config *sconf;
+    apr_array_header_t *sec;
+    int nelts;
+    ap_conf_vector_t **elts;
+    int num_bits;
+    int i;
+    int max;
+    int mask;
+    int threshold;
+    int j;
+
+    sconf = ap_get_module_config(s->module_config, &core_module);
+    sec = sconf->sec;
+    nelts = sec->nelts;
+    elts = (ap_conf_vector_t **)sec->elts;
+
+    num_bits =
+    (sec->nelts < NUM_PRE_MERGE_DIRS) ? sec->nelts : NUM_PRE_MERGE_DIRS;
+    max = (1 << (num_bits - 1));
+
+    sconf->pre_merged_dir_conf[0].conf = s->lookup_defaults;
+    sconf->pre_merged_dir_conf[0].dirname = "";
+    mask = 1;
+    threshold = 2;
+    j = 0;
+    /* i holds the value of the vector 'V' in the
+     * algorithm described in the comments at the top
+     * of this function.  To make the code simpler, the
+     * bits are 'backwards'; the low order bit of i is V[0].
+     * sconf->pre_merged_dir_conf is a vector of 2^NUM_PRE_MERGE_DIRS
+     * elements; for any permutation P of the bits in V,
+     * sconf->pre_merged_dir_conf[P] contains the merged
+     * per-dir configuration corresponding to P.
+     */
+    for (i = 1; i < max; i++) {
+    int parent;
+    if (i == threshold) {
+        mask = threshold;
+        threshold <<= 1;
+        j++;
+    }
+
+    /* The bit vector 'V' can be thought of as a tree;
+     * the configuration for a node can be computed
+     * incrementally by merging with its parent
+     */
+    parent = i & (mask - 1);
+
+    if (sconf->pre_merged_dir_conf[parent].conf == NULL) {
+        /* If the parent node represents an unreachable
+         * state, so does the child, so don't bother
+         * computing it
+         */
+        continue;
+    }
+    if (i & mask) {
+        /* A 1 value in the new high-order bit corresponds to
+         * a permutation in which the j'th per-directory config
+         * is a match against the requested filename.  Thus we
+         * should compute the merge of the j'th config with
+         * the config stored in the parent node.  However,
+         * if the j'th config corresponds to a <Directory>
+         * path that can't possibly match a filename that
+         * matched the parent node, then we needn't waste
+         * time or memory computing it
+         */
+        core_dir_config *core_elt = ap_get_module_config(elts[j],
+                                 &core_module);
+        const char *dirname = core_elt->d;
+        const char *parent_dirname =
+        sconf->pre_merged_dir_conf[parent].dirname;
+        if (is_possible_predecessor(parent_dirname, dirname)) {
+        sconf->pre_merged_dir_conf[i].conf =
+            ap_merge_per_dir_configs(p,
+             sconf->pre_merged_dir_conf[parent].conf, elts[j]);
+        sconf->pre_merged_dir_conf[i].dirname = dirname;
+        }
+    }
+    else {
+        /* A 0 value in the new high-order bit corresponds to
+         * a permutation in which the j'th per-directory config
+         * is not matched.  Thus the merged configuration for
+         * this node is the same as that for its parent, and
+         * we can copy the pointer rather than making another
+         * copy of the merged config
+         */
+        sconf->pre_merged_dir_conf[i] = sconf->pre_merged_dir_conf[parent];
+    }
+    }
+}
+
+static void core_post_config_merge(apr_pool_t *p, apr_pool_t *plog,
+                   apr_pool_t *ptemp, server_rec *s)
+{
+    server_rec *virt;
+
+    for (virt = s->next; virt; virt = virt->next) {
+    pre_merge_per_dir_configs(p, virt);
+    }
+    pre_merge_per_dir_configs(p, s);
+}
+
+
 static void core_open_logs(apr_pool_t *pconf, apr_pool_t *plog, 
apr_pool_t *ptemp, server_rec *s)
 {
     ap_open_logs(s, pconf);
@@ -3339,6 +3499,7 @@
 static void register_hooks(apr_pool_t *p)
 {
     ap_hook_post_config(core_post_config,NULL,NULL,APR_HOOK_REALLY_FIRST);
+    
ap_hook_post_config(core_post_config_merge,NULL,NULL,APR_HOOK_REALLY_LAST);
     
ap_hook_translate_name(ap_core_translate,NULL,NULL,APR_HOOK_REALLY_LAST);
     ap_hook_open_logs(core_open_logs,NULL,NULL,APR_HOOK_MIDDLE);
     ap_hook_handler(default_handler,NULL,NULL,APR_HOOK_REALLY_LAST);
Index: server/request.c
===================================================================
RCS file: /home/cvspublic/httpd-2.0/server/request.c,v
retrieving revision 1.18
diff -u -r1.18 request.c
--- server/request.c    2001/08/01 11:59:55    1.18
+++ server/request.c    2001/08/02 08:39:12
@@ -717,6 +717,7 @@
     int num_sec = sconf->sec->nelts;
     int sec;
     unsigned int seg;
+    unsigned int bitmask;
     int res;
     ap_conf_vector_t *entry_config;
     ap_conf_vector_t *this_conf;
@@ -814,9 +815,12 @@
      * seg keeps track of which segment we've copied.
      * sec keeps track of which section we're on, since sections are
      *     ordered by number of segments. See core_reorder_directories
+     * bitmask keeps track of which per-dir configs we've merged...
+     *     given a list of 'n'
      */
     seg = 1;
     sec = 0;
+    bitmask = 0;
     do {
         int overrides_here;
         core_dir_config *core_dir = ap_get_module_config(per_dir_defaults,
@@ -852,9 +856,23 @@
                 this_conf = entry_config;
 
             if (this_conf) {
-                per_dir_defaults = ap_merge_per_dir_configs(r->pool,
-                                                            
per_dir_defaults,
-                                                            this_conf);
+        if (sec < NUM_PRE_MERGE_DIRS) {
+            bitmask |= (1 << sec);
+            if (sconf->pre_merged_dir_conf[bitmask].conf) {
+            per_dir_defaults =
+                sconf->pre_merged_dir_conf[bitmask].conf;
+            }
+            else {
+            per_dir_defaults =
+                ap_merge_per_dir_configs(r->pool, per_dir_defaults,
+                             this_conf);
+            }
+        }
+        else {
+            per_dir_defaults =
+            ap_merge_per_dir_configs(r->pool, per_dir_defaults,
+                         this_conf);
+        }
                 core_dir = ap_get_module_config(per_dir_defaults,
                                                 &core_module);
             }
@@ -1008,9 +1026,23 @@
 
         if (entry_core->r) {
             if (!ap_regexec(entry_core->r, r->filename, 0, NULL, 
REG_NOTEOL)) {
-                per_dir_defaults = ap_merge_per_dir_configs(r->pool,
-                                                            
per_dir_defaults,
-                                                            entry_config);
+        if (sec < NUM_PRE_MERGE_DIRS) {
+            bitmask |= (1 << sec);
+            if (sconf->pre_merged_dir_conf[bitmask].conf) {
+            per_dir_defaults =
+                sconf->pre_merged_dir_conf[bitmask].conf;
+            }
+            else {
+            per_dir_defaults =
+                ap_merge_per_dir_configs(r->pool, per_dir_defaults,
+                             entry_config);
+            }
+        }
+        else {
+            per_dir_defaults =
+            ap_merge_per_dir_configs(r->pool, per_dir_defaults,
+                         entry_config);
+        }
             }
         }
     }


Re: directory_walk performance

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
Brian,

  see my commit to core/request.c ... I've dropped in the new directory_walk
code.

  I'd like us to optimize it for things we _know_ (e.g., it's given as a
<Directory > so skip the stat, unless FollowSymLinks isn't set, and assume
for the remainder of the request that it _is_ a directory) and then introduce
your optimizations.  Hopefully both branches in the same source help us keep 
running while we keep developing.  Focus on the second directory_walk function
since the first will be deprecated and removed when we've got it right :)

Bill
  

----- Original Message ----- 
From: "Brian Pane" <bp...@pacbell.net>
To: "new-httpd" <ne...@apache.org>
Sent: Friday, June 22, 2001 4:03 PM
Subject: directory_walk performance


> More fun with gprof...
> directory_walk is one of the bigger consumers of user-mode CPU time in
> the current 2.0 httpd source, due largely to its calls to 
> ap_merge_per_dir_configs.
> 
> For a typical configuration that has one <Directory> config for '/' and 
> another for
> the document root path, it looks like all requests for real files will 
> follow the
> same pattern:
>   * Initialize per_dir_defaults to be the default_lookups for the vhost 
> matching
>      the request
>   * Merge the configs for '/'
>   * Merge the configs for the document root path
> 
> I.e., the same merge of the / and /document/root configs is happening on 
> every
> request.
> 
> Assuming that this configuration style (with a <Directory> block for the
> document root path) is indeed common, would it be feasible to precompute
> the merge of the '/' configs with the /document/root configs after 
> initialization
> to support optimized handling of this case in directory_walk?  In the 
> implementation
> I'm thinking of, the logic in directory_walk would look something like:
>     if (path begins with sconf->ap_document_root) {
>           per_dir_defaults = precomputed merged configs for document root;
>           scan the rest of the path (after the docroot prefix) for 
> possible additional dir matches;
>     }
>     else {
>         use the current algorithm;
>     }
> 
> Would that sort of optimization make sense, or is it too special-purpose 
> (or too
> incorrect, even) to be generally useful?
> 
> Thanks,
> --Brian
> 
> 
> 


Re: directory_walk performance

Posted by dean gaudet <dg...@arctic.org>.
On Sat, 23 Jun 2001 rbb@covalent.net wrote:

> On Sat, 23 Jun 2001, Bill Stoddard wrote:
>
> > > Just a heads up, because I know he's off-list this weekend.  Will Rowe has
> > > been looking at some of this stuff recently.  He has basically re-worked
> > > directory walk to take advantage of a lot of the stat() calls that have
> > > already been made.
> >
> > A request for a file incurs one stat() call.  What other stat() calls are we talking about?
>
> A request for a single page can incur far more than one stat() call.
> Think sub-requests, and .htacess files.

and PATH_INFO.

-dean


Re: directory_walk performance

Posted by rb...@covalent.net.
On Sat, 23 Jun 2001, Bill Stoddard wrote:

> > Just a heads up, because I know he's off-list this weekend.  Will Rowe has
> > been looking at some of this stuff recently.  He has basically re-worked
> > directory walk to take advantage of a lot of the stat() calls that have
> > already been made.
>
> A request for a file incurs one stat() call.  What other stat() calls are we talking about?

A request for a single page can incur far more than one stat() call.
Think sub-requests, and .htacess files.

Ryan

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------


Re: directory_walk performance

Posted by Bill Stoddard <bi...@wstoddard.com>.
> Just a heads up, because I know he's off-list this weekend.  Will Rowe has
> been looking at some of this stuff recently.  He has basically re-worked
> directory walk to take advantage of a lot of the stat() calls that have
> already been made. 

A request for a file incurs one stat() call.  What other stat() calls are we talking about?

Bill


Re: directory_walk performance

Posted by dean gaudet <dg...@arctic.org>.
that works :)

however you can't quite handle /home/*/public_html that way can you?
ap_core_reorder_directories allows for those simple, single
path-component, wildcards.

-dean

On Sun, 24 Jun 2001, Brian Pane wrote:

> Thanks for the pointer; I'll see if I can apply the 'pre-merging'
> concept on top of Will's version of directory_walk.  In the meantime,
> I think I've figured out how to generalize the idea to cover pre-merging
> of all non-regular-expression-based directories.  Thanks to the
> sorting done by ap_core_reorder_directories, and directory_walk's
> corresponding pattern of handling all the non-regex directory matches
> before any of the regex ones, it's possible to precompute the merged
> config vectors for all the non-regex directories as a final post-config
> step before handling any requests.  E.g., if a server has <Directory>
> blocks for '/' and '/foo' and '/foo/bar', it's guaranteed that any request
> matching '/foo/bar' will have matched '/' and '/foo' also, so we can
> 'pre-inherit' the configs for '/' and '/foo' into the config vector for
> '/foo/bar' at startup rather than doing the merge repeatedly on each
> request.  I prototyped this algorithm for the current directory_walk,
> and it seems to work properly, but I haven't done extensive testing.
>
> --Brian
>
>
> rbb@covalent.net wrote:
>
> >Just a heads up, because I know he's off-list this weekend.  Will Rowe has
> >been looking at some of this stuff recently.  He has basically re-worked
> >directory walk to take advantage of a lot of the stat() calls that have
> >already been made.  Brian, I would suggest sending him e-mail on Monday so
> >that you can take advantage of his work.  He says it's not quite ready for
> >public consumption yet, but I'm sure he'll send it if there is interest.
> >BTW, this idea is great, I would love to see a patch, especially if it can
> >be generalized at all!
> >
> >Ryan
> >
> >On Sat, 23 Jun 2001, dean gaudet wrote:
> >
> >>that sounds good.
> >>
> >>it's just like constant-folding :)
> >>
> >>can you generalise it any?  Alias and mod_userdir can add more constant
> >>factors in the path.
> >>
> >>-dean
> >>
> >>On Fri, 22 Jun 2001, Brian Pane wrote:
> >>
> >>>More fun with gprof...
> >>>directory_walk is one of the bigger consumers of user-mode CPU time in
> >>>the current 2.0 httpd source, due largely to its calls to
> >>>ap_merge_per_dir_configs.
> >>>
> >>>For a typical configuration that has one <Directory> config for '/' and
> >>>another for
> >>>the document root path, it looks like all requests for real files will
> >>>follow the
> >>>same pattern:
> >>>  * Initialize per_dir_defaults to be the default_lookups for the vhost
> >>>matching
> >>>     the request
> >>>  * Merge the configs for '/'
> >>>  * Merge the configs for the document root path
> >>>
> >>>I.e., the same merge of the / and /document/root configs is happening on
> >>>every
> >>>request.
> >>>
> >>>Assuming that this configuration style (with a <Directory> block for the
> >>>document root path) is indeed common, would it be feasible to precompute
> >>>the merge of the '/' configs with the /document/root configs after
> >>>initialization
> >>>to support optimized handling of this case in directory_walk?  In the
> >>>implementation
> >>>I'm thinking of, the logic in directory_walk would look something like:
> >>>    if (path begins with sconf->ap_document_root) {
> >>>          per_dir_defaults = precomputed merged configs for document root;
> >>>          scan the rest of the path (after the docroot prefix) for
> >>>possible additional dir matches;
> >>>    }
> >>>    else {
> >>>        use the current algorithm;
> >>>    }
> >>>
> >>>Would that sort of optimization make sense, or is it too special-purpose
> >>>(or too
> >>>incorrect, even) to be generally useful?
> >>>
> >>>Thanks,
> >>>--Brian
> >>>
> >>>
> >>>
> >>
> >
> >
> >_______________________________________________________________________________
> >Ryan Bloom                        	rbb@apache.org
> >406 29th St.
> >San Francisco, CA 94131
> >-------------------------------------------------------------------------------
> >
> >
>
>
>
>


Re: directory_walk performance

Posted by Brian Pane <bp...@pacbell.net>.
Thanks for the pointer; I'll see if I can apply the 'pre-merging'
concept on top of Will's version of directory_walk.  In the meantime,
I think I've figured out how to generalize the idea to cover pre-merging
of all non-regular-expression-based directories.  Thanks to the
sorting done by ap_core_reorder_directories, and directory_walk's
corresponding pattern of handling all the non-regex directory matches
before any of the regex ones, it's possible to precompute the merged
config vectors for all the non-regex directories as a final post-config
step before handling any requests.  E.g., if a server has <Directory>
blocks for '/' and '/foo' and '/foo/bar', it's guaranteed that any request
matching '/foo/bar' will have matched '/' and '/foo' also, so we can
'pre-inherit' the configs for '/' and '/foo' into the config vector for
'/foo/bar' at startup rather than doing the merge repeatedly on each
request.  I prototyped this algorithm for the current directory_walk,
and it seems to work properly, but I haven't done extensive testing.

--Brian


rbb@covalent.net wrote:

>Just a heads up, because I know he's off-list this weekend.  Will Rowe has
>been looking at some of this stuff recently.  He has basically re-worked
>directory walk to take advantage of a lot of the stat() calls that have
>already been made.  Brian, I would suggest sending him e-mail on Monday so
>that you can take advantage of his work.  He says it's not quite ready for
>public consumption yet, but I'm sure he'll send it if there is interest.
>BTW, this idea is great, I would love to see a patch, especially if it can
>be generalized at all!
>
>Ryan
>
>On Sat, 23 Jun 2001, dean gaudet wrote:
>
>>that sounds good.
>>
>>it's just like constant-folding :)
>>
>>can you generalise it any?  Alias and mod_userdir can add more constant
>>factors in the path.
>>
>>-dean
>>
>>On Fri, 22 Jun 2001, Brian Pane wrote:
>>
>>>More fun with gprof...
>>>directory_walk is one of the bigger consumers of user-mode CPU time in
>>>the current 2.0 httpd source, due largely to its calls to
>>>ap_merge_per_dir_configs.
>>>
>>>For a typical configuration that has one <Directory> config for '/' and
>>>another for
>>>the document root path, it looks like all requests for real files will
>>>follow the
>>>same pattern:
>>>  * Initialize per_dir_defaults to be the default_lookups for the vhost
>>>matching
>>>     the request
>>>  * Merge the configs for '/'
>>>  * Merge the configs for the document root path
>>>
>>>I.e., the same merge of the / and /document/root configs is happening on
>>>every
>>>request.
>>>
>>>Assuming that this configuration style (with a <Directory> block for the
>>>document root path) is indeed common, would it be feasible to precompute
>>>the merge of the '/' configs with the /document/root configs after
>>>initialization
>>>to support optimized handling of this case in directory_walk?  In the
>>>implementation
>>>I'm thinking of, the logic in directory_walk would look something like:
>>>    if (path begins with sconf->ap_document_root) {
>>>          per_dir_defaults = precomputed merged configs for document root;
>>>          scan the rest of the path (after the docroot prefix) for
>>>possible additional dir matches;
>>>    }
>>>    else {
>>>        use the current algorithm;
>>>    }
>>>
>>>Would that sort of optimization make sense, or is it too special-purpose
>>>(or too
>>>incorrect, even) to be generally useful?
>>>
>>>Thanks,
>>>--Brian
>>>
>>>
>>>
>>
>
>
>_______________________________________________________________________________
>Ryan Bloom                        	rbb@apache.org
>406 29th St.
>San Francisco, CA 94131
>-------------------------------------------------------------------------------
>
>




Re: directory_walk performance

Posted by Brian Pane <bp...@pacbell.net>.
William A. Rowe, Jr. wrote:
[...]

>On Sat, 23 Jun 2001, dean gaudet wrote:
>
>>>that sounds good.
>>>
>>>it's just like constant-folding :)
>>>
>>>can you generalise it any?  Alias and mod_userdir can add more constant
>>>factors in the path.
>>>
>
>More complicated than that.  Think rewrite :-)
>
>Bill Stoddard and I chatted about this, our collective instinct is that we need
>to generally cache the intermediate results, and give the operator the choice of
>how quickly to invalidate it.
>
In general, I like this caching strategy.  My approach of pre-merging
the configs for all constant path matches is applicable on servers that
have been tuned to disable .htaccess files, but caching of intermediate
results can yield essentially the same optimization while still supporting
htaccess.

The one major concern I have about the caching of intermediate
results is that the mutex protecting the cache reads/writes will
become a scalability bottleneck on multiprocessors, but that's
not an insurmountable problem.

--Brian



Re: directory_walk performance

Posted by dean gaudet <dg...@arctic.org>.
On Mon, 25 Jun 2001, William A. Rowe, Jr. wrote:

> One consideration, depending on module authorship, is if a module
> author chooses to 'do something' in the directory merge operation,
> relative to the request (uri, query elements, etc.)  Does anyone know
> of such a module?

the merge functions have arguments (pool *, void *, void *) ... there's no
knowledge of the request.  this is pretty much deliberate... since they
need to be invoked during config time when no request exists.

...

the cache really should be of request URI -> merged request info (and more
generaly of the entire request header -> filled in request_rec).  i
started to go down this path when i was implementing the "flow cache".
it's very similar to what modern routers / packet filters do so that they
can avoid doing O(N) work on each packet.  (they cache patterns which
match packets which are allowed, and generally include the routing
information in the cache, so that the decision can be made in special
purpose hardware.  a central CPU only needs to decide what goes in the
cache -- the CPU implements the full O(N) search over the configuration.)

this is one of the many reasons that architectures like TUX make so much
sense.  (you know how enamoured i am of the generality of the apache
config language...)

below is the README from
<http://arctic.org/~dean/apache/2.0/flow-00.tar.gz>.

-dean

HTTP FLOWS

A "flow" is a pattern that maps a request to a response.  At a minimum
the pattern includes a URL.  More complex patterns are possible, such
as patterns involving the various "Accept" headers, a Cookie header,
or an Authorization header.

Pattern matching is very simple -- only exact matches are supported.

Patterns can be broken into three categories (possibly more):

    - object -- mapping URLs to filenames
    - negotiation -- distinguishing multiple mappings for the same URL
    - authorization -- permitting or forbidding a request based on the
	presence or absence of headers

In the interest of efficiency authorization patterns will be
shared between multiple flows.  It is easiest to think of this as an
implementation of the <Directory> containers.  If an entire tree /private
is protected by Basic auth, then there will be an authorization pattern
corresponding to /private.  Conceptually the pattern is essentially
a list of permitted Authorization headers, and request containing an
Authorization header in the list is permitted.

It is not feasible in general to pre-calculate all the flows for
a website.  In fact that's not interesting.  What is interesting is
dynamic calculation of flows.  Let's call a "flow cache" a set of flows
that describe a subset of the responses for a website.  If a flow is
in the cache then we know for certain what to respond with, and we can
immediately send the response without further processing.  If a flow
isn't in the cache then we need to perform more expensive processing,
and while doing that we may add a flow to the cache to handle future
requests matching similar patterns.

Consider this httpd.conf fragment:

    DocumentRoot /www/htdocs

    <Directory /www/htdocs/private>
	BrowserMatch Mozilla/2 go_away
	order allow,deny
	allow from all
	deny from env=go_away
    </Directory>

The directory container blocks access by User-Agent's containing
"Mozilla/2".  (Or something like that, can't be bothered to look up
the syntax.)

Initially the flow cache is empty.

Suppose we get a request for "/".  The flow processor can't find any
match, so it punts.  The full request processing is performed -- this is
pretty much the same as the current Apache API.  During this process we
calculate that the url "/" maps to the file "/www/htdocs/index.html",
and there are no alternatives to negotiate, and no authorization
to restrict access.  So we insert a flow with the pattern "/ ->
/www/htdocs/index.html" into the flow cache.

Actually we wouldn't quite insert that.  If we're running on WIN32
we'd insert a pattern "/" -> file_handle, where file_handle is an open
read-only handle for /www/htdocs/index.html.  This way we could call
TransmitFile directly.  Similarly on unix we'd use mmap().  Or maybe we'd
actually map it to a database object cached in memory... it's essentially
a mapping to a handler and an object.  Also as part of this map we insert
cached content-length, last-modified, and content-type values used to
build the response headers.

Similar when we get requests for "/logo.gif", or "/warez.txt" we can
insert them into the cache.

When a request comes for "/private/" we have to build an authentication
pattern.  An auth pattern is a list of headers and the values they must
match.  Suppose this first request for "/private/" comes from a mozilla/2
client.  We build an auth pattern for <Directory /www/htdocs/private>,
indicating that User-Agent must match one of {} ... must match an empty
list.  i.e. it will never match.  We will also build the "/private/"
-> /www/htdocs/private/index.html mapping, and it will refer to the
<Directory /www/htdocs/private> auth pattern.

If we then get a request for "/private/" from a "Lynx/2.6 libwww-FM/2.14"
client, it won't match in the flow cache.  But we will permit
it during the full request processing... and then we can insert
"Lynx/2.6 libwww-FM/2.14" into the User-Agent list for <Directory
/www/htdocs/private>.  Subsequent requests from that same user agent
to /private/ will match in the flow-cache and be served without full
processing.

We can continue this way, adding urls such as "/private/part.gif",
and "/private/memo.html".  They all share the same auth pattern, which
means they share updates to the list of permitted User-Agents.

Similar techniques can be used to build negotiation patterns.  We take
the simple approach in every case.  Rather than cache complex expressions
(such as the list of all negotiation possibilities for an object, with
q-values and whatever else), we cache the exact set of headers which are
acceptable and match them.  Anything which doesn't match goes through
the full processing (and that will insert a record to help match the
next time).  Similar to auth patterns, some effort should be put into
"coalescing" negotiation patterns.

For example, Navigator sends the same Accept/Accept-Language headers
on almost every request (the exact contents can be modified by the
user...).  We only need to recognize that set of headers, and then
any server negotiation involving those headers can refer to the
common pattern.

It's my feeling that a large percentage of requests fit into the above
mold.  There's some obvious extensions too, such as negative caching --
for example, a pattern denying an IP address could be a map to an error
handler and an error code.

WHY WHY WHY?

Why do we want to do all this?  Speed.  The prototype implementation
here is 40% to 50% faster than apache-nspr... this was with a hardwired
test across loopback, but without keep-alive.

Here's two reasons I can think of for the speed increase:

- Simplistic hand coded parser.  The parser doesn't handle all cases,
  it shouldn't have to.  It bails to the full request parser in any
  case it can't handle.  In fact it's so simple that it doesn't
  even have buffered i/o -- if the request isn't in the first packet
  sent from the client it bails and passes it down to full request
  parsing.

- Fewer threads.  The prototype has two threads.  One accepts
  connections, reads requests, and parses them.  The other handles
  sending responses.  Fewer threads means fewer context switches.
  But more importantly it means less memory required for things such
  as stacks... and that results in less L1/L2 cache impact... which
  can have a dramatic effect on performance.

CAVEATS

The prototype has a pre-seeded flow cache.  In practice there are
locking issues involved with seeding the flow cache.  Using a message
queue will allow us to batch updates and hopefully incur less locking
overhead.

The prototype and the discussion above don't deal with cache coherency.
This is an area I'm still working on.  A rough first approximation is
to flush the entire cache periodically.  There's a bunch of other
options... each pattern has different criteria for coherency.

Obviously doing this will require API changes, and configuration language
changes.  The above example would be extremely difficult to "compile"
into the form required for the flow cache.  But I think we can make a
user-friendly language that is easily compilable into flows on the fly.


Re: directory_walk performance

Posted by "William A. Rowe, Jr." <ad...@rowe-clan.net>.
From: <rb...@covalent.net>
Sent: Saturday, June 23, 2001 1:36 PM


> Just a heads up, because I know he's off-list this weekend.  Will Rowe has
> been looking at some of this stuff recently.  He has basically re-worked
> directory walk to take advantage of a lot of the stat() calls that have
> already been made.

I'm back online [and saner for having walked away from it all to the north woods
for a few days.]  I had meant to keep profiling this bit, but here it is for
initial comment (it's not tested sufficiently to commit it just yet.)

Rather than confuse the issue with an illegible patch (it is), what follows is 
the replacement of directory_walk (and several other pieces folded back in from
what used to be static functions).

More comments and thoughts follow...


> On Sat, 23 Jun 2001, dean gaudet wrote:
> 
> > that sounds good.
> >
> > it's just like constant-folding :)
> >
> > can you generalise it any?  Alias and mod_userdir can add more constant
> > factors in the path.

More complicated than that.  Think rewrite :-)

Bill Stoddard and I chatted about this, our collective instinct is that we need
to generally cache the intermediate results, and give the operator the choice of
how quickly to invalidate it.

This way, we can skip the merge every time the results are equal.

We also can skip testing the link, since we know there 'was' a directory of
/foo/bar/, we can presume it still is there (unless followsymlinks is restricted),
and can throw up if /foo/bar/bash reports "Invalid Directory".

One consideration, depending on module authorship, is if a module author chooses
to 'do something' in the directory merge operation, relative to the request (uri,
query elements, etc.)  Does anyone know of such a module?

> > On Fri, 22 Jun 2001, Brian Pane wrote:
> >
> > > More fun with gprof...
> > > directory_walk is one of the bigger consumers of user-mode CPU time in
> > > the current 2.0 httpd source, due largely to its calls to
> > > ap_merge_per_dir_configs.

And don't dismiss the stats that happened in path_info (backwards, so we take a
very slow boat to China for each non-existant cgi suffix.  Therefore the path
/viewcvs.cgi/module/path/file results in three slow ENOENTs.)  And the correction
of the path name on OS2/Win32/other case insensitive platforms.

This all needs to be rolled into one loop and optimized, as my experimental code
illustrates.

> > > For a typical configuration that has one <Directory> config for '/' and
> > > another for
> > > the document root path, it looks like all requests for real files will
> > > follow the
> > > same pattern:
> > >   * Initialize per_dir_defaults to be the default_lookups for the vhost
> > > matching
> > >      the request
> > >   * Merge the configs for '/'
> > >   * Merge the configs for the document root path
> > >
> > > I.e., the same merge of the / and /document/root configs is happening on
> > > every request.

Correct analysis, but remember symlinks are validated every step of the way.

> > > Assuming that this configuration style (with a <Directory> block for the
> > > document root path) is indeed common, would it be feasible to precompute
> > > the merge of the '/' configs with the /document/root configs after
> > > initialization
> > > to support optimized handling of this case in directory_walk?  In the
> > > implementation
> > > I'm thinking of, the logic in directory_walk would look something like:
> > >     if (path begins with sconf->ap_document_root) {
> > >           per_dir_defaults = precomputed merged configs for document root;
> > >           scan the rest of the path (after the docroot prefix) for
> > > possible additional dir matches;
> > >     }
> > >     else {
> > >         use the current algorithm;
> > >     }
> > >
> > > Would that sort of optimization make sense, or is it too special-purpose
> > > (or too incorrect, even) to be generally useful?

The risk is in the symlinks, we MUST test those as required.  And you need to
expire the cache, since .htaccess can even live in the root (with Options set
appropriately, of course.)

Most operators would want this flushed every 1-5 minutes, to give some sort of
feedback for themselves as they change .htaccess directives (perhaps as low as
every 5 seconds.)  But if your server answers 500 requests per second, even
2500 requests in seconds is a heck of a savings for caching!!!

So here is the uncached proposal to make the directory_walk a little more cross 
purpose, and a eliminate a ton of redundant stats on NT.  The optimizations
just aren't there yet, so I'm not ready to commit.


/// SNIPPED FROM server/request.c:


/*****************************************************************
 *
 * Getting and checking directory configuration.  Also checks the
 * FollowSymlinks and FollowSymOwner stuff, since this is really the
 * only place that can happen (barring a new mid_dir_walk callout).
 *
 * We can't do it as an access_checker module function which gets
 * called with the final per_dir_config, since we could have a directory
 * with FollowSymLinks disabled, which contains a symlink to another
 * with a .htaccess file which turns FollowSymLinks back on --- and
 * access in such a case must be denied.  So, whatever it is that
 * checks FollowSymLinks needs to know the state of the options as
 * they change, all the way down.
 */

AP_DECLARE(int) directory_walk(request_rec *r)
{
    core_server_config *sconf = ap_get_module_config(r->server->module_config,
                                                     &core_module);
    ap_conf_vector_t *per_dir_defaults = r->server->lookup_defaults;
    ap_conf_vector_t **sec_ent = (ap_conf_vector_t **) sconf->sec->elts;
    int num_sec = sconf->sec->nelts;
    int sec;
    unsigned int seg;
    int res;
    ap_conf_vector_t *entry_config;
    ap_conf_vector_t *this_conf;
    core_dir_config *entry_core;
    apr_status_t rv;
    apr_size_t buflen;
    char *seg_name;
    char *delim;

    /*
     * Are we dealing with a file? If not, we simply assume we have a 
     * handler that doesn't require one, but for safety's sake, and so
     * we have something find_types() can get something out of, fake 
     * one. But don't run through the directory entries.
     */

    if (r->filename == NULL) {
        r->filename = apr_pstrdup(r->pool, r->uri);
        r->finfo.filetype = APR_NOFILE;
        r->per_dir_config = per_dir_defaults;

        return OK;
    }

    /*
     * Go down the directory hierarchy.  Where we have to check for symlinks,
     * do so.  Where a .htaccess file has permission to override anything,
     * try to find one.  If either of these things fails, we could poke
     * around, see why, and adjust the lookup_rec accordingly --- this might
     * save us a call to get_path_info (with the attendant stat()s); however,
     * for the moment, that's not worth the trouble.
     *
     * r->path_info tracks the remaining source path.
     * r->filename  tracks the path as we build it.
     * we begin our adventure at the root...
     */
    r->path_info = r->filename;
    if ((rv = apr_filepath_merge(&r->path_info, NULL, r->filename, 
                                 APR_FILEPATH_NOTRELATIVE, r->pool)) 
                  == APR_SUCCESS) {
        char *buf;
        rv = apr_filepath_root(&r->filename, &r->path_info, 
                               APR_FILEPATH_TRUENAME, r->pool);
        buflen = strlen(r->filename) + strlen(r->path_info) + 1;
        buf = apr_palloc(r->pool, buflen);
        strcpy (buf, r->filename);
        r->filename = buf;
    }
    else {
        return HTTP_FORBIDDEN;
    }
    r->finfo.valid = APR_FINFO_TYPE;
    r->finfo.filetype = APR_DIR; /* It's the root, of course it's a dir */

    /* Fake filenames (i.e. proxy:) only match Directory sections.
     */
    if (rv == APR_EBADPATH)
    {
        const char *entry_dir;

        for (sec = 0; sec < num_sec; ++sec) {

            entry_config = sec_ent[sec];
            entry_core = ap_get_module_config(entry_config, &core_module);
            entry_dir = entry_core->d;

            this_conf = NULL;
            if (entry_core->r) {
                if (!ap_regexec(entry_core->r, r->filename, 0, NULL, 0))
                    this_conf = entry_config;
            }
            else if (entry_core->d_is_fnmatch) {
                /* XXX: Gut instinct tells me this could be very, very nasty,
                 * have we thought through what 'faux' resource might be
                 * case senstitive or not?
                 */
                if (!apr_fnmatch(entry_dir, r->filename, 0))
                    this_conf = entry_config;
            }
            else if (!strncmp(r->filename, entry_dir, strlen(entry_dir)))
                this_conf = entry_config;

            if (this_conf)
                per_dir_defaults = ap_merge_per_dir_configs(r->pool,
                                                            per_dir_defaults,
                                                            this_conf);
        }

        r->per_dir_config = per_dir_defaults;

        return OK;
    }

    /*
     * seg keeps track of which segment we've copied.
     * sec keeps track of which section we're on, since sections are
     *     ordered by number of segments. See core_reorder_directories 
     */
    seg = 1; 
    sec = 0;
    do {
        int overrides_here;
        core_dir_config *core_dir = ap_get_module_config(per_dir_defaults,
                                                         &core_module);
        
        /* We have no trailing slash, but we sure would appreciate one...
         */
        if (seg > 1)
            strcat(r->filename, "/");

        /* Begin *this* level by looking for matching <Directory> sections
         * from the server config.
         */
        for (; sec < num_sec; ++sec) {
            char *entry_dir;

            entry_config = sec_ent[sec];
            entry_core = ap_get_module_config(entry_config, &core_module);
            entry_dir = entry_core->d;

            /* No more possible matches? */
            if (entry_core->r || !entry_core->d_is_absolute
                              || entry_core->d_components > seg)
                break;

            this_conf = NULL;
            if (entry_core->d_is_fnmatch) {
                if (!apr_fnmatch(entry_dir, r->filename, FNM_PATHNAME | MATCH_FLAGS)) {
                    this_conf = entry_config;
                }
            }
            else if (!strcmp(r->filename, entry_dir))
                this_conf = entry_config;

            if (this_conf) {
                per_dir_defaults = ap_merge_per_dir_configs(r->pool,
                                                            per_dir_defaults,
                                                            this_conf);
                core_dir = ap_get_module_config(per_dir_defaults,
                                                &core_module);
            }
        }
        overrides_here = core_dir->override;

        /* If .htaccess files are enabled, check for one. */
        if (overrides_here) {
            ap_conf_vector_t *htaccess_conf = NULL;

            res = ap_parse_htaccess(&htaccess_conf, r, overrides_here,
                                    apr_pstrdup(r->pool, r->filename),
                                    sconf->access_name);
            if (res)
                return res;

            if (htaccess_conf) {
                per_dir_defaults = ap_merge_per_dir_configs(r->pool,
           per_dir_defaults,
           htaccess_conf);
           r->per_dir_config = per_dir_defaults;
           }
        }

        /* That temporary trailing slash was useful, now drop it.
         */
        if (seg > 1)
            r->filename[strlen(r->filename) - 1] = '\0';

        /* Time for all good things to come to an end?
         */
        if (!r->path_info || !*r->path_info)
            break;

        /* Now it's time for the next segment... 
         * We will assume the next element is an end node, and fix it up
         * below as necessary...
         */
        
        seg_name = strchr(r->filename, '\0');
        delim = strchr(r->path_info + (*r->path_info == '/' ? 1 : 0), '/');
        if (delim) {
            *delim = '\0';
            strcpy(seg_name, r->path_info);
            r->path_info = delim;
            *delim = '/';
        }
        else {
            strcpy(seg_name, r->path_info);
            r->path_info = strchr(r->path_info, '\0');
        }
        if (*seg_name == '/') ++seg_name;
        
        /* If nothing remained but a '/' string, we are finished
         */
        if (!*seg_name)
            break;

        /* XXX: Optimization required:
         * If...we have allowed symlinks, and
         * if...we find the segment exists in the directory list
         * skip the lstat and dummy up an APR_DIR value for r->finfo
         * this means case sensitive platforms go quite quickly.
         * Case insensitive platforms might be given the wrong path,
         * but if it's not found in the cache, then we know we have
         * something to test (the misspelling is never cached.)
         */

        /* We choose apr_lstat here, rather that apr_stat, so that we
         * capture this path object rather than it's target.  We will
         * replace the info with our target's info below.  We especially
         * want the name of this 'link' object, not the name of it's
         * target, if we are fixing case.
         */
        rv = apr_lstat(&r->finfo, r->filename, APR_FINFO_MIN | APR_FINFO_NAME, r->pool);

        if (APR_STATUS_IS_ENOENT(rv)) {
            /* Nothing?  That could be nice.  But our directory walk is done.
             */
            r->finfo.filetype = APR_NOFILE;
            break;
        }
        else if (APR_STATUS_IS_EACCES(rv)) {
                ap_log_rerror(APLOG_MARK, APLOG_ERR, rv, r,
                              "access to %s denied", r->uri);
            return r->status = HTTP_FORBIDDEN;
        }
        else if ((rv != APR_SUCCESS && rv != APR_INCOMPLETE) 
                 || !(r->finfo.valid & APR_FINFO_TYPE)) {
            /* If we hit ENOTDIR, we must have over-optimized, deny 
             * rather than assume not found.
             */
            ap_log_rerror(APLOG_MARK, APLOG_ERR, rv, r,
                          "access to %s failed", r->uri);
            return r->status = HTTP_FORBIDDEN;
        }
        else if (r->finfo.filetype != APR_NOFILE
              && r->finfo.filetype != APR_REG
              && r->finfo.filetype != APR_DIR
              && r->finfo.filetype != APR_LNK) {
            ap_log_rerror(APLOG_MARK, APLOG_NOERRNO|APLOG_ERR, 0, r,
                        "object is not a file, directory or symlink: %s",
                        r->filename);
            return r->status = HTTP_FORBIDDEN;
        }

        /* Fix up the path now if we have a name, and they don't agree
         */
        if ((r->finfo.valid & APR_FINFO_NAME) 
            && strcmp(seg_name, r->finfo.name)) {
            strcpy(seg_name, r->finfo.name);
            /* TODO: provide users an option that an internal/external
             * redirect is required here?
             */
        }

        if (r->finfo.filetype == APR_LNK) 
        {
            /* Is this an possibly acceptable symlink?
             */
            apr_finfo_t fi;
            if (!(core_dir->opts & (OPT_SYM_LINKS | OPT_SYM_OWNER))) {
                ap_log_rerror(APLOG_MARK, APLOG_NOERRNO|APLOG_ERR, 0, r,
                            "Symbolic link not allowed: %s", r->filename);
                return r->status = HTTP_FORBIDDEN;
            }

            if ((core_dir->opts & OPT_SYM_OWNER) 
                    && !(r->finfo.valid & APR_FINFO_OWNER)) {
                /* Ok, it wasn't so easy to learn the owner, (at least, it
                 * wasn't free info), so let's ask again... this may hurt
                 * performance-wise, but we need the name more often than
                 * we test symlinks on case-insensitive platforms.
                 */
                if (apr_lstat(&r->finfo, r->filename, APR_FINFO_OWNER, r->pool)
                        != APR_SUCCESS) {
                    ap_log_rerror(APLOG_MARK, APLOG_NOERRNO|APLOG_ERR, 0, r,
                                  "Symbolic link not allowed: %s", r->filename);
                    return r->status = HTTP_FORBIDDEN;
                }
            }

            /* let's find out the real deal...
             */
            rv = apr_stat(&fi, r->filename, APR_FINFO_MIN
                                          | (core_dir->opts & OPT_SYM_OWNER 
                                               ? APR_FINFO_OWNER : 0), r->pool);
            if (rv != APR_SUCCESS) {
                ap_log_rerror(APLOG_MARK, APLOG_ERR, rv, r,
                              "Failed to stat symbolic link's target: %s", r->filename);
                return r->status = HTTP_FORBIDDEN;
            }

            if ((core_dir->opts & OPT_SYM_OWNER)
                && (apr_compare_users(fi.user, r->finfo.user) != APR_SUCCESS)) {
                ap_log_rerror(APLOG_MARK, APLOG_NOERRNO|APLOG_ERR, 0, r,
                              "Symbolic link's owner doesn't match target's owner: %s", 
                              r->filename);
                return r->status = HTTP_FORBIDDEN;
            }

            /* Ok, we are done with the link's info, test the real target
             */
            r->finfo = fi;

            if (r->finfo.filetype == APR_REG) {
                /* That was fun, nothing left for us here
                 */
                break;
            }
            else if (r->finfo.filetype != APR_DIR) {
                ap_log_rerror(APLOG_MARK, APLOG_NOERRNO|APLOG_ERR, 0, r,
                              "symlink doesn't point to a file or directory: %s",
                              r->filename);
                return r->status = HTTP_FORBIDDEN;
            }
        }

        ++seg;
    } while (r->finfo.filetype == APR_DIR);

    /*
     * There's two types of IS_SPECIAL sections (see http_core.c), and we've
     * already handled the proxy:-style stuff.  Now we'll deal with the
     * regexes.
     */
    for (; sec < num_sec; ++sec) {

        entry_config = sec_ent[sec];
        entry_core = ap_get_module_config(entry_config, &core_module);

        if (entry_core->r) {
            if (!ap_regexec(entry_core->r, r->filename, 0, NULL, REG_NOTEOL)) {
                per_dir_defaults = ap_merge_per_dir_configs(r->pool,
                                                            per_dir_defaults,
                                                            entry_config);
            }
        }
    }
    r->per_dir_config = per_dir_defaults;


/* It seems this shouldn't be needed anymore.  We translated the symlink above
 x  into a real resource, and should have died up there.  Even if we keep this,
 x  it needs more thought (maybe an r->file_is_symlink) perhaps it should actually
 x  happen in file_walk, so we catch more obscure cases in autoindex sub requests, etc.
 x
 x    * Symlink permissions are determined by the parent.  If the request is
 x    * for a directory then applying the symlink test here would use the
 x    * permissions of the directory as opposed to its parent.  Consider a
 x    * symlink pointing to a dir with a .htaccess disallowing symlinks.  If
 x    * you access /symlink (or /symlink/) you would get a 403 without this
 x    * S_ISDIR test.  But if you accessed /symlink/index.html, for example,
 x    * you would *not* get the 403.
 x
 x   if (r->finfo.filetype != APR_DIR
 x       && (res = check_symlinks(r->filename, ap_allow_options(r), r->pool))) {
 x       ap_log_rerror(APLOG_MARK, APLOG_NOERRNO|APLOG_ERR, 0, r,
 x                   "Symbolic link not allowed: %s", r->filename);
 x       return res;
 x   }
 */

    return OK;  /* 'no excuses' */
}






Re: directory_walk performance

Posted by rb...@covalent.net.
Just a heads up, because I know he's off-list this weekend.  Will Rowe has
been looking at some of this stuff recently.  He has basically re-worked
directory walk to take advantage of a lot of the stat() calls that have
already been made.  Brian, I would suggest sending him e-mail on Monday so
that you can take advantage of his work.  He says it's not quite ready for
public consumption yet, but I'm sure he'll send it if there is interest.
BTW, this idea is great, I would love to see a patch, especially if it can
be generalized at all!

Ryan

On Sat, 23 Jun 2001, dean gaudet wrote:

> that sounds good.
>
> it's just like constant-folding :)
>
> can you generalise it any?  Alias and mod_userdir can add more constant
> factors in the path.
>
> -dean
>
> On Fri, 22 Jun 2001, Brian Pane wrote:
>
> > More fun with gprof...
> > directory_walk is one of the bigger consumers of user-mode CPU time in
> > the current 2.0 httpd source, due largely to its calls to
> > ap_merge_per_dir_configs.
> >
> > For a typical configuration that has one <Directory> config for '/' and
> > another for
> > the document root path, it looks like all requests for real files will
> > follow the
> > same pattern:
> >   * Initialize per_dir_defaults to be the default_lookups for the vhost
> > matching
> >      the request
> >   * Merge the configs for '/'
> >   * Merge the configs for the document root path
> >
> > I.e., the same merge of the / and /document/root configs is happening on
> > every
> > request.
> >
> > Assuming that this configuration style (with a <Directory> block for the
> > document root path) is indeed common, would it be feasible to precompute
> > the merge of the '/' configs with the /document/root configs after
> > initialization
> > to support optimized handling of this case in directory_walk?  In the
> > implementation
> > I'm thinking of, the logic in directory_walk would look something like:
> >     if (path begins with sconf->ap_document_root) {
> >           per_dir_defaults = precomputed merged configs for document root;
> >           scan the rest of the path (after the docroot prefix) for
> > possible additional dir matches;
> >     }
> >     else {
> >         use the current algorithm;
> >     }
> >
> > Would that sort of optimization make sense, or is it too special-purpose
> > (or too
> > incorrect, even) to be generally useful?
> >
> > Thanks,
> > --Brian
> >
> >
> >
>
>


_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------


Re: directory_walk performance

Posted by dean gaudet <dg...@arctic.org>.
that sounds good.

it's just like constant-folding :)

can you generalise it any?  Alias and mod_userdir can add more constant
factors in the path.

-dean

On Fri, 22 Jun 2001, Brian Pane wrote:

> More fun with gprof...
> directory_walk is one of the bigger consumers of user-mode CPU time in
> the current 2.0 httpd source, due largely to its calls to
> ap_merge_per_dir_configs.
>
> For a typical configuration that has one <Directory> config for '/' and
> another for
> the document root path, it looks like all requests for real files will
> follow the
> same pattern:
>   * Initialize per_dir_defaults to be the default_lookups for the vhost
> matching
>      the request
>   * Merge the configs for '/'
>   * Merge the configs for the document root path
>
> I.e., the same merge of the / and /document/root configs is happening on
> every
> request.
>
> Assuming that this configuration style (with a <Directory> block for the
> document root path) is indeed common, would it be feasible to precompute
> the merge of the '/' configs with the /document/root configs after
> initialization
> to support optimized handling of this case in directory_walk?  In the
> implementation
> I'm thinking of, the logic in directory_walk would look something like:
>     if (path begins with sconf->ap_document_root) {
>           per_dir_defaults = precomputed merged configs for document root;
>           scan the rest of the path (after the docroot prefix) for
> possible additional dir matches;
>     }
>     else {
>         use the current algorithm;
>     }
>
> Would that sort of optimization make sense, or is it too special-purpose
> (or too
> incorrect, even) to be generally useful?
>
> Thanks,
> --Brian
>
>
>