You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by "William A. Rowe, Jr." <wr...@rowe-clan.net> on 2001/08/01 19:48:32 UTC
Re: directory_walk performance
Brian,
see my commit to core/request.c ... I've dropped in the new directory_walk
code.
I'd like us to optimize it for things we _know_ (e.g., it's given as a
<Directory > so skip the stat, unless FollowSymLinks isn't set, and assume
for the remainder of the request that it _is_ a directory) and then introduce
your optimizations. Hopefully both branches in the same source help us keep
running while we keep developing. Focus on the second directory_walk function
since the first will be deprecated and removed when we've got it right :)
Bill
----- Original Message -----
From: "Brian Pane" <bp...@pacbell.net>
To: "new-httpd" <ne...@apache.org>
Sent: Friday, June 22, 2001 4:03 PM
Subject: directory_walk performance
> More fun with gprof...
> directory_walk is one of the bigger consumers of user-mode CPU time in
> the current 2.0 httpd source, due largely to its calls to
> ap_merge_per_dir_configs.
>
> For a typical configuration that has one <Directory> config for '/' and
> another for
> the document root path, it looks like all requests for real files will
> follow the
> same pattern:
> * Initialize per_dir_defaults to be the default_lookups for the vhost
> matching
> the request
> * Merge the configs for '/'
> * Merge the configs for the document root path
>
> I.e., the same merge of the / and /document/root configs is happening on
> every
> request.
>
> Assuming that this configuration style (with a <Directory> block for the
> document root path) is indeed common, would it be feasible to precompute
> the merge of the '/' configs with the /document/root configs after
> initialization
> to support optimized handling of this case in directory_walk? In the
> implementation
> I'm thinking of, the logic in directory_walk would look something like:
> if (path begins with sconf->ap_document_root) {
> per_dir_defaults = precomputed merged configs for document root;
> scan the rest of the path (after the docroot prefix) for
> possible additional dir matches;
> }
> else {
> use the current algorithm;
> }
>
> Would that sort of optimization make sense, or is it too special-purpose
> (or too
> incorrect, even) to be generally useful?
>
> Thanks,
> --Brian
>
>
>
Re: [PATCH] Re: directory_walk performance
Posted by Brian Pane <bp...@pacbell.net>.
dean gaudet wrote:
>On Thu, 2 Aug 2001, Brian Pane wrote:
>
>>are there any modules that don't treat the base as const?
>>
>
>such modules would be broken. base is a const.
>
Good, so it should be safe to pre-merge the per-dir configs.
Does anybody have additional comments on the patch? I haven't
yet seen much feedback on the code.
Thanks
--Brian
Re: [PATCH] Re: directory_walk performance
Posted by dean gaudet <dg...@arctic.org>.
On Thu, 2 Aug 2001, Brian Pane wrote:
> are there any modules that don't treat the base as const?
such modules would be broken. base is a const.
-dean
Re: [PATCH] Re: directory_walk performance
Posted by Brian Pane <bp...@pacbell.net>.
William A. Rowe, Jr. wrote:
>From: "Brian Pane" <bp...@pacbell.net>
>Sent: Thursday, August 02, 2001 3:54 AM
>
>
>>William A. Rowe, Jr. wrote:
>>
>>> see my commit to core/request.c ... I've dropped in the new directory_walk
>>>code.
>>>
>>Here's a patch that implements my pre-merge optimization to
>>reduce (and in some cases completely eliminate) the calls to
>>ap_merge_per_dir_configs. Its impacts on directory_walk are
>>minor; all the real work happens in a new post-config-phase
>>function.
>>
>>I was able to generalize the pre-merge technique to support
>>even wildcard directories (something that my original prototype
>>couldn't handle correctly). The comments in core.c describe the
>>algorithm.
>>
>
>There is one _huge_, glaring omission (not in your patch).
>
>If we optimize an individual module dir_merge (see the tables in mod_mime)
>and the optimizer comes along and holds a partial merge, that cached merge
>_may_ become corrupted when we try merging along the complete context in
>the r pool.
>
>This change will require all dir_merge operaions to be told if it is merging
>a final (cached) copy, or an incremental (the request, or part of a cached
>copy that isn't the end-node). The dir_merge must be certain it doesn't modify
>anything in the working base if that base has come from the config rec or the
>cache. Contrawise, it _may_ modify the base if the caller will be discarding
>that base.
>
I don't think this is a new problem. It's never been safe for a
dir_merge to blindly modify the base. In directory_walk, the very
first call to ap_merge_per_dir_configs gets r->server->lookup_defaults
as its base, and the dir_merge shouldn't be modifying that. Mod_mime
looks okay in this regard, with its copy-on-write logic; are there any
modules that don't treat the base as const?
--Brian
Re: [PATCH] Re: directory_walk performance
Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
From: "Brian Pane" <bp...@pacbell.net>
Sent: Thursday, August 02, 2001 3:54 AM
> William A. Rowe, Jr. wrote:
>
> > see my commit to core/request.c ... I've dropped in the new directory_walk
> >code.
>
> Here's a patch that implements my pre-merge optimization to
> reduce (and in some cases completely eliminate) the calls to
> ap_merge_per_dir_configs. Its impacts on directory_walk are
> minor; all the real work happens in a new post-config-phase
> function.
>
> I was able to generalize the pre-merge technique to support
> even wildcard directories (something that my original prototype
> couldn't handle correctly). The comments in core.c describe the
> algorithm.
There is one _huge_, glaring omission (not in your patch).
If we optimize an individual module dir_merge (see the tables in mod_mime)
and the optimizer comes along and holds a partial merge, that cached merge
_may_ become corrupted when we try merging along the complete context in
the r pool.
This change will require all dir_merge operaions to be told if it is merging
a final (cached) copy, or an incremental (the request, or part of a cached
copy that isn't the end-node). The dir_merge must be certain it doesn't modify
anything in the working base if that base has come from the config rec or the
cache. Contrawise, it _may_ modify the base if the caller will be discarding
that base.
Bill
Re: [PATCH] Re: directory_walk performance
Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
From: "Brian Pane" <bp...@pacbell.net>
Sent: Thursday, August 02, 2001 3:54 AM
> Here's a patch that implements my pre-merge optimization to
> reduce (and in some cases completely eliminate) the calls to
> ap_merge_per_dir_configs. Its impacts on directory_walk are
> minor; all the real work happens in a new post-config-phase
> function.
Another glitch. If the source is from the cache, we can't trust that elements
of base or add allocated will still in memory when the cache is called :( Not
unless we also discard all decendants when we refresh the cache. This implies
that we need a protected refcount for each of the cached elements, not only by
the request that is relying on the data, but the straight down the chain.
Bill
[PATCH] Re: directory_walk performance
Posted by Brian Pane <bp...@pacbell.net>.
William A. Rowe, Jr. wrote:
>Brian,
>
> see my commit to core/request.c ... I've dropped in the new directory_walk
>code.
>
> I'd like us to optimize it for things we _know_ (e.g., it's given as a
><Directory > so skip the stat, unless FollowSymLinks isn't set, and assume
>for the remainder of the request that it _is_ a directory) and then introduce
>your optimizations. Hopefully both branches in the same source help us keep
>running while we keep developing. Focus on the second directory_walk function
>since the first will be deprecated and removed when we've got it right :)
>
>Bill
>
Here's a patch that implements my pre-merge optimization to
reduce (and in some cases completely eliminate) the calls to
ap_merge_per_dir_configs. Its impacts on directory_walk are
minor; all the real work happens in a new post-config-phase
function.
I was able to generalize the pre-merge technique to support
even wildcard directories (something that my original prototype
couldn't handle correctly). The comments in core.c describe the
algorithm.
--Brian
Index: include/http_core.h
===================================================================
RCS file: /home/cvspublic/httpd-2.0/include/http_core.h,v
retrieving revision 1.46
diff -u -r1.46 http_core.h
--- include/http_core.h 2001/07/30 18:51:57 1.46
+++ include/http_core.h 2001/08/02 08:39:09
@@ -492,6 +492,14 @@
char *access_name;
apr_array_header_t *sec;
apr_array_header_t *sec_url;
+
+ /* Pre-merged per-dir configs */
+#define NUM_PRE_MERGE_DIRS 6
+ struct {
+ const char *dirname;
+ ap_conf_vector_t *conf;
+ } pre_merged_dir_conf[1 << (NUM_PRE_MERGE_DIRS - 1)];
+
} core_server_config;
/* for http_config.c */
Index: server/core.c
===================================================================
RCS file: /home/cvspublic/httpd-2.0/server/core.c,v
retrieving revision 1.32
diff -u -r1.32 core.c
--- server/core.c 2001/08/01 19:15:22 1.32
+++ server/core.c 2001/08/02 08:39:11
@@ -3295,6 +3295,166 @@
ap_set_version(pconf);
}
+
+static int is_possible_predecessor(const char *dir1, const char *dir2)
+{
+ char c1, c2;
+ while ((c1 = *dir1++)) {
+ c2 = *dir2++;
+ if (c1 != c2) {
+ if ((c1 == '*') || (c2 == '*')) {
+ while ((c1 = *dir1) && (c1 != '/'))
+ dir1++;
+ while ((c2 = *dir2) && (c2 != '/'))
+ dir2++;
+ }
+ else {
+ return 0;
+ }
+ }
+ }
+ return 1;
+}
+
+
+/* Pre-merge the per-directory configurations, to avoid the
+ * overhead of doing this for each request (see directory_walk
+ * in request.c).
+ *
+ * Background:
+ *
+ * There are 'n' per-directory configurations. The directory_walk()
+ * function scans through them from i=0 through i=n-1 and either merges
+ * 'per-dir-config[i]' into its pending request configuration or
+ * leaves that config out (depending on whether the directory name
+ * associated with per-dir-config[i] matches the requested file
+ * or not).
+ *
+ * It's possible to represent the set of merges done for a request
+ * as a vector of bits, V[0] through V[n-1]. V[i] is 1 if
+ * per-dir-config[i] is used for the requests, or 0 if it isn't.
+ * Conceptually, pre_merge_per_dir_configs() is responsible for
+ * pre-building the final configuration associated with each
+ * possible value of this bit vector. Given this pre-merged
+ * config structure, directory_walk() can just look up the
+ * end result instead of actually doing all of the merging at
+ * request time.
+ *
+ * In the general case, it's not feasible to precompute all
+ * permutations of V, because there are O(2^n) possibilities.
+ * Thus pre_merge_per_dir_configs() computes the pre-merge
+ * for the first NUM_PRE_MERGE_DIRS directories. It still
+ * uses O(2^m) storage, but m is constrained to a small value.
+ * As an additional space optimization, the function detects
+ * impossible permutations (e.g., if you have <Directory>
+ * blocks for /foo and /bar/*, their configs will never actually
+ * be merged) and doesn't compute the merged configs for these.
+ */
+static void pre_merge_per_dir_configs(apr_pool_t *p, server_rec *s)
+{
+ core_server_config *sconf;
+ apr_array_header_t *sec;
+ int nelts;
+ ap_conf_vector_t **elts;
+ int num_bits;
+ int i;
+ int max;
+ int mask;
+ int threshold;
+ int j;
+
+ sconf = ap_get_module_config(s->module_config, &core_module);
+ sec = sconf->sec;
+ nelts = sec->nelts;
+ elts = (ap_conf_vector_t **)sec->elts;
+
+ num_bits =
+ (sec->nelts < NUM_PRE_MERGE_DIRS) ? sec->nelts : NUM_PRE_MERGE_DIRS;
+ max = (1 << (num_bits - 1));
+
+ sconf->pre_merged_dir_conf[0].conf = s->lookup_defaults;
+ sconf->pre_merged_dir_conf[0].dirname = "";
+ mask = 1;
+ threshold = 2;
+ j = 0;
+ /* i holds the value of the vector 'V' in the
+ * algorithm described in the comments at the top
+ * of this function. To make the code simpler, the
+ * bits are 'backwards'; the low order bit of i is V[0].
+ * sconf->pre_merged_dir_conf is a vector of 2^NUM_PRE_MERGE_DIRS
+ * elements; for any permutation P of the bits in V,
+ * sconf->pre_merged_dir_conf[P] contains the merged
+ * per-dir configuration corresponding to P.
+ */
+ for (i = 1; i < max; i++) {
+ int parent;
+ if (i == threshold) {
+ mask = threshold;
+ threshold <<= 1;
+ j++;
+ }
+
+ /* The bit vector 'V' can be thought of as a tree;
+ * the configuration for a node can be computed
+ * incrementally by merging with its parent
+ */
+ parent = i & (mask - 1);
+
+ if (sconf->pre_merged_dir_conf[parent].conf == NULL) {
+ /* If the parent node represents an unreachable
+ * state, so does the child, so don't bother
+ * computing it
+ */
+ continue;
+ }
+ if (i & mask) {
+ /* A 1 value in the new high-order bit corresponds to
+ * a permutation in which the j'th per-directory config
+ * is a match against the requested filename. Thus we
+ * should compute the merge of the j'th config with
+ * the config stored in the parent node. However,
+ * if the j'th config corresponds to a <Directory>
+ * path that can't possibly match a filename that
+ * matched the parent node, then we needn't waste
+ * time or memory computing it
+ */
+ core_dir_config *core_elt = ap_get_module_config(elts[j],
+ &core_module);
+ const char *dirname = core_elt->d;
+ const char *parent_dirname =
+ sconf->pre_merged_dir_conf[parent].dirname;
+ if (is_possible_predecessor(parent_dirname, dirname)) {
+ sconf->pre_merged_dir_conf[i].conf =
+ ap_merge_per_dir_configs(p,
+ sconf->pre_merged_dir_conf[parent].conf, elts[j]);
+ sconf->pre_merged_dir_conf[i].dirname = dirname;
+ }
+ }
+ else {
+ /* A 0 value in the new high-order bit corresponds to
+ * a permutation in which the j'th per-directory config
+ * is not matched. Thus the merged configuration for
+ * this node is the same as that for its parent, and
+ * we can copy the pointer rather than making another
+ * copy of the merged config
+ */
+ sconf->pre_merged_dir_conf[i] = sconf->pre_merged_dir_conf[parent];
+ }
+ }
+}
+
+static void core_post_config_merge(apr_pool_t *p, apr_pool_t *plog,
+ apr_pool_t *ptemp, server_rec *s)
+{
+ server_rec *virt;
+
+ for (virt = s->next; virt; virt = virt->next) {
+ pre_merge_per_dir_configs(p, virt);
+ }
+ pre_merge_per_dir_configs(p, s);
+}
+
+
static void core_open_logs(apr_pool_t *pconf, apr_pool_t *plog,
apr_pool_t *ptemp, server_rec *s)
{
ap_open_logs(s, pconf);
@@ -3339,6 +3499,7 @@
static void register_hooks(apr_pool_t *p)
{
ap_hook_post_config(core_post_config,NULL,NULL,APR_HOOK_REALLY_FIRST);
+
ap_hook_post_config(core_post_config_merge,NULL,NULL,APR_HOOK_REALLY_LAST);
ap_hook_translate_name(ap_core_translate,NULL,NULL,APR_HOOK_REALLY_LAST);
ap_hook_open_logs(core_open_logs,NULL,NULL,APR_HOOK_MIDDLE);
ap_hook_handler(default_handler,NULL,NULL,APR_HOOK_REALLY_LAST);
Index: server/request.c
===================================================================
RCS file: /home/cvspublic/httpd-2.0/server/request.c,v
retrieving revision 1.18
diff -u -r1.18 request.c
--- server/request.c 2001/08/01 11:59:55 1.18
+++ server/request.c 2001/08/02 08:39:12
@@ -717,6 +717,7 @@
int num_sec = sconf->sec->nelts;
int sec;
unsigned int seg;
+ unsigned int bitmask;
int res;
ap_conf_vector_t *entry_config;
ap_conf_vector_t *this_conf;
@@ -814,9 +815,12 @@
* seg keeps track of which segment we've copied.
* sec keeps track of which section we're on, since sections are
* ordered by number of segments. See core_reorder_directories
+ * bitmask keeps track of which per-dir configs we've merged...
+ * given a list of 'n'
*/
seg = 1;
sec = 0;
+ bitmask = 0;
do {
int overrides_here;
core_dir_config *core_dir = ap_get_module_config(per_dir_defaults,
@@ -852,9 +856,23 @@
this_conf = entry_config;
if (this_conf) {
- per_dir_defaults = ap_merge_per_dir_configs(r->pool,
-
per_dir_defaults,
- this_conf);
+ if (sec < NUM_PRE_MERGE_DIRS) {
+ bitmask |= (1 << sec);
+ if (sconf->pre_merged_dir_conf[bitmask].conf) {
+ per_dir_defaults =
+ sconf->pre_merged_dir_conf[bitmask].conf;
+ }
+ else {
+ per_dir_defaults =
+ ap_merge_per_dir_configs(r->pool, per_dir_defaults,
+ this_conf);
+ }
+ }
+ else {
+ per_dir_defaults =
+ ap_merge_per_dir_configs(r->pool, per_dir_defaults,
+ this_conf);
+ }
core_dir = ap_get_module_config(per_dir_defaults,
&core_module);
}
@@ -1008,9 +1026,23 @@
if (entry_core->r) {
if (!ap_regexec(entry_core->r, r->filename, 0, NULL,
REG_NOTEOL)) {
- per_dir_defaults = ap_merge_per_dir_configs(r->pool,
-
per_dir_defaults,
- entry_config);
+ if (sec < NUM_PRE_MERGE_DIRS) {
+ bitmask |= (1 << sec);
+ if (sconf->pre_merged_dir_conf[bitmask].conf) {
+ per_dir_defaults =
+ sconf->pre_merged_dir_conf[bitmask].conf;
+ }
+ else {
+ per_dir_defaults =
+ ap_merge_per_dir_configs(r->pool, per_dir_defaults,
+ entry_config);
+ }
+ }
+ else {
+ per_dir_defaults =
+ ap_merge_per_dir_configs(r->pool, per_dir_defaults,
+ entry_config);
+ }
}
}
}