You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apr.apache.org by Joe Schaefer <jo...@sunstarsys.com> on 2003/05/17 03:19:07 UTC

[PATCH] Re: Frankentables

Joe Schaefer <jo...@sunstarsys.com> writes:

[...]

> There's a crude set of table tests in httpd-apreq-2/t/performance.c,
> but basically what they tell me is that
> 
>   1) Getting rid of the temporary table in ap_get_mime_headers_core
>      is a big win.
> 
>   2) For tables whose entries can be distinguished by first letter
>      alone, the current implementation is faster. I haven't benchmarked
>      the addn difference yet, but the penalty on the new code
>      is certainly more substantial there.

I haven't out a figured a way to make my tree implementation faster
(overall) than the current table implementation wrt httpd, but the 
following patch incorporates the above ideas into the current code.

oprofile tells me this version wins by ~10 - 25% when measuring the
time httpd spends in table ops.  To get the full benefit, 
ap_get_mime_headers_core in httpd-2.0/server/protocol.c also 
needs a patch.

The first patch is for httpd-2.0, the other two are for apr_tables.

-- Joe Schaefer

Index: server/protocol.c
===================================================================
RCS file: /home/cvspublic/httpd-2.0/server/protocol.c,v
retrieving revision 1.131
diff -u -r1.131 protocol.c
--- server/protocol.c   15 Apr 2003 22:47:57 -0000      1.131
+++ server/protocol.c   16 May 2003 23:47:33 -0000
@@ -705,10 +705,6 @@
     char *value;
     apr_size_t len;
     int fields_read = 0;
-    apr_table_t *tmp_headers;
-
-    /* We'll use apr_table_overlap later to merge these into r->headers_in. */
-    tmp_headers = apr_table_make(r->pool, 50);

     /*
      * Read header lines until we get the empty separator line, a read error,
@@ -796,7 +792,7 @@
                     ++value;            /* Skip to start of value   */
                 }

-                apr_table_addn(tmp_headers, last_field, value);
+                apr_table_addn(r->headers_in, last_field, value);

                 /* reset the alloc_len so that we'll allocate a new
                  * buffer if we have to do any more folding: we can't
@@ -822,8 +818,7 @@
             last_len = len;
         }
     }
-
-    apr_table_overlap(r->headers_in, tmp_headers, APR_OVERLAP_TABLES_MERGE);
+    apr_table_compress(r->headers_in);
 }

 AP_DECLARE(void) ap_get_mime_headers(request_rec *r)
 



Index: include/apr_tables.h
===================================================================
RCS file: /home/cvspublic/apr/include/apr_tables.h,v
retrieving revision 1.37
diff -u -r1.37 apr_tables.h
--- include/apr_tables.h	5 Mar 2003 21:22:26 -0000	1.37
+++ include/apr_tables.h	17 May 2003 01:02:19 -0000
@@ -224,6 +224,16 @@
 				      const char sep);
 
 /**
+ * Same as apr_array_pstrcat, but takes a (char *) as third argument.
+ * This allows the array elements to be joined on a string instead of
+ * a single character.
+ */
+
+APR_DECLARE(char *)apr_array_pstrjoin(apr_pool_t *p,
+                                      const apr_array_header_t *arr,
+                                      const char *sep);
+
+/**
  * Make a new table
  * @param p The pool to allocate the pool out of
  * @param nelts The number of elements in the initial table.
@@ -242,6 +252,31 @@
                                           const apr_table_t *t);
 
 /**
+ * Get/set method for the table's value copier.
+ * @param t Table.
+ * @param c The new t->copy callback.  c = NULL is ignored;
+ *          a non-NULL value replaces the table's internal copier.
+ * @return The original t->copy callback (prior to any assignment).
+ */
+typedef char *(apr_table_copier_t)(apr_pool_t *p, const char *val);
+
+APR_DECLARE(apr_table_copier_t *) apr_table_copier(apr_table_t *t, 
+                                                   apr_table_copier_t *c);
+
+/**
+ * Get/set method for the table's merger callback.
+ * @param t Table.
+ * @param m The new t->merge callback.  m = NULL is ignored;
+ *          a non-NULL value replaces the table's internal merger.
+ * @return The original t->merge callback (prior to any assignment).
+ */
+typedef char *(apr_table_merger_t)(apr_pool_t *p, 
+                                   const apr_array_header_t *a);
+
+APR_DECLARE(apr_table_merger_t *) apr_table_merger(apr_table_t *t,
+                                                   apr_table_merger_t *m);
+
+/**
  * Delete all of the elements from a table
  * @param t The table to clear
  */
@@ -334,6 +369,25 @@
  */
 APR_DECLARE(void) apr_table_addn(apr_table_t *t, const char *key,
                                  const char *val);
+
+/**
+ * Eliminates redunandant entries with the apr_table_merger_t 
+ * callback associated to t. Use apr_table_merger() to access
+ * and alter this callback.
+ *
+ * @param t Table.
+ */
+APR_DECLARE(void) apr_table_compress(apr_table_t *t);
+
+/**
+ * Append one table to the end of another.
+ * @param t The table to be modified.
+ * @param s The entries from this table are added to "t".
+ * @remark This function overlays the internal indices from "s"
+ * onto "t", so it shoul be faster than iterating over s with apr_table_addn.
+ * In any case, the result should be identical.
+ */
+APR_DECLARE(void) apr_table_cat(apr_table_t *t, const apr_table_t *s);
 
 /**
  * Merge two tables into one new table
Index: tables/apr_tables.c
===================================================================
RCS file: /home/cvspublic/apr/tables/apr_tables.c,v
retrieving revision 1.47
diff -u -r1.47 apr_tables.c
--- tables/apr_tables.c	4 Apr 2003 12:05:53 -0000	1.47
+++ tables/apr_tables.c	17 May 2003 01:02:22 -0000
@@ -249,6 +249,40 @@
     return res;
 }
 
+APR_DECLARE(char *)apr_array_pstrjoin(apr_pool_t *p,
+                                      const apr_array_header_t *arr,
+                                      const char *sep)
+{
+    apr_size_t len;
+    const char *src, **a = (const char **)arr->elts;
+    char *dest, *rv;
+    const int n = arr->nelts;
+    int j;
+
+    if (n == 0)
+        return apr_pcalloc(p, 1);
+
+    for (j=0, len=0; j<n; ++j)
+        if (a[j] != NULL)
+            len += strlen(a[j]);
+
+    rv = apr_palloc(p, len + 1 + strlen(sep)*(n-1));
+
+    /* ~ strcpy: this leaves "dest" & "src" pointing at nul-characters */
+    for (dest = rv, src = a[0]; (*dest = *src); ++src, ++dest)
+        ;
+
+    for (j=1; j<n; ++j) {
+        for (src = sep; (*dest = *src); ++src, ++dest)
+            ;
+        if (a[j] != NULL)
+            for (src = a[j]; (*dest = *src); ++src, ++dest)
+                ;
+    }
+
+    return rv;
+}
+
 /* apr_array_pstrcat generates a new string from the apr_pool_t containing
  * the concatenated sequence of substrings referenced as elements within
  * the array.  The string will be empty if all substrings are empty or null,
@@ -259,55 +293,8 @@
 				     const apr_array_header_t *arr,
 				     const char sep)
 {
-    char *cp, *res, **strpp;
-    apr_size_t len;
-    int i;
-
-    if (arr->nelts <= 0 || arr->elts == NULL) {    /* Empty table? */
-        return (char *) apr_pcalloc(p, 1);
-    }
-
-    /* Pass one --- find length of required string */
-
-    len = 0;
-    for (i = 0, strpp = (char **) arr->elts; ; ++strpp) {
-        if (strpp && *strpp != NULL) {
-            len += strlen(*strpp);
-        }
-        if (++i >= arr->nelts) {
-            break;
-	}
-        if (sep) {
-            ++len;
-	}
-    }
-
-    /* Allocate the required string */
-
-    res = (char *) apr_palloc(p, len + 1);
-    cp = res;
-
-    /* Pass two --- copy the argument strings into the result space */
-
-    for (i = 0, strpp = (char **) arr->elts; ; ++strpp) {
-        if (strpp && *strpp != NULL) {
-            len = strlen(*strpp);
-            memcpy(cp, *strpp, len);
-            cp += len;
-        }
-        if (++i >= arr->nelts) {
-            break;
-	}
-        if (sep) {
-            *cp++ = sep;
-	}
-    }
-
-    *cp = '\0';
-
-    /* Return the result string */
-
-    return res;
+    char s[2] = {sep,0};
+    return apr_array_pstrjoin(p, arr, s);
 }
 
 
@@ -382,9 +369,12 @@
      *     of index_initialized will be zero.  (Check this before
      *     trying to use index_first[i] or index_last[i]!)
      */
+    apr_table_copier_t *copy;
+    apr_table_merger_t *merge;
     apr_uint32_t index_initialized;
     int index_first[TABLE_HASH_SIZE];
     int index_last[TABLE_HASH_SIZE];
+
 };
 
 /*
@@ -413,6 +403,12 @@
     return ((t == NULL) || (t->a.nelts == 0));
 }
 
+static char *merge_values(apr_pool_t *p, const apr_array_header_t *a)
+{
+
+    return apr_array_pstrjoin(p, a, ", ");
+}
+
 APR_DECLARE(apr_table_t *) apr_table_make(apr_pool_t *p, int nelts)
 {
     apr_table_t *t = apr_palloc(p, sizeof(apr_table_t));
@@ -421,6 +417,8 @@
 #ifdef MAKE_TABLE_PROFILE
     t->creator = __builtin_return_address(0);
 #endif
+    t->copy = apr_pstrdup;
+    t->merge = merge_values;
     t->index_initialized = 0;
     return t;
 }
@@ -441,12 +439,36 @@
     make_array_core(&new->a, p, t->a.nalloc, sizeof(apr_table_entry_t), 0);
     memcpy(new->a.elts, t->a.elts, t->a.nelts * sizeof(apr_table_entry_t));
     new->a.nelts = t->a.nelts;
+    new->copy = t->copy;
+    new->merge = t->merge;
     memcpy(new->index_first, t->index_first, sizeof(int) * TABLE_HASH_SIZE);
     memcpy(new->index_last, t->index_last, sizeof(int) * TABLE_HASH_SIZE);
     new->index_initialized = t->index_initialized;
     return new;
 }
 
+APR_DECLARE(apr_table_copier_t *)apr_table_copier(apr_table_t *t, 
+                                                  apr_table_copier_t *c)
+{
+    apr_table_copier_t *original = t->copy;
+
+    if (c != NULL)
+        t->copy = c;
+    return original;
+}
+
+
+APR_DECLARE(apr_table_merger_t *)apr_table_merger(apr_table_t *t, 
+                                                  apr_table_merger_t *m)
+{
+    apr_table_merger_t *original = t->merge;
+
+    if (m != NULL)
+        t->merge = m;
+    return original;
+}
+
+
 static void table_reindex(apr_table_t *t)
 {
     int i;
@@ -464,6 +486,81 @@
     }
 }
 
+static void bury_table_entries(apr_table_t *t, apr_array_header_t *q)
+{
+    apr_table_entry_t *const o = (apr_table_entry_t *)t->a.elts;
+    int j, m, n, *p;
+
+    /* add sentinel */
+    *(int *)apr_array_push(q) = t->a.nelts; 
+    q->nelts--;
+    p = (int *)q->elts;
+
+    /* remove entries O(t->nelts - p[o]) */
+    for (j=1, n=0; n < q->nelts; ++j, ++n)
+        for (m = p[n]+1; m < p[n+1]; ++m)
+            m[o-j] = m[o];
+
+    t->a.nelts -= q->nelts;
+    table_reindex(t);
+}
+
+#define DEFAULT_NELTS 8
+
+APR_DECLARE(void)apr_table_compress(apr_table_t *t)
+{
+    int g[DEFAULT_NELTS];
+    apr_array_header_t ghosts = { t->a.pool, sizeof(int), 0,
+                                  DEFAULT_NELTS, (char *)g };
+    char *a[DEFAULT_NELTS];
+    apr_array_header_t arr = {t->a.pool, sizeof(char *), 0,
+                              DEFAULT_NELTS, (char *)a};
+
+    apr_table_entry_t *const table_start = (apr_table_entry_t *)t->a.elts;
+    apr_table_entry_t *const table_end   = table_start + t->a.nelts;
+    apr_table_entry_t *next_elt;
+
+    /* mark & sweep:
+     * Iterate through the table, marking duplicates by setting their
+     * checksum to -1 (-1 is "greater" than CASE_MASK as unsigned integer 
+     * types).  The process is quadratic in time, but linear in space.
+     * Once the duplicates are marked, the original value is replaced
+     * by merging it with the rest.
+     */
+
+    for (next_elt = table_start; next_elt < table_end; ++next_elt) {
+        const apr_uint32_t checksum = next_elt->key_checksum;
+        const char *const key = next_elt->key;
+
+        apr_table_entry_t *e = next_elt;
+        const apr_table_entry_t *const end = table_start + 
+            t->index_last[TABLE_HASH(next_elt->key)];
+
+        if (checksum == -1) {
+            *(int *)apr_array_push(&ghosts) = next_elt - table_start;
+            continue;
+        }
+
+        arr.nelts = 0;
+        *(const char **)apr_array_push(&arr) = e->val;
+
+        while (++e <= end) {
+            if ((checksum == e->key_checksum) &&
+                !strcasecmp(e->key, key)) {
+                e->key_checksum = -1;
+                *(const char **)apr_array_push(&arr) = e->val;
+            }
+        }
+
+        if (arr.nelts > 1) {
+            next_elt->val = t->merge(t->a.pool, &arr);
+        }
+    }
+
+    if (ghosts.nelts)
+        bury_table_entries(t, &ghosts);
+}
+
 APR_DECLARE(void) apr_table_clear(apr_table_t *t)
 {
     t->a.nelts = 0;
@@ -528,7 +625,7 @@
             int must_reindex = 0;
             apr_table_entry_t *dst_elt = NULL;
 
-            next_elt->val = apr_pstrdup(t->a.pool, val);
+            next_elt->val = t->copy(t->a.pool, val);
 
             /* Remove any other instances of this key */
             for (next_elt++; next_elt <= end_elt; next_elt++) {
@@ -567,7 +664,7 @@
     t->index_last[hash] = t->a.nelts;
     next_elt = (apr_table_entry_t *) table_push(t);
     next_elt->key = apr_pstrdup(t->a.pool, key);
-    next_elt->val = apr_pstrdup(t->a.pool, val);
+    next_elt->val = t->copy(t->a.pool, val);
     next_elt->key_checksum = checksum;
 }
 
@@ -702,8 +799,12 @@
 {
     apr_table_entry_t *next_elt;
     apr_table_entry_t *end_elt;
+    apr_table_entry_t *table_end;
     apr_uint32_t checksum;
     int hash;
+    int a[DEFAULT_NELTS];
+    apr_array_header_t arr = { t->a.pool, sizeof(int), 0,
+                               DEFAULT_NELTS, (char *)a };
 
     COMPUTE_KEY_CHECKSUM(key, checksum);
     hash = TABLE_HASH(key);
@@ -714,14 +815,52 @@
     }
     next_elt = ((apr_table_entry_t *) t->a.elts) + t->index_first[hash];
     end_elt = ((apr_table_entry_t *) t->a.elts) + t->index_last[hash];
+    table_end = ((apr_table_entry_t *) t->a.elts) + t->a.nelts;
 
     for (; next_elt <= end_elt; next_elt++) {
 	if ((checksum == next_elt->key_checksum) &&
             !strcasecmp(next_elt->key, key)) {
+            int must_reindex = 0;
+            apr_table_entry_t *dst_elt = NULL;
+            apr_table_entry_t *match_elt = next_elt;
+
+            *(char **)apr_array_push(&arr) = next_elt->val;
+
+            /* Remove any other instances of this key */
+            for (next_elt++; next_elt <= end_elt; next_elt++) {
+                if ((checksum == next_elt->key_checksum) &&
+                    !strcasecmp(next_elt->key, key)) {
+                    t->a.nelts--;
+                    *(const char **)apr_array_push(&arr) = next_elt->val;
+                    if (!dst_elt) {
+                        dst_elt = next_elt;
+                    }
+                }
+                else if (dst_elt) {
+                    *dst_elt++ = *next_elt;
+                    must_reindex = 1;
+                }
+            }
+
+
+            /* If we've removed anything, shift over the remainder
+             * of the table (note that the previous loop didn't
+             * run to the end of the table, just to the last match
+             * for the index)
+             */
+            if (dst_elt) {
+                for (; next_elt < table_end; next_elt++) {
+                    *dst_elt++ = *next_elt;
+                }
+                must_reindex = 1;
+            }
+            if (must_reindex) {
+                table_reindex(t);
+            }
 
             /* Found an existing entry with the same key, so merge with it */
-	    next_elt->val = apr_pstrcat(t->a.pool, next_elt->val, ", ",
-                                        val, NULL);
+            *(const char **)apr_array_push(&arr) = t->copy(t->a.pool, val);
+            match_elt->val = t->merge(t->a.pool, &arr);
             return;
         }
     }
@@ -730,7 +869,7 @@
     t->index_last[hash] = t->a.nelts;
     next_elt = (apr_table_entry_t *) table_push(t);
     next_elt->key = apr_pstrdup(t->a.pool, key);
-    next_elt->val = apr_pstrdup(t->a.pool, val);
+    next_elt->val = t->copy(t->a.pool, val);
     next_elt->key_checksum = checksum;
 }
 
@@ -739,8 +878,12 @@
 {
     apr_table_entry_t *next_elt;
     apr_table_entry_t *end_elt;
+    apr_table_entry_t *table_end;
     apr_uint32_t checksum;
     int hash;
+    int a[DEFAULT_NELTS];
+    apr_array_header_t arr = { t->a.pool, sizeof(int), 0,
+                               DEFAULT_NELTS, (char *)a };
 
 #ifdef POOL_DEBUG
     {
@@ -764,14 +907,51 @@
     }
     next_elt = ((apr_table_entry_t *) t->a.elts) + t->index_first[hash];;
     end_elt = ((apr_table_entry_t *) t->a.elts) + t->index_last[hash];
-
+    table_end = ((apr_table_entry_t *) t->a.elts) + t->a.nelts;
     for (; next_elt <= end_elt; next_elt++) {
 	if ((checksum == next_elt->key_checksum) &&
             !strcasecmp(next_elt->key, key)) {
+            int must_reindex = 0;
+            apr_table_entry_t *dst_elt = NULL;
+            apr_table_entry_t *match_elt = next_elt;
+
+            *(const char **)apr_array_push(&arr) = next_elt->val;
+
+            /* Remove any other instances of this key */
+            for (next_elt++; next_elt <= end_elt; next_elt++) {
+                if ((checksum == next_elt->key_checksum) &&
+                    !strcasecmp(next_elt->key, key)) {
+                    t->a.nelts--;
+                    *(const char **)apr_array_push(&arr) = next_elt->val;
+                    if (!dst_elt) {
+                        dst_elt = next_elt;
+                    }
+                }
+                else if (dst_elt) {
+                    *dst_elt++ = *next_elt;
+                    must_reindex = 1;
+                }
+            }
+
+
+            /* If we've removed anything, shift over the remainder
+             * of the table (note that the previous loop didn't
+             * run to the end of the table, just to the last match
+             * for the index)
+             */
+            if (dst_elt) {
+                for (; next_elt < table_end; next_elt++) {
+                    *dst_elt++ = *next_elt;
+                }
+                must_reindex = 1;
+            }
+            if (must_reindex) {
+                table_reindex(t);
+            }
 
             /* Found an existing entry with the same key, so merge with it */
-	    next_elt->val = apr_pstrcat(t->a.pool, next_elt->val, ", ",
-                                        val, NULL);
+            *(const char **)apr_array_push(&arr) = val;
+            match_elt->val = t->merge(t->a.pool, &arr);
             return;
         }
     }
@@ -835,13 +1015,41 @@
     elts->key = (char *)key;
     elts->val = (char *)val;
     elts->key_checksum = checksum;
+ }
+
+
+APR_DECLARE(void) apr_table_cat(apr_table_t *t, const apr_table_t *s)
+{
+    const int n = t->a.nelts;
+    register int idx;
+
+    apr_array_cat(&t->a,&s->a);
+
+    if (n == 0) {
+        memcpy(t->index_first,s->index_first,sizeof(int) * TABLE_HASH_SIZE);
+        memcpy(t->index_last, s->index_last, sizeof(int) * TABLE_HASH_SIZE);
+        t->index_initialized = s->index_initialized;
+        return;
+    }
+
+    for (idx = 0; idx < TABLE_HASH_SIZE; ++idx) {
+        if (TABLE_INDEX_IS_INITIALIZED(s, idx)) {
+            t->index_last[idx] = s->index_last[idx] + n;
+            if (!TABLE_INDEX_IS_INITIALIZED(t, idx)) {
+                t->index_first[idx] = s->index_first[idx] + n;
+            }
+        }
+    }
+
+    t->index_initialized |= s->index_initialized;
 }
 
+
 APR_DECLARE(apr_table_t *) apr_table_overlay(apr_pool_t *p,
 					     const apr_table_t *overlay,
 					     const apr_table_t *base)
-{
-    apr_table_t *res;
+{ 
+    apr_table_t *t = apr_palloc(p, sizeof *t);
 
 #ifdef POOL_DEBUG
     /* we don't copy keys and values, so it's necessary that
@@ -860,13 +1068,26 @@
     }
 #endif
 
-    res = apr_palloc(p, sizeof(apr_table_t));
-    /* behave like append_arrays */
-    res->a.pool = p;
-    copy_array_hdr_core(&res->a, &overlay->a);
-    apr_array_cat(&res->a, &base->a);
-    table_reindex(res);
-    return res;
+    t->a.pool = p;
+    t->a.elt_size = sizeof(apr_table_entry_t);
+    t->copy = overlay->copy;
+    t->merge = overlay->merge;
+
+    memcpy(t->index_first,overlay->index_first,TABLE_HASH_SIZE * sizeof(int));
+    memcpy(t->index_last, overlay->index_last, TABLE_HASH_SIZE * sizeof(int));
+
+    t->a.elts = apr_palloc(p, t->a.elt_size *
+                           (overlay->a.nalloc + base->a.nalloc));
+
+    memcpy(t->a.elts, overlay->a.elts, overlay->a.elt_size * 
+                                       overlay->a.nelts);
+
+    t->a.nelts = overlay->a.nelts;
+    t->a.nalloc = overlay->a.nalloc + base->a.nalloc;
+
+    if (base->a.nelts)
+        apr_table_cat(t, base);
+    return t;
 }
 
 /* And now for something completely abstract ...
@@ -996,334 +1217,39 @@
     return vdorv;
 }
 
-/* During apr_table_overlap(), we build an overlap key for
- * each element in the two tables.
- */
-#define RED 0
-#define BLACK 1
-typedef struct overlap_key {
-    /* The table element */
-    apr_table_entry_t *elt;
-
-    /* 0 if from table 'a', 1 if from table 'b' */
-    int level;
-
-    /* Whether to omit this element when building the result table */
-    int skip;
-
-    /* overlap_keys can be assembled into a red-black tree */
-    int black;
-    struct overlap_key *tree_parent;
-    struct overlap_key *tree_left;
-    struct overlap_key *tree_right;
-    int color;
-
-    /* List of multiple values for this key */
-    struct overlap_key *merge_next;
-    struct overlap_key *merge_last;
-} overlap_key;
-
-/* Rotate a subtree of a red-black tree */
-static void rotate_counterclockwise(overlap_key **root,
-                                    overlap_key *rotate_node)
-{
-    overlap_key *child = rotate_node->tree_right;
-    rotate_node->tree_right = child->tree_left;
-    if (rotate_node->tree_right) {
-        rotate_node->tree_right->tree_parent = rotate_node;
-    }
-    child->tree_parent = rotate_node->tree_parent;
-    if (child->tree_parent == NULL) {
-        *root = child;
-    }
-    else {
-        if (rotate_node == rotate_node->tree_parent->tree_left) {
-            rotate_node->tree_parent->tree_left = child;
-        }
-        else {
-            rotate_node->tree_parent->tree_right = child;
-        }
-    }
-    child->tree_left = rotate_node;
-    rotate_node->tree_parent = child;
-}
-
-static void rotate_clockwise(overlap_key **root, overlap_key *rotate_node)
+static char *collapse(apr_pool_t *p, 
+                               const apr_array_header_t *arr)
 {
-    overlap_key *child = rotate_node->tree_left;
-    rotate_node->tree_left = child->tree_right;
-    if (rotate_node->tree_left) {
-        rotate_node->tree_left->tree_parent = rotate_node;
-    }
-    child->tree_parent = rotate_node->tree_parent;
-    if (child->tree_parent == NULL) {
-        *root = child;
-    }
-    else {
-        if (rotate_node == rotate_node->tree_parent->tree_left) {
-            rotate_node->tree_parent->tree_left = child;
-        }
-        else {
-            rotate_node->tree_parent->tree_right = child;
-        }
-    }
-    child->tree_right = rotate_node;
-    rotate_node->tree_parent = child;
+    char **a = (char **)arr->elts;
+    return (arr->nelts == 0) ? NULL : a[arr->nelts - 1];
 }
 
 
-static void overlap_hash(overlap_key *elt,
-                         overlap_key **hash_table, int nhash,
-                         unsigned flags)
+APR_DECLARE(void) apr_table_overlap(apr_table_t *a, 
+                                    const apr_table_t *b,
+                                    const unsigned flags)
 {
-    /* Each bucket in the hash table holds a red-black tree
-     * containing the overlap_keys that hash into that bucket
-     */
-    overlap_key **child = &(hash_table[elt->elt->key_checksum & (nhash - 1)]);
-    overlap_key **root = child;
-    overlap_key *parent = NULL;
-    overlap_key *next;
-
-    /* Look for the element in the tree */
-    while ((next = *child) != NULL) {
-        int direction = strcasecmp(elt->elt->key, next->elt->key);
-        if (direction < 0) {
-            parent = next;
-            child = &(next->tree_left);
-        }
-        else if (direction > 0) {
-            parent = next;
-            child = &(next->tree_right);
-        }
-        else {
-            /* This is the key we're looking for */
-            if (flags == APR_OVERLAP_TABLES_MERGE) {
-                /* Just link this node at the end of the list
-                 * of values for the key.  It doesn't need to
-                 * be linked into the tree, because the node at
-                 * the head of this key's value list is in the
-                 * tree already.
-                 */
-                elt->skip = 1;
-                elt->merge_next = NULL;
-                if (next->merge_last) {
-                    next->merge_last->merge_next = elt;
-                }
-                else {
-                    next->merge_next = elt;
-                }
-                next->merge_last = elt;
-            }
-            else {
-                /* In the "set" case, don't bother storing
-                 * this value in the tree if it's already
-                 * there, except if the previous value was
-                 * from table 'a' (level==0) and this value
-                 * is from table 'b' (level==1)
-                 */
-                if (elt->level > next->level) {
-                    elt->tree_left = next->tree_left;
-                    if (next->tree_left) {
-                        next->tree_left->tree_parent = elt;
-                    }
-                    elt->tree_right = next->tree_right;
-                    if (next->tree_right) {
-                        next->tree_right->tree_parent = elt;
-                    }
-                    elt->tree_parent = next->tree_parent;
-                    elt->color = next->color;
-                    (*child) = elt;
-                    elt->merge_next = NULL;
-                    elt->merge_last = NULL;
-                    elt->skip = 0;
-                    next->skip = 1;
-                }
-                else {
-                    elt->skip = 1;
-                }
-            }
-            return;
-        }
-    }
 
-    /* The element wasn't in the tree, so add it */
-    elt->tree_left = NULL;
-    elt->tree_right = NULL;
-    elt->tree_parent = parent;
-    (*child) = elt;
-    elt->merge_next = NULL;
-    elt->merge_last = NULL;
-    elt->skip = 0;
-    elt->color = RED;
-
-    /* Shuffle the nodes to maintain the red-black tree's balanced
-     * shape property.  (This is what guarantees O(n*log(n)) worst-case
-     * performance for apr_table_overlap().)
-     */
-    next = elt;
-    while ((next->tree_parent) && (next->tree_parent->color == RED)) {
-        /* Note: Root is always black, and red and black nodes
-         * alternate on any path down the tree.  So if we're inside
-         * this block, the grandparent node is non-NULL.
-         */
-        overlap_key *grandparent = next->tree_parent->tree_parent;
-        if (next->tree_parent == grandparent->tree_left) {
-            overlap_key *parent_sibling = grandparent->tree_right;
-            if (parent_sibling && (parent_sibling->color == RED)) {
-                next->tree_parent->color = BLACK;
-                parent_sibling->color = BLACK;
-                grandparent->color = RED;
-                next = grandparent;
-            }
-            else {
-                if (next == next->tree_parent->tree_right) {
-                    next = next->tree_parent;
-                    rotate_counterclockwise(root, next);
-                }
-                next->tree_parent->color = BLACK;
-                next->tree_parent->tree_parent->color = RED;
-                rotate_clockwise(root, next->tree_parent->tree_parent);
-            }
-        }
-        else {
-            overlap_key *parent_sibling = grandparent->tree_left;
-            if (parent_sibling && (parent_sibling->color == RED)) {
-                next->tree_parent->color = BLACK;
-                parent_sibling->color = BLACK;
-                grandparent->color = RED;
-                next = grandparent;
-            }
-            else {
-                if (next == next->tree_parent->tree_left) {
-                    next = next->tree_parent;
-                    rotate_clockwise(root, next);
-                }
-                next->tree_parent->color = BLACK;
-                next->tree_parent->tree_parent->color = RED;
-                rotate_counterclockwise(root, next->tree_parent->tree_parent);
-            }
-        }
-    }
-    (*root)->color = BLACK;
-}
-
-/* Must be a power of 2 */
-#define DEFAULT_HASH_SIZE 16
-
-APR_DECLARE(void) apr_table_overlap(apr_table_t *a, const apr_table_t *b,
-				    unsigned flags)
-{
-    int max_keys;
-    int nkeys;
-    overlap_key *cat_keys; /* concatenation of the keys of a and b */
-    overlap_key **hash_table;
-    int nhash;
-    int i;
-    apr_table_entry_t *elts;
-    apr_table_entry_t *dst_elt;
+    const int m = a->a.nelts;
+    const int n = b->a.nelts;
+    apr_pool_t *p = b->a.pool;
 
-    max_keys = a->a.nelts + b->a.nelts;
-    if (!max_keys) {
-        /* The following logic won't do anything harmful if we keep
-         * going in this situation, but
-         *
-         *    1) certain memory debuggers don't like an alloc size of 0
-         *       so we'd like to avoid that...
-         *    2) this isn't all that rare a call anyway, so it is useful
-         *       to skip the storage allocation and other checks in the
-         *       following logic
-         */
+    if (m + n == 0)
         return;
+
+    /* copy (extend) a using b's pool */
+    if (a->a.pool != p) {
+        make_array_core(&a->a, p, m+n, sizeof(apr_table_entry_t), 0);
     }
-    cat_keys = apr_palloc(b->a.pool, sizeof(overlap_key) * max_keys);
-    nhash = DEFAULT_HASH_SIZE;
-    while (nhash < max_keys) {
-        nhash <<= 1;
-    }
-    hash_table = (overlap_key **)apr_pcalloc(b->a.pool,
-                                             sizeof(overlap_key *) * nhash);
-
-    /* The cat_keys array contains an element for each entry in a,
-     * followed by one for each in b.  While populating this array,
-     * we also use it as:
-     *  1) a hash table, to detect matching keys, and
-     *  2) a linked list of multiple values for a given key (in the
-     *     APR_OVERLAP_TABLES_MERGE case)
-     */
 
-    /* First, the elements of a */
-    nkeys = 0;
-    elts = (apr_table_entry_t *)a->a.elts;
-    for (i = 0; i < a->a.nelts; i++, nkeys++) {
-        cat_keys[nkeys].elt = &(elts[i]);
-        cat_keys[nkeys].level = 0;
-        overlap_hash(&(cat_keys[nkeys]), hash_table, nhash, flags);
-    }
-
-    /* Then the elements of b */
-    elts = (apr_table_entry_t *)b->a.elts;
-    for (i = 0; i < b->a.nelts; i++, nkeys++) {
-        cat_keys[nkeys].elt = &(elts[i]);
-        cat_keys[nkeys].level = 1;
-        overlap_hash(&(cat_keys[nkeys]), hash_table, nhash, flags);
-    }
-
-    /* Copy concatenated list of elements into table a to
-     * form the new table contents, but:
-     *   1) omit the ones marked "skip," and
-     *   2) merge values for the same key if needed
-     */
-    make_array_core(&a->a, b->a.pool, max_keys, sizeof(apr_table_entry_t), 0);
-    nkeys = 0;
-    dst_elt = (apr_table_entry_t *)a->a.elts;
-    for (i = 0; i < max_keys; i++) {
-        if (cat_keys[i].skip) {
-            continue;
-        }
-        if (cat_keys[i].merge_next) {
-            char *new_val;
-            char *val_next;
-            overlap_key *next = cat_keys[i].merge_next;
-            int len = (cat_keys[i].elt->val ?
-                       strlen(cat_keys[i].elt->val) : 0);
-            do {
-                len += 2;
-                if (next->elt->val) {
-                    len += strlen(next->elt->val);
-                }
-                next = next->merge_next;
-            } while (next);
-            len++;
-            new_val = (char *)apr_palloc(b->a.pool, len);
-            val_next = new_val;
-            if (cat_keys[i].elt->val) {
-                strcpy(val_next, cat_keys[i].elt->val);
-                val_next += strlen(cat_keys[i].elt->val);
-            }
-            next = cat_keys[i].merge_next;
-            do {
-                *val_next++ = ',';
-                *val_next++ = ' ';
-                if (next->elt->val) {
-                    strcpy(val_next, next->elt->val);
-                    val_next += strlen(next->elt->val);
-                }
-                next = next->merge_next;
-            } while (next);
-            *val_next = 0;
-            dst_elt->key = cat_keys[i].elt->key;
-            dst_elt->val = new_val;
-            dst_elt->key_checksum = cat_keys[i].elt->key_checksum;
-            dst_elt++;
-        }
-        else {
-            dst_elt->key = cat_keys[i].elt->key;
-            dst_elt->val = cat_keys[i].elt->val;
-            dst_elt->key_checksum = cat_keys[i].elt->key_checksum;
-            dst_elt++;
-        }
+    apr_table_cat(a, b);
+
+    if (flags == APR_OVERLAP_TABLES_SET) {
+        apr_table_merger_t *f = a->merge;
+        a->merge = collapse;
+        apr_table_compress(a);
+        a->merge = f;
     }
-    a->a.nelts = dst_elt - (apr_table_entry_t *)a->a.elts;
-    table_reindex(a);
+    else
+        apr_table_compress(a);
 }
-


Re: call for vote Re: [PATCH] Re: Frankentables

Posted by Joe Schaefer <jo...@sunstarsys.com>.
Bill Stoddard <bi...@wstoddard.com> writes:

[...]

> At about how many headers is the performance benefits of this patch
> lost? 

It's hard to give a brief answer to this question, but I'll 
give it a try.

The patch leverages the checksum & table index code; it will
win whenever those optimizations are working well, and 
it will lose when they are working poorly.

There are essentially two factors that go into this:

  1) the quality of the key_checksum,
  2) the spread between index_first and index_last.

The checksum is good whenever headers are distinguishable
by their first 4 letters, since it eliminates calls to
strcasecmp.  The spread between index_first & index_last
measures the "clumpiness" of the distribution of the 
headers' first letters.

Normally both of these factors are in the patch's favor.
Typically the key_checksum can distinguish most headers, 
and the spread between index_first & index_last for 
each letter is less than 16.  I would expect the patch 
to win in such cases.

For example, the headers range here from "Acommon" to
"Pcommon" and are duplicated once.  Even though the
spread for each letter ("A" .. "P") is 16, which is
atypically large, the  key_checksum can distinguish 
headers and the patch wins:

% perl -wle '$X="A";$args.="-H ".$X++."common:foo=bar " for 1..16; \
  exec qq{ab $args $args -n 50000 -c 50 http://127.0.0.1:8080/foo}'

=> UNPATCHED   : 895 requests/second
   APR PATCHED : 913 requests/second

But the pathological case is when all headers have at
least the first four letters in common. In such a situation,
the above optimizations are worthless, and the patch 
underperforms the current code at about 8 headers 
(because strcasecmp is grinding up the CPU).


-- 
Joe Schaefer


Re: call for vote Re: [PATCH] Re: Frankentables

Posted by Bill Stoddard <bi...@wstoddard.com>.
Brian Pane wrote:

>Thanks.  IMHO, a 4x increase in CPU usage for the worst
>case isn't a problem, as that's a 4x increase in an operation
>that's only a few percent of the total cycles anyway.
>
>But since the 4x degradation in the worst case may be
>controversial, I'd like to have a vote first.  I'm +1 for
>making this change in APR. Can a few other committers weigh in?
>
>Thanks,
>Brian
>

At about how many headers is the performance benefits of this patch lost?

Bill



call for vote Re: [PATCH] Re: Frankentables

Posted by Brian Pane <br...@cnet.com>.
Thanks.  IMHO, a 4x increase in CPU usage for the worst
case isn't a problem, as that's a 4x increase in an operation
that's only a few percent of the total cycles anyway.

But since the 4x degradation in the worst case may be
controversial, I'd like to have a vote first.  I'm +1 for
making this change in APR. Can a few other committers weigh in?

Thanks,
Brian

On Tue, 2003-05-20 at 17:25, Joe Schaefer wrote:
> Brian Pane <br...@cnet.com> writes:
> 
> > Thanks, it works much better with that change.
> > One more question: do you have a way to get "before" and
> > "after" profile data for a request with a ridiculously
> > large number of header fields (e.g., 90 different Cookie
> > or Accept lines that need to get merged)?  A default httpd
> > config will allow up to a 100-line header, which is a large
> > enough 'n' to make me paranoid about O(n^2) DoS vulnerabilities.
> 
> I had to patch ab to allow it to take a large enough 
> set of headers.
> 
> % perl -wle '$xx="aa";$args.="-H Cook".++$xx.":foo=bar " for 1..48; \
>   exec qq{ab $args $args -n 50000 -c 50 http://127.0.0.1:8080/foo}'
> 
> % oprofile -rl /home/joe/apache2/bin/httpd | head
> 
> 
> UNPATCHED:
> vma      samples  %           symbol name             
> 00075fb4 9007     20.0071     __GI___strcasecmp       
> 00009d3c 8553     18.9986     __pthread_internal_tsd_address 
> 0000cf00 2294     5.09563     overlap_hash            
> 0000d13c 1900     4.22044     apr_table_overlap       
> 0808852c 1867     4.14714     ap_rgetline_core        
> 000762d8 1254     2.78549     memcpy                  
> 000177d4 787      1.74815     apr_palloc              
> 0808ddbc 551      1.22393     core_input_filter       
> 00075b10 482      1.07066     __memchr                
> 
> 
> PATCHED:
> vma      samples  %           symbol name             
> 00075fb4 35642    35.7866     __GI___strcasecmp       
> 00009d3c 35549    35.6932     __pthread_internal_tsd_address 
> 0000c41c 3134     3.14671     apr_table_compress      
> 000762d8 1454     1.4599      memcpy                  
> 0808852c 1453     1.45889     ap_rgetline_core        
> 0000c010 1450     1.45588     apr_array_pstrjoin      
> 00017bb8 811      0.81429     apr_palloc              
> 00006214 621      0.623519    apr_brigade_split_line  
> 0808ddbc 578      0.580345    core_input_filter       
> 
> So it looks like the worst case scenario is 
> worsened by a factor of 4.


APR_BINARY?

Posted by "Marc M. Adkins" <mm...@Doorways.org>.
I'm using the 2.0.45 source for reference...

Should the APR_BINARY flag be used to handle newline differences (prevalent
on Windows)?  'Cause it doesn't appear to be doing anything on that
platform.  As near as I can tell, all files on Windows are binary mode.

APR_BINARY is described in apr_file_open() documentation as having nothing
to do with UNIX.  A search through the source shows that it is _only_ used
in the UNIX source files (where O_BINARY is found to be defined).

I ask because I just spent some time figuring out that there is no newline
conversion in the apr_file_ functions on Windows.  Had to write code to
handle it.  No big deal, but somewhat unexpected.

mma


Re: [PATCH] Re: Frankentables

Posted by Joe Schaefer <jo...@sunstarsys.com>.
Brian Pane <br...@cnet.com> writes:

> Thanks, it works much better with that change.
> One more question: do you have a way to get "before" and
> "after" profile data for a request with a ridiculously
> large number of header fields (e.g., 90 different Cookie
> or Accept lines that need to get merged)?  A default httpd
> config will allow up to a 100-line header, which is a large
> enough 'n' to make me paranoid about O(n^2) DoS vulnerabilities.

I had to patch ab to allow it to take a large enough 
set of headers.

% perl -wle '$xx="aa";$args.="-H Cook".++$xx.":foo=bar " for 1..48; \
  exec qq{ab $args $args -n 50000 -c 50 http://127.0.0.1:8080/foo}'

% oprofile -rl /home/joe/apache2/bin/httpd | head


UNPATCHED:
vma      samples  %           symbol name             
00075fb4 9007     20.0071     __GI___strcasecmp       
00009d3c 8553     18.9986     __pthread_internal_tsd_address 
0000cf00 2294     5.09563     overlap_hash            
0000d13c 1900     4.22044     apr_table_overlap       
0808852c 1867     4.14714     ap_rgetline_core        
000762d8 1254     2.78549     memcpy                  
000177d4 787      1.74815     apr_palloc              
0808ddbc 551      1.22393     core_input_filter       
00075b10 482      1.07066     __memchr                


PATCHED:
vma      samples  %           symbol name             
00075fb4 35642    35.7866     __GI___strcasecmp       
00009d3c 35549    35.6932     __pthread_internal_tsd_address 
0000c41c 3134     3.14671     apr_table_compress      
000762d8 1454     1.4599      memcpy                  
0808852c 1453     1.45889     ap_rgetline_core        
0000c010 1450     1.45588     apr_array_pstrjoin      
00017bb8 811      0.81429     apr_palloc              
00006214 621      0.623519    apr_brigade_split_line  
0808ddbc 578      0.580345    core_input_filter       

So it looks like the worst case scenario is 
worsened by a factor of 4.

-- 
Joe Schaefer


Re: [PATCH] Re: Frankentables

Posted by Joe Schaefer <jo...@sunstarsys.com>.
Brian Pane <br...@cnet.com> writes:

> Thanks, it works much better with that change.
> One more question: do you have a way to get "before" and
> "after" profile data for a request with a ridiculously
> large number of header fields (e.g., 90 different Cookie
> or Accept lines that need to get merged)?  A default httpd
> config will allow up to a 100-line header, which is a large
> enough 'n' to make me paranoid about O(n^2) DoS vulnerabilities.

Sure thing, Brian.  I'll post the oprofile data as soon as 
I have it (probably tomorrow).

-- 
Joe Schaefer


Re: [PATCH] Re: Frankentables

Posted by Brian Pane <br...@cnet.com>.
Thanks, it works much better with that change.
One more question: do you have a way to get "before" and
"after" profile data for a request with a ridiculously
large number of header fields (e.g., 90 different Cookie
or Accept lines that need to get merged)?  A default httpd
config will allow up to a 100-line header, which is a large
enough 'n' to make me paranoid about O(n^2) DoS vulnerabilities.

Thanks,
Brian

Joe Schaefer wrote:

>Brian Pane <br...@cnet.com> writes:
>
>  
>
>>I just applied the patch (both the APR and httpd parts) and ran
>>the httpd-test regression tests.  Something is broken:
>>    
>>
>
>[...]
>
>  
>
>>There were a lot of segfaults, too.  It looks like the index
>>within the table is getting corrupted.  Here's an example:
>>    
>>
>
>Yup- looks like I botched index_initialized in apr_table_overlay.
>Try adding 
>
>    t->index_initialized = overlay->index_initialized;
>
>to apr_tables.c line 1076; all tests pass for me now.
>
>Very sorry about that.  I'll gladly add a test for apr_table_overlay 
>to the apr table tests in a followup patch.
>
>  
>



Re: [PATCH] Re: Frankentables

Posted by Joe Schaefer <jo...@sunstarsys.com>.
Brian Pane <br...@cnet.com> writes:

> I just applied the patch (both the APR and httpd parts) and ran
> the httpd-test regression tests.  Something is broken:

[...]

> There were a lot of segfaults, too.  It looks like the index
> within the table is getting corrupted.  Here's an example:

Yup- looks like I botched index_initialized in apr_table_overlay.
Try adding 

    t->index_initialized = overlay->index_initialized;

to apr_tables.c line 1076; all tests pass for me now.

Very sorry about that.  I'll gladly add a test for apr_table_overlay 
to the apr table tests in a followup patch.

-- 
Joe Schaefer


Re: [PATCH] Re: Frankentables

Posted by Brian Pane <br...@cnet.com>.
I just applied the patch (both the APR and httpd parts) and ran
the httpd-test regression tests.  Something is broken:

Failed Test             Stat Wstat Total Fail  Failed  List of Failed
-------------------------------------------------------------------------------
apache/acceptpathinfo.t               36    2   5.56%  13 26
apache/limits.t                       10    1  10.00%  5
modules/alias.t                       62    1   1.61%  1
modules/negotiation.t                 98    8   8.16%  1 17 40 49 63 86
94 98
modules/setenvif.t                   111   47  42.34%  1-3 7-9 14-15
19-21 25-
                                                       27 31-33 37-39
43-45 50-
                                                       51 55-57 61-63
68-69 73-
                                                       75 79 81 86-87 91
93 98-
                                                       102

There were a lot of segfaults, too.  It looks like the index
within the table is getting corrupted.  Here's an example:

(gdb) bt
#0  apr_table_setn (t=0x818a090, key=0x80b86ab "ETag",
    val=0x8182af0 "\"518034-1f-b5d91b00\"") at apr_tables.c:692
#1  0x0807b0ca in ap_set_etag (r=0x8188540) at http_protocol.c:2806
#2  0x0809cb2f in default_handler (r=0x8188540) at core.c:3340
#3  0x0808bace in ap_run_handler (r=0x8188540) at config.c:195
#4  0x0808bfe6 in ap_invoke_handler (r=0x8188540) at config.c:401
#5  0x0807bc2b in ap_process_request (r=0x8188540) at http_request.c:288
#6  0x08077e41 in ap_process_http_connection (c=0x817e5f0) at
http_core.c:292
#7  0x0809494e in ap_run_process_connection (c=0x817e5f0) at
connection.c:85
#8  0x0808a61a in child_main (child_num_arg=135831996) at prefork.c:652
#9  0x0808a7d0 in make_child (s=0x81122d0, slot=0) at prefork.c:747
#10 0x0808a86f in startup_children (number_to_start=1) at prefork.c:765
#11 0x0808af61 in ap_mpm_run (_pconf=0x0, plog=0x810cd20, s=0x81122d0)
    at prefork.c:982
#12 0x0808fcb3 in main (argc=7, argv=0xbfffd854) at main.c:671
#13 0x420156a4 in __libc_start_main () from /lib/tls/libc.so.6
(gdb) p next_elt
$6 = (apr_table_entry_t *) 0x6860b8f0
(gdb) p (apr_table_entry_t *) t->a.elts
$7 = (struct apr_table_entry_t *) 0x818a1b0
(gdb) p hash
$8 = 5
(gdb) p t->index_first[hash]
$9 = 134611440
(gdb)



Re: [PATCH] Re: Frankentables

Posted by Joe Schaefer <jo...@sunstarsys.com>.
Brian Pane <br...@cnet.com> writes:

[...]

> Thanks for posting the oprofile data.  The results 
> look good.  I haven't had time to read through the
> code in detail yet, but I'll do that this afternoon.

Thanks alot, Brian!

> Based on a first glance at the code, the only thing
> that worries me is the comment about apr_table_compress()
> taking quadratic time.  I think it's possible to use a
> mergesort-based algorithm to do the compress in
> n*log(n) time and linear space, though.

It'll likely be slower for small-sized tables, though,
just like the red-black tree implementation. IIRC httpd has
a cutoff for the maximum number of input headers, so this 
"worst-case-quadratic-time" shouldn't present a DOS-able
issue for httpd.

In any case, it's no problem to keep the complex
algorithms here for larger tables; but apr_tables_get()
is still going to be O(n) in such cases.  Unfortunately,
any sort of conditionals in apr_tables_get tends to kill
its httpd (many small tables) performance.

Maybe the documentation should just recommend apr_hash in
cases where the number of elements is more than what is
typical (a few dozen?) for httpd.

-- 
Joe Schaefer


Re: [PATCH] Re: Frankentables

Posted by Brian Pane <br...@cnet.com>.
On Sat, 2003-05-17 at 21:53, Joe Schaefer wrote:
> Joe Schaefer <jo...@sunstarsys.com> writes:
> 
> [...]
> 
> > oprofile tells me this version wins by ~10 - 25% when measuring the
> > time httpd spends in table ops.  To get the full benefit, 
> > ap_get_mime_headers_core in httpd-2.0/server/protocol.c also 
> > needs a patch.
> 
> FWIW here's the oprofile data.  The sample data was generated by
> running 3 passes of (here "/foo" is a zero-byte file)

Thanks for posting the oprofile data.  The results 
look good.  I haven't had time to read through the
code in detail yet, but I'll do that this afternoon.

Based on a first glance at the code, the only thing
that worries me is the comment about apr_table_compress()
taking quadratic time.  I think it's possible to use a
mergesort-based algorithm to do the compress in
n*log(n) time and linear space, though.

Brian



Re: [PATCH] Re: Frankentables

Posted by Sascha Schumann <sa...@schumann.cx>.
On Sun, 18 May 2003, Joe Schaefer wrote:

> Joe Schaefer <jo...@sunstarsys.com> writes:
>
> [...]
>
> > oprofile tells me this version wins by ~10 - 25% when measuring the
> > time httpd spends in table ops.  To get the full benefit,
> > ap_get_mime_headers_core in httpd-2.0/server/protocol.c also
> > needs a patch.
>
> FWIW here's the oprofile data.  The sample data was generated by
> running 3 passes of (here "/foo" is a zero-byte file)

    Have you looked at valgrind/kcachegrind?  It usually gives
    you more interesting profiling data, because it generates
    complete call graphs instead of just measuring the time spent
    in a single function.

    http://developr.kde.org/~sewardj/
    http://kcachegrind.sourceforge.net/

    - Sascha

Re: [PATCH] Re: Frankentables

Posted by Joe Schaefer <jo...@sunstarsys.com>.
Joe Schaefer <jo...@sunstarsys.com> writes:

[...]

> oprofile tells me this version wins by ~10 - 25% when measuring the
> time httpd spends in table ops.  To get the full benefit, 
> ap_get_mime_headers_core in httpd-2.0/server/protocol.c also 
> needs a patch.

FWIW here's the oprofile data.  The sample data was generated by
running 3 passes of (here "/foo" is a zero-byte file)

  % ab -H "Accept-Language: en" -H "Accept-Encoding: gzip"        \
  -H "Accept-Charset: *" -H "Referer: http://localhost/"          \
  -H "Cookie: foo=bar" -n 50000 -c 50 http://127.0.0.1:8080/foo

for each of 3 cases: 

  1) current, 
  2) just apr patched, and 
  3) both apr + httpd patched

The results were obtained via:

  % op_time -rl /home/joe/apache2/bin/httpd | perl -nal             \
  -e 'if (/table|overlap/) { $ops += $F[1]; $pct += $F[2]; print }' \
  -e 'END { printf "\nTOTAL:\t %d \t  %.6f %%\n", $ops, $pct }'


--------------------------------------------------


CURRENT:

0000c1cc 627      1.33444     apr_table_get           ...
0000cf00 348      0.740646    overlap_hash            ...
0000c4dc 334      0.71085     apr_table_setn          ...
0000d13c 302      0.642745    apr_table_overlap       ...
0000c050 206      0.438428    apr_table_make          ...
0000cbbc 183      0.389478    apr_table_addn          ...
0000c9c8 162      0.344784    apr_table_mergen        ...
0000c144 158      0.33627     table_reindex           ...
0000c6e4 107      0.227727    apr_table_unset         ...
0000ccd0 89       0.189418    apr_table_vdo           ...
0000c02c 78       0.166007    apr_table_elts          ...
0000cca8 44       0.0936449   apr_table_do            ...
0000c034 18       0.0383093   apr_is_empty_table      ...

TOTAL:   2656     5.652747 %


--------------------------------------------------


APR PATCHED, HTTPD UNPATCHED:

0000c65c 518      1.16785     apr_table_get           ...
0000c41c 282      0.63578     apr_table_compress      ...
0000c968 267      0.601961    apr_table_setn          ...
0000c194 235      0.529816    apr_table_make          ...
0000d3a4 224      0.505016    apr_table_addn          ...
0000d430 176      0.396799    apr_table_cat           ...
0000d008 174      0.392289    apr_table_mergen        ...
0000d7d4 103      0.232217    apr_table_overlap       ...
0000d614 103      0.232217    apr_table_vdo           ...
0000d5ec 80       0.180363    apr_table_do            ...
0000c144 72       0.162327    apr_table_elts          ...
0000cb70 51       0.114981    apr_table_unset         ...
0000c14c 26       0.058618    apr_is_empty_table      ...

TOTAL:	 2311 	  5.210234 %


--------------------------------------------------


BOTH APR & HTTPD PATCHED:

0000c65c 508      1.12834     apr_table_get           ...
0000c968 234      0.519746    apr_table_setn          ...
0000c194 218      0.484208    apr_table_make          ...
0000d430 202      0.44867     apr_table_cat           ...
0000c41c 188      0.417574    apr_table_compress      ...
0000d3a4 187      0.415352    apr_table_addn          ...
0000d008 145      0.322065    apr_table_mergen        ...
0000d614 95       0.211008    apr_table_vdo           ...
0000d7d4 83       0.184354    apr_table_overlap       ...
0000d5ec 55       0.122162    apr_table_do            ...
0000c144 52       0.115499    apr_table_elts          ...
0000cb70 44       0.09773     apr_table_unset         ...
0000c14c 25       0.0555284   apr_is_empty_table      ...

TOTAL:	 2036 	  4.522236 %

--------------------------------------------------

Is anyone interested in applying the patch?  If so,
soon afterwards I'd like submit another patch to this patch,
which includes unit tests for the new functions,
fixes the current testtable.c (since the test arguments
are all incorrectly ordered), and reflects the correct
behavior for APR_OVERLAP_TABLES_SET.

-- 
Joe Schaefer