You are viewing a plain text version of this content. The canonical link for it is here.
Posted to bugs@httpd.apache.org by bu...@apache.org on 2011/03/12 23:54:54 UTC

DO NOT REPLY [Bug 50920] New: using word wise comparison to improve string comparison

https://issues.apache.org/bugzilla/show_bug.cgi?id=50920

           Summary: using word wise comparison to improve string
                    comparison
           Product: Apache httpd-2
           Version: 2.2.17
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: All
        AssignedTo: bugs@httpd.apache.org
        ReportedBy: songlinhai0543@gmail.com


There are many code fragments which compare two strings or one string with a
constant characters in httpd. These comparisons continue when two characters
are equal to each other, and stop when two characters are not equal. The whole
comparison process can be optimized by word-wise comparison. 

For example:

/httpd-2.2.17/modules/http/http_filters.c:592
        while (*b == '0') {

    /* Skip leading zeros */
    while (*b == '0') {
        ++b;
    }


can be optimized as:

while( *(unsigned int *)b == 0x30303030 ) {
   b += 4;
}

while( *b == '0')
{
   b ++;
}

in my unit test, if b has more than 4 '0' from the beginning, the patch will be
faster. And the optimization result will be more and more clear, when string b
has more '0' from the beginning. 

There are also other places which can be optimized in a similar way, and I list
them as follows:


/httpd-2.2.17/modules/generators/mod_autoindex.c:2169
   while (title_endp > title_name && *title_endp == '/') {
        *title_endp-- = '\0';
    }
==============================================================================
/httpd-2.2.17/modules/http/http_filters.c:592

/* Skip leading zeros */
    while (*b == '0') {
        ++b;
    }

===============================================================================
/httpd-2.2.17/modules/mappers/mod_alias.c:309
do {
                ++aliasp;
            } while (*aliasp == '/');

===============================================================================
/httpd-2.2.17/modules/mappers/mod_alias.c:312
  do {
                ++urip;
            } while (*urip == '/');
===============================================================================
/httpd-2.2.17/server/config.c:367

/* MIME type arguments */
            while (p2 > handler && p2[-1] == ' ')
                --p2; /* strip trailing spaces */

==============================================================================
/httpd-2.2.17/server/util_script.c:294
                while (lu && uri[lu-1] == '/') lu--;
==============================================================================
/httpd-2.2.17/server/protocol.c:507
 while ((uri[0] == '/') && (uri[1] == '/')) {
        ++uri ;
    }

==============================================================================
/httpd-2.2.17/server/core.c:3519

 while (*path == '/') {
            ++path;
        }

==============================================================================
/httpd-2.2.17/server/core.c:3542

while (*path == '/') {
            ++path;
        }

==============================================================================
/httpd-2.2.17/server/util.c:516

 do {
                ++s;
            } while (*s == '/');

=============================================================================
/httpd-2.2.17/srclib/apr/file_io/unix/filepath.c:73
do {
            ++(*inpath);
        } while (**inpath == '/');

=============================================================================
/httpd-2.2.17/srclib/apr/file_io/unix/filepath.c:164

 while (addpath[0] == '/')
            ++addpath;

=============================================================================
/httpd-2.2.17/srclib/apr/strings/apr_fnmatch.c:88

/* Collapse multiple stars. */
        while (c == '*') {
        c = *++pattern;
        }

==============================================================================
/httpd-2.2.17/srclib/apr/strings/apr_snprintf.c:187
   for (i = ndigit - 1; i > 0 && p1[i] == '0'; i--)
        ndigit--;

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


DO NOT REPLY [Bug 50920] using word wise comparison to improve string comparison

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=50920

--- Comment #2 from Linhai Song <so...@gmail.com> 2011-03-13 00:58:50 EST ---
No optimizing compiler is able to perform this type of optimization. We need to
write word-wise comparison codes by ourselves. 

I may make some misleading claims in my original bug report.
Actually, based on my unit testing, the suggested implementation almost always
runs faster.
When the length is smaller than 4 bytes, there is negligible performance
difference.
When the length is larger than 4 bytes, the version I suggested runs 
much faster.
For example, when the trailing blank length is 10 bytes, the version I
suggested only takes about 3/4 the time the original version takes. And when
the trailing blank length is 20, the version I suggested only takes about HALF
the time the original version takes.

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


DO NOT REPLY [Bug 50920] using word wise comparison to improve string comparison

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=50920

--- Comment #4 from Nick Kew <ni...@webthing.com> 2011-03-13 04:50:02 EDT ---
(In reply to comment #2)
> the trailing blank length is 20, the version I suggested only takes about HALF
> the time the original version takes.

Can't see a trailing blanks example in there.

Your first example, leading zeros, is typical: the likelihood of there being
any, let alone many, is low.  If we were expecting to skip 20 leading zeros,
maybe we'd have used strspn.

If you want to make a case for a patch, I'd suggest all of:
(a) A usage case where there's a likelihood of long strings in real life.
(b) A macro that doesn't detract from readability.
(c) Address portability issues like sizeof and endianness.

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


DO NOT REPLY [Bug 50920] using word wise comparison to improve string comparison

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=50920

--- Comment #3 from William A. Rowe Jr. <wr...@apache.org> 2011-03-13 01:08:48 EST ---
Linhai,

let me caution you that the httpd project isn't going to be receptive to
per-platform optimizations that result in less readable code.  In fact, 2.0
was a milestone in moving these hundreds of #ifdef exceptions out of the
project
and giving them away to a portability layer.

Per-platform optimizations are welcomed - at the Apache APR (portable runtime)
project, which is happy to deal with OS, CPU or driver discrepancies.  I'd
encourage you to continue your research, and propose optimizations to that
project (there is a dev@apr.apache.org mailing list, and a separate APR bug
category in bugzilla).

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


DO NOT REPLY [Bug 50920] using word wise comparison to improve string comparison

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=50920

Nick Kew <ni...@webthing.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |WONTFIX

--- Comment #1 from Nick Kew <ni...@webthing.com> 2011-03-12 19:55:23 EST ---
(a) Isn't this the kind of thing best left to an optimising compiler?
(b) Since the size of int is not fixed, your patch breaks portability.

Don't you think readability should trump outguessing-the-compiler here?

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


DO NOT REPLY [Bug 50920] using word wise comparison to improve string comparison

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=50920

Eric Covener <co...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org