You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@httpd.apache.org by Jim Jagielski <ji...@jaguNET.com> on 2015/11/24 16:21:55 UTC

Re: svn commit: r1715938 - /httpd/httpd/trunk/modules/cache/cache_util.c

For these types of paths, where we use a switch/case against
the 1st char of the token, the real reason why we used that
impl was to avoid the (expected) expensive string comparison.

This is really no longer an issue. Sure, we could still use
this "fast path" short-circuit, but even that is non-optimal.
For the below, what we really should be testing is:

	if (!ap_casecmpstr(token+1, "ublic")) {

for example.

Sooooo the question is: Do we remove these silly fast-paths
that no longer make sense[1] or optimize for the "we know the
1st char" case ala the above?


1. eg:

        while (token) {
                if (!ap_casecmpstrn(token, "no-cache", 8)) {
                    if (token[8] == '=') {
                        cc->no_cache_header = 1;
                    }
                    else if (!token[8]) {
                        cc->no_cache = 1;
                    }
                }
                else if (!ap_casecmpstr(token, "no-store")) {
                    cc->no_store = 1;
                }
                else if (!ap_casecmpstr(token, "no-transform")) {
                    cc->no_transform = 1;
                }
                else if (!ap_casecmpstrn(token, "max-age", 7)) {
                    if (token[7] == '='
                            && !apr_strtoff(&offt, token + 8, &endp, 10)
                            && endp > token + 8 && !*endp) {
                        cc->max_age = 1;
                        cc->max_age_value = offt;
                    }
                }
                else if (!ap_casecmpstr(token, "must-revalidate")) {
                    cc->must_revalidate = 1;
                }
                else if (!ap_casecmpstrn(token, "max-stale", 9)) {
                    if (token[9] == '='
                            && !apr_strtoff(&offt, token + 10, &endp, 10)
                            && endp > token + 10 && !*endp) {
                        cc->max_stale = 1;
                        cc->max_stale_value = offt;
                    }
                    else if (!token[9]) {
                        cc->max_stale = 1;
                        cc->max_stale_value = -1;
                    }
                }
                else if (!ap_casecmpstrn(token, "min-fresh", 9)) {
                    if (token[9] == '='
                            && !apr_strtoff(&offt, token + 10, &endp, 10)
                            && endp > token + 10 && !*endp) {
                        cc->min_fresh = 1;
                        cc->min_fresh_value = offt;
                    }
                }

	...

> On Nov 23, 2015, at 3:05 PM, ylavic@apache.org wrote:
> 
> Author: ylavic
> Date: Mon Nov 23 20:05:16 2015
> New Revision: 1715938
> 
> URL: http://svn.apache.org/viewvc?rev=1715938&view=rev
> Log:
> Follow up to r1715876: fix typo.
> 
> Modified:
>    httpd/httpd/trunk/modules/cache/cache_util.c
> 
> Modified: httpd/httpd/trunk/modules/cache/cache_util.c
> URL: http://svn.apache.org/viewvc/httpd/httpd/trunk/modules/cache/cache_util.c?rev=1715938&r1=1715937&r2=1715938&view=diff
> ==============================================================================
> --- httpd/httpd/trunk/modules/cache/cache_util.c (original)
> +++ httpd/httpd/trunk/modules/cache/cache_util.c Mon Nov 23 20:05:16 2015
> @@ -1074,7 +1074,8 @@ int ap_cache_control(request_rec *r, cac
>                 }
>                 break;
>             }
> -            case 'p': {
> +            case 'p':
> +            case 'P': {
>                 if (!ap_casecmpstr(token, "public")) {
>                     cc->public = 1;
>                 }
> 
>

Re: svn commit: r1715938 - /httpd/httpd/trunk/modules/cache/cache_util.c

Posted by Yann Ylavic <yl...@gmail.com>.

On Tue, Nov 24, 2015 at 4:49 PM, William A Rowe Jr <wr...@rowe-clan.net> wrote:
>
> Does it make sense to expose the ap_tolower_ascii() function?

+1, possibly an inline one or a macro...

> We could call it ap_tolower_plaintext()

I prefer ascii().

Re: svn commit: r1715938 - /httpd/httpd/trunk/modules/cache/cache_util.c

Posted by William A Rowe Jr <wr...@rowe-clan.net>.

On Tue, Nov 24, 2015 at 9:28 AM, Eric Covener <co...@gmail.com> wrote:

> On Tue, Nov 24, 2015 at 10:21 AM, Jim Jagielski <ji...@jagunet.com> wrote:
> > For these types of paths, where we use a switch/case against
> > the 1st char of the token, the real reason why we used that
> > impl was to avoid the (expected) expensive string comparison.
>
> Maybe naively, I thought we were saving some potential function
> call overhead here rather than some strcasecmp overhead.
>

In part, but also in this specific case;

        while (token) {
***
A switch here short circuits three function calls;
                if (!ap_casecmpstrn(token, "no-cache", 8)) {
                    if (token[8] == '=') {
                        cc->no_cache_header = 1;
                    }
                    else if (!token[8]) {
                        cc->no_cache = 1;
                    }
                }
                else if (!ap_casecmpstr(token, "no-store")) {
                    cc->no_store = 1;
                }
                else if (!ap_casecmpstr(token, "no-transform")) {
                    cc->no_transform = 1;
                }
***
A switch here short circuits four function calls;
                else if (!ap_casecmpstrn(token, "max-age", 7)) {
                    if (token[7] == '='
                            && !apr_strtoff(&offt, token + 8, &endp, 10)
                            && endp > token + 8 && !*endp) {
                        cc->max_age = 1;
                        cc->max_age_value = offt;
                    }
                }
                else if (!ap_casecmpstr(token, "must-revalidate")) {
                    cc->must_revalidate = 1;
                }
                else if (!ap_casecmpstrn(token, "max-stale", 9)) {
                    if (token[9] == '='
                            && !apr_strtoff(&offt, token + 10, &endp, 10)
                            && endp > token + 10 && !*endp) {
                        cc->max_stale = 1;
                        cc->max_stale_value = offt;
                    }
                    else if (!token[9]) {
                        cc->max_stale = 1;
                        cc->max_stale_value = -1;
                    }
                }
                else if (!ap_casecmpstrn(token, "min-fresh", 9)) {
                    if (token[9] == '='
                            && !apr_strtoff(&offt, token + 10, &endp, 10)
                            && endp > token + 10 && !*endp) {
                        cc->min_fresh = 1;
                        cc->min_fresh_value = offt;
                    }
                }
***

Plus the overhead if each function call itself, it is still sensible to
narrow out
several groups of tokens (could even use if s[0]=='n' s[1]=='o' tests)

Does it make sense to expose the ap_tolower_ascii() function?  We could
call it ap_tolower_plaintext() or something else to distinguish that it
isn't
lowercasing extended characters?  Or is the case 'N': case 'n': syntax
easier to follow?

Re: svn commit: r1715938 - /httpd/httpd/trunk/modules/cache/cache_util.c

Posted by Jim Jagielski <ji...@jaguNET.com>.

> On Nov 24, 2015, at 10:28 AM, Eric Covener <co...@gmail.com> wrote:
> 
> On Tue, Nov 24, 2015 at 10:21 AM, Jim Jagielski <ji...@jagunet.com> wrote:
>> For these types of paths, where we use a switch/case against
>> the 1st char of the token, the real reason why we used that
>> impl was to avoid the (expected) expensive string comparison.
> 
> Maybe naively, I thought we were saving some potential function
> call overhead here rather than some strcasecmp overhead.

Yeah... you are likely right. In which case, we should adjust
ala the suggested

	if (!ap_casecmpstr(token+1, "ublic")) {

implementation. Yeah, that's ugly, I agree.

Re: svn commit: r1715938 - /httpd/httpd/trunk/modules/cache/cache_util.c

Posted by Yann Ylavic <yl...@gmail.com>.

On Tue, Nov 24, 2015 at 4:28 PM, Eric Covener <co...@gmail.com> wrote:
> On Tue, Nov 24, 2015 at 10:21 AM, Jim Jagielski <ji...@jagunet.com> wrote:
>> For these types of paths, where we use a switch/case against
>> the 1st char of the token, the real reason why we used that
>> impl was to avoid the (expected) expensive string comparison.
>
> Maybe naively, I thought we were saving some potential function
> call overhead here rather than some strcasecmp overhead.

I agree, the switch'ed code is likely to be more efficient because of
that (unless we inline casecmpstr[n], but that's not a good idea
IMHO).

I'll modify Chritophe test a bit (more) to check that...

Re: svn commit: r1715938 - /httpd/httpd/trunk/modules/cache/cache_util.c

Posted by Eric Covener <co...@gmail.com>.

On Tue, Nov 24, 2015 at 10:21 AM, Jim Jagielski <ji...@jagunet.com> wrote:
> For these types of paths, where we use a switch/case against
> the 1st char of the token, the real reason why we used that
> impl was to avoid the (expected) expensive string comparison.

Maybe naively, I thought we were saving some potential function
call overhead here rather than some strcasecmp overhead.

Re: svn commit: r1715938 - /httpd/httpd/trunk/modules/cache/cache_util.c

Posted by Graham Leggett <mi...@sharp.fm>.

On 24 Nov 2015, at 6:36 PM, Yann Ylavic <yl...@gmail.com> wrote:

> Sure, but my point is that the worst case is likely depend on the
> application,

It will depend on the application, yes.

> eg:
> 
>    case 'm':
>    case 'M':
>    if (!strncmp(token, "max-age", 7)
>        || !ap_casecmpstrn(token, "max-age", 7)) {
>        ...
>    }
>    else if (!strncmp(token, "max-stale", 9)
>             || !ap_casecmpstrn(token, "max-stale", 9)) {
>        ...
>    }
>    else if (!strncmp(token, "min-fresh", 9)
>             || !ap_casecmpstrn(token, "min-fresh", 9)) {
>        ...
>    }
>    else if (!strcmp(token, "max-revalidate")
>             || !ap_casecmpstr(token, "must-revalidate")) {

Oops - max-revalidate != must-revalidate

>        ...
>    }
>    else if ...
> 
> is going to be costly when matched against "must-revalidate", or worse
> "my-token”.

In that case make it cheaper for those cases.

Have the length handy to check for a minimum-sane-length, then do a switch on the 4th character.

> We could use all str[n]cmp() first, but still it's a lot of
> comparisons, and now duplicated code too.

The duplicated code is not a worry, the worry is to ensure the most common cases take the fastest path.

Regards,
Graham
—

Re: svn commit: r1715938 - /httpd/httpd/trunk/modules/cache/cache_util.c

Posted by Yann Ylavic <yl...@gmail.com>.

On Tue, Nov 24, 2015 at 5:18 PM, Graham Leggett <mi...@sharp.fm> wrote:
> On 24 Nov 2015, at 6:15 PM, Yann Ylavic <yl...@gmail.com> wrote:
>
>> Not sure:
>>    if (!strcmp(h, "max-age")
>>        || ap_cmpcasestr(h, "max-age"))
>> is likely to be a bit faster than a single ap_cmpcasestr() when it
>> matches, but much slower when it does not.
>
> Yep, that’s the point.
>
> The vast majority of comparisons are lowercase for tokens like this. Might as well test that fast path first before testing the worst case scenario.

Sure, but my point is that the worst case is likely depend on the
application, eg:

    case 'm':
    case 'M':
    if (!strncmp(token, "max-age", 7)
        || !ap_casecmpstrn(token, "max-age", 7)) {
        ...
    }
    else if (!strncmp(token, "max-stale", 9)
             || !ap_casecmpstrn(token, "max-stale", 9)) {
        ...
    }
    else if (!strncmp(token, "min-fresh", 9)
             || !ap_casecmpstrn(token, "min-fresh", 9)) {
        ...
    }
    else if (!strcmp(token, "max-revalidate")
             || !ap_casecmpstr(token, "must-revalidate")) {
        ...
    }
    else if ...

is going to be costly when matched against "must-revalidate", or worse
"my-token".

We could use all str[n]cmp() first, but still it's a lot of
comparisons, and now duplicated code too.

Regards,
Yann.

Re: svn commit: r1715938 - /httpd/httpd/trunk/modules/cache/cache_util.c

Posted by Jim Jagielski <ji...@jaguNET.com>.

> On Nov 24, 2015, at 1:11 PM, Yann Ylavic <yl...@gmail.com> wrote:
> 
> On Tue, Nov 24, 2015 at 7:03 PM, Jim Jagielski <ji...@jagunet.com> wrote:
>> 
>>> On Nov 24, 2015, at 11:18 AM, Graham Leggett <mi...@sharp.fm> wrote:
>>> 
>>> On 24 Nov 2015, at 6:15 PM, Yann Ylavic <yl...@gmail.com> wrote:
>>> 
>>>> Not sure:
>>>>  if (!strcmp(h, "max-age")
>>>>      || ap_cmpcasestr(h, "max-age"))
>>>> is likely to be a bit faster than a single ap_cmpcasestr() when it
>>>> matches, but much slower when it does not.
>>> 
>>> Yep, that’s the point.
>>> 
>>> The vast majority of comparisons are lowercase for tokens like this. Might as well test that fast path first before testing the worst case scenario.
>>> 
>> 
>> Is there REALLY that much of a diff between
>> 
>>        if (ucharmap[*ps1] != ucharmap[*ps2]) {
>> 
>> and
>> 
>>        if (*ps1 != *ps2) {
>> 
>> to muddle up the code like that though??
> 
> The test from the other thread including str[n]cmp() says:
> 
> $ for i in `seq 0 3`; do
>    ./newtest $i 150000000 \
>        xcxcxcxcxcxcxcxcxcxcwwwwwwwwwwaaaaaaaaaa \
>        xcxcxcxcxcxcxcxcxcxcwwwwwwwwwwaaaaaaaaaa \
>        0
> done
> - str[n]casecmp (nb=150000000, len=0)
> time = 8.435895804 : res = 0
> - ap_casecmpstr[n] (nb=150000000, len=0)
> time = 8.207019751 : res = 0
> - ap_casecmpstr[n] w/ index (nb=150000000, len=0)
> time = 4.429462481 : res = 0
> - str[n]cmp (nb=150000000, len=0)
> time = 1.923039981 : res = 0
> 
> So strcmp() is really fast, since it probably advances word per word...

Yeah, likely just memcmp()

Re: svn commit: r1715938 - /httpd/httpd/trunk/modules/cache/cache_util.c

Posted by Yann Ylavic <yl...@gmail.com>.

On Tue, Nov 24, 2015 at 7:03 PM, Jim Jagielski <ji...@jagunet.com> wrote:
>
>> On Nov 24, 2015, at 11:18 AM, Graham Leggett <mi...@sharp.fm> wrote:
>>
>> On 24 Nov 2015, at 6:15 PM, Yann Ylavic <yl...@gmail.com> wrote:
>>
>>> Not sure:
>>>   if (!strcmp(h, "max-age")
>>>       || ap_cmpcasestr(h, "max-age"))
>>> is likely to be a bit faster than a single ap_cmpcasestr() when it
>>> matches, but much slower when it does not.
>>
>> Yep, that’s the point.
>>
>> The vast majority of comparisons are lowercase for tokens like this. Might as well test that fast path first before testing the worst case scenario.
>>
>
> Is there REALLY that much of a diff between
>
>         if (ucharmap[*ps1] != ucharmap[*ps2]) {
>
> and
>
>         if (*ps1 != *ps2) {
>
> to muddle up the code like that though??

The test from the other thread including str[n]cmp() says:

$ for i in `seq 0 3`; do
    ./newtest $i 150000000 \
        xcxcxcxcxcxcxcxcxcxcwwwwwwwwwwaaaaaaaaaa \
        xcxcxcxcxcxcxcxcxcxcwwwwwwwwwwaaaaaaaaaa \
        0
done
- str[n]casecmp (nb=150000000, len=0)
time = 8.435895804 : res = 0
- ap_casecmpstr[n] (nb=150000000, len=0)
time = 8.207019751 : res = 0
- ap_casecmpstr[n] w/ index (nb=150000000, len=0)
time = 4.429462481 : res = 0
- str[n]cmp (nb=150000000, len=0)
time = 1.923039981 : res = 0

So strcmp() is really fast, since it probably advances word per word...

Re: svn commit: r1715938 - /httpd/httpd/trunk/modules/cache/cache_util.c

Posted by Yann Ylavic <yl...@gmail.com>.

On Tue, Nov 24, 2015 at 7:29 PM, William A Rowe Jr <wr...@rowe-clan.net> wrote:
>
> I'm wondering the other way around.  Even in Yann's latest exercise,
> simply testing
>
>    if ((*ps1 == *ps2) || (ucharmap[*ps1] != ucharmap[*ps2])) {
>
> (or in Yann's code, use the const int lookups, considering that they
> should be optimized out by the compiler if the first pattern matches).

*blink*
Actually the "indexed" function in my test does not include the
translation, just realized that while wanted to include your proposed
change.

-       const int c1 = ps1[i];
-       const int c2 = ps2[i];
+       const int c1 = ucharmap[ps1[i]];
+       const int c2 = ucharmap[ps2[i]];

So now it's just a tiny faster than Jim's, not worth the commit :) but
for -Os maybe...

BTW, I'll test your change with the original version.

Re: svn commit: r1715938 - /httpd/httpd/trunk/modules/cache/cache_util.c

Posted by William A Rowe Jr <wr...@rowe-clan.net>.

On Tue, Nov 24, 2015 at 12:03 PM, Jim Jagielski <ji...@jagunet.com> wrote:

>
> > On Nov 24, 2015, at 11:18 AM, Graham Leggett <mi...@sharp.fm> wrote:
> >
> > On 24 Nov 2015, at 6:15 PM, Yann Ylavic <yl...@gmail.com> wrote:
> >
> >> Not sure:
> >>   if (!strcmp(h, "max-age")
> >>       || ap_cmpcasestr(h, "max-age"))
> >> is likely to be a bit faster than a single ap_cmpcasestr() when it
> >> matches, but much slower when it does not.
> >
> > Yep, that’s the point.
> >
> > The vast majority of comparisons are lowercase for tokens like this.
> Might as well test that fast path first before testing the worst case
> scenario.
> >
>
> Is there REALLY that much of a diff between
>
>         if (ucharmap[*ps1] != ucharmap[*ps2]) {
>
> and
>
>         if (*ps1 != *ps2) {
>
> to muddle up the code like that though??
>

I'm wondering the other way around.  Even in Yann's latest exercise,
simply testing

   if ((*ps1 == *ps2) || (ucharmap[*ps1] != ucharmap[*ps2])) {

(or in Yann's code, use the const int lookups, considering that they
should be optimized out by the compiler if the first pattern matches).

Really we are expecting one of two things in strcmp_token(), we will
usually have an all-samecase (e.g. "GET" or "upgrade") and the
exceptions will largely be proper-case (e.g. "Upgrade").  So doing
the first char case-insensitively always seems smart and then
falling back on casematch until we don't case match.

By the time we've coded all that up, I wonder what the performance
is when simply checking equality and then the lookup match on the
character-by-character basis, for mixed vs Mixed vs MIXED.

Re: svn commit: r1715938 - /httpd/httpd/trunk/modules/cache/cache_util.c

Posted by Jim Jagielski <ji...@jaguNET.com>.

> On Nov 24, 2015, at 11:18 AM, Graham Leggett <mi...@sharp.fm> wrote:
> 
> On 24 Nov 2015, at 6:15 PM, Yann Ylavic <yl...@gmail.com> wrote:
> 
>> Not sure:
>>   if (!strcmp(h, "max-age")
>>       || ap_cmpcasestr(h, "max-age"))
>> is likely to be a bit faster than a single ap_cmpcasestr() when it
>> matches, but much slower when it does not.
> 
> Yep, that’s the point.
> 
> The vast majority of comparisons are lowercase for tokens like this. Might as well test that fast path first before testing the worst case scenario.
> 

Is there REALLY that much of a diff between

	if (ucharmap[*ps1] != ucharmap[*ps2]) {

and

	if (*ps1 != *ps2) {

to muddle up the code like that though??

Re: svn commit: r1715938 - /httpd/httpd/trunk/modules/cache/cache_util.c

Posted by Jim Jagielski <ji...@jaguNET.com>.

+1 (concept)

> On Nov 24, 2015, at 12:40 PM, William A Rowe Jr <wr...@rowe-clan.net> wrote:
> 
> On Tue, Nov 24, 2015 at 10:18 AM, Graham Leggett <mi...@sharp.fm> wrote:
> On 24 Nov 2015, at 6:15 PM, Yann Ylavic <yl...@gmail.com> wrote:
> 
> > Not sure:
> >    if (!strcmp(h, "max-age")
> >        || ap_cmpcasestr(h, "max-age"))
> > is likely to be a bit faster than a single ap_cmpcasestr() when it
> > matches, but much slower when it does not.
> 
> Yep, that’s the point.
> 
> The vast majority of comparisons are lowercase for tokens like this. Might as well test that fast path first before testing the worst case scenario.
> 
> Hmmm, it's our implementation, why increase the pain?  Consider the following 'optimization' - fallthrough from identity to case insensitivity with one wasted equality test...
> 
> AP_DECLARE(int) ap_casecmpstr(const char *s1, const char *s2)
> {
>     const unsigned char *ps1 = (const unsigned char *) s1;
>     const unsigned char *ps2 = (const unsigned char *) s2;
> 
>     while (*ps1 == *ps2) {
>         if (*ps1++ == '\0') {
>             return (0);
>         }
>         ps2++;
>     }
>     while (ucharmap[*ps1] == ucharmap[*ps2]) {
>         if (*ps1++ == '\0') {
>             return (0);
>         }
>         ps2++;
>     }
>     return (ucharmap[*ps1] - ucharmap[*ps2]);
> }
> 
> 
> AP_DECLARE(int) ap_casecmpstrn(const char *s1, const char *s2, apr_size_t n)
> {
>     const unsigned char *ps1 = (const unsigned char *) s1;
>     const unsigned char *ps2 = (const unsigned char *) s2;
>     while (n) {
>         if (*ps1 != *ps2) {
>             break;
>         }
>         if (*ps1++ == '\0') {
>             return (0);
>         }
>         ps2++;
>         n--;
>     }
>     while (n--) {
>         if (ucharmap[*ps1] != ucharmap[*ps2]) {
>             return (ucharmap[*ps1] - ucharmap[*ps2]);
>         }
>         if (*ps1++ == '\0') {
>             break;
>         }
>         ps2++;
>     }
>     return (0);
> }
>      
>

Re: svn commit: r1715938 - /httpd/httpd/trunk/modules/cache/cache_util.c

Posted by William A Rowe Jr <wr...@rowe-clan.net>.

On Tue, Nov 24, 2015 at 10:18 AM, Graham Leggett <mi...@sharp.fm> wrote:

> On 24 Nov 2015, at 6:15 PM, Yann Ylavic <yl...@gmail.com> wrote:
>
> > Not sure:
> >    if (!strcmp(h, "max-age")
> >        || ap_cmpcasestr(h, "max-age"))
> > is likely to be a bit faster than a single ap_cmpcasestr() when it
> > matches, but much slower when it does not.
>
> Yep, that’s the point.
>
> The vast majority of comparisons are lowercase for tokens like this. Might
> as well test that fast path first before testing the worst case scenario.
>

Hmmm, it's our implementation, why increase the pain?  Consider the
following 'optimization' - fallthrough from identity to case insensitivity
with one wasted equality test...

AP_DECLARE(int) ap_casecmpstr(const char *s1, const char *s2)
{
    const unsigned char *ps1 = (const unsigned char *) s1;
    const unsigned char *ps2 = (const unsigned char *) s2;

    while (*ps1 == *ps2) {
        if (*ps1++ == '\0') {
            return (0);
        }
        ps2++;
    }
    while (ucharmap[*ps1] == ucharmap[*ps2]) {
        if (*ps1++ == '\0') {
            return (0);
        }
        ps2++;
    }
    return (ucharmap[*ps1] - ucharmap[*ps2]);
}


AP_DECLARE(int) ap_casecmpstrn(const char *s1, const char *s2, apr_size_t n)
{
    const unsigned char *ps1 = (const unsigned char *) s1;
    const unsigned char *ps2 = (const unsigned char *) s2;
    while (n) {
        if (*ps1 != *ps2) {
            break;
        }
        if (*ps1++ == '\0') {
            return (0);
        }
        ps2++;
        n--;
    }
    while (n--) {
        if (ucharmap[*ps1] != ucharmap[*ps2]) {
            return (ucharmap[*ps1] - ucharmap[*ps2]);
        }
        if (*ps1++ == '\0') {
            break;
        }
        ps2++;
    }
    return (0);
}

Re: svn commit: r1715938 - /httpd/httpd/trunk/modules/cache/cache_util.c

Posted by Graham Leggett <mi...@sharp.fm>.

On 24 Nov 2015, at 6:15 PM, Yann Ylavic <yl...@gmail.com> wrote:

> Not sure:
>    if (!strcmp(h, "max-age")
>        || ap_cmpcasestr(h, "max-age"))
> is likely to be a bit faster than a single ap_cmpcasestr() when it
> matches, but much slower when it does not.

Yep, that’s the point.

The vast majority of comparisons are lowercase for tokens like this. Might as well test that fast path first before testing the worst case scenario.

Regards,
Graham
—

Re: svn commit: r1715938 - /httpd/httpd/trunk/modules/cache/cache_util.c

Posted by Yann Ylavic <yl...@gmail.com>.

On Tue, Nov 24, 2015 at 5:09 PM, Graham Leggett <mi...@sharp.fm> wrote:
>
> A further optimisation - in many cases the fast path is really a case sensitive string comparison, given that the vast majority of the time the case being used is the case quoted in the spec.
>
> In other words, while mAx-AgE is valid, it is almost always max-age.
>
> Does it make sense to do a case sensitive comparison first, and then fall back to case insensitive?

Not sure:
    if (!strcmp(h, "max-age")
        || ap_cmpcasestr(h, "max-age"))
is likely to be a bit faster than a single ap_cmpcasestr() when it
matches, but much slower when it does not.

Regards,
Yann.

Re: svn commit: r1715938 - /httpd/httpd/trunk/modules/cache/cache_util.c

Posted by Graham Leggett <mi...@sharp.fm>.

On 24 Nov 2015, at 5:21 PM, Jim Jagielski <ji...@jaguNET.com> wrote:

> For these types of paths, where we use a switch/case against
> the 1st char of the token, the real reason why we used that
> impl was to avoid the (expected) expensive string comparison.
> 
> This is really no longer an issue. Sure, we could still use
> this "fast path" short-circuit, but even that is non-optimal.
> For the below, what we really should be testing is:
> 
> 	if (!ap_casecmpstr(token+1, "ublic")) {
> 
> for example.
> 
> Sooooo the question is: Do we remove these silly fast-paths
> that no longer make sense[1] or optimize for the "we know the
> 1st char" case ala the above?

A further optimisation - in many cases the fast path is really a case sensitive string comparison, given that the vast majority of the time the case being used is the case quoted in the spec.

In other words, while mAx-AgE is valid, it is almost always max-age.

Does it make sense to do a case sensitive comparison first, and then fall back to case insensitive?

Regards,
Graham
—