You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Martin Kraemer <ma...@apache.org> on 2007/08/30 12:55:40 UTC

[PATCH] proxy/ajp_header.c: Fix header detection

Hi.

While looking at ajp_header.c, I realized that its method of parsing
the header line tokens is flakey: it uses memcmp() to check, e.g.,
whether the header token is "Accept-Charset:", by uppercasing the
token name (-> "ACCEPT-CHARSET"), then compares the initial "ACCEPT-"
prefix, and then tests:
	if (memcmp(p, "CHARSET", 7) == 0) return SC_ACCEPT_CHARSET;
but does not verify that the end of the token has been reached.

Thus, a header
  Accept-CharsetXXX-Blah: utf-8
would be mistaken for an "Accept-Charset: utf-8".

Same goes for a couple of other header names.
The patch replaces the memcmp by a strcmp to check for the trailing
NIL character, too.

Also, IMO it is better to replace memcmp by strncasecmp in the test
-        if (memcmp(stringname, "Content-Type", 12) == 0) {
+        if (strncasecmp(stringname, "Content-Type", 12) == 0) {

WDYT?

  Martin
-- 
<Ma...@Fujitsu-Siemens.com>        |     Fujitsu Siemens
http://www.fujitsu-siemens.com/imprint.html | 81730  Munich,  Germany

Re: [PATCH] proxy/ajp_header.c: Fix header detection

Posted by Jim Jagielski <ji...@jaguNET.com>.
On Aug 31, 2007, at 5:11 AM, Martin Kraemer wrote:

>
> Please go for "obvious" algorithms, or simply automate them (as in
> the example macro above) rather than "coding in assembler code" for
> efficiency, dropping even the slightest trace of explanation what
> the code is intended to do, and leaving unmaintainable code behind  
> you.
>
> Because the strcmp() is so obvious, and does exactly what it's
> supposed to do in this case, namely compare two strings over their
> complete length, I went against the choice of "using memcmp for
> efficiency" to save myself and many others from long nights of
> head-scratching and -bashing to find out where some off-by-one
> error was lingering deep in the AJP code.
>

+++1 :)

If this was in some deeply tight loop, then *maybe* the
optimization might be worth it (as Martin says, a macro
wrapper would be best due to the size of the const string
is known by the compiler and this counting those chars
is not something the coder needs to worry about), but
readability counts for an awful lot :)

Of course, to be honest, lookup_builtin_method() suffers
from the same "unreadability" but at least it is documented ;)


Re: [PATCH] proxy/ajp_header.c: Fix header detection

Posted by Martin Kraemer <ma...@apache.org>.
On Thu, Aug 30, 2007 at 06:24:55PM +0200, Rainer Jung wrote:
> >The patch replaces the memcmp by a strcmp to check for the trailing
> >NIL character, too.
> 
> For mod_jk the problem you found here is the same. Thanks for finding it!
> 
> We finally applied a slightly different patch, by keeping the memcmp, 
> but simply incrementing the number of bytes to compare by one. This 
> should work for mod_proxy also.
> 
[...]
> Our variant of the patch is at
> 
> http://marc.info/?l=tomcat-dev&m=118849057126771&w=2

On Thu, Aug 30, 2007 at 07:33:55PM +0200, jean-frederic clere wrote:
> 
> +1 mod_jk fixed it by additing one to each length, that is probably more
> efficent, no?

I had considered incrementing the byte count by one, too, yet I
went for

* readability
  A memcmp(xxx,"some arbitrary string", 22) is not really
  obviously a comparison of the string including its trailing NIL
  byte) -- I would have at least wrapped such a comparison into a
  commented macro like
    /* We use sizeof() to include comparison of the trailing NIL character
     * of the literal string, and use memcmp() instead of the
     * more obvious strcmp(cmpstr, literalconst) for efficiency reasons.
     */
    #define FAST_LITSTRING_STREQ(cmpstr, literalconst) \
            (0==memcmp(cmpstr, literalconst, sizeof(literalconst)))
  and then used
    if (FAST_LITSTRING_STREQ(p, "LANGUAGE")) {..}
  in place of
    if (memcmp(p, "LANGUAGE", 9) == 0) {..}
  which is, in spite of its absolute correctness, illegible.

* maintainability
  Few readers of the code will actually count the characters of all
  the bytes of the string "LANGUAGE" manually, to see that it has 8,
  in the line
    if (memcmp(p, "LANGUAGE", 9) == 0) {..}
  And because it is so non-obvious, a code maintainer adding code
  will add another if() with the incorrect byte count.

* efficiency
  Well, strcmp() is optimized on most platforms too, but I confess
  that, by definition, it has to waste some more machine
  instructions than memcpy(). However, maintainability and
  readability are higher values to me than utmost speed at the
  cost of maintainability.

Please go for "obvious" algorithms, or simply automate them (as in
the example macro above) rather than "coding in assembler code" for
efficiency, dropping even the slightest trace of explanation what
the code is intended to do, and leaving unmaintainable code behind you.

Because the strcmp() is so obvious, and does exactly what it's
supposed to do in this case, namely compare two strings over their
complete length, I went against the choice of "using memcmp for
efficiency" to save myself and many others from long nights of
head-scratching and -bashing to find out where some off-by-one
error was lingering deep in the AJP code.

Just my $.02,

   Martin
-- 
<Ma...@Fujitsu-Siemens.com>        |     Fujitsu Siemens
http://www.fujitsu-siemens.com/imprint.html | 81730  Munich,  Germany

Re: [PATCH] proxy/ajp_header.c: Fix header detection

Posted by jean-frederic clere <jf...@gmail.com>.
Martin Kraemer wrote:
> Hi.
> 
> While looking at ajp_header.c, I realized that its method of parsing
> the header line tokens is flakey: it uses memcmp() to check, e.g.,
> whether the header token is "Accept-Charset:", by uppercasing the
> token name (-> "ACCEPT-CHARSET"), then compares the initial "ACCEPT-"
> prefix, and then tests:
> 	if (memcmp(p, "CHARSET", 7) == 0) return SC_ACCEPT_CHARSET;
> but does not verify that the end of the token has been reached.
> 
> Thus, a header
>   Accept-CharsetXXX-Blah: utf-8
> would be mistaken for an "Accept-Charset: utf-8".
> 
> Same goes for a couple of other header names.
> The patch replaces the memcmp by a strcmp to check for the trailing
> NIL character, too.
> 
> Also, IMO it is better to replace memcmp by strncasecmp in the test
> -        if (memcmp(stringname, "Content-Type", 12) == 0) {
> +        if (strncasecmp(stringname, "Content-Type", 12) == 0) {
> 
> WDYT?

+1 mod_jk fixed it by additing one to each length, that is probably more
efficent, no?

Cheers

Jean-Frederic

> 
>   Martin
> 


Re: [PATCH] proxy/ajp_header.c: Fix header detection

Posted by Rainer Jung <ra...@kippdata.de>.
Hi Martin,

Martin Kraemer wrote:
> Hi.
> 
> While looking at ajp_header.c, I realized that its method of parsing
> the header line tokens is flakey: it uses memcmp() to check, e.g.,
> whether the header token is "Accept-Charset:", by uppercasing the
> token name (-> "ACCEPT-CHARSET"), then compares the initial "ACCEPT-"
> prefix, and then tests:
> 	if (memcmp(p, "CHARSET", 7) == 0) return SC_ACCEPT_CHARSET;
> but does not verify that the end of the token has been reached.
> 
> Thus, a header
>   Accept-CharsetXXX-Blah: utf-8
> would be mistaken for an "Accept-Charset: utf-8".
> 
> Same goes for a couple of other header names.
> The patch replaces the memcmp by a strcmp to check for the trailing
> NIL character, too.

For mod_jk the problem you found here is the same. Thanks for finding it!

We finally applied a slightly different patch, by keeping the memcmp, 
but simply incrementing the number of bytes to compare by one. This 
should work for mod_proxy also.

Why is it OK?

- the variable header name is inside an array of length 16, which is big 
enough to hold the longest string we want to compare to

- the variable header names are \0-terminated

- the string constants we compare to are always \0-terminated

- so increasing the number of bytes to do memcmp() on by one will 
correctly include a compare on the terminating \0.

Our variant of the patch is at

http://marc.info/?l=tomcat-dev&m=118849057126771&w=2

Regards,

Rainer

Re: [PATCH] proxy/ajp_header.c: Fix header detection

Posted by Jess Holle <je...@ptc.com>.
Martin Kraemer wrote:
> On Thu, Aug 30, 2007 at 04:45:38PM +0200, Rainer Jung wrote:
>   
>> I committed Martins patch to mod_jk a couple of minutes ago.
>> Thanks Martin!
>>
>> The Content-Type part of the patch didn't apply to mod_jk though.
>>     
> ...
>   
>>>> -        if (memcmp(stringname, "Content-Type", 12) == 0) {
>>>> +        if (strncasecmp(stringname, "Content-Type", 12) == 0) {
>>>>         
>
> That is good, because it was wrong... Of course we need the
> normal strcasecmp(stringname, "Content-Type"), not the one limited
> to 12 chars (think of "Content-TypeXYZ").
>
> Already committed to trunk.
>   
Backporting to 2.2.x?

--
Jess Holle


Re: [PATCH] proxy/ajp_header.c: Fix header detection

Posted by Martin Kraemer <ma...@apache.org>.
On Thu, Aug 30, 2007 at 04:45:38PM +0200, Rainer Jung wrote:
> I committed Martins patch to mod_jk a couple of minutes ago.
> Thanks Martin!
> 
> The Content-Type part of the patch didn't apply to mod_jk though.
...
> >>-        if (memcmp(stringname, "Content-Type", 12) == 0) {
> >>+        if (strncasecmp(stringname, "Content-Type", 12) == 0) {

That is good, because it was wrong... Of course we need the
normal strcasecmp(stringname, "Content-Type"), not the one limited
to 12 chars (think of "Content-TypeXYZ").

Already committed to trunk.

  Martin
-- 
<Ma...@Fujitsu-Siemens.com>        |     Fujitsu Siemens
http://www.fujitsu-siemens.com/imprint.html | 81730  Munich,  Germany

Re: [PATCH] proxy/ajp_header.c: Fix header detection

Posted by Rainer Jung <ra...@kippdata.de>.
I committed Martins patch to mod_jk a couple of minutes ago.
Thanks Martin!

The Content-Type part of the patch didn't apply to mod_jk though.

Regards,

Rainer

Jim Jagielski wrote:
> Yeah, all this is being fixed in the mod_jk code as
> well...
> 
> On Aug 30, 2007, at 6:55 AM, Martin Kraemer wrote:
> 
>> Hi.
>>
>> While looking at ajp_header.c, I realized that its method of parsing
>> the header line tokens is flakey: it uses memcmp() to check, e.g.,
>> whether the header token is "Accept-Charset:", by uppercasing the
>> token name (-> "ACCEPT-CHARSET"), then compares the initial "ACCEPT-"
>> prefix, and then tests:
>>     if (memcmp(p, "CHARSET", 7) == 0) return SC_ACCEPT_CHARSET;
>> but does not verify that the end of the token has been reached.
>>
>> Thus, a header
>>   Accept-CharsetXXX-Blah: utf-8
>> would be mistaken for an "Accept-Charset: utf-8".
>>
>> Same goes for a couple of other header names.
>> The patch replaces the memcmp by a strcmp to check for the trailing
>> NIL character, too.
>>
>> Also, IMO it is better to replace memcmp by strncasecmp in the test
>> -        if (memcmp(stringname, "Content-Type", 12) == 0) {
>> +        if (strncasecmp(stringname, "Content-Type", 12) == 0) {
>>
>> WDYT?
>>
>>   Martin
>> -- 
>> <Ma...@Fujitsu-Siemens.com>        |     Fujitsu Siemens
>> http://www.fujitsu-siemens.com/imprint.html | 81730  Munich,  Germany
>> <ajp_header.c.diff>

Re: [PATCH] proxy/ajp_header.c: Fix header detection

Posted by Jim Jagielski <ji...@jaguNET.com>.
Yeah, all this is being fixed in the mod_jk code as
well...

On Aug 30, 2007, at 6:55 AM, Martin Kraemer wrote:

> Hi.
>
> While looking at ajp_header.c, I realized that its method of parsing
> the header line tokens is flakey: it uses memcmp() to check, e.g.,
> whether the header token is "Accept-Charset:", by uppercasing the
> token name (-> "ACCEPT-CHARSET"), then compares the initial "ACCEPT-"
> prefix, and then tests:
> 	if (memcmp(p, "CHARSET", 7) == 0) return SC_ACCEPT_CHARSET;
> but does not verify that the end of the token has been reached.
>
> Thus, a header
>   Accept-CharsetXXX-Blah: utf-8
> would be mistaken for an "Accept-Charset: utf-8".
>
> Same goes for a couple of other header names.
> The patch replaces the memcmp by a strcmp to check for the trailing
> NIL character, too.
>
> Also, IMO it is better to replace memcmp by strncasecmp in the test
> -        if (memcmp(stringname, "Content-Type", 12) == 0) {
> +        if (strncasecmp(stringname, "Content-Type", 12) == 0) {
>
> WDYT?
>
>   Martin
> -- 
> <Ma...@Fujitsu-Siemens.com>        |     Fujitsu Siemens
> http://www.fujitsu-siemens.com/imprint.html | 81730  Munich,  Germany
> <ajp_header.c.diff>