You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Paul Burba <pa...@softlanding.com> on 2006/02/15 20:22:32 UTC

apr_strnatcasecmp vs. strcasecmp

Hi All,

A quick question: Is there some reason we use (with one exception in 
config.c) strcasecmp() instead of apr_strnatcasecmp()?

On OS400 V5R4 strcasecmp() is one of a handful of functions on that still 
expects EBCDIC strings.  apr_strnatcasecmp() works correctly on UTF-8 
strings. 

I'd like to submit a patch, nominally for the OS400/EBCDIC port*, that 
uses the latter exclusively.  I just wanted to make sure I wasn't missing 
something. 
 
Paul B.

*http://svn.haxx.se/dev/archive-2006-02/0519.shtml


_____________________________________________________________________________
Scanned for SoftLanding Systems, Inc. and SoftLanding Europe Plc by IBM Email Security Management Services powered by MessageLabs. 
_____________________________________________________________________________

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

[PATCH] apr_strnatcasecmp vs. strcasecmp

Posted by Julian Foad <ju...@btopenworld.com>.
Julian Foad wrote:
> 
> Yes: in every call site, we just want to know whether two strings are 
> equal (ignoring case).  APR doesn't appear to document what a "natural 
> order comparison" means, but looking at the implementation it appears to 
> mean something like leading spaces are ignored and a run of digits is 
> compared as a number rather than as characters.  Without the semantics 
> being documented I would not want to use apr_strnat[case]cmp().
> 
> I think our one use of apr_strnatcasecmp is probably a mistake, or at 
> least not a particularly good idea.

[[[
Stop using apr_strnatcasecmp().

* subversion/libsvn_subr/config.c
   Replace what is currently our only use of apr_strnatcasecmp() with
   strcasecmp() because the latter is more standard and the former is
   poorly defined and gives us no advantage here.
]]]

Index: subversion/libsvn_subr/config.c
===================================================================
--- subversion/libsvn_subr/config.c     (revision 18525)
+++ subversion/libsvn_subr/config.c     (working copy)
@@ -384,7 +384,7 @@
                           cfg->tmp_key->len);
        /* NOTE: ConfigParser's sections are case sensitive. */
        if (opt == NULL
-          && apr_strnatcasecmp(section, SVN_CONFIG__DEFAULT_SECTION) != 0)
+          && strcasecmp(section, SVN_CONFIG__DEFAULT_SECTION) != 0)
          /* Options which aren't found in the requested section are
             also sought after in the default section. */
          opt = find_option(cfg, SVN_CONFIG__DEFAULT_SECTION, option, &sec);

Any objections to me making this change?

- Julian

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: apr_strnatcasecmp vs. strcasecmp

Posted by Julian Foad <ju...@btopenworld.com>.
Paul Burba wrote:
> 
> A quick question: Is there some reason we use (with one exception in 
> config.c) strcasecmp() instead of apr_strnatcasecmp()?

Yes: in every call site, we just want to know whether two strings are equal 
(ignoring case).  APR doesn't appear to document what a "natural order 
comparison" means, but looking at the implementation it appears to mean 
something like leading spaces are ignored and a run of digits is compared as a 
number rather than as characters.  Without the semantics being documented I 
would not want to use apr_strnat[case]cmp().

I think our one use of apr_strnatcasecmp is probably a mistake, or at least not 
a particularly good idea.

- Julian

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: apr_strnatcasecmp vs. strcasecmp

Posted by Julian Foad <ju...@btopenworld.com>.
David James wrote:
> 
> I see that httpd recently switched from 'apr_strnatcasecmp' to
> 'strcasecmp'. Would this introduce problems on OS400 V5R4?
> 
> See http://svn.apache.org/viewcvs?rev=230708&view=rev

The log message says:

   End apr_natstrcasecmp, backporting part of r216111 (there may remain
   other apr_natstrcasecmp abuse in other files).

   Still unsure if apr_strnatcasecmp of the hostname is deliberate to
   handle leading 0's in ip addresses, so leaving that single use case.

I see from the implementation that apr_strnatcasecmp does not ignore leading 
zeros, it treats them as significant, as if this were the fractional part of a 
number, effectively doing the same as a character-by-character comparison. 
Therefore that is not a reason for using it to compare IP addresses for equality.

I see in today's snapshot that such a comparison is still in 
httpd/modules/proxy/proxy_util.c.

If someone here is in contact with the HTTPD developers, you might like to pass 
on this message to them.

- Julian

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: apr_strnatcasecmp vs. strcasecmp

Posted by Mark Phippard <ma...@softlanding.com>.
james82@gmail.com wrote on 02/15/2006 04:34:10 PM:

> On 2/15/06, Paul Burba <pa...@softlanding.com> wrote:
> > A quick question: Is there some reason we use (with one exception in
> > config.c) strcasecmp() instead of apr_strnatcasecmp()?
> >
> > On OS400 V5R4 strcasecmp() is one of a handful of functions on that 
still
> > expects EBCDIC strings.  apr_strnatcasecmp() works correctly on UTF-8
> > strings.
> 
> I see that httpd recently switched from 'apr_strnatcasecmp' to
> 'strcasecmp'. Would this introduce problems on OS400 V5R4?
> 
> See http://svn.apache.org/viewcvs?rev=230708&view=rev

That appears to just be one file in one module which OS/400 does not 
provide.  If httpd uses this function, I imagine the Apache team at IBM 
would have to replace it with some function that could compare UTF-8 
strings, or convert the strings to EBCDIC before comparing them?  We could 
do something like that as well, but we figured that since there is an APR 
function that appears to work correctly we should ask if Subversion could 
just always use that function.  Just one less place where there is 
conditional code.

Is there something wrong/different about the APR function?

Thanks

Mark


_____________________________________________________________________________
Scanned for SoftLanding Systems, Inc. and SoftLanding Europe Plc by IBM Email Security Management Services powered by MessageLabs. 
_____________________________________________________________________________

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: apr_strnatcasecmp vs. strcasecmp

Posted by David James <dj...@collab.net>.
On 2/15/06, Paul Burba <pa...@softlanding.com> wrote:
> A quick question: Is there some reason we use (with one exception in
> config.c) strcasecmp() instead of apr_strnatcasecmp()?
>
> On OS400 V5R4 strcasecmp() is one of a handful of functions on that still
> expects EBCDIC strings.  apr_strnatcasecmp() works correctly on UTF-8
> strings.

I see that httpd recently switched from 'apr_strnatcasecmp' to
'strcasecmp'. Would this introduce problems on OS400 V5R4?

See http://svn.apache.org/viewcvs?rev=230708&view=rev

Cheers,

David
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: dev-help@subversion.tigris.org
>
>


--
David James -- http://www.cs.toronto.edu/~james

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org