You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Lieven Govaerts <lg...@mobsol.be> on 2006/03/07 23:50:26 UTC

Hostnames should be case-insensitive - tryiing to solve issue 2475

Hi, 

I started working on issue 2475. The problem reported in that issue is that
hostnames in URIs should be case-insensitive ( as per RFC3986 ).

The problem surfaces when two URLs are compared. E.g. when a file is copied
from one folder to another, a case-sensitive check is done wether source and
target folders are from the same repository. 

Now, I've checked the code for places where this type of test is done, and I
found some 30 instances of strcmp & strncmp where URLs are involved. Most of
them in libsvn_client and libsvn_wc, some in libsvn_ra* and libsvn_repos.

I see some possible options to solve this:

Option 1. Make lowercase of all the hostnames in all the URLs on entry point
in the code, don't touch the case-sensitive checks. Maybe the easiest
solution, but not very clear in error messages etc.

Option 2. Replace all the strcmp and strncmp calls ( only those where URLs
are compared ) with their case-insensitive equivalents. 

Option 3. Create one new function url_compare() where a correct,
standards-compliant URL comparison is implemented, use that function in all
places where strcmp/strncmp is used now. What would be a good place to
define such a function?

I prefer option 3 myself. Maybe there are other and better options? Anybody
any thoughts on the subject?

thanks for your input,

Lieven.






---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: Hostnames should be case-insensitive - tryiing to solve issue 2475 [+PATCH]

Posted by Lieven Govaerts <lg...@mobsol.be>.
Attached a new patch that addresses your questions. If you have any remarks,
please let me know.

> -----Original Message-----
> From: Branko Čibej [mailto:brane@xbc.nu]
> Sent: zondag 12 maart 2006 4:15
...
>
> Is there any reason not to use apr_uri_[un]parse to
> canonicalize the URL?

Not really, I thought I'd reinvent the wheel :) Fixed in this patch.

> > A consequence of solving the issue like this is that the URI in the
> > entries file should now be stored with a lowercase
> hostname.
...
> >  
> It should be fairly easy to canonicalize the URLs while the
> entries file is being parsed. It's a memory-only operation,
> so performance is not an issue. I see no reason to bump the
> WC version because of such a change; after all, the WC format
> remains backward-compatible, and if we happen to rewrite the
> URL in the entries file (which we will, sooner or later),
> older clients can still use that working copy.
> -- Brane
> 

I added code to convert the canonicalize the URL when read from the entries
file, to remain backwards compatible. I found it difficult to add an extra
test for this behaviour though. How is backwards compatibility for svn
tested anyway? 

The patch is tested on windows + svn, http & local. 

Lieven.

Re: Hostnames should be case-insensitive - tryiing to solve issue 2475 [+PATCH]

Posted by Branko Čibej <br...@xbc.nu>.
Lieven Govaerts wrote:
> Branko, 
>
> attached is a patch for this issue. The patch is implemented by converting
> the hostname to lowercase path.c/svn_path_canonicalize. This should make
> sure that all internal handling of URI's is done with a lowercase hostname. 
>
> Is that what you suggested?
>   
Yes, something like that.

Is there any reason not to use apr_uri_[un]parse to canonicalize the URL?

> A consequence of solving the issue like this is that the URI in the entries
> file should now be stored with a lowercase hostname. Working copies created
> with a pre-this-patch svn client with uppercase hostnames in their entries
> files are not supported anymore.
>
> I could add a temporary solution to automatically convert hostnames to
> lowercase in those entries. Other ( and better in my opinion ) option is to
> include this patch in a release that bumps the version of the entries file.
>   
It should be fairly easy to canonicalize the URLs while the entries file 
is being parsed. It's a memory-only operation, so performance is not an 
issue. I see no reason to bump the WC version because of such a change; 
after all, the WC format remains backward-compatible, and if we happen 
to rewrite the URL in the entries file (which we will, sooner or later), 
older clients can still use that working copy.

-- Brane


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: Hostnames should be case-insensitive - tryiing to solve issue 2475 [+PATCH]

Posted by Lieven Govaerts <lg...@mobsol.be>.
Branko, 

attached is a patch for this issue. The patch is implemented by converting
the hostname to lowercase path.c/svn_path_canonicalize. This should make
sure that all internal handling of URI's is done with a lowercase hostname. 

Is that what you suggested?

A consequence of solving the issue like this is that the URI in the entries
file should now be stored with a lowercase hostname. Working copies created
with a pre-this-patch svn client with uppercase hostnames in their entries
files are not supported anymore.

I could add a temporary solution to automatically convert hostnames to
lowercase in those entries. Other ( and better in my opinion ) option is to
include this patch in a release that bumps the version of the entries file.

Are there other options?

regards,

Lieven. 

> -----Original Message-----
> From: Branko Čibej [mailto:brane@xbc.nu] 
> Sent: woensdag 8 maart 2006 2:07
> To: Lieven Govaerts
> Cc: dev@subversion.tigris.org
> Subject: Re: Hostnames should be case-insensitive - tryiing 
> to solve issue 2475
> 
> Lieven Govaerts wrote:
> > Hi,
> >
> > I started working on issue 2475. The problem reported in 
> that issue is 
> > that hostnames in URIs should be case-insensitive ( as per 
> RFC3986 ).
> >
> > The problem surfaces when two URLs are compared. E.g. when 
> a file is 
> > copied from one folder to another, a case-sensitive check is done 
> > wether source and target folders are from the same repository.
> >
> > Now, I've checked the code for places where this type of 
> test is done, 
> > and I found some 30 instances of strcmp & strncmp where URLs are 
> > involved. Most of them in libsvn_client and libsvn_wc, some 
> in libsvn_ra* and libsvn_repos.
> >
> > I see some possible options to solve this:
> >
> > Option 1. Make lowercase of all the hostnames in all the 
> URLs on entry 
> > point in the code, don't touch the case-sensitive checks. Maybe the 
> > easiest solution, but not very clear in error messages etc.
> >
> > Option 2. Replace all the strcmp and strncmp calls ( only 
> those where 
> > URLs are compared ) with their case-insensitive equivalents.
> >   
> That would be wrong; the path part of the URL is *not* 
> case-insensitive.
> 
> > Option 3. Create one new function url_compare() where a correct, 
> > standards-compliant URL comparison is implemented, use that 
> function 
> > in all places where strcmp/strncmp is used now. What would 
> be a good 
> > place to define such a function?
> >
> > I prefer option 3 myself. Maybe there are other and better options? 
> > Anybody any thoughts on the subject?
> >   
> Option 4. Follow the conventions we already use for paths: 
> i.e., the API expects all path parameters in canonical form. 
> Define the canonical form for URLs so that the hostname part 
> must be lowercase (but only the host name, not the optional 
> embedded user name). Document that, then perform the URL 
> canonicalisation in the client code, as we do with file names --
> *not* in the library code.
> 
> -- Brane
> 

Re: Hostnames should be case-insensitive - tryiing to solve issue 2475

Posted by Branko Čibej <br...@xbc.nu>.
Lieven Govaerts wrote:
> Hi, 
>
> I started working on issue 2475. The problem reported in that issue is that
> hostnames in URIs should be case-insensitive ( as per RFC3986 ).
>
> The problem surfaces when two URLs are compared. E.g. when a file is copied
> from one folder to another, a case-sensitive check is done wether source and
> target folders are from the same repository. 
>
> Now, I've checked the code for places where this type of test is done, and I
> found some 30 instances of strcmp & strncmp where URLs are involved. Most of
> them in libsvn_client and libsvn_wc, some in libsvn_ra* and libsvn_repos.
>
> I see some possible options to solve this:
>
> Option 1. Make lowercase of all the hostnames in all the URLs on entry point
> in the code, don't touch the case-sensitive checks. Maybe the easiest
> solution, but not very clear in error messages etc.
>
> Option 2. Replace all the strcmp and strncmp calls ( only those where URLs
> are compared ) with their case-insensitive equivalents. 
>   
That would be wrong; the path part of the URL is *not* case-insensitive.

> Option 3. Create one new function url_compare() where a correct,
> standards-compliant URL comparison is implemented, use that function in all
> places where strcmp/strncmp is used now. What would be a good place to
> define such a function?
>
> I prefer option 3 myself. Maybe there are other and better options? Anybody
> any thoughts on the subject?
>   
Option 4. Follow the conventions we already use for paths: i.e., the API 
expects all path parameters in canonical form. Define the canonical form 
for URLs so that the hostname part must be lowercase (but only the host 
name, not the optional embedded user name). Document that, then perform 
the URL canonicalisation in the client code, as we do with file names -- 
*not* in the library code.

-- Brane


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org