You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Karl Fogel <kf...@newton.ch.collab.net> on 2002/08/01 15:16:02 UTC

Re: svn commit: rev 2821 - trunk/subversion/libsvn_wc trunk/subversion/clients/cmdline trunk/subversion/tests/clients/cmdline

philip@tigris.org writes:
> --- trunk/subversion/libsvn_wc/lock.c	(original)
> +++ trunk/subversion/libsvn_wc/lock.c	Wed Jul 31 09:51:20 2002
> @@ -162,7 +162,9 @@
>  }
>  
>  /* Allocate from POOL, intialise and return an access baton. TYPE and PATH
> -   are used to initialise the baton. */
> +   are used to initialise the baton.  PATH is assumed to be canonicalized,
> +   paths need to be canonicalized so that path matching and parent-child
> +   identification work. */
>  static svn_wc_adm_access_t *
>  adm_access_alloc (enum svn_wc__adm_access_type type,
>                    const char *path,
> @@ -172,9 +174,7 @@
>    lock->type = type;
>    lock->set = NULL;
>    lock->lock_exists = FALSE;
> -  /* ### Some places lock with a path that is not canonical, we need
> -     ### cannonical paths for reliable parent-child determination */
> -  lock->path = svn_path_canonicalize_nts (apr_pstrdup (pool, path), pool);
> +  lock->path = path;
>    lock->pool = pool;
>  
>    apr_pool_cleanup_register (lock->pool, lock, pool_cleanup,
> @@ -213,11 +213,14 @@
>                   svn_boolean_t tree_lock,
>                   apr_pool_t *pool)
>  {
> +  /* Sigh. There are non-canonical paths coming in here. */
> +  const char *canonical = svn_path_canonicalize_nts (apr_pstrdup (pool, path),
> +                                                     pool);

I'm not familiar enough with the context, so maybe this isn't a
problem, but: is it a bug that there are non-canonical paths coming
in?  The public libsvn_wc interfaces should only be receiving
canonical paths, at least.  If it decanonicalizes internally for some
reason, then the above is understandable.  But if it's getting
non-canonical paths as input, then there's a larger bug somewhere...

Just wondering.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn commit: rev 2821 - trunk/subversion/libsvn_wc trunk/subversion/clients/cmdline trunk/subversion/tests/clients/cmdline

Posted by cm...@collab.net.
Justin Erenkrantz <je...@apache.org> writes:

> On Thu, Aug 01, 2002 at 03:19:45PM -0500, Karl Fogel wrote:
> > cmpilato@collab.net writes:
> > > I think that canonicalization should fall at the same line as
> > > case-correctness and UTF-8 conversion and URI-encoding ... it is the
> > > job of the client binary to supply such to libsvn_client (and
> > > libsvn_whatever as the case may be).
> > 
> > Yup.  All libsvn_foo layers assume that incoming paths are "fully
> > canonicalized".  Meaning UTF-8, case-canonicalized, path-sep
> > canonicalized, and any other sort of canonicalization you can think of
> > :-). 
> 
> The problem I saw when I looked at this before is that even though
> canonical paths often entered libsvn_wc, the canonicalization was
> lost at various stages *internally* in libsvn_wc.

If this is still the case, then that internal lossage is in itself a
bug, and should definitely be addressed.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn commit: rev 2821 - trunk/subversion/libsvn_wc trunk/subversion/clients/cmdline trunk/subversion/tests/clients/cmdline

Posted by Karl Fogel <kf...@newton.ch.collab.net>.
Philip Martin <ph...@codematters.co.uk> writes:
> The last problem I saw was when running 'svn ci .'
> 
> svn_path_condense_targets calls svn_path_get_absolute passing "." and
> gets back a path with a trailing slash, something like "/home/pm/wc/".
> Now if the command is 'svn ci /home/pm/wc/', or 'svn ci /home/pm/wc',
> then svn_path_get_absolute returns "/home/pm/wc" without a slash.
> 
> So should a working copy directory path include a trailing slash?

Hmmm.  Euuuuww.

Could we just change the definition of canonicalized path to mean
"without trailing slash if directory", and change the code
(condense_targets and friends) accordingly?  Does anything depend on
this?

> The locking code uses the path as a hash key.  So if the trailing
> slash is optional the code needs to ensure that setting and getting
> will work both with and without the trailing slash.

Gotcha.  Sorry for the mess you're inheriting here...

-K

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn commit: rev 2821 - trunk/subversion/libsvn_wc trunk/subversion/clients/cmdline trunk/subversion/tests/clients/cmdline

Posted by cm...@collab.net.
Philip Martin <ph...@codematters.co.uk> writes:

> The last problem I saw was when running 'svn ci .'
> 
> svn_path_condense_targets calls svn_path_get_absolute passing "." and
> gets back a path with a trailing slash, something like "/home/pm/wc/".
> Now if the command is 'svn ci /home/pm/wc/', or 'svn ci /home/pm/wc',
> then svn_path_get_absolute returns "/home/pm/wc" without a slash.

Sounds to me like svn_path_get_absolute needs to canonicalize its
output.  I see that it canonicalizes the input before calling
apr_filepath_merge, but no attempt is made to ensure that the output
is all good.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn commit: rev 2821 - trunk/subversion/libsvn_wc trunk/subversion/clients/cmdline trunk/subversion/tests/clients/cmdline

Posted by Philip Martin <ph...@codematters.co.uk>.
Karl Fogel <kf...@newton.ch.collab.net> writes:

> Justin Erenkrantz <je...@apache.org> writes:
> > The problem I saw when I looked at this before is that even though
> > canonical paths often entered libsvn_wc, the canonicalization was
> > lost at various stages *internally* in libsvn_wc.
> 
> That's okay, as long as libsvn_wc knows what it's doing.
> 
> That what I was essentially asking Philip in my original mail: are
> they noncanonical coming in (bad), or decanonicalized by choice
> internally (fine, as long as the relevant code is adjusted correctly).

The last problem I saw was when running 'svn ci .'

svn_path_condense_targets calls svn_path_get_absolute passing "." and
gets back a path with a trailing slash, something like "/home/pm/wc/".
Now if the command is 'svn ci /home/pm/wc/', or 'svn ci /home/pm/wc',
then svn_path_get_absolute returns "/home/pm/wc" without a slash.

So should a working copy directory path include a trailing slash?

The locking code uses the path as a hash key.  So if the trailing
slash is optional the code needs to ensure that setting and getting
will work both with and without the trailing slash.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn commit: rev 2821 - trunk/subversion/libsvn_wc trunk/subversion/clients/cmdline trunk/subversion/tests/clients/cmdline

Posted by Justin Erenkrantz <je...@apache.org>.
On Thu, Aug 01, 2002 at 03:46:58PM -0500, Karl Fogel wrote:
> Justin Erenkrantz <je...@apache.org> writes:
> > The problem I saw when I looked at this before is that even though
> > canonical paths often entered libsvn_wc, the canonicalization was
> > lost at various stages *internally* in libsvn_wc.
> 
> That's okay, as long as libsvn_wc knows what it's doing.
> 
> That what I was essentially asking Philip in my original mail: are
> they noncanonical coming in (bad), or decanonicalized by choice
> internally (fine, as long as the relevant code is adjusted correctly).

I think it is a mixture of both.  Non-canonical names come in and
then regardless it gets decanonicalized internally.  Since the baton
code has to have a consistent naming scheme (esp. considering
caching), it must re-canonicalize.  

I remember running smack into this before and getting very
frustrated.  -- justin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn commit: rev 2821 - trunk/subversion/libsvn_wc trunk/subversion/clients/cmdline trunk/subversion/tests/clients/cmdline

Posted by cm...@collab.net.
Karl Fogel <kf...@newton.ch.collab.net> writes:

> Justin Erenkrantz <je...@apache.org> writes:
> > The problem I saw when I looked at this before is that even though
> > canonical paths often entered libsvn_wc, the canonicalization was
> > lost at various stages *internally* in libsvn_wc.
> 
> That's okay, as long as libsvn_wc knows what it's doing.
> 
> That what I was essentially asking Philip in my original mail: are
> they noncanonical coming in (bad), or decanonicalized by choice
> internally (fine, as long as the relevant code is adjusted correctly).

Actually, yeah, I should clarify my last comments on this to state
that so long as the path canonicalization inside libsvn_wc isn't passed
back outside libsvn_wc, that's fine.  It's where non-canonicalized
paths cross library boundaries that would concern me.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn commit: rev 2821 - trunk/subversion/libsvn_wc trunk/subversion/clients/cmdline trunk/subversion/tests/clients/cmdline

Posted by Karl Fogel <kf...@newton.ch.collab.net>.
Justin Erenkrantz <je...@apache.org> writes:
> The problem I saw when I looked at this before is that even though
> canonical paths often entered libsvn_wc, the canonicalization was
> lost at various stages *internally* in libsvn_wc.

That's okay, as long as libsvn_wc knows what it's doing.

That what I was essentially asking Philip in my original mail: are
they noncanonical coming in (bad), or decanonicalized by choice
internally (fine, as long as the relevant code is adjusted correctly).

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn commit: rev 2821 - trunk/subversion/libsvn_wc trunk/subversion/clients/cmdline trunk/subversion/tests/clients/cmdline

Posted by Justin Erenkrantz <je...@apache.org>.
On Thu, Aug 01, 2002 at 03:19:45PM -0500, Karl Fogel wrote:
> cmpilato@collab.net writes:
> > I think that canonicalization should fall at the same line as
> > case-correctness and UTF-8 conversion and URI-encoding ... it is the
> > job of the client binary to supply such to libsvn_client (and
> > libsvn_whatever as the case may be).
> 
> Yup.  All libsvn_foo layers assume that incoming paths are "fully
> canonicalized".  Meaning UTF-8, case-canonicalized, path-sep
> canonicalized, and any other sort of canonicalization you can think of
> :-). 

The problem I saw when I looked at this before is that even though
canonical paths often entered libsvn_wc, the canonicalization was
lost at various stages *internally* in libsvn_wc.

I don't know if it is still the case, but once I get a chance to
play with the entries cache, I'll be able to see if it has been
fixed since I last looked at it.  -- justin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn commit: rev 2821 - trunk/subversion/libsvn_wc trunk/subversion/clients/cmdline trunk/subversion/tests/clients/cmdline

Posted by Karl Fogel <kf...@newton.ch.collab.net>.
cmpilato@collab.net writes:
> I think that canonicalization should fall at the same line as
> case-correctness and UTF-8 conversion and URI-encoding ... it is the
> job of the client binary to supply such to libsvn_client (and
> libsvn_whatever as the case may be).

Yup.  All libsvn_foo layers assume that incoming paths are "fully
canonicalized".  Meaning UTF-8, case-canonicalized, path-sep
canonicalized, and any other sort of canonicalization you can think of
:-). 

This needs to be better documented, I completely agree.

-K

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn commit: rev 2821 - trunk/subversion/libsvn_wc trunk/subversion/clients/cmdline trunk/subversion/tests/clients/cmdline

Posted by Philip Martin <ph...@codematters.co.uk>.
Greg Stein <gs...@lyra.org> writes:

> > > OK, so libsvn_wc can assume its paths are canonical, what about other
> > > layers?  Where should the canonicalization occur, in the application?
> > > In libsvn_client?  At the libsvn_wc interface?
> > 
> > I think that canonicalization should fall at the same line as
> > case-correctness and UTF-8 conversion and URI-encoding ... it is the
> > job of the client binary to supply such to libsvn_client (and
> > libsvn_whatever as the case may be).
> 
> IMO, the canonical form for a directory should include a trailing slash. Or
> at least we should generate URLs like that. If not, then Apache is going to
> send us a redirection and we'll have to re-issue a request (suck).

It may well make sense for URLs sent over ra_dav, but I'm not sure
that it makes sense for the client and wc interfaces. It requires us
to know whether things are files or directories.

svn_client_add(... const char *path ...)
svn_wc_add(... const char *path ...)
svn_client_commit(... const char *path ...)

Will the caller of these be required to know whether path is a file or
a directory?  What about scheduled for deletion items that are not
physically present?

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn commit: rev 2821 - trunk/subversion/libsvn_wc trunk/subversion/clients/cmdline trunk/subversion/tests/clients/cmdline

Posted by Greg Stein <gs...@lyra.org>.
On Thu, Aug 01, 2002 at 02:36:36PM -0500, cmpilato@collab.net wrote:
>...
> > OK, so libsvn_wc can assume its paths are canonical, what about other
> > layers?  Where should the canonicalization occur, in the application?
> > In libsvn_client?  At the libsvn_wc interface?
> 
> I think that canonicalization should fall at the same line as
> case-correctness and UTF-8 conversion and URI-encoding ... it is the
> job of the client binary to supply such to libsvn_client (and
> libsvn_whatever as the case may be).

IMO, the canonical form for a directory should include a trailing slash. Or
at least we should generate URLs like that. If not, then Apache is going to
send us a redirection and we'll have to re-issue a request (suck).

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn commit: rev 2821 - trunk/subversion/libsvn_wc trunk/subversion/clients/cmdline trunk/subversion/tests/clients/cmdline

Posted by cm...@collab.net.
Philip Martin <ph...@codematters.co.uk> writes:

> Karl Fogel <kf...@newton.ch.collab.net> writes:
> 
> > > When I started the access baton stuff, and started storing/comparing
> > > paths I immediately had problems like this.  Most of the libsvn_wc
> > > code doesn't appear to care whether paths are canonical, so my guess
> > > is that non-canonical paths get in.
> > 
> > So if "canonical" means "has a trailing slash" (which I'm not sure it
> 
> The opposite.  svn_path_canonicalize simply removes a trailing slash
> if there is one.
> 
> > does, but that's a different question), then the real fix is to fix
> > the callers of libsvn_wc, not to start patching up libsvn_wc to accept
> > less-than-canonical patches...
> > 
> > If you want to detect them and *error*, that's fine.  But I think
> > silently papering over them is a mistake.  We need to discipline the
> > callers.
> 
> OK, so libsvn_wc can assume its paths are canonical, what about other
> layers?  Where should the canonicalization occur, in the application?
> In libsvn_client?  At the libsvn_wc interface?

I think that canonicalization should fall at the same line as
case-correctness and UTF-8 conversion and URI-encoding ... it is the
job of the client binary to supply such to libsvn_client (and
libsvn_whatever as the case may be).

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn commit: rev 2821 - trunk/subversion/libsvn_wc trunk/subversion/clients/cmdline trunk/subversion/tests/clients/cmdline

Posted by Philip Martin <ph...@codematters.co.uk>.
Karl Fogel <kf...@newton.ch.collab.net> writes:

> > When I started the access baton stuff, and started storing/comparing
> > paths I immediately had problems like this.  Most of the libsvn_wc
> > code doesn't appear to care whether paths are canonical, so my guess
> > is that non-canonical paths get in.
> 
> So if "canonical" means "has a trailing slash" (which I'm not sure it

The opposite.  svn_path_canonicalize simply removes a trailing slash
if there is one.

> does, but that's a different question), then the real fix is to fix
> the callers of libsvn_wc, not to start patching up libsvn_wc to accept
> less-than-canonical patches...
> 
> If you want to detect them and *error*, that's fine.  But I think
> silently papering over them is a mistake.  We need to discipline the
> callers.

OK, so libsvn_wc can assume its paths are canonical, what about other
layers?  Where should the canonicalization occur, in the application?
In libsvn_client?  At the libsvn_wc interface?

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn commit: rev 2821 - trunk/subversion/libsvn_wc trunk/subversion/clients/cmdline trunk/subversion/tests/clients/cmdline

Posted by Karl Fogel <kf...@newton.ch.collab.net>.
Philip Martin <ph...@codematters.co.uk> writes:
> In this case the failure was "svn ci" with no explicit target, the
> locking code is given a path with a trailing slash.
> 
> When I started the access baton stuff, and started storing/comparing
> paths I immediately had problems like this.  Most of the libsvn_wc
> code doesn't appear to care whether paths are canonical, so my guess
> is that non-canonical paths get in.

So if "canonical" means "has a trailing slash" (which I'm not sure it
does, but that's a different question), then the real fix is to fix
the callers of libsvn_wc, not to start patching up libsvn_wc to accept
less-than-canonical patches...

If you want to detect them and *error*, that's fine.  But I think
silently papering over them is a mistake.  We need to discipline the
callers.

-K

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn commit: rev 2821 - trunk/subversion/libsvn_wc trunk/subversion/clients/cmdline trunk/subversion/tests/clients/cmdline

Posted by Philip Martin <ph...@codematters.co.uk>.
Karl Fogel <kf...@newton.ch.collab.net> writes:

> >  {
> > +  /* Sigh. There are non-canonical paths coming in here. */
> > +  const char *canonical = svn_path_canonicalize_nts (apr_pstrdup (pool, path),
> > +                                                     pool);
> 
> I'm not familiar enough with the context, so maybe this isn't a
> problem, but: is it a bug that there are non-canonical paths coming
> in?  The public libsvn_wc interfaces should only be receiving
> canonical paths, at least.  If it decanonicalizes internally for some
> reason, then the above is understandable.  But if it's getting
> non-canonical paths as input, then there's a larger bug somewhere...

In this case the failure was "svn ci" with no explicit target, the
locking code is given a path with a trailing slash.

When I started the access baton stuff, and started storing/comparing
paths I immediately had problems like this.  Most of the libsvn_wc
code doesn't appear to care whether paths are canonical, so my guess
is that non-canonical paths get in.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org