You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Daniel Shahaf <d....@daniel.shahaf.name> on 2010/11/11 21:27:57 UTC

Which family of path/dirent/uri/relpath functions to use for svn_lock_t->path?

set_lock() subversion/libsvn_fs_fs/lock.c uses svn_dirent_* to interact
with this_path, which is initialized with an svn_lock_t->path.

Should that function be using svn_relpath_* or svn_uri_* instead?

Re: Which family of path/dirent/uri/relpath functions to use for svn_lock_t->path?

Posted by Julian Foad <ju...@wandisco.com>.
On Fri, 2010-11-12, Julian Foad wrote:
> Julian Foad wrote:
> > Peter Samuelson wrote:
> > > [Julian Foad]
> > > > Proposal for switching to a dedicated "fs path" API:
> > > > 
> > > > Step 1:  Introduce a new API for FS paths, as a thin layer that maps to
> > > > existing path APIs.  It must assert that the fspath arguments and fspath
> > > > return values begin with '/'.  Initially it should map directly to
> > > > svn_uri_*, so that it can be seen to be a direct replacement.
> 
> I've just scanned through libsvn_fs*, libsvn_repos and libsvn_ra*, and
> the only svn_uri_* functions used here on non-URI paths appear to be:
> 
>   _join()
>   _is_child()
>   _basename()
> 
> So that makes for a nice small start-up requirement for this new API.
> There are probably others used elsewhere in our codebase, of course.

I'll send a patch that adds these APIs, for review and comments.

- Julian



Re: Which family of path/dirent/uri/relpath functions to use for svn_lock_t->path?

Posted by Julian Foad <ju...@wandisco.com>.
On Fri, 2010-11-12, Julian Foad wrote:
> Julian Foad wrote:
> > Peter Samuelson wrote:
> > > [Julian Foad]
> > > > Proposal for switching to a dedicated "fs path" API:
> > > > 
> > > > Step 1:  Introduce a new API for FS paths, as a thin layer that maps to
> > > > existing path APIs.  It must assert that the fspath arguments and fspath
> > > > return values begin with '/'.  Initially it should map directly to
> > > > svn_uri_*, so that it can be seen to be a direct replacement.
> 
> I've just scanned through libsvn_fs*, libsvn_repos and libsvn_ra*, and
> the only svn_uri_* functions used here on non-URI paths appear to be:
> 
>   _join()
>   _is_child()
>   _basename()
> 
> So that makes for a nice small start-up requirement for this new API.
> There are probably others used elsewhere in our codebase, of course.

I'll send a patch that adds these APIs, for review and comments.

- Julian


Re: Which family of path/dirent/uri/relpath functions to use for svn_lock_t->path?

Posted by Julian Foad <ju...@wandisco.com>.
Julian Foad wrote:
> Peter Samuelson wrote:
> > [Julian Foad]
> > > Proposal for switching to a dedicated "fs path" API:
> > > 
> > > Step 1:  Introduce a new API for FS paths, as a thin layer that maps to
> > > existing path APIs.  It must assert that the fspath arguments and fspath
> > > return values begin with '/'.  Initially it should map directly to
> > > svn_uri_*, so that it can be seen to be a direct replacement.

I've just scanned through libsvn_fs*, libsvn_repos and libsvn_ra*, and
the only svn_uri_* functions used here on non-URI paths appear to be:

  _join()
  _is_child()
  _basename()

So that makes for a nice small start-up requirement for this new API.
There are probably others used elsewhere in our codebase, of course.

- Julian


> > So do these or do these not involve doing URI escaping?  %20 for space,
> > etc.
> 
> No escaping involved here - these are not URLs and these are not
> relative-URLs and so do not have any escaping inside them.  When we make
> a URL from a base URL and a repo-relpath, *then* we will escape it.
> 
> I believe, but haven't checked recently, that the present set of
> svn_uri_* functions doesn't touch or care about percent-escaping when
> handling these paths.  Because we have forced them to handle this case,
> they don't assert that their arguments are canonical.  After we make
> this transition, then the svn_uri_* functions should be able (from a
> correctness point of view) to assert that their 'URI' inputs and outputs
> are canonical.
> 
> 
> >   By calling them 'fs paths', the assumption is that these are in
> > UTF-8, and there is no marshalling / escaping to even think about,
> > except when converting it to or from a URI or dirent.  Is that the
> > intent?
> 
> Yes.
> 
> 
> > > Bikeshed: svn_fspath__* or svn_fs__path_* or something else?
> > 
> > svn_fspath__ seems better to me.  svn_fs__path_ sounds like this is all
> > internal to libsvn_fs, whereas I understand you want this in
> > libsvn_subr for layers to use that are _not_ specifically talking about
> > svn_fs internals.
> 
> Correct, so yes, I agree.
> 
> - Julian
> 
> 


Re: Which family of path/dirent/uri/relpath functions to use for svn_lock_t->path?

Posted by Julian Foad <ju...@wandisco.com>.
Julian Foad wrote:
> Peter Samuelson wrote:
> > [Julian Foad]
> > > Proposal for switching to a dedicated "fs path" API:
> > > 
> > > Step 1:  Introduce a new API for FS paths, as a thin layer that maps to
> > > existing path APIs.  It must assert that the fspath arguments and fspath
> > > return values begin with '/'.  Initially it should map directly to
> > > svn_uri_*, so that it can be seen to be a direct replacement.

I've just scanned through libsvn_fs*, libsvn_repos and libsvn_ra*, and
the only svn_uri_* functions used here on non-URI paths appear to be:

  _join()
  _is_child()
  _basename()

So that makes for a nice small start-up requirement for this new API.
There are probably others used elsewhere in our codebase, of course.

- Julian


> > So do these or do these not involve doing URI escaping?  %20 for space,
> > etc.
> 
> No escaping involved here - these are not URLs and these are not
> relative-URLs and so do not have any escaping inside them.  When we make
> a URL from a base URL and a repo-relpath, *then* we will escape it.
> 
> I believe, but haven't checked recently, that the present set of
> svn_uri_* functions doesn't touch or care about percent-escaping when
> handling these paths.  Because we have forced them to handle this case,
> they don't assert that their arguments are canonical.  After we make
> this transition, then the svn_uri_* functions should be able (from a
> correctness point of view) to assert that their 'URI' inputs and outputs
> are canonical.
> 
> 
> >   By calling them 'fs paths', the assumption is that these are in
> > UTF-8, and there is no marshalling / escaping to even think about,
> > except when converting it to or from a URI or dirent.  Is that the
> > intent?
> 
> Yes.
> 
> 
> > > Bikeshed: svn_fspath__* or svn_fs__path_* or something else?
> > 
> > svn_fspath__ seems better to me.  svn_fs__path_ sounds like this is all
> > internal to libsvn_fs, whereas I understand you want this in
> > libsvn_subr for layers to use that are _not_ specifically talking about
> > svn_fs internals.
> 
> Correct, so yes, I agree.
> 
> - Julian
> 
> 



Re: Which family of path/dirent/uri/relpath functions to use for svn_lock_t->path?

Posted by Julian Foad <ju...@wandisco.com>.
Peter Samuelson wrote:
> [Julian Foad]
> > Proposal for switching to a dedicated "fs path" API:
> > 
> > Step 1:  Introduce a new API for FS paths, as a thin layer that maps to
> > existing path APIs.  It must assert that the fspath arguments and fspath
> > return values begin with '/'.  Initially it should map directly to
> > svn_uri_*, so that it can be seen to be a direct replacement.
> 
> So do these or do these not involve doing URI escaping?  %20 for space,
> etc.

No escaping involved here - these are not URLs and these are not
relative-URLs and so do not have any escaping inside them.  When we make
a URL from a base URL and a repo-relpath, *then* we will escape it.

I believe, but haven't checked recently, that the present set of
svn_uri_* functions doesn't touch or care about percent-escaping when
handling these paths.  Because we have forced them to handle this case,
they don't assert that their arguments are canonical.  After we make
this transition, then the svn_uri_* functions should be able (from a
correctness point of view) to assert that their 'URI' inputs and outputs
are canonical.


>   By calling them 'fs paths', the assumption is that these are in
> UTF-8, and there is no marshalling / escaping to even think about,
> except when converting it to or from a URI or dirent.  Is that the
> intent?

Yes.


> > Bikeshed: svn_fspath__* or svn_fs__path_* or something else?
> 
> svn_fspath__ seems better to me.  svn_fs__path_ sounds like this is all
> internal to libsvn_fs, whereas I understand you want this in
> libsvn_subr for layers to use that are _not_ specifically talking about
> svn_fs internals.

Correct, so yes, I agree.

- Julian



Re: Which family of path/dirent/uri/relpath functions to use for svn_lock_t->path?

Posted by Julian Foad <ju...@wandisco.com>.
Peter Samuelson wrote:
> [Julian Foad]
> > Proposal for switching to a dedicated "fs path" API:
> > 
> > Step 1:  Introduce a new API for FS paths, as a thin layer that maps to
> > existing path APIs.  It must assert that the fspath arguments and fspath
> > return values begin with '/'.  Initially it should map directly to
> > svn_uri_*, so that it can be seen to be a direct replacement.
> 
> So do these or do these not involve doing URI escaping?  %20 for space,
> etc.

No escaping involved here - these are not URLs and these are not
relative-URLs and so do not have any escaping inside them.  When we make
a URL from a base URL and a repo-relpath, *then* we will escape it.

I believe, but haven't checked recently, that the present set of
svn_uri_* functions doesn't touch or care about percent-escaping when
handling these paths.  Because we have forced them to handle this case,
they don't assert that their arguments are canonical.  After we make
this transition, then the svn_uri_* functions should be able (from a
correctness point of view) to assert that their 'URI' inputs and outputs
are canonical.


>   By calling them 'fs paths', the assumption is that these are in
> UTF-8, and there is no marshalling / escaping to even think about,
> except when converting it to or from a URI or dirent.  Is that the
> intent?

Yes.


> > Bikeshed: svn_fspath__* or svn_fs__path_* or something else?
> 
> svn_fspath__ seems better to me.  svn_fs__path_ sounds like this is all
> internal to libsvn_fs, whereas I understand you want this in
> libsvn_subr for layers to use that are _not_ specifically talking about
> svn_fs internals.

Correct, so yes, I agree.

- Julian


Re: Which family of path/dirent/uri/relpath functions to use for svn_lock_t->path?

Posted by Peter Samuelson <pe...@p12n.org>.
[Julian Foad]
> Proposal for switching to a dedicated "fs path" API:
> 
> Step 1:  Introduce a new API for FS paths, as a thin layer that maps to
> existing path APIs.  It must assert that the fspath arguments and fspath
> return values begin with '/'.  Initially it should map directly to
> svn_uri_*, so that it can be seen to be a direct replacement.

So do these or do these not involve doing URI escaping?  %20 for space,
etc.  By calling them 'fs paths', the assumption is that these are in
UTF-8, and there is no marshalling / escaping to even think about,
except when converting it to or from a URI or dirent.  Is that the
intent?

> Bikeshed: svn_fspath__* or svn_fs__path_* or something else?

svn_fspath__ seems better to me.  svn_fs__path_ sounds like this is all
internal to libsvn_fs, whereas I understand you want this in
libsvn_subr for layers to use that are _not_ specifically talking about
svn_fs internals.
-- 
Peter Samuelson | org-tld!p12n!peter | http://p12n.org/

Re: Which family of path/dirent/uri/relpath functions to use for svn_lock_t->path?

Posted by Peter Samuelson <pe...@p12n.org>.
[Julian Foad]
> Proposal for switching to a dedicated "fs path" API:
> 
> Step 1:  Introduce a new API for FS paths, as a thin layer that maps to
> existing path APIs.  It must assert that the fspath arguments and fspath
> return values begin with '/'.  Initially it should map directly to
> svn_uri_*, so that it can be seen to be a direct replacement.

So do these or do these not involve doing URI escaping?  %20 for space,
etc.  By calling them 'fs paths', the assumption is that these are in
UTF-8, and there is no marshalling / escaping to even think about,
except when converting it to or from a URI or dirent.  Is that the
intent?

> Bikeshed: svn_fspath__* or svn_fs__path_* or something else?

svn_fspath__ seems better to me.  svn_fs__path_ sounds like this is all
internal to libsvn_fs, whereas I understand you want this in
libsvn_subr for layers to use that are _not_ specifically talking about
svn_fs internals.
-- 
Peter Samuelson | org-tld!p12n!peter | http://p12n.org/

RE: Which family of path/dirent/uri/relpath functions to use for svn_lock_t->path?

Posted by Julian Foad <ju...@wandisco.com>.
On Fri, 2010-11-12, Bert Huijben wrote:
> Is it a path in the form '/trunk/some/path' then svn_relpath_X will assert
> on an invalid relative path.
> 
> svn_uri_* is currently the only non-deprecated set of path functions that
> allows this deprecated format.
> 
> @julianf: This is why svn_uri handles urls and other paths differently.

Bert, thanks.

I think it's pretty important that we at least do something to stop
spreading the inappropriate use of svn_uri_* for paths that start with
'/'.

See my proposal below.


> We currently have:
> * dirent: local paths (architecture specific)
> * relpath: Only paths in the form 'some' or 'some/subdir' are valid. '/qq'
> and 'some/./dir' are not.
> * uri: All other paths.
> 
> Uri should actually be '<schema>://<schema-base>/<path>', but when we
> started working on these apis we deprecated svn_path and had only dirent and
> uri functions to replace them.
> 
> I think we should:
> * either introduce a svn_fspath_*() set of functions for '/style/paths'.
> * or deprecate the entire '/style/path' notation in our internal handling.
> (We can replace them with relpaths by just removing the initial '/')

Switching to a dedicated API for these '/style/paths' seems to be the
'safer' option.  I can see no potential problems with it.  The steps to
get there are of the purely mechanical kind, so the implementation and
testing effort is not completely unpredictable.  It leads us closer to
being able to change the path style in future, because the correct
subset of svn_uri_* calls would already have been identified.

Changing to the relpath style also looks attractive.  It is a less
mechanical change: as well as the svn_uri_* calls there will be other
places where a leading '/' is coded for.  After this change, the
distinction between a root-relative path and a path relative to
something else would be maintained purely by human communication (such
as the naming of variables) rather than by the leading '/' on the data.
It would be easier to confuse them, a kind of error that is harder to
debug.  That is normal practice and not a problem in itself, just a
slightly greater risk in the conversion effort.  And this kind of change
is less easily reversible, if for any reason we were to change our
minds.

In either case, we have to find all the places where we're currently
calling svn_uri_* on repository paths, and change them.

I recommend we change the API now and don't change the path format now.
We can still change the path format later.


> (An temporary third option would be to introduce a svn_fspath__ internal set
> of functions, to keep svn_uri_ clean)

Proposal for switching to a dedicated "fs path" API:

Step 1:  Introduce a new API for FS paths, as a thin layer that maps to
existing path APIs.  It must assert that the fspath arguments and fspath
return values begin with '/'.  Initially it should map directly to
svn_uri_*, so that it can be seen to be a direct replacement.

Then any new code can start using the new API straight away.

Step 2:  Switch over all the existing svn_uri_* calls that handle
fs-paths to use the new API, one at a time or in bunches.

Step 3:  Re-implement the new API to use svn_relpath_* while stripping
and re-adding the leading '/'.  Optionally, write dedicated
implementations in cases where the wrapper approach seems expensive.
For extra confidence, log the result of every call and check that the
log produced by a test suite run is the same before and after the
change.

Step 4:  In svn_uri_*, assert that the URI does *not* begin with '/'.
(Should we go further and assert that it begins with a scheme?)


Bikeshed: svn_fspath__* or svn_fs__path_* or something else?

- Julian



RE: Which family of path/dirent/uri/relpath functions to use for svn_lock_t->path?

Posted by Julian Foad <ju...@wandisco.com>.
On Fri, 2010-11-12, Bert Huijben wrote:
> Is it a path in the form '/trunk/some/path' then svn_relpath_X will assert
> on an invalid relative path.
> 
> svn_uri_* is currently the only non-deprecated set of path functions that
> allows this deprecated format.
> 
> @julianf: This is why svn_uri handles urls and other paths differently.

Bert, thanks.

I think it's pretty important that we at least do something to stop
spreading the inappropriate use of svn_uri_* for paths that start with
'/'.

See my proposal below.


> We currently have:
> * dirent: local paths (architecture specific)
> * relpath: Only paths in the form 'some' or 'some/subdir' are valid. '/qq'
> and 'some/./dir' are not.
> * uri: All other paths.
> 
> Uri should actually be '<schema>://<schema-base>/<path>', but when we
> started working on these apis we deprecated svn_path and had only dirent and
> uri functions to replace them.
> 
> I think we should:
> * either introduce a svn_fspath_*() set of functions for '/style/paths'.
> * or deprecate the entire '/style/path' notation in our internal handling.
> (We can replace them with relpaths by just removing the initial '/')

Switching to a dedicated API for these '/style/paths' seems to be the
'safer' option.  I can see no potential problems with it.  The steps to
get there are of the purely mechanical kind, so the implementation and
testing effort is not completely unpredictable.  It leads us closer to
being able to change the path style in future, because the correct
subset of svn_uri_* calls would already have been identified.

Changing to the relpath style also looks attractive.  It is a less
mechanical change: as well as the svn_uri_* calls there will be other
places where a leading '/' is coded for.  After this change, the
distinction between a root-relative path and a path relative to
something else would be maintained purely by human communication (such
as the naming of variables) rather than by the leading '/' on the data.
It would be easier to confuse them, a kind of error that is harder to
debug.  That is normal practice and not a problem in itself, just a
slightly greater risk in the conversion effort.  And this kind of change
is less easily reversible, if for any reason we were to change our
minds.

In either case, we have to find all the places where we're currently
calling svn_uri_* on repository paths, and change them.

I recommend we change the API now and don't change the path format now.
We can still change the path format later.


> (An temporary third option would be to introduce a svn_fspath__ internal set
> of functions, to keep svn_uri_ clean)

Proposal for switching to a dedicated "fs path" API:

Step 1:  Introduce a new API for FS paths, as a thin layer that maps to
existing path APIs.  It must assert that the fspath arguments and fspath
return values begin with '/'.  Initially it should map directly to
svn_uri_*, so that it can be seen to be a direct replacement.

Then any new code can start using the new API straight away.

Step 2:  Switch over all the existing svn_uri_* calls that handle
fs-paths to use the new API, one at a time or in bunches.

Step 3:  Re-implement the new API to use svn_relpath_* while stripping
and re-adding the leading '/'.  Optionally, write dedicated
implementations in cases where the wrapper approach seems expensive.
For extra confidence, log the result of every call and check that the
log produced by a test suite run is the same before and after the
change.

Step 4:  In svn_uri_*, assert that the URI does *not* begin with '/'.
(Should we go further and assert that it begins with a scheme?)


Bikeshed: svn_fspath__* or svn_fs__path_* or something else?

- Julian


RE: Which family of path/dirent/uri/relpath functions to use for svn_lock_t->path?

Posted by Bert Huijben <be...@qqmail.nl>.

> -----Original Message-----
> From: Daniel Shahaf [mailto:d.s@daniel.shahaf.name]
> Sent: donderdag 11 november 2010 22:28
> To: dev@subversion.apache.org
> Subject: Which family of path/dirent/uri/relpath functions to use for
> svn_lock_t->path?
> 
> set_lock() subversion/libsvn_fs_fs/lock.c uses svn_dirent_* to interact
> with this_path, which is initialized with an svn_lock_t->path.
> 
> Should that function be using svn_relpath_* or svn_uri_* instead?

That depends on the path kind stored in this value.

Is it a path in the form '/trunk/some/path' then svn_relpath_X will assert
on an invalid relative path.

svn_uri_* is currently the only non-deprecated set of path functions that
allows this deprecated format.

@julianf: This is why svn_uri handles urls and other paths differently.


We currently have:
* dirent: local paths (architecture specific)
* relpath: Only paths in the form 'some' or 'some/subdir' are valid. '/qq'
and 'some/./dir' are not.
* uri: All other paths.

Uri should actually be '<schema>://<schema-base>/<path>', but when we
started working on these apis we deprecated svn_path and had only dirent and
uri functions to replace them.

I think we should:
* either introduce a svn_fspath_*() set of functions for '/style/paths'.
* or deprecate the entire '/style/path' notation in our internal handling.
(We can replace them with relpaths by just removing the initial '/')

(An temporary third option would be to introduce a svn_fspath__ internal set
of functions, to keep svn_uri_ clean)

	Bert