You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Julian Foad <ju...@wandisco.com> on 2009/10/23 13:56:27 UTC

Review of svn_dirent_uri.h

Here's a partial review of svn_dirent_uri.h, just me reading through it
and raising lots of questions.

- Julian

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2410648

RE: Review of svn_dirent_uri.h

Posted by Bert Huijben <rh...@sharpsvn.net>.
> -----Original Message-----
> From: Julian Foad [mailto:julian.foad@wandisco.com]
> Sent: vrijdag 23 oktober 2009 15:56
> To: dev@subversion.tigris.org
> Subject: Review of svn_dirent_uri.h
> 
> Here's a partial review of svn_dirent_uri.h, just me reading through it
> and raising lots of questions.

I'll add a few notes of myself, because in most cases I was the last one who
edited the apis and documentation. (I don't have the time to dive into this
in the next few days / before subconf)

> Revise the dirent-uri API.
> 
> * subversion/include/svn_dirent_uri.h
>   (svn_relpath_internal_style, svn_relpath_local_style): Delete, because
>     the caller must know the argument is a dirent so
svn_dirent_XXX_style()
>     should be used instead.
>   (...): Suggestions, questions, clarifications, tweaks.
> 
> Index: subversion/include/svn_dirent_uri.h
> ===================================================================
> --- subversion/include/svn_dirent_uri.h	(revision 40199)
> +++ subversion/include/svn_dirent_uri.h	(working copy)
> @@ -27,18 +27,25 @@
>   *
>   *  - a dirent is a path on (local) disc or a UNC path (Windows) in
either
>   *    relative or absolute format
> + *    ### Either clarify what a "relative" dirent is relative to and
whether
> + *    ### it can be "relative to current drive" (/foo) or to CWD on a
drive
> + *    ### (A:foo) on Windows;
> + *    ### or define that a dirent is always absolute.
>   *    Examples: "/foo/bar", "X:/temp", "//server/share" and on Windows
"A:/"
>   *    But not: "http://server"
>   *
>   *  - a URI is an absolute path that starts with a '/' or a schema
definition.
> + *    A URI is stored in URI-encoded form.

This is not enforced at this time. The URI is encoded before using it used
as a URL. The uri apis never encode or decode themselves. This could be a
design decision driven by the idea that svn_uri_*() is a drop in replacement
for svn_path_*(), as it was around the creation of the 1.6.X branch.

Currently svn_uri_*() is used for urls (Encoded) and repository root based
paths like "/trunk/file" in the libsvn_fs_* libraries. Enforcing encoding
would require rewriting the fs layers to use proper relpaths, or introducing
a fourth kind of path.

Requiring uris to be always encoded wouldn't solve our encoding problems, as
double encoding is about the same security problem as not encoding at all.


>   *    Examples: "/", "/foo", "http://server",
"svn+ssh://user@host:123/file"
> - *    But not: "file", "dir/file", "A:/dir"
> + *    But not: "file", "dir/file", "A:/dir", "file:///Spaces in name"

This last one is valid, and making it invalid will most likely break several
tests.
(Most test where designed when svn_uri_*() was just a new variant of
svn_path_*())

As far as I know we currently use "/dir with spaces/file" and "file:///dir
with/spaces/file" as uris, only encoding where needed. (Via
svn_path_url_add_component2).

>   *    ### Currently the URI implementation still allows relpaths as valid
>   *    uris, but this will change soon.
>   *
>   *  - a relative path (relpath) is an unrooted path that can be joined to
>   *    any other relative path, uri or dirent. A relative path is never
> - *    rooted/prefixed by a '/'.
> + *    rooted/prefixed by a '/'. A relpath is stored in UTF-8 encoding.

All paths used by svn_uri*(), svn_dirent*() and svn_relpath*() functions are
encoded as UTF-8.

> + *    A relpath may be obtained from a dirent and then joined to a URI,
or
> + *    obtained from a URI and then joined to a dirent.
>   *    Examples: "file", "dir/file", "dir/subdir/../file"
>   *    But not: "/file", "http://server/file"
>   *
> @@ -110,37 +117,32 @@
>  svn_dirent_local_style(const char *dirent,
>                         apr_pool_t *pool);
>  
> -/** Convert @a relpath from the local style to the canonical internal
style.
> - *
> - * @since New in 1.7.
> - */
> -const char *
> -svn_relpath_internal_style(const char *relpath,
> -                           apr_pool_t *pool);

On Windows relative paths use the "dir\subdir\file" style, while the
internal style uses "dir/subdir/file".

>  
> -/** Convert @a relpath from the canonical internal style to the local
style.
> - *
> - * @since New in 1.7.
> +/* ### All of the "join" functions: Should have one version that
> + * specifically adds a relpath, and throws an error on attempt to add
> + * an absolute dirent/uri. Can have another version that adds either
> + * relative or absolute parts, if really needed at this level.
>   */
> -const char *
> -svn_relpath_local_style(const char *relpath,
> -                        apr_pool_t *pool);
> -

Same here.. Relative paths use platform specific directory separator.

>  
> -/** Join a base dirent (@a base) with a component (@a component),
allocated in
> - * @a pool.
> +/** Join a base dirent (@a base) with a second dirent (@a component).
> + *
> + * ### Rename "component" to something that doesn't imply it's a single
> + *     component unless it must be so.

This was most likely copied from the svn_path_*() api.

>   *
>   * If either @a base or @a component is the empty string, then the other
>   * argument will be copied and returned.  If both are the empty string
then
>   * empty string is returned.
>   *
>   * If the @a component is an absolute dirent, then it is copied and
returned.
> - * The platform specific rules for joining paths are used to join the
components.
> + * Otherwise, the platform-specific rules for joining paths are used to
join
> + * the components.
>   *
>   * This function is NOT appropriate for native (local) file
>   * dirents. Only for "internal" canonicalized dirents, since it uses '/'
>   * for the separator.
>   *
> + * Allocate the result in @a pool.
> + *
>   * @since New in 1.6.
>   */
>  char *
> @@ -386,6 +388,7 @@
>  svn_dirent_is_absolute(const char *dirent);
>  
>  /** Return TRUE if @a uri is considered absolute or is a URL.
> + * ### A URI is defined as "absolute" so how can it be relative?

This is a leftover from a few weeks ago when relative paths where still
valid for svn_uri functions. (And they still are not turned into not
canonical, because there is still some unfixed test failure)
>   *
>   * @since New in 1.7.
>   */
> @@ -401,6 +404,10 @@
>   * Note that on Windows '/' and 'X:' are roots, but paths starting with
this
>   * root are not absolute.
>   *
> + * ### What does "root" mean in this context? What property does "X:"
have
> + *     on Windows that "X:foo" does not? (Think about when the CWD on
drive
> + *     X is "\dir".)
> + *

I don't know how to write a good summary here, but I agree that we should
extend this.

>   * @since New in 1.5.
>   */
>  svn_boolean_t
> @@ -425,7 +432,7 @@
>   * separator characters, and possibly other semantically inoperative
>   * transformations.
>   *
> - * Convert the server name of UNC paths lowercase and drive letters to
> + * Convert the server name of UNC paths to lower case and drive letters
to
>   * upper case on Windows.
>   *
>   * The returned dirent may be statically allocated or allocated from @a
pool.
> @@ -445,15 +452,13 @@
>   * separator characters, and possibly other semantically inoperative
>   * transformations.
>   *
> - * This functions supports relpaths.
> - *
>   * The returned relpath may be statically allocated or allocated from @a
>   * pool.
>   *
>   * @since New in 1.7.
>   */
>  const char *
> -svn_relpath_canonicalize(const char *uri,
> +svn_relpath_canonicalize(const char *relpath,
>                           apr_pool_t *pool);
>  
>  
> @@ -465,8 +470,6 @@
>   * separator characters, and possibly other semantically inoperative
>   * transformations.
>   *
> - * This functions supports URLs.
> - *
>   * The returned uri may be statically allocated or allocated from @a
pool.
>   *
>   * @since New in 1.7.
> @@ -537,6 +540,7 @@
>   * with the same path but different protocols may point at completely
>   * different resources), and (b) share a common ancestor in their path
>   * component, i.e. 'protocol://' is not a sufficient ancestor.
> + * ### And what about 'protocol://server.com/' ?
>   *
>   * @since New in 1.7.
>   */
> @@ -611,8 +615,8 @@
>   * @since New in 1.6.
>   */
>  svn_boolean_t
> -svn_dirent_is_ancestor(const char *path1,
> -                       const char *path2);
> +svn_dirent_is_ancestor(const char *dirent1,
> +                       const char *dirent2);
>  
>  /** Return TRUE if @a relpath1 is an ancestor of @a relpath2 or the
relpaths
>   * are equal and FALSE otherwise.
> @@ -637,9 +641,9 @@
>                      const char *uri2);
>  
>  
> -/** Returns the relative path part of @a dirent2 that is below @a
dirent1,
> - * or just "" iif @a dirent1 is equal to @a dirent2. If @a dirent2 is not
> - * below @a path1, return @a dirent2 completely.
> +/** Return, as a relative dirent, the part of @a dirent2 that is below @a
> + * dirent1, or just "" iif @a dirent1 is equal to @a dirent2. If @a
dirent2
> + * is not below @a path1, return @a dirent2 completely.
>   *
>   * This function assumes @a dirent1 and @a dirent2 are both absolute or
>   * relative in the same way.
> @@ -663,6 +667,7 @@
>  /** Returns the relative path part of @a uri2 that is below @a uri1, or
just
>   * "" iif @a uri1 is equal to @a uri2. If @a uri2 is not below @a uri1,
>   * return @a uri2.
> + * ### Does it return a relpath or a URI? (It matters because the
encoding differs.)

It returns a relpath. "" is not a valid relpath.

The encoding currently doesn't matter, as the svn_uri_ functions don't
handle encodings.


>   *
>   * This function assumes @a uri1 and @a uri2 are both absolute or
relative
>   * in the same way.
>
> 
> - Julian
> 
> ------------------------------------------------------
> http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageI
> d=2410648

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2410768

Re: Review of svn_dirent_uri.h

Posted by Greg Stein <gs...@gmail.com>.
It was for that specific reason that the "uri" functions were witheld from
1.6. I was thinking that we would probably want to rename them all to
svn_url_*

On Oct 23, 2009 10:16 AM, "Julian Foad" <ju...@btopenworld.com> wrote:

On Fri, 2009-10-23 at 14:56 +0100, Julian Foad wrote: > Here's a partial
review of svn_dirent_uri.h,...
And how often in Subversion do we want to deal with a URI that is not a
URL? Wouldn't it be clearer if we had URL-specific functions?

- Julian

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2410655

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2410685

Re: Review of svn_dirent_uri.h

Posted by Julian Foad <ju...@btopenworld.com>.
On Fri, 2009-10-23 at 14:56 +0100, Julian Foad wrote:
> Here's a partial review of svn_dirent_uri.h, just me reading through it
> and raising lots of questions.

And how often in Subversion do we want to deal with a URI that is not a
URL? Wouldn't it be clearer if we had URL-specific functions?

- Julian

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2410655