You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Julian Foad <ju...@btopenworld.com> on 2014/03/27 14:17:01 UTC

Review of lock-many API [was: svn commit: r1577280 [1/3] ...]

Philip Martin wrote:
> Julian Foad writes:
>>>  URL: http://svn.apache.org/r1577280
>>>  * subversion/include/svn_fs.h
>>>    (svn_fs_lock_target_t, svn_fs_lock_result_t,
>>>     svn_fs_lock2, svn_fs_unlock2): new.
>>> 
>>>  * subversion/include/svn_repos.h
>>>    (svn_repos_fs_lock2, svn_repos_fs_unlock2): new.
>> 
>>  Do we intend to deprecate the old versions? If not, then it would be
>>  better to name the new functions something like ...lock_many() instead
>>  of ...lock2().
> 
> I'm unsure.  In some ways it is convenient to have a single path
> function as it avoids the need to construct hashes and the error
> handling is simpler and less prone to leak.  On the other hand when
> svn_client_copy4 introduced multi-path copy source the single path
> version was deprecated.

Thoughts...

A low level API such as libsvn_fs should aim to be clean and efficient, more so than to provide functions that are convenient for the casual user. (A high level API such as libsvn_client or the bindings is a more appropriate place to provide convenience functions.) We do have a number of APIs already, in several libraries, where the caller is expected to pass in a list of targets even if they only want to operate on one. Maybe in general the thing to aim for is to make it easy for a caller to do so.

We could provide a singleton constructor for the hash-of-svn_fs_lock_target_t argument. Should we? Probably not at the libsvn_fs level; maybe at a high level.

I started reading further through the new code. Here are some more review comments.

The 'comment' parameter to svn_fs_lock2 should be a member of the 'target' struct, since it is inherently a per-target attribute. Of course there are common scenarios where a client wants to lock a bunch of paths all with the same comment, but the benefits of a lock-many API should also be made available to clients which supply different comments. This applies all the way up the call stack. However, I note that from the RA layer upwards we have had lock-many APIs since v1.2 which take a single comment for all the paths, so changing to per-target comments throughout the stack is perhaps beyond the scope of this issue. We could still do it at the FS and repos layers now in preparation; I don't see that it would add significant overhead in run-time or in maintenance.

In principle, the 'is_dav_comment' and 'expiration_date' parameters should similarly be per target, but that makes less sense in practice as they're only used for generic DAV clients via mod_dav_svn. As an alternative, we might consider dropping them from this API and keeping the original single-lock API (not deprecated) to cater for such locks. RA-local and svnserve and 'svnadmin lock' don't support expiration and always pass is_dav_comment=FALSE. Only mod_dav_svn supplies these two options, and it currently only uses the old one-lock-at-a-time API anyway.

We should provide a destructor for the hash-of-svn_fs_lock_result_t results, which contains errors that need to be cleared. svnserve's lock_many() already contains such code twice. It's simple code -- around 5 lines -- but if we're demanding that callers ensure they do it, it's nice to provide a ready-made way for them to do it. This would also help in future-proofing against us revising the API in later releases.

The semantics relating to returning one error per path and an overall error needs to be fully documented and/or changed. For example, svn_repos_fs_lock() assumes that svn_repos_fs_lock2() has set *results to a valid hash if it returns any overall error, while svn_fs_lock() assumes that svn_fs_lock2() WON'T have put any error in *results if it returns an error overall. These look at least surprising.

And some more suggestions and comments in the form of a diff:

Index: subversion/include/svn_fs.h
===================================================================
--- subversion/include/svn_fs.h    (revision 1582241)
+++ subversion/include/svn_fs.h    (working copy)
@@ -2545,7 +2545,7 @@
  *
  * @a fs must have a username associated with it (see
  * #svn_fs_access_t), else return #SVN_ERR_FS_NO_USER.  Set the
- * 'owner' field in the new lock to the fs username.
+ * 'owner' field in each new lock to the fs username.
  *
  * @a comment is optional: it's either an xml-escapable UTF8 string
  * which describes the lock, or it is @c NULL.
@@ -2566,9 +2566,10 @@
  * that existing lock.  If current_rev is a valid revnum, then do an
  * out-of-dateness check.  If the revnum is less than the
  * last-changed-revision of the path (or if the path doesn't exist in
- * HEAD), return * #SVN_ERR_FS_OUT_OF_DATE.
+ * HEAD), yield an #SVN_ERR_FS_OUT_OF_DATE error for this path.
  *
- * If a path is already locked, then return #SVN_ERR_FS_PATH_ALREADY_LOCKED,
+ * If a path is already locked, then yield an
+ * #SVN_ERR_FS_PATH_ALREADY_LOCKED error for this path,
  * unless @a steal_lock is TRUE, in which case "steal" the existing
  * lock, even if the FS access-context's username does not match the
  * current lock's owner: delete the existing lock on the path, and
@@ -2582,7 +2583,7 @@
  * <tt>svn_fs_lock_result_t *</tt>.  The error associated with each
  * path is returned as #svn_fs_lock_result_t->err.  The caller must
  * ensure that all such errors are handled to avoid leaks.  The lock
- * associated with each path is returned as #svn_fs_lock_result_t->lock,
+ * associated with each path is returned as #svn_fs_lock_result_t->lock;
  * this will be @c NULL if no lock was created.
  *
  * Allocate @a *results in @a result_pool. Use @a scratch_pool for
@@ -2637,17 +2638,19 @@
  * the results in @a *results.
  *
  * The paths to be unlocked are passed as <tt>const char *</tt> keys
- * of the @a targets hash with <tt>svn_fs_lock_target_t *</tt> values.
- * #svn_fs_lock_target_t->token provides the token to be unlocked for
- * each path. If the the token doesn't point to a lock, return
- * #SVN_ERR_FS_BAD_LOCK_TOKEN.  If the token points to an expired
- * lock, return #SVN_ERR_FS_LOCK_EXPIRED.  If @a fs has no username
- * associated with it, return #SVN_ERR_FS_NO_USER unless @a break_lock
- * is specified.
+ * of the @a targets hash with the corresponding lock tokens as the
+ * <tt>const char *</tt> values. If the the token doesn't point to a
+ * lock, yield an #SVN_ERR_FS_BAD_LOCK_TOKEN error for this path.  If
+ * the token points to an expired lock, yield an #SVN_ERR_FS_LOCK_EXPIRED
+ * error for this path.
+ * 
+ * If @a fs has no username associated with it, return #SVN_ERR_FS_NO_USER
+ * unless @a break_lock is specified.
+ *   ### Or: "yield an #SVN_ERR_FS_NO_USER error for this path unless..."?
  *
  * If the token points to a lock, but the username of @a fs's access
- * context doesn't match the lock's owner, return
- * #SVN_ERR_FS_LOCK_OWNER_MISMATCH.  If @a break_lock is TRUE, however, don't
+ * context doesn't match the lock's owner, yield an
+ * #SVN_ERR_FS_LOCK_OWNER_MISMATCH error for this path.  If @a break_lock is TRUE, however, don't
  * return error;  allow the lock to be "broken" in any case.  In the latter
  * case, the token shall be @c NULL.
  *
Index: subversion/include/svn_repos.h
===================================================================
--- subversion/include/svn_repos.h    (revision 1582241)
+++ subversion/include/svn_repos.h    (working copy)
@@ -2187,21 +2187,20 @@
 /** Like svn_fs_lock2(), but invoke the @a repos's pre- and
  * post-lock hooks before and after the locking action.
  *
- * The pre-lock is run for every path in @a targets. Those entries in
- * @a targets for which the pre-lock is successful are passed to
- * svn_fs_lock2 and the post-lock is run for those that are
+ * The pre-lock hook is run for every path in @a targets. Those
+ * targets for which the pre-lock hook is successful are passed to
+ * svn_fs_lock2() and the post-lock hook is run for those that are
  * successfully locked.
  *
- * @a results contains the result of running the pre-lock and
- * svn_fs_lock2 if the pre-lock was successful.  If an error occurs
- * when running the post-lock hook the error is returned wrapped with
- * SVN_ERR_REPOS_POST_LOCK_HOOK_FAILED.  If the caller sees this
- * error, it knows that the some locks succeeded.  In all cases the
- * caller must handle all errors in @a results to avoid leaks.
+ * @a *results contains the result for each path.  If an error occurs
+ * when running the post-lock hook, the error is wrapped with
+ * #SVN_ERR_REPOS_POST_LOCK_HOOK_FAILED.  If the caller sees this
+ * error, it knows that this lock succeeded.  In all cases the caller
+ * must handle or clear all errors in @a *results to avoid leaks.
  *
  * The pre-lock hook may cause a different token to be used for the
- * lock, instead of @a token; see the pre-lock-hook documentation for
- * more.
+ * lock, instead of the token supplied here; see the pre-lock-hook
+ * documentation for more.
  *
  * Allocate @a *results in @a result_pool. Use @a scratch_pool for
  * temporary allocations.
@@ -2236,21 +2235,19 @@
                   apr_pool_t *pool);


-/** Like svn_fs_unlock(), but invoke the @a repos's pre- and
+/** Like svn_fs_unlock2(), but invoke the @a repos's pre- and
  * post-unlock hooks before and after the unlocking action.
  *
  * The pre-unlock hook is run for every path in @a targets. Those
- * entries in @a targets for which the pre-unlock is successful are
- * passed to svn_fs_unlock2 and the post-unlock is run for those that
- * are successfully unlocked.
- *
- * @a results contains the result of of running the pre-unlock and
- * svn_fs_unlock2 if the pre-unlock was successful. If an error occurs
- * when running the post-unlock hook, return the original error
- * wrapped with SVN_ERR_REPOS_POST_UNLOCK_HOOK_FAILED.  If the caller
- * sees this error, it knows that some unlocks succeeded.  In all
- * cases the caller must handle all error in @a results to avoid
- * leaks.
+ * targets for which the pre-unlock hook is successful are passed to
+ * svn_fs_unlock2() and the post-unlock hook is run for those that are
+ * successfully unlocked.
+ *
+ * @a *results contains the result for each path. If an error occurs
+ * when running the post-unlock hook, the error is wrapped with
+ * #SVN_ERR_REPOS_POST_UNLOCK_HOOK_FAILED.  If the caller sees this
+ * error, it knows that this unlock succeeded.  In all cases the caller
+ * must handle or clear all errors in @a *results to avoid leaks.
  *
  * Allocate @a *results in @a result_pool. Use @a scratch_pool for
  * temporary allocations.
@@ -2265,7 +2262,7 @@
                      apr_pool_t *result_pool,
                      apr_pool_t *scratch_pool);

-/* Similar to svn_repos_fs_unlock2 but only unlocks a single path.
+/* Similar to svn_repos_fs_unlock2() but only unlocks a single path.
  *
  * @since New in 1.2.
  */
Index: subversion/libsvn_repos/fs-wrap.c
===================================================================
--- subversion/libsvn_repos/fs-wrap.c    (revision 1582225)
+++ subversion/libsvn_repos/fs-wrap.c    (working copy)
@@ -570,6 +570,10 @@ svn_repos_fs_lock2(apr_hash_t **results,
                   svn__apr_hash_index_val(hi));

   /* If there are locks and an error should we return or run the post-lock? */
+  /* [JAF] Is that a question to reviewers? It depends. What errors can
+           svn_fs_lock2 return, and do any of them justify not running the
+           post-lock hook? Can it even return an error and also create
+           locks? Its doc string needs to say. */
   if (err)
     return svn_error_trace(err);

Index: subversion/svnserve/serve.c
===================================================================
--- subversion/svnserve/serve.c    (revision 1582225)
+++ subversion/svnserve/serve.c    (working copy)
@@ -2733,7 +2733,8 @@ static svn_error_t *lock_many(svn_ra_svn
      an error. */
   SVN_ERR(must_have_access(conn, pool, b, svn_authz_write, NULL, TRUE));

-  /* Loop through the lock requests. */
+  /* Loop through the lock requests
+     ### and do what? */
   for (i = 0; i < path_revs->nelts; ++i)
     {
       const char *path, *full_path;
@@ -2759,6 +2760,8 @@ static svn_error_t *lock_many(svn_ra_svn
       target->current_rev = current_rev;

       /* We could check for duplicate paths and reject the request? */
+      /* ### [JAF] Is that a question to reviewers? We could, but I
+         don't think it's useful to do so. */
       svn_hash_sets(targets, full_path, target);
     }

@@ -2767,6 +2770,11 @@ static svn_error_t *lock_many(svn_ra_svn

   /* From here on we need to make sure any errors in authz_results, or
      results, are cleared before returning from this function. */
+
+  /* Check authz access for each target, because ...
+     ### Why? We don't want svn_repos_fs_lock2() to do this for us ...?
+         Or, we want to log such errors and it's easier to do so before
+         than afterwards? */
   for (hi = apr_hash_first(pool, targets); hi; hi = apr_hash_next(hi))
     {
       const char *full_path = svn__apr_hash_index_key(hi);
@@ -2791,7 +2799,8 @@ static svn_error_t *lock_many(svn_ra_svn
                            0, /* No expiration time. */
                            steal_lock, pool, subpool);

-  /* The client expects results in the same order as paths were supplied. */
+  /* Send the results. The client expects results in the same order as
+     paths were supplied. */
   for (i = 0; i < path_revs->nelts; ++i)
     {
       const char *path, *full_path;
@@ -2816,6 +2825,16 @@ static svn_error_t *lock_many(svn_ra_svn
         result = svn_hash_gets(authz_results, full_path);
       if (!result)
         /* No result?  Should we return some sort of placeholder error? */
+        /* ### [JAF] Is that a question to reviewers? Certainly something
+           has gone wrong. Maybe svn_repos_fs_lock2 returned an error
+           overall, having processed none or only some paths -- is it
+           allowed to do so? Breaking here would signal to the client that
+           something went wrong, because they'll see a too-short response
+           list, but would also potentially hide information about further
+           targets that were processed, some of which perhaps were locked
+           successfully. (Note: we're scanning them here in a different
+           order from the order in which svn_repos_fs_lock2() processed
+           them.) */
         break;

       if (result->err)

- Julian

Re: Review of lock-many API [was: svn commit: r1577280 [1/3] ...]

Posted by Julian Foad <ju...@btopenworld.com>.
I (Julian Foad) wrote:
> In principle, the 'is_dav_comment' and 'expiration_date' 
> parameters should similarly be per target,

Not "should", but rather "could be argued, at some pedantic level".

> but that makes less sense in practice 
> as they're only used for generic DAV clients via mod_dav_svn. As an 
> alternative, we might consider dropping them from this API [...]

[...]
> The semantics relating to returning one error per path and an overall error 
> needs to be fully documented and/or changed. For example, svn_repos_fs_lock() 
> assumes that svn_repos_fs_lock2() has set *results to a valid hash if it returns 
> any overall error, while svn_fs_lock() assumes that svn_fs_lock2() WON'T 
> have put any error in *results if it returns an error overall. These look at 
> least surprising.

In fact, I strongly feel we should not return error objects within the results structure. A per-target result is not, in some abstract sense, an error -- the caller didn't try to do something it is forbidden to try, and the server didn't encounter an unexpected condition or anything broken in its execution. A result such as "the lock you're trying to remove is not present" is merely one of the possible and expected outcomes.

The problem existed before this lock-many work. Wrapping the existing design in a multi-target wrapper just exacerbates the issue.

So, rather, we should return flags or other information that specifically states the result for that target. (If a higher level caller wants to consider it an error and construct an error object, that's it's business. For backwards compatibility that will obviously have to be done in some places.)

A different, practical objection to error objects constructed on the server is they are not localized to the client's human language.

- Julian


Re: Review of lock-many API

Posted by Julian Foad <ju...@btopenworld.com>.
Hi Philip.

Philip Martin wrote:

> Julian Foad writes:
>>  Index: subversion/svnserve/serve.c
>>  ===================================================================

>>  -  /* The client expects results in the same order as paths were supplied. */
>>  +  /* Send the results. The client expects results in the same order as
>>  +     paths were supplied. */
>>     for (i = 0; i < path_revs->nelts; ++i)
>>       {
>>         const char *path, *full_path;
>>  @@ -2816,6 +2825,16 @@ static svn_error_t *lock_many(svn_ra_svn
>>           result = svn_hash_gets(authz_results, full_path);
>>         if (!result)
>>           /* No result?  Should we return some sort of placeholder error? */
>>  +        /* ### [JAF] Is that a question to reviewers? Certainly something
>>  +           has gone wrong. Maybe svn_repos_fs_lock2 returned an error
>>  +           overall, having processed none or only some paths -- is it
>>  +           allowed to do so?
> 
> It could run out of disk space half way through.

AFAICT, if I'm reading this right, then a valid reason for not finding a result here is if svn_repos_fs_lock_many returned an error overall, having locked some but not all of the paths. Doesn't really matter why.

Breaking here seems wrong. We're processing the targets in this loop in a different order from the hash order that svn_repos_fs_lock_many() was using to lock them. Therefore we are potentially stopping before we have given responses to all the paths that were successfully locked. We would stop the response list at this point, and terminate it with a successful "done" keyword, but the response list would have too few elements. This seems arbitrary and broken.

I can think of two alternatives. We could just not generate a response list in the case where svn_repos_fs_lock_many returns an error, but simply report the error. That's effectively no worse than the current behaviour. Or we could insert an error response or some sort of place-holder entry in the list for this target, and then continue to the next target, to give a full response.

Does that make sense and would either of those options work?

- Julian


>>                          Breaking here would signal to the client that
>>  +           something went wrong, because they'll see a too-short response
>>  +           list, but would also potentially hide information about further
>>  +           targets that were processed, some of which perhaps were locked
>>  +           successfully. (Note: we're scanning them here in a different
>>  +           order from the order in which svn_repos_fs_lock2() processed
>>  +           them.) */
>>           break;

Re: Review of lock-many API

Posted by Julian Foad <ju...@btopenworld.com>.
Philip Martin wrote:
> Julian Foad <ju...@btopenworld.com> writes:
>> Imagine the caller is an svn GUI that has an offline mode in which the
>> user can queue up lock requests, and the software will send them to
>> the server when it's back on line. Does the GUI's back end need to run
>> through the queued lock requests and re-group them into batches such
>> that all the lock requests in a batch have the same comment?
>> 
>> Or imagine we want to teach 'svnsync' to synchronize locks: it reads
>> all the locks in the source repo and writes them to the target
>> repo. Do we want to force this software to re-group the locks into
>> batches or write them one at a time? Wouldn't it be better if it can
>> just say "Here is a bunch of locks; please write them"?
>> 
>> The principle that makes sense to me is, if the semantics of the data
>> objects being written is one comment per lock, then the API should
>> match that, and not restrict the caller unnecessarily.
> 
> Those examples rely on a new RA API that we don't have.  Making the FS
> API support more than the RA API might future-proof the FS API but it
> might also be a dead end that is never used.  We don't know exactly how
> lock replication will work and whether it will require changes to the
> storage backend, but if it does require backend changes then any FS API
> we create today may not be sufficient.
> 
> Lock usernames are another problem. At present the username is passed
> via an svn_fs_access_t and stored in each lock but there is no mechanism
> to pass per-lock usernames and no mechanism to modify usernames on
> locks.  Should we pass per-lock usernames with the rest of the lock
> data, or should we introduce per-lock usernames in svn_fs_access_t?  The
> first is contrary to all other FS APIs, the second is a totally new API
> that nobody yet needs.  I don't know the answer and since we don't need
> this feature yet I prefer not to make a decision at this time.

Ack. Okay.

> I have made svn_fs_lock_target_t opaque in the public API which should
> make it easier to add new fields in future.

OK.

Thanks.

- Julian

Re: Review of lock-many API

Posted by Philip Martin <ph...@wandisco.com>.
Julian Foad <ju...@btopenworld.com> writes:

> Imagine the caller is an svn GUI that has an offline mode in which the
> user can queue up lock requests, and the software will send them to
> the server when it's back on line. Does the GUI's back end need to run
> through the queued lock requests and re-group them into batches such
> that all the lock requests in a batch have the same comment?
>
> Or imagine we want to teach 'svnsync' to synchronize locks: it reads
> all the locks in the source repo and writes them to the target
> repo. Do we want to force this software to re-group the locks into
> batches or write them one at a time? Wouldn't it be better if it can
> just say "Here is a bunch of locks; please write them"?
>
> The principle that makes sense to me is, if the semantics of the data
> objects being written is one comment per lock, then the API should
> match that, and not restrict the caller unnecessarily.

Those examples rely on a new RA API that we don't have.  Making the FS
API support more than the RA API might future-proof the FS API but it
might also be a dead end that is never used.  We don't know exactly how
lock replication will work and whether it will require changes to the
storage backend, but if it does require backend changes then any FS API
we create today may not be sufficient.

Lock usernames are another problem. At present the username is passed
via an svn_fs_access_t and stored in each lock but there is no mechanism
to pass per-lock usernames and no mechanism to modify usernames on
locks.  Should we pass per-lock usernames with the rest of the lock
data, or should we introduce per-lock usernames in svn_fs_access_t?  The
first is contrary to all other FS APIs, the second is a totally new API
that nobody yet needs.  I don't know the answer and since we don't need
this feature yet I prefer not to make a decision at this time.

I have made svn_fs_lock_target_t opaque in the public API which should
make it easier to add new fields in future.

-- 
Philip Martin | Subversion Committer
WANdisco // *Non-Stop Data*

Re: Review of lock-many API

Posted by Julian Foad <ju...@btopenworld.com>.
Hi Philip. Thanks for the responses.

(I see you've changed the API implementation to a callback pattern now.)

Philip Martin wrote:
> Julian Foad <ju...@btopenworld.com> writes:
>> The 'comment' parameter to svn_fs_lock2 should be a member of the
>> 'target' struct, since it is inherently a per-target attribute. Of
>> course there are common scenarios where a client wants to lock a bunch
>> of paths all with the same comment, but the benefits of a lock-many
>> API should also be made available to clients which supply different
>> comments. This applies all the way up the call stack. However, I note
>> that from the RA layer upwards we have had lock-many APIs since v1.2
>> which take a single comment for all the paths, so changing to
>> per-target comments throughout the stack is perhaps beyond the scope
>> of this issue. We could still do it at the FS and repos layers now in
>> preparation; I don't see that it would add significant overhead in
>> run-time or in maintenance.
> 
> I think per-lock comments in a multi-path lock are unneccesary.  I don't
> think anybody locking dozens/hundreds of files will bother defining
> different comments.  Anyone who wants two or three locks with distinct
> comments can call the function multiple times.

For a high level client API, I agree with you. But the same argument doesn't make sense for the low level APIs. Here, it's not a matter of what *people* are likely to do, but another layer of *software*.

Imagine the caller is an svn GUI that has an offline mode in which the user can queue up lock requests, and the software will send them to the server when it's back on line. Does the GUI's back end need to run through the queued lock requests and re-group them into batches such that all the lock requests in a batch have the same comment?

Or imagine we want to teach 'svnsync' to synchronize locks: it reads all the locks in the source repo and writes them to the target repo. Do we want to force this software to re-group the locks into batches or write them one at a time? Wouldn't it be better if it can just say "Here is a bunch of locks; please write them"?

The principle that makes sense to me is, if the semantics of the data objects being written is one comment per lock, then the API should match that, and not restrict the caller unnecessarily.

>> In principle, the 'is_dav_comment' and 'expiration_date' parameters
>> should similarly be per target, but that makes less sense in practice
>> as they're only used for generic DAV clients via mod_dav_svn. As an
>> alternative, we might consider dropping them from this API and keeping
>> the original single-lock API (not deprecated) to cater for such
>> locks. RA-local and svnserve and 'svnadmin lock' don't support
>> expiration and always pass is_dav_comment=FALSE. Only mod_dav_svn
>> supplies these two options, and it currently only uses the old
>> one-lock-at-a-time API anyway.
> 
> Eventually we will define a new request and mod_dav_svn will start
> calling the multiple path version.  It will still handle the current
> LOCK request but may use the multi-path function to do it.

Eventually we will define a new request and mod_dav_svn MAY start calling the multiple path version of the API, but it will never need to set multiple locks at once, because mod_dav_svn talks to DAV via the dav_hooks_locks vtable which only deals with one path at a time. Look at the calls to svn_repos_fs_lock() in mod_dav_svn/lock.c. (A slight confusion is that the vtable methods support multiple locks per path; but that's irrelevant as our implementation doesn't.)

So it could still just as efficiently use the old single-path version of the API, if we support that, couldn't it?



[...]
>>  Index: subversion/svnserve/serve.c
>>  ===================================================================
>>  --- subversion/svnserve/serve.c    (revision 1582225)
>>  +++ subversion/svnserve/serve.c    (working copy)
>>  @@ -2733,7 +2733,8 @@ static svn_error_t *lock_many(svn_ra_svn
>>        an error. */
>>     SVN_ERR(must_have_access(conn, pool, b, svn_authz_write, NULL, TRUE));
>> 
>>  -  /* Loop through the lock requests. */
>>  +  /* Loop through the lock requests
>>  +     ### and do what? */
> 
> I just copied that comment from the old code. I probably should have
> removed it.

Well, I find it very useful to have a comment at the top of a non-trivial block of code summarizing what it's supposed to be doing. How about

    /* Parse PATH_REVS into TARGETS. */

here?

>>     for (i = 0; i < path_revs->nelts; ++i)
>>       {
>>         const char *path, *full_path;
>>  @@ -2759,6 +2760,8 @@ static svn_error_t *lock_many(svn_ra_svn
>>         target->current_rev = current_rev;
>> 
>>         /* We could check for duplicate paths and reject the request? */
>>  +      /* ### [JAF] Is that a question to reviewers? We could, but I
>>  +         don't think it's useful to do so. */
> 
> I brought up the handling of canonical paths on the dev list, it wasn't
> conclusive from my point of view so I just implemented something.  Given
> that FS generally accepts non-canonical paths it is hard to know what to
> do.

Agreed. Let's just update the comment here to say we decided not to?

Please could you mark comments with '###' or similar when they are questions to reviewers about the code or design being unfinished or questionable, etc.

- Julian

Re: Review of lock-many API

Posted by Philip Martin <ph...@wandisco.com>.
Julian Foad <ju...@btopenworld.com> writes:

> Philip Martin wrote:
>> Julian Foad writes:
>>>>  URL: http://svn.apache.org/r1577280
>>>>  * subversion/include/svn_fs.h
>>>>    (svn_fs_lock_target_t, svn_fs_lock_result_t,
>>>>     svn_fs_lock2, svn_fs_unlock2): new.
>>>> 
>>>>  * subversion/include/svn_repos.h
>>>>    (svn_repos_fs_lock2, svn_repos_fs_unlock2): new.
>>> 
>>>  Do we intend to deprecate the old versions? If not, then it would be
>>>  better to name the new functions something like ...lock_many() instead
>>>  of ...lock2().
>> 
>> I'm unsure.  In some ways it is convenient to have a single path
>> function as it avoids the need to construct hashes and the error
>> handling is simpler and less prone to leak.  On the other hand when
>> svn_client_copy4 introduced multi-path copy source the single path
>> version was deprecated.
>
> Thoughts...
>
> A low level API such as libsvn_fs should aim to be clean and
> efficient, more so than to provide functions that are convenient for
> the casual user. (A high level API such as libsvn_client or the
> bindings is a more appropriate place to provide convenience
> functions.) We do have a number of APIs already, in several libraries,
> where the caller is expected to pass in a list of targets even if they
> only want to operate on one. Maybe in general the thing to aim for is
> to make it easy for a caller to do so.
>
> We could provide a singleton constructor for the
> hash-of-svn_fs_lock_target_t argument. Should we? Probably not at the
> libsvn_fs level; maybe at a high level.

I'l look at doing that.

> I started reading further through the new code. Here are some more
> review comments.
>
> The 'comment' parameter to svn_fs_lock2 should be a member of the
> 'target' struct, since it is inherently a per-target attribute. Of
> course there are common scenarios where a client wants to lock a bunch
> of paths all with the same comment, but the benefits of a lock-many
> API should also be made available to clients which supply different
> comments. This applies all the way up the call stack. However, I note
> that from the RA layer upwards we have had lock-many APIs since v1.2
> which take a single comment for all the paths, so changing to
> per-target comments throughout the stack is perhaps beyond the scope
> of this issue. We could still do it at the FS and repos layers now in
> preparation; I don't see that it would add significant overhead in
> run-time or in maintenance.

I think per-lock comments in a multi-path lock are unneccesary.  I don't
think anybody locking dozens/hundreds of files will bother defining
different comments.  Anyone who wants two or three locks with distinct
comments can call the function multiple times.

> In principle, the 'is_dav_comment' and 'expiration_date' parameters
> should similarly be per target, but that makes less sense in practice
> as they're only used for generic DAV clients via mod_dav_svn. As an
> alternative, we might consider dropping them from this API and keeping
> the original single-lock API (not deprecated) to cater for such
> locks. RA-local and svnserve and 'svnadmin lock' don't support
> expiration and always pass is_dav_comment=FALSE. Only mod_dav_svn
> supplies these two options, and it currently only uses the old
> one-lock-at-a-time API anyway.

Eventually we will define a new request and mod_dav_svn will start
calling the multiple path version.  It will still handle the current
LOCK request but may use the multi-path function to do it.

> We should provide a destructor for the hash-of-svn_fs_lock_result_t
> results, which contains errors that need to be cleared. svnserve's
> lock_many() already contains such code twice. It's simple code --
> around 5 lines -- but if we're demanding that callers ensure they do
> it, it's nice to provide a ready-made way for them to do it. This
> would also help in future-proofing against us revising the API in
> later releases.

I have a look at that.

> The semantics relating to returning one error per path and an overall
> error needs to be fully documented and/or changed. For example,
> svn_repos_fs_lock() assumes that svn_repos_fs_lock2() has set *results
> to a valid hash if it returns any overall error, while svn_fs_lock()
> assumes that svn_fs_lock2() WON'T have put any error in *results if it
> returns an error overall. These look at least surprising.
>

That's an error in libsvn_fs, libsvn_repos does it right.  Fixed in
r1582845.

> +  /* [JAF] Is that a question to reviewers? It depends. What errors can
> +           svn_fs_lock2 return, and do any of them justify not running the
> +           post-lock hook? Can it even return an error and also create
> +           locks? Its doc string needs to say. */
>    if (err)
>      return svn_error_trace(err);

It's almost always possible for fs functions to return arbitrary errors
due to filesystem permissions, lack of disk space etc.  This could
happen part way through creating locks.  I'm not sure whether we should
attempt to run the post-hook.


> Index: subversion/svnserve/serve.c
> ===================================================================
> --- subversion/svnserve/serve.c    (revision 1582225)
> +++ subversion/svnserve/serve.c    (working copy)
> @@ -2733,7 +2733,8 @@ static svn_error_t *lock_many(svn_ra_svn
>       an error. */
>    SVN_ERR(must_have_access(conn, pool, b, svn_authz_write, NULL, TRUE));
>
> -  /* Loop through the lock requests. */
> +  /* Loop through the lock requests
> +     ### and do what? */

I just copied that comment from the old code. I probably should have
removed it.

>    for (i = 0; i < path_revs->nelts; ++i)
>      {
>        const char *path, *full_path;
> @@ -2759,6 +2760,8 @@ static svn_error_t *lock_many(svn_ra_svn
>        target->current_rev = current_rev;
>
>        /* We could check for duplicate paths and reject the request? */
> +      /* ### [JAF] Is that a question to reviewers? We could, but I
> +         don't think it's useful to do so. */

I brought up the handling of canonical paths on the dev list, it wasn't
conclusive from my point of view so I just implemented something.  Given
that FS generally accepts non-canonical paths it is hard to know what to
do.

>        svn_hash_sets(targets, full_path, target);
>      }
>
> @@ -2767,6 +2770,11 @@ static svn_error_t *lock_many(svn_ra_svn
>
>    /* From here on we need to make sure any errors in authz_results, or
>       results, are cleared before returning from this function. */
> +
> +  /* Check authz access for each target, because ...
> +     ### Why? We don't want svn_repos_fs_lock2() to do this for us ...?
> +         Or, we want to log such errors and it's easier to do so before
> +         than afterwards? */

That's the way servers/repos functions work.

>    for (hi = apr_hash_first(pool, targets); hi; hi = apr_hash_next(hi))
>      {
>        const char *full_path = svn__apr_hash_index_key(hi);
> @@ -2791,7 +2799,8 @@ static svn_error_t *lock_many(svn_ra_svn
>                             0, /* No expiration time. */
>                             steal_lock, pool, subpool);
>
> -  /* The client expects results in the same order as paths were supplied. */
> +  /* Send the results. The client expects results in the same order as
> +     paths were supplied. */
>    for (i = 0; i < path_revs->nelts; ++i)
>      {
>        const char *path, *full_path;
> @@ -2816,6 +2825,16 @@ static svn_error_t *lock_many(svn_ra_svn
>          result = svn_hash_gets(authz_results, full_path);
>        if (!result)
>          /* No result?  Should we return some sort of placeholder error? */
> +        /* ### [JAF] Is that a question to reviewers? Certainly something
> +           has gone wrong. Maybe svn_repos_fs_lock2 returned an error
> +           overall, having processed none or only some paths -- is it
> +           allowed to do so?

It could run out of disk space half way through.

> Breaking here would signal to the client that
> +           something went wrong, because they'll see a too-short response
> +           list, but would also potentially hide information about further
> +           targets that were processed, some of which perhaps were locked
> +           successfully. (Note: we're scanning them here in a different
> +           order from the order in which svn_repos_fs_lock2() processed
> +           them.) */
>          break;
>
>        if (result->err)


-- 
Philip Martin | Subversion Committer
WANdisco // *Non-Stop Data*