You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Patrick Steinhardt <pa...@elegosoft.com> on 2016/11/09 11:54:08 UTC

[PATCH v3] Reject checkouts to existing directory

Hi,

this is version three of my patch regarding checkouts to existing
directories. Thanks for the feedback on the previous two patches.

This patch requires my currently in-flight patch
`svn_client_list4: accept `NULL patterns'. If it will not go into
trunk, I'll convert `verify_checkout_target` to use an empty
array instead of passing `NULL`.

The new version fixes an issue where I did not check correctly
whether the target directory is empty (that is I previously used
`svn_path_is_empty`, where I now correctly use
`svn_io_dir_empty`). Furthermore, I now check if the target is a
working copy and, if so, compare UUID and relative paths of the
wc and remote repository. These changes allowed me to drop all
previously required '--force' flags in our test suite, indicating
the changes now match more closely with existing use cases.

I stopped short of adding a new '--force-obstructed-checkout'
flag, which might be used instead of '--force' if the new checks
reject the checkout. After reading '--force's doc string, it
should actually do exactly what the new flag would be doing.
Furthermore, I no hope that it's not required that frequently.

Regards
Patrick

[[[
checkout_cmd: refuse obstructed checkouts

When a new checkout is done where the target dirctory already
exists, subversion will usually create a lot of tree conflicts
which are intimidating, especially to new users. This behavior
stems from release 1.8, where we started accepting checkouts to
existing directories without any safety-checks.

This patch changes semantics in that it introduces new safety
checks so that the user does not accidentally shoots himself into
the foot. We now only allow checkouts if one of the following
conditions holds true:

- the target directory does not exist
- the target directory is empty
- the target directory is a repository and has the same UUID and
  relative path as the repository that is to be checked out
- the repository to check out is empty
- the --force flag is given

The main use case solved by the above conditions is for
converting existing directories into a repository when the
repository is newly created as well as resuming checkouts.

* subversion/svn/checkout-cmd.c:
  (listing_cb): New callback to check whether the remote
                repository is empty.
  (verify_checkout_target): New function to check whether the
                            target checkout directory is a valid
                            one.
  (svn_cl__checkout): Now calls `verify_checkout_target` if no
                      --force flag is specified.
* subversion/tests/cmdline/checkout_tests.py:
  (checkout_with_obstructions): Replace old test and now assert
                                that subversion refuses to
                                checkout to non-empty dirs.
]]]
-- 
Patrick Steinhardt, Entwickler

elego Software Solutions GmbH, http://www.elego.de
Gebäude 12 (BIG), Gustav-Meyer-Allee 25, 13355 Berlin, Germany

Sitz der Gesellschaft: Berlin, USt-IdNr.: DE 163214194
Handelsregister: Amtsgericht Charlottenburg HRB 77719
Geschäftsführer: Olaf Wagner

Re: [PATCH v3] Reject checkouts to existing directory

Posted by Patrick Steinhardt <pa...@elegosoft.com>.
On Wed, Nov 09, 2016 at 05:09:17PM +0000, Daniel Shahaf wrote:
> Patrick Steinhardt wrote on Wed, Nov 09, 2016 at 12:54:08 +0100:
> > [[[
> > checkout_cmd: refuse obstructed checkouts
> 
> The first paragraph of the log message should be a complete sentence.
> (Capitalize when appropriate, use a fullstop, and there's no length
> limit.)
> 
> Nitpick: 'checkout_cmd' isn't a thing; I'd probably have written «'svn
> checkout':» as the up-to-the-colon part.

Will reword this to "svn checkout: Refuse checking out to
obstructed target directories." when re-rolling this patch. Will
await more feedback first, though.

Regards
Patrick
-- 
Patrick Steinhardt, Entwickler

elego Software Solutions GmbH, http://www.elego.de
Gebäude 12 (BIG), Gustav-Meyer-Allee 25, 13355 Berlin, Germany

Sitz der Gesellschaft: Berlin, USt-IdNr.: DE 163214194
Handelsregister: Amtsgericht Charlottenburg HRB 77719
Geschäftsführer: Olaf Wagner

Re: [PATCH v3] Reject checkouts to existing directory

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Patrick Steinhardt wrote on Wed, Nov 09, 2016 at 12:54:08 +0100:
> [[[
> checkout_cmd: refuse obstructed checkouts

The first paragraph of the log message should be a complete sentence.
(Capitalize when appropriate, use a fullstop, and there's no length
limit.)

Nitpick: 'checkout_cmd' isn't a thing; I'd probably have written �'svn
checkout':� as the up-to-the-colon part.

Cheers,

Daniel

Re: [PATCH v3] Reject checkouts to existing directory

Posted by Julian Foad <ju...@foad.me.uk>.
Patrick Steinhardt wrote:
> short ping. Anybody got time to review this?

As somebody "watching from the sidelines" a.k.a. "lurking", I just want 
to say thank you for developing this enhancement, and for driving it by 
giving a gentle ping like this when needed. I particularly appreciate 
how you carefully explained what you have changed in this patch compared 
to the previous version.

- Julian

Re: [PATCH v3] Reject checkouts to existing directory

Posted by Patrick Steinhardt <ps...@pks.im>.
Hi,

short ping. Anybody got time to review this?

Regards
Patrick

On Wed, Nov 09, 2016 at 12:54:08PM +0100, Patrick Steinhardt wrote:
> Hi,
> 
> this is version three of my patch regarding checkouts to existing
> directories. Thanks for the feedback on the previous two patches.
> 
> This patch requires my currently in-flight patch
> `svn_client_list4: accept `NULL patterns'. If it will not go into
> trunk, I'll convert `verify_checkout_target` to use an empty
> array instead of passing `NULL`.
> 
> The new version fixes an issue where I did not check correctly
> whether the target directory is empty (that is I previously used
> `svn_path_is_empty`, where I now correctly use
> `svn_io_dir_empty`). Furthermore, I now check if the target is a
> working copy and, if so, compare UUID and relative paths of the
> wc and remote repository. These changes allowed me to drop all
> previously required '--force' flags in our test suite, indicating
> the changes now match more closely with existing use cases.
> 
> I stopped short of adding a new '--force-obstructed-checkout'
> flag, which might be used instead of '--force' if the new checks
> reject the checkout. After reading '--force's doc string, it
> should actually do exactly what the new flag would be doing.
> Furthermore, I no hope that it's not required that frequently.
> 
> Regards
> Patrick
> 
> [[[
> checkout_cmd: refuse obstructed checkouts
> 
> When a new checkout is done where the target dirctory already
> exists, subversion will usually create a lot of tree conflicts
> which are intimidating, especially to new users. This behavior
> stems from release 1.8, where we started accepting checkouts to
> existing directories without any safety-checks.
> 
> This patch changes semantics in that it introduces new safety
> checks so that the user does not accidentally shoots himself into
> the foot. We now only allow checkouts if one of the following
> conditions holds true:
> 
> - the target directory does not exist
> - the target directory is empty
> - the target directory is a repository and has the same UUID and
>   relative path as the repository that is to be checked out
> - the repository to check out is empty
> - the --force flag is given
> 
> The main use case solved by the above conditions is for
> converting existing directories into a repository when the
> repository is newly created as well as resuming checkouts.
> 
> * subversion/svn/checkout-cmd.c:
>   (listing_cb): New callback to check whether the remote
>                 repository is empty.
>   (verify_checkout_target): New function to check whether the
>                             target checkout directory is a valid
>                             one.
>   (svn_cl__checkout): Now calls `verify_checkout_target` if no
>                       --force flag is specified.
> * subversion/tests/cmdline/checkout_tests.py:
>   (checkout_with_obstructions): Replace old test and now assert
>                                 that subversion refuses to
>                                 checkout to non-empty dirs.
> ]]]
> -- 
> Patrick Steinhardt, Entwickler
> 
> elego Software Solutions GmbH, http://www.elego.de
> Gebäude 12 (BIG), Gustav-Meyer-Allee 25, 13355 Berlin, Germany
> 
> Sitz der Gesellschaft: Berlin, USt-IdNr.: DE 163214194
> Handelsregister: Amtsgericht Charlottenburg HRB 77719
> Geschäftsführer: Olaf Wagner

> diff --git a/subversion/svn/checkout-cmd.c b/subversion/svn/checkout-cmd.c
> index 56fd02b..b282248 100644
> --- a/subversion/svn/checkout-cmd.c
> +++ b/subversion/svn/checkout-cmd.c
> @@ -61,6 +61,144 @@
>    Is this the same as CVS?  Does it matter if it is not?
>  */
>  
> +static svn_error_t *
> +listing_cb(void *baton,
> +           const char *path,
> +           const svn_dirent_t *dirent,
> +           const svn_lock_t *lock,
> +           const char *abs_path,
> +           const char *external_parent_url,
> +           const char *external_target,
> +           apr_pool_t *scratch_pool)
> +{
> +  size_t *count = (size_t *) baton;
> +
> +  (*count)++;
> +
> +  return SVN_NO_ERROR;
> +}
> +
> +/* Check if the target directory which should be checked out to
> + * is a valid target directory. A target directory is valid if we
> + * are sure that a checkout to the directory will not create any
> + * unexpected situations for the user. This is the case if one of
> + * the following criteria is true:
> + *
> + * - the target directory does not exist
> + * - the target directory is empty
> + * - the target directory is the same repository with the same
> + *   relative URL as the one that is to be checked out (e.g. we
> + *   resume a checkout)
> + * - the repository that is to be checked out is empty
> + */
> +static svn_error_t *
> +verify_checkout_target(const char *repo_url,
> +                       const char *target_dir,
> +                       const svn_opt_revision_t *peg_revision,
> +                       const svn_opt_revision_t *revision,
> +                       svn_client_ctx_t *ctx,
> +                       apr_pool_t *pool)
> +{
> +  svn_node_kind_t kind;
> +  apr_pool_t *scratch_pool;
> +  const char *absolute_path;
> +  size_t count = 0;
> +  svn_wc_status3_t *status;
> +  svn_error_t *error;
> +  svn_boolean_t empty;
> +
> +  scratch_pool = svn_pool_create(pool);
> +
> +  SVN_ERR(svn_dirent_get_absolute(&absolute_path,
> +                                  target_dir,
> +                                  pool));
> +
> +  SVN_ERR(svn_io_check_path(absolute_path,
> +                            &kind,
> +                            pool));
> +
> +  /* If the directory does not exist yet, it will be created
> +   * anew and is thus considered valid. */
> +  if (kind == svn_node_none)
> +    return SVN_NO_ERROR;
> +
> +  /* Checking out to a file or symlink cannot work. */
> +  if (kind != svn_node_dir)
> +    return svn_error_createf(
> +        SVN_ERR_ILLEGAL_TARGET,
> +        NULL,
> +        _("Rejecting checkout to existing %s '%s'"),
> +        svn_node_kind_to_word(kind),
> +        absolute_path);
> +
> +  /* Checking out to a non-empty directory will create
> +   * tree-conflicts, which is usually not what the user wants. */
> +  SVN_ERR(svn_io_dir_empty(&empty, absolute_path, scratch_pool));
> +  if (empty)
> +    return SVN_NO_ERROR;
> +
> +  error = svn_wc_status3(&status,
> +                         ctx->wc_ctx,
> +                         absolute_path,
> +                         pool,
> +                         scratch_pool);
> +
> +  /* If the remote repository UUID and working copy UUID are the
> +   * same and if the relative paths inside the repository match,
> +   * we assume that the command is a resumed checkout. */
> +  if (error == SVN_NO_ERROR)
> +    {
> +      const char *repo_relpath, *repo_root, *repo_uuid;
> +
> +      SVN_ERR(svn_client_get_repos_root(&repo_root,
> +                                        &repo_uuid,
> +                                        repo_url,
> +                                        ctx,
> +                                        pool,
> +                                        scratch_pool));
> +
> +      repo_relpath = svn_uri_skip_ancestor(repo_root,
> +                                           repo_url,
> +                                           scratch_pool);
> +
> +      if (repo_relpath
> +          && !strcmp(status->repos_uuid, repo_uuid)
> +          && !strcmp(status->repos_relpath, repo_relpath))
> +        return SVN_NO_ERROR;
> +    }
> +  else
> +    {
> +      svn_error_clear(error);
> +    }
> +
> +  SVN_ERR(svn_client_list4(repo_url,
> +                           peg_revision,
> +                           revision,
> +                           NULL,
> +                           svn_depth_immediates,
> +                           0,
> +                           FALSE,
> +                           FALSE,
> +                           listing_cb,
> +                           &count,
> +                           ctx,
> +                           pool));
> +
> +  /* If the remote respotiory has more than one file (that is, it
> +   * is not an empty root directory only), we refuse to check out
> +   * into a non-empty target directory. Otherwise Subversion
> +   * would create tree conflicts. */
> +  if (count > 1)
> +    return svn_error_createf(
> +      SVN_ERR_ILLEGAL_TARGET,
> +      NULL,
> +      _("Rejecting checkout of non-empty repository into non-empty directory '%s'"),
> +      absolute_path);
> +
> +  svn_pool_clear(scratch_pool);
> +
> +  return SVN_NO_ERROR;
> +}
>  
>  /* This implements the `svn_opt_subcommand_t' interface. */
>  svn_error_t *
> @@ -165,6 +303,16 @@ svn_cl__checkout(apr_getopt_t *os,
>            revision.kind = svn_opt_revision_head;
>        }
>  
> +      if (! opt_state->force)
> +        {
> +          SVN_ERR(verify_checkout_target(true_url,
> +                                         target_dir,
> +                                         &peg_revision,
> +                                         &revision,
> +                                         ctx,
> +                                         subpool));
> +        }
> +
>        SVN_ERR(svn_client_checkout3
>                (NULL, true_url, target_dir,
>                 &peg_revision,
> diff --git a/subversion/tests/cmdline/checkout_tests.py b/subversion/tests/cmdline/checkout_tests.py
> index 49165e7..93415b9 100755
> --- a/subversion/tests/cmdline/checkout_tests.py
> +++ b/subversion/tests/cmdline/checkout_tests.py
> @@ -158,94 +158,20 @@ def make_local_tree(sbox, mod_files=False, add_unversioned=False):
>  #----------------------------------------------------------------------
>  
>  def checkout_with_obstructions(sbox):
> -  """co with obstructions conflicts without --force"""
> +  """obstructed co without --force"""
>  
>    make_local_tree(sbox, False, False)
>  
> -  #svntest.factory.make(sbox,
> -  #       """# Checkout with unversioned obstructions lying around.
> -  #          svn co url wc_dir
> -  #          svn status""")
> -  #svntest.factory.make(sbox,
> -  #       """# Now see to it that we can recover from the obstructions.
> -  #          rm -rf A iota
> -  #          svn up""")
> -  #exit(0)
> -
>    wc_dir = sbox.wc_dir
>    url = sbox.repo_url
>  
>    # Checkout with unversioned obstructions causes tree conflicts.
>    # svn co url wc_dir
> -  expected_output = svntest.wc.State(wc_dir, {
> -    'iota'              : Item(status='  ', treeconflict='C'),
> -    'A'                 : Item(status='  ', treeconflict='C'),
> -    # And the updates below the tree conflict
> -    'A/D'               : Item(status='  ', treeconflict='A'),
> -    'A/D/gamma'         : Item(status='  ', treeconflict='A'),
> -    'A/D/G'             : Item(status='  ', treeconflict='A'),
> -    'A/D/G/rho'         : Item(status='  ', treeconflict='A'),
> -    'A/D/G/pi'          : Item(status='  ', treeconflict='A'),
> -    'A/D/G/tau'         : Item(status='  ', treeconflict='A'),
> -    'A/D/H'             : Item(status='  ', treeconflict='A'),
> -    'A/D/H/chi'         : Item(status='  ', treeconflict='A'),
> -    'A/D/H/omega'       : Item(status='  ', treeconflict='A'),
> -    'A/D/H/psi'         : Item(status='  ', treeconflict='A'),
> -    'A/B'               : Item(status='  ', treeconflict='A'),
> -    'A/B/E'             : Item(status='  ', treeconflict='A'),
> -    'A/B/E/beta'        : Item(status='  ', treeconflict='A'),
> -    'A/B/E/alpha'       : Item(status='  ', treeconflict='A'),
> -    'A/B/F'             : Item(status='  ', treeconflict='A'),
> -    'A/B/lambda'        : Item(status='  ', treeconflict='A'),
> -    'A/C'               : Item(status='  ', treeconflict='A'),
> -    'A/mu'              : Item(status='  ', treeconflict='A'),
> -  })
> +  expected_err = ".*Rejecting checkout of non-empty repository into non-empty directory.*"
>  
>    expected_disk = svntest.main.greek_state.copy()
> -  expected_disk.remove('A/B', 'A/B/E', 'A/B/E/beta', 'A/B/E/alpha', 'A/B/F',
> -    'A/B/lambda', 'A/D', 'A/D/G', 'A/D/G/rho', 'A/D/G/pi', 'A/D/G/tau',
> -    'A/D/H', 'A/D/H/psi', 'A/D/H/omega', 'A/D/H/chi', 'A/D/gamma', 'A/C')
> -
> -  actions.run_and_verify_checkout(url, wc_dir, expected_output,
> -                                  expected_disk)
> -
> -  # svn status
> -  expected_status = actions.get_virginal_state(wc_dir, 1)
> -  # A and iota are tree conflicted and obstructed
> -  expected_status.tweak('A', 'iota', status='D ', wc_rev=1,
> -                        treeconflict='C')
> -
> -  expected_status.tweak('A/D', 'A/D/G', 'A/D/G/rho', 'A/D/G/pi', 'A/D/G/tau',
> -    'A/D/H', 'A/D/H/chi', 'A/D/H/omega', 'A/D/H/psi', 'A/D/gamma', 'A/B',
> -    'A/B/E', 'A/B/E/beta', 'A/B/E/alpha', 'A/B/F', 'A/B/lambda', 'A/C',
> -    status='D ')
> -  # A/mu exists on disk, but is deleted
> -  expected_status.tweak('A/mu', status='D ')
> -
> -  actions.run_and_verify_unquiet_status(wc_dir, expected_status)
> -
> -
> -  # Now see to it that we can recover from the obstructions.
> -  # rm -rf A iota
> -  svntest.main.safe_rmtree( os.path.join(wc_dir, 'A') )
> -  os.remove( os.path.join(wc_dir, 'iota') )
> -
> -
> -  svntest.main.run_svn(None, 'revert', '-R', os.path.join(wc_dir, 'A'),
> -                       os.path.join(wc_dir, 'iota'))
> -
> -  # svn up
> -  expected_output = svntest.wc.State(wc_dir, {
> -  })
> -
> -  expected_disk = svntest.main.greek_state.copy()
> -
> -  expected_status = actions.get_virginal_state(wc_dir, 1)
> -
> -  actions.run_and_verify_update(wc_dir, expected_output, expected_disk,
> -                                expected_status,)
> -
>  
> +  actions.run_and_verify_svn(None, expected_err, "co", url, wc_dir)
>  
>  #----------------------------------------------------------------------
>  




Re: [PATCH v3] Reject checkouts to existing directory

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Patrick Steinhardt wrote on Mon, Nov 21, 2016 at 11:57:01 +0100:
> On Sat, Nov 19, 2016 at 05:53:26AM +0000, Daniel Shahaf wrote:
> > Patrick Steinhardt wrote on Wed, Nov 09, 2016 at 12:54:08 +0100:
> > > +  if (kind != svn_node_dir)
> > > +    return svn_error_createf(
> > > +        SVN_ERR_ILLEGAL_TARGET,
> > > +        NULL,
> > > +        _("Rejecting checkout to existing %s '%s'"),
> > > +        svn_node_kind_to_word(kind),
> > 
> > Would using 'kind' through %s cause a problem for translators?
> > It would result in a nominative English noun being used in a dative
> > local-language context.
> 
> Is there any recommendation what to use instead here?

I think best practice in translations is to repeat the full string:

    "Rejecting checkout to existing file '%s'"
    "Rejecting checkout to existing directory '%s'"

> > > +  svn_pool_clear(scratch_pool);
> > > +
> > 
> > This pool usage is not idiomatic.
> > 
> > Does this function (verify_checkout_target()) return a value to its
> > caller?  If it does, then let its signature have two pools.  If it does
> > not, then let it take a single 'apr_pool_t *scratch_pool' argument, use
> > that pool for both pool arguments of callees, and leave that pool for
> > its (verify_checkout_target()'s) caller to clear.  (In this case,
> > probably at apr_terminate().)
> 
> No, `verify_checkout_target()` does not return a value, it only
> indicates an error if the target is not valid. The reason why I
> created the scratch pool was that I have to pass both a result
> and scratch pool to `svn_client_get_repos_root`, where I wasn't
> exactly sure which semantics one may expect when passing the same
> pool as both result and scratch pool to a callee.
> 
> So if I understand it correctly I can expect
> `svn_client_get_repos_root` to behave correct in that case, so I
> can pass a scratch pool twice?

Yes, you can, and it'll work.

The rule is that a 'scratch_pool' argument may be cleared by the
*caller* at any time after the call.  Callees don't clear pools passed
to them by callers.

> Thanks for your review.

You're welcome.

Cheers,

Daniel

Re: [PATCH v3] Reject checkouts to existing directory

Posted by Patrick Steinhardt <ps...@pks.im>.
On Sat, Nov 19, 2016 at 05:53:26AM +0000, Daniel Shahaf wrote:
> Patrick Steinhardt wrote on Wed, Nov 09, 2016 at 12:54:08 +0100:
> > +++ b/subversion/svn/checkout-cmd.c
> > @@ -61,6 +61,144 @@
> >    Is this the same as CVS?  Does it matter if it is not?
> >  */
> >  
> > +static svn_error_t *
> > +listing_cb(void *baton,
> > +           const char *path,
> > +           const svn_dirent_t *dirent,
> > +           const svn_lock_t *lock,
> > +           const char *abs_path,
> > +           const char *external_parent_url,
> > +           const char *external_target,
> > +           apr_pool_t *scratch_pool)
> > +{
> > +  size_t *count = (size_t *) baton;
> > +
> > +  (*count)++;
> > +
> 
> Would it make sense, at this point, to raise an error
> (SVN_ERR_CEASE_INVOCATION or SVN_ERR_ITER_BREAK) if count>1, so as to
> not iterate the remaining dirents?

Yes, it does make sense indeed.

> 
> > +  return SVN_NO_ERROR;
> > +}
> > +
> > +/* Check if the target directory which should be checked out to
> > + * is a valid target directory. A target directory is valid if we
> > + * are sure that a checkout to the directory will not create any
> > + * unexpected situations for the user. This is the case if one of
> > + * the following criteria is true:
> > + *
> > + * - the target directory does not exist
> > + * - the target directory is empty
> > + * - the target directory is the same repository with the same
> > + *   relative URL as the one that is to be checked out (e.g. we
> > + *   resume a checkout)
> > + * - the repository that is to be checked out is empty
> > + */
> > +static svn_error_t *
> > +verify_checkout_target(const char *repo_url,
> > +                       const char *target_dir,
> > +                       const svn_opt_revision_t *peg_revision,
> > +                       const svn_opt_revision_t *revision,
> > +                       svn_client_ctx_t *ctx,
> > +                       apr_pool_t *pool)
> > +{
> > +  svn_node_kind_t kind;
> > +  apr_pool_t *scratch_pool;
> > +  const char *absolute_path;
> > +  size_t count = 0;
> > +  svn_wc_status3_t *status;
> > +  svn_error_t *error;
> 
> Variables of type 'svn_error_t *' are conventionally named 'err'.

Okay.

> > +  svn_boolean_t empty;
> > +
> > +  scratch_pool = svn_pool_create(pool);
> > +
> > +  SVN_ERR(svn_dirent_get_absolute(&absolute_path,
> > +                                  target_dir,
> > +                                  pool));
> > +
> > +  SVN_ERR(svn_io_check_path(absolute_path,
> > +                            &kind,
> > +                            pool));
> > +
> > +  /* If the directory does not exist yet, it will be created
> > +   * anew and is thus considered valid. */
> > +  if (kind == svn_node_none)
> > +    return SVN_NO_ERROR;
> > +
> > +  /* Checking out to a file or symlink cannot work. */
> > +  if (kind != svn_node_dir)
> > +    return svn_error_createf(
> > +        SVN_ERR_ILLEGAL_TARGET,
> > +        NULL,
> > +        _("Rejecting checkout to existing %s '%s'"),
> > +        svn_node_kind_to_word(kind),
> 
> Would using 'kind' through %s cause a problem for translators?
> It would result in a nominative English noun being used in a dative
> local-language context.

Is there any recommendation what to use instead here?

> > +        absolute_path);
> 
> Paths in error messages should be converted to local style
> (with svn_dirent_local_style()).

Okay.

> > +  error = svn_wc_status3(&status,
> > +                         ctx->wc_ctx,
> > +                         absolute_path,
> > +                         pool,
> > +                         scratch_pool);
> > +
> > +  /* If the remote repository UUID and working copy UUID are the
> > +   * same and if the relative paths inside the repository match,
> > +   * we assume that the command is a resumed checkout. */
> > +  if (error == SVN_NO_ERROR)
> > +    {
> > +      if (repo_relpath
> > +          && !strcmp(status->repos_uuid, repo_uuid)
> > +          && !strcmp(status->repos_relpath, repo_relpath))
> > +        return SVN_NO_ERROR;
> > +    }
> > +  else
> > +    {
> > +      svn_error_clear(error);
> 
> Shouldn't you check for a specific error code here?  I.e.,
> 
>    if (err == SVN_NO_ERROR)
>        ⋮
>    else if (err && err->apr_err == SVN_ERR_WC_NOT_WORKING_COPY)
>        svn_error_clear();
>    else
>        SVN_ERR(err)
> 
> (possibly a few more error codes in the second branch)?
> 
> To catch the case that ABSOLUTE_PATH denotes a wedged working copy,
> for example.

Yeah, I was reasoning about the same thing here while I wrote
this. It probably makes sense to do.

> > +  SVN_ERR(svn_client_list4(repo_url,
> > +                           pool));
> > +
> > +  /* If the remote respotiory has more than one file (that is, it
> 
> Typo for "repository".
> 
> > +   * is not an empty root directory only), we refuse to check out
> > +   * into a non-empty target directory. Otherwise Subversion
> > +   * would create tree conflicts. */
> > +  if (count > 1)
> > +    return svn_error_createf(
> > +      SVN_ERR_ILLEGAL_TARGET,
> > +      NULL,
> > +      _("Rejecting checkout of non-empty repository into non-empty directory '%s'"),
> 
> I think it would be more accurate to s/repository/repository URL/
> or s/repository/repository directory/.
> 
> > +      absolute_path);
> 
> Again svn_dirent_local_style().
> 
> > +
> > +  svn_pool_clear(scratch_pool);
> > +
> 
> This pool usage is not idiomatic.
> 
> Does this function (verify_checkout_target()) return a value to its
> caller?  If it does, then let its signature have two pools.  If it does
> not, then let it take a single 'apr_pool_t *scratch_pool' argument, use
> that pool for both pool arguments of callees, and leave that pool for
> its (verify_checkout_target()'s) caller to clear.  (In this case,
> probably at apr_terminate().)

No, `verify_checkout_target()` does not return a value, it only
indicates an error if the target is not valid. The reason why I
created the scratch pool was that I have to pass both a result
and scratch pool to `svn_client_get_repos_root`, where I wasn't
exactly sure which semantics one may expect when passing the same
pool as both result and scratch pool to a callee.

So if I understand it correctly I can expect
`svn_client_get_repos_root` to behave correct in that case, so I
can pass a scratch pool twice?

> The rationale for this is (quoting HACKING):
> 
>     Functions should not create/destroy pools for their operation; they
>     should use a pool provided by the caller. Again, the caller knows
>     more about how the function will be used, how often, how many times,
>     etc. thus, it should be in charge of the function's memory usage.
> 
> >
> 
> As to the actual business logic, it seems okay to me (meaning I'm +0),
> however, I'll leave it for others to review it in detail.
> 
> Thanks for the detailed comments and log message.
> 
> Cheers,
> 
> Daniel
> 
> P.S. Two more minor points: One, listing_cb() might grow a "This
> implements the `svn_client_list_func2_t' interface." docstring.  Two,
> I found the variable name "absolute_path" somewhat opaque; I would
> suggest a name based on the semantics, rather than the data type, e.g.,
> "target" or "candidate_target_directory".

Thanks for your review.

Regards
Patrick

Re: [PATCH v3] Reject checkouts to existing directory

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Patrick Steinhardt wrote on Wed, Nov 09, 2016 at 12:54:08 +0100:
> +++ b/subversion/svn/checkout-cmd.c
> @@ -61,6 +61,144 @@
>    Is this the same as CVS?  Does it matter if it is not?
>  */
>  
> +static svn_error_t *
> +listing_cb(void *baton,
> +           const char *path,
> +           const svn_dirent_t *dirent,
> +           const svn_lock_t *lock,
> +           const char *abs_path,
> +           const char *external_parent_url,
> +           const char *external_target,
> +           apr_pool_t *scratch_pool)
> +{
> +  size_t *count = (size_t *) baton;
> +
> +  (*count)++;
> +

Would it make sense, at this point, to raise an error
(SVN_ERR_CEASE_INVOCATION or SVN_ERR_ITER_BREAK) if count>1, so as to
not iterate the remaining dirents?

> +  return SVN_NO_ERROR;
> +}
> +
> +/* Check if the target directory which should be checked out to
> + * is a valid target directory. A target directory is valid if we
> + * are sure that a checkout to the directory will not create any
> + * unexpected situations for the user. This is the case if one of
> + * the following criteria is true:
> + *
> + * - the target directory does not exist
> + * - the target directory is empty
> + * - the target directory is the same repository with the same
> + *   relative URL as the one that is to be checked out (e.g. we
> + *   resume a checkout)
> + * - the repository that is to be checked out is empty
> + */
> +static svn_error_t *
> +verify_checkout_target(const char *repo_url,
> +                       const char *target_dir,
> +                       const svn_opt_revision_t *peg_revision,
> +                       const svn_opt_revision_t *revision,
> +                       svn_client_ctx_t *ctx,
> +                       apr_pool_t *pool)
> +{
> +  svn_node_kind_t kind;
> +  apr_pool_t *scratch_pool;
> +  const char *absolute_path;
> +  size_t count = 0;
> +  svn_wc_status3_t *status;
> +  svn_error_t *error;

Variables of type 'svn_error_t *' are conventionally named 'err'.

> +  svn_boolean_t empty;
> +
> +  scratch_pool = svn_pool_create(pool);
> +
> +  SVN_ERR(svn_dirent_get_absolute(&absolute_path,
> +                                  target_dir,
> +                                  pool));
> +
> +  SVN_ERR(svn_io_check_path(absolute_path,
> +                            &kind,
> +                            pool));
> +
> +  /* If the directory does not exist yet, it will be created
> +   * anew and is thus considered valid. */
> +  if (kind == svn_node_none)
> +    return SVN_NO_ERROR;
> +
> +  /* Checking out to a file or symlink cannot work. */
> +  if (kind != svn_node_dir)
> +    return svn_error_createf(
> +        SVN_ERR_ILLEGAL_TARGET,
> +        NULL,
> +        _("Rejecting checkout to existing %s '%s'"),
> +        svn_node_kind_to_word(kind),

Would using 'kind' through %s cause a problem for translators?
It would result in a nominative English noun being used in a dative
local-language context.

> +        absolute_path);

Paths in error messages should be converted to local style
(with svn_dirent_local_style()).

> +  error = svn_wc_status3(&status,
> +                         ctx->wc_ctx,
> +                         absolute_path,
> +                         pool,
> +                         scratch_pool);
> +
> +  /* If the remote repository UUID and working copy UUID are the
> +   * same and if the relative paths inside the repository match,
> +   * we assume that the command is a resumed checkout. */
> +  if (error == SVN_NO_ERROR)
> +    {
> +      if (repo_relpath
> +          && !strcmp(status->repos_uuid, repo_uuid)
> +          && !strcmp(status->repos_relpath, repo_relpath))
> +        return SVN_NO_ERROR;
> +    }
> +  else
> +    {
> +      svn_error_clear(error);

Shouldn't you check for a specific error code here?  I.e.,

   if (err == SVN_NO_ERROR)
       \u22ee
   else if (err && err->apr_err == SVN_ERR_WC_NOT_WORKING_COPY)
       svn_error_clear();
   else
       SVN_ERR(err)

(possibly a few more error codes in the second branch)?

To catch the case that ABSOLUTE_PATH denotes a wedged working copy,
for example.

> +  SVN_ERR(svn_client_list4(repo_url,
> +                           pool));
> +
> +  /* If the remote respotiory has more than one file (that is, it

Typo for "repository".

> +   * is not an empty root directory only), we refuse to check out
> +   * into a non-empty target directory. Otherwise Subversion
> +   * would create tree conflicts. */
> +  if (count > 1)
> +    return svn_error_createf(
> +      SVN_ERR_ILLEGAL_TARGET,
> +      NULL,
> +      _("Rejecting checkout of non-empty repository into non-empty directory '%s'"),

I think it would be more accurate to s/repository/repository URL/
or s/repository/repository directory/.

> +      absolute_path);

Again svn_dirent_local_style().

> +
> +  svn_pool_clear(scratch_pool);
> +

This pool usage is not idiomatic.

Does this function (verify_checkout_target()) return a value to its
caller?  If it does, then let its signature have two pools.  If it does
not, then let it take a single 'apr_pool_t *scratch_pool' argument, use
that pool for both pool arguments of callees, and leave that pool for
its (verify_checkout_target()'s) caller to clear.  (In this case,
probably at apr_terminate().)

The rationale for this is (quoting HACKING):

    Functions should not create/destroy pools for their operation; they
    should use a pool provided by the caller. Again, the caller knows
    more about how the function will be used, how often, how many times,
    etc. thus, it should be in charge of the function's memory usage.

>

As to the actual business logic, it seems okay to me (meaning I'm +0),
however, I'll leave it for others to review it in detail.

Thanks for the detailed comments and log message.

Cheers,

Daniel

P.S. Two more minor points: One, listing_cb() might grow a "This
implements the `svn_client_list_func2_t' interface." docstring.  Two,
I found the variable name "absolute_path" somewhat opaque; I would
suggest a name based on the semantics, rather than the data type, e.g.,
"target" or "candidate_target_directory".