You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Chia-liang Kao <cl...@clkao.org> on 2003/06/21 04:02:04 UTC

"svn commit" performance

Hi,

recently I noticed the svn commit performs poorly before
bringing up the editor. 

if i do a "svn co file:///repo/trunk svn.co", edit a file
in a deep directory, "svn diff" gives me the result immediately,
while it takes 3 or 4 seconds for "svn commit" to bring up the
editor. this is still acceptable.

but if i do "svn co file:///repo svn.co.full", edit the same file
in under trunk/'s deep direcotry. "svn diff trunk" still gives me
the result immediately. but now "svn commit trunk" takes about 7
seconds to fire the editor. manually applying 6301:6302,6312:6313
for 0.24.2 takes 10 seconds even.

I think maybe it's because the the commit code needs to traverse
to parent directory to find the root of wc, but other operation
should be the same as "svn diff".

it seems the number of tags made the difference, since i didn't 
notice this lag when i had no tags. but i think "svn commit trunk"
should be irrevelent to tags/ ?

there are ~2000 files and around 10MB in trunk, ~50 tags, 5 branches.

Cheers,
CLK

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: "svn commit" performance

Posted by kf...@collab.net.
Chia-liang Kao <cl...@clkao.org> writes:
> I am thinking about the semantic having a repository checked out
> to a subdirectory of another repository's wc, since i made a
> patch for #695 whcih does some checks for the parents of the
> target path of checkout.
> 
> # svn co http://server1/path/project1 /tmp/project1
> # svn co http://server2/path/project2 /tmp/project1/lib/project2
> 
> maybe we could add svn:externals if we find that project2
> is being checked out somewhere inside project1's wc?

Ick, no :-).  svn:externals is confusing enough already, without it
being automatically applied behind the user's back.

I'm not saying I have a better idea, just that issue #695 is
Post-1.0...

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: "svn commit" performance

Posted by Chia-liang Kao <cl...@clkao.org>.
I am thinking about the semantic having a repository checked out
to a subdirectory of another repository's wc, since i made a
patch for #695 whcih does some checks for the parents of the
target path of checkout.

# svn co http://server1/path/project1 /tmp/project1
# svn co http://server2/path/project2 /tmp/project1/lib/project2

maybe we could add svn:externals if we find that project2
is being checked out somewhere inside project1's wc?

then have a svn_wc_condense_targets that does not condense 
subdirectories that are not from the sam repo.

anyway, maybe the commit tree lock should be fixed first.

Cheers,
CLK

On Sat, Jun 21, 2003 at 12:10:05PM -0500, cmpilato@collab.net wrote:
> >    'svn commit foo/bar foo/bar/zig/zag'
> > 
> > where foo/bar/zig/zag is a child of foo/bar, but comes from a
> > different working copy.  I believe that in the past Subversion
> > silently dropped the second argument and only committed foo/bar.
> 
> Ah.  Well, I can tell you why *that* happens.  Look no further than
> svn_path_condense_targets(), which is going to drop foo/bar/zig/zag as
> redundant in light of foo/bar and recursivity.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: "svn commit" performance

Posted by cm...@collab.net.
Greg Stein <gs...@lyra.org> writes:

> Sounds like we have to stop the "parse path/url arguments without context"
> stuff that's been going on. Seems like we run into a bug every now and then
> where we've been trying to handle some arguments without truly knowing what
> the heck they represent.
> 
> Assuming some level of agreement, then the next question is: how to get the
> right amount of context into the path condensing? Before each path can be
> condensed, it would almost seem there is a question: can I combine >these<
> two paths together? Some kind of a callback would have to say "woah! no,
> those are separate WCs"
> 
> Of course, if our committing code traversed mixed-repos WCs, then we could
> go ahead and allow the pathes to be condensed without worry.

Actually, my thinking led the opposite direction -- let's just stop
altogether removing redundancy up front from the set of input paths.
If we instead teach our tool to quickly recognize redundant targets as
we are processing them deep in the code, I think we'll be in much
better shape.  

We can still do things to optimize this.  For example, for commits
(and other operations where the order of targets doesn't matter) we
can sort the targets in a depth-first order.  This way the largest
portions of the WCs get locked up front, which makes redundant input
in those working copies easy to spot (we'll be able to get the
adm_access baton from an existing set).

Anyway, just spouting off ideas with no code to back them yet.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: "svn commit" performance

Posted by Greg Stein <gs...@lyra.org>.
On Sat, Jun 21, 2003 at 12:10:05PM -0500, cmpilato@collab.net wrote:
> Philip Martin <ph...@codematters.co.uk> writes:
> > I added a the regression test commit_multiple_wc.  It's basically
> > 
> >    'svn commit foo/bar foo/bar/zig/zag'
> > 
> > where foo/bar/zig/zag is a child of foo/bar, but comes from a
> > different working copy.  I believe that in the past Subversion
> > silently dropped the second argument and only committed foo/bar.
> 
> Ah.  Well, I can tell you why *that* happens.  Look no further than
> svn_path_condense_targets(), which is going to drop foo/bar/zig/zag as
> redundant in light of foo/bar and recursivity.
> 
> Hm.  So your solution did the job, but at the cost of desired
> functionality.
> 
> I will think on these things.

Sounds like we have to stop the "parse path/url arguments without context"
stuff that's been going on. Seems like we run into a bug every now and then
where we've been trying to handle some arguments without truly knowing what
the heck they represent.

Assuming some level of agreement, then the next question is: how to get the
right amount of context into the path condensing? Before each path can be
condensed, it would almost seem there is a question: can I combine >these<
two paths together? Some kind of a callback would have to say "woah! no,
those are separate WCs"

Of course, if our committing code traversed mixed-repos WCs, then we could
go ahead and allow the pathes to be condensed without worry.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: "svn commit" performance

Posted by cm...@collab.net.
Philip Martin <ph...@codematters.co.uk> writes:

> cmpilato@collab.net writes:
> 
> > Hm.  So your solution did the job, but at the cost of desired
> > functionality.
> 
> It was a stop-gap which was very simple to implement.  As I recall,
> when the problem came up we agreed that an error was better than
> simply ignoring one of the targets.

You know, I realize only now how cold my last mail probably read.  Not
intended -- I'm not criticizing the change you made.  Just didn't
understand it.  I know you don't code without purpose.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: "svn commit" performance

Posted by Philip Martin <ph...@codematters.co.uk>.
cmpilato@collab.net writes:

> Hm.  So your solution did the job, but at the cost of desired
> functionality.

It was a stop-gap which was very simple to implement.  As I recall,
when the problem came up we agreed that an error was better than
simply ignoring one of the targets.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: "svn commit" performance

Posted by cm...@collab.net.
Philip Martin <ph...@codematters.co.uk> writes:

> I added a the regression test commit_multiple_wc.  It's basically
> 
>    'svn commit foo/bar foo/bar/zig/zag'
> 
> where foo/bar/zig/zag is a child of foo/bar, but comes from a
> different working copy.  I believe that in the past Subversion
> silently dropped the second argument and only committed foo/bar.

Ah.  Well, I can tell you why *that* happens.  Look no further than
svn_path_condense_targets(), which is going to drop foo/bar/zig/zag as
redundant in light of foo/bar and recursivity.

Hm.  So your solution did the job, but at the cost of desired
functionality.

I will think on these things.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: "svn commit" performance

Posted by Philip Martin <ph...@codematters.co.uk>.
cmpilato@collab.net writes:

>   +  /* One day we might support committing from multiple working copies, but
>   +     we don't yet.  This check ensures that we don't silently commit a
>   +     subset of the targets */
>   +  for (i = 0; i < targets->nelts; ++i)
>   +    {
>   +      svn_wc_adm_access_t *adm_access;
>   +      const char *target;
>   +          SVN_ERR (svn_path_get_absolute (&target,
>   +                                          ((const char **)targets->elts)[i],
>   +                                          pool));
>   +      SVN_ERR_W (svn_wc_adm_probe_retrieve (&adm_access, base_dir_access,
>   +                                            target, pool),
>   +                 "Are all the targets part of the same working copy?");
>   +    }
>   +
>
> Using my new locking method, I don't get this kind of sanity check for
> free any more.  But then, I don't know what prompted him to make this
> change, since it's perfectly okay for 'svn commit' to commit from
> multiple working copies (by my design, in fact).  The commit code is
> driven off of URLs, not on-disk paths, so perhaps Philip can page back
> through the annals of time and tell me what was bugging him when he
> added this code?

I added a the regression test commit_multiple_wc.  It's basically

   'svn commit foo/bar foo/bar/zig/zag'

where foo/bar/zig/zag is a child of foo/bar, but comes from a
different working copy.  I believe that in the past Subversion
silently dropped the second argument and only committed foo/bar.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] "svn commit" performance

Posted by Chia-liang Kao <cl...@clkao.org>.
On Fri, Jul 11, 2003 at 12:46:06AM +0100, Philip Martin wrote:
> > So is it safe for an access baton just to ignore the already locked
> > descendant items for the current tree locking?
> 
> Don't ignore them, but use them if they are already locked.  In most
> places the wc code is written so that it "knows" whether a directory
> is locked or not, and errors occur if it attempts to lock something
> twice or if it attempts use a lock that does not exist.  I think this
> is the correct behaviour in general, but for this particular function
> I propose to relax it.

in this case i think the do_open function needs to have another flag
for accumulation. also need to have new access type saying this
directory is tree-locked.

so if /foo/bar/baz is locked (access type write_lock), accumulative tree
locking of /foo/bar will go thru baz so it also locks /foo/bar/baz/deep/dir.

and if /foo/bar/baz is tree locked, accumulative tree locking of /foo/bar
will save the traversal time for locking /foo/bar/baz and locks other items.

If this is the desired behaviour, I will file an issue and start working
on this.

Cheers,
CLK

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] "svn commit" performance

Posted by Philip Martin <ph...@codematters.co.uk>.
Chia-liang Kao <cl...@clkao.org> writes:

> So is it safe for an access baton just to ignore the already locked
> descendant items for the current tree locking?

Don't ignore them, but use them if they are already locked.  In most
places the wc code is written so that it "knows" whether a directory
is locked or not, and errors occur if it attempts to lock something
twice or if it attempts use a lock that does not exist.  I think this
is the correct behaviour in general, but for this particular function
I propose to relax it.

> this would assume that
> locks are shared within an access batons, which i'm not yet sure what
> the design was.

There is a one-to-one relation between access batons and locks (for
write access anyway).

> But It shouldn't be hard if it's the case. Otherwise
> there should be some other mechanism resolving this? 

I don't understand the question.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] "svn commit" performance

Posted by Chia-liang Kao <cl...@clkao.org>.
So is it safe for an access baton just to ignore the already locked
descendant items for the current tree locking? this would assume that
locks are shared within an access batons, which i'm not yet sure what
the design was. But It shouldn't be hard if it's the case. Otherwise
there should be some other mechanism resolving this? 

And could we have the fix in before fixing the api discussed here?
(after the patch got some general cleanups of course)

The commit performance problem is really annoying for running vcp 
conversion from cvs to svn.

On Thu, Jul 10, 2003 at 11:51:44PM +0100, Philip Martin wrote:
> I feel that having a requirement to sort the paths first means the
> that the svn_wc_adm_open_descendant API is broken, the user has to
> know the entire history of the access baton, which conflicts with the
> "open if not already open" behaviour of this function.  It really
> would be better to get it to do the right thing, possibly by
> refactoring the tree_lock stuff from do_open.  I suspect that doing
> this would also get rid of the closing and re-opening of the basedir
> problem.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] "svn commit" performance

Posted by Philip Martin <ph...@codematters.co.uk>.
Chia-liang Kao <cl...@clkao.org> writes:

> the problem (and the sort) here is that if foo/bar/baz is locked first,
> subsequent tree locking of foo/bar would fail since baz is already locked.

Ah, yes that would be a problem. In fact I mentioned the problem of
already locked but non-recursive baton in my last mail but I didn't
realize how serious it was, it really does need to be fixed.

I feel that having a requirement to sort the paths first means the
that the svn_wc_adm_open_descendant API is broken, the user has to
know the entire history of the access baton, which conflicts with the
"open if not already open" behaviour of this function.  It really
would be better to get it to do the right thing, possibly by
refactoring the tree_lock stuff from do_open.  I suspect that doing
this would also get rid of the closing and re-opening of the basedir
problem.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] "svn commit" performance

Posted by Chia-liang Kao <cl...@clkao.org>.
On Thu, Jul 10, 2003 at 04:13:54PM +0100, Philip Martin wrote:
> > +      SVN_ERR (svn_wc_adm_probe_try (&parent_access, associated,
> > +                                     parent, FALSE, FALSE, pool));
> 
> No, don't use the probe functions, just use the information in the
> entries, call apr_hash_get to see if access batons exist, and call
> do_open to open any missing access batons.

the problem (and the sort) here is that if foo/bar/baz is locked first,
subsequent tree locking of foo/bar would fail since baz is already locked.
or am i'm missing something in the wc api?

and thanks for reviewing! i'll fix the rest later.

Cheers,
CLK

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] "svn commit" performance

Posted by Sander Roobol <ph...@wanadoo.nl>.
On Wed, Jul 23, 2003 at 01:07:06PM -0400, Paul Lussier wrote:
> 
> Hi,
> 
> What's the status of this patch?  Are you off re-inventing it, or 
> should it be made an issue?

No response. Filed as issue #1452:
  http://subversion.tigris.org/issues/show_bug.cgi?id=1452

Sander

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] "svn commit" performance

Posted by Philip Martin <ph...@codematters.co.uk>.
Chia-liang Kao <cl...@clkao.org> writes:

> Index: libsvn_client/commit.c
> ===================================================================
> --- libsvn_client/commit.c	(revision 6432)
> +++ libsvn_client/commit.c	(working copy)
> @@ -39,6 +39,7 @@
>  #include "svn_io.h"
>  #include "svn_md5.h"
>  #include "svn_time.h"
> +#include "svn_sorts.h"
>  
>  #include "client.h"
>  
> @@ -812,7 +813,7 @@
>    apr_array_header_t *rel_targets;
>    apr_hash_t *committables, *tempfiles = NULL;
>    svn_wc_adm_access_t *base_dir_access;
> -  apr_array_header_t *commit_items;
> +  apr_array_header_t *commit_items, *lock_targets;
>    svn_error_t *cmt_err = SVN_NO_ERROR, *unlock_err = SVN_NO_ERROR;
>    svn_error_t *bump_err = SVN_NO_ERROR, *cleanup_err = SVN_NO_ERROR;
>    svn_boolean_t commit_in_progress = FALSE;
> @@ -862,22 +863,55 @@
>          }
>      }
>  
> -  SVN_ERR (svn_wc_adm_open (&base_dir_access, NULL, base_dir, TRUE, TRUE,
> +  /* Extract the path of targets */
> +  lock_targets = apr_array_make (pool, targets->nelts, 
> +				 sizeof (const char *));
> +
> +  for (i = 0; i < targets->nelts; ++i)
> +    {
> +      const char *target;
> +      svn_node_kind_t kind;
> +
> +      SVN_ERR (svn_path_get_absolute (&target,
> +                                      ((const char **)targets->elts)[i],
> +                                      pool));
> +
> +      SVN_ERR (svn_io_check_path (target, &kind, pool));

That's not the way I'd do it.  I would arrange for
svn_wc_adm_open_descendant to handle file targets.

> +      if (kind != svn_node_dir)
> +        target = svn_path_dirname (target, pool);
> +
> +      *((const char **) apr_array_push (lock_targets)) = target;
> +    }
> +
> +  SVN_ERR (svn_wc_adm_open (&base_dir_access, NULL, base_dir, TRUE, FALSE,
>                              pool));
>  
> +  /* Sort the targets so parents will be locked first, descendants will
> +     retrieve from parents. Also if base_dir is one of the lock_targets,
> +     re-lock it with treelock */

I don't think it is necessary to lock the parents first,
svn_wc_adm_open_descendant should handle targets in any order, so
there is no need to sort the targets.

The second sentence is in the wrong place, you don't do that until
later on.  However it would be better to do the check *before* opening
the base_dir, then you could open it recursively if required and there
would be no need to close and re-open it later.

> +
> +  qsort (lock_targets->elts, lock_targets->nelts, lock_targets->elt_size, 
> +         svn_sort_compare_paths);
> +
>    /* One day we might support committing from multiple working copies, but
>       we don't yet.  This check ensures that we don't silently commit a
>       subset of the targets */
> -  for (i = 0; i < targets->nelts; ++i)
> +
> +  for (i = 0; i < lock_targets->nelts; ++i)
>      {
> +      const char *target = APR_ARRAY_IDX (lock_targets, i, const char *);
>        svn_wc_adm_access_t *adm_access;
> -      const char *target;
> -          SVN_ERR (svn_path_get_absolute (&target,
> -                                          ((const char **)targets->elts)[i],
> -                                          pool));
> -      SVN_ERR_W (svn_wc_adm_probe_retrieve (&adm_access, base_dir_access,
> -                                            target, pool),
> -                 "Are all the targets part of the same working copy?");
> +
> +      if (strcmp (target, base_dir) == 0)
> +        {
> +          svn_wc_adm_close (base_dir_access);
> +          SVN_ERR (svn_wc_adm_open (&base_dir_access, NULL, base_dir,
> +                                    TRUE, TRUE, pool));
> +        }
> +      else
> +	SVN_ERR_W (svn_wc_adm_open_descendant (&adm_access, base_dir_access,
> +                                               target, TRUE, TRUE, pool),
> +                  "Are all the targets part of the same working copy?");
>      }
>  
>    /* Crawl the working copy for commit items. */
> Index: include/svn_wc.h
> ===================================================================
> --- include/svn_wc.h	(revision 6432)
> +++ include/svn_wc.h	(working copy)
> @@ -127,6 +127,20 @@
>                                      svn_boolean_t tree_lock,
>                                      apr_pool_t *pool);
>  
> +/** Checks if @a path is in the same working copy as the adm_access
> + * @associated. Acts like @c svn_wc_adm_open if true, otherwise
> + * returns SVN_ERR_ENTRY_NOT_FOUND.
> + */
> +
> +svn_error_t *
> +svn_wc_adm_open_descendant (svn_wc_adm_access_t **adm_access,
> +                            svn_wc_adm_access_t *associated,
> +                            const char *path, 
> +                            svn_boolean_t write_lock,
> +                            svn_boolean_t tree_lock,
> +                            apr_pool_t *pool);
> +
> +
>  /** Return, in @a *adm_access, a pointer to an existing access baton associated
>   * with @a path.  @a path must be a directory that is locked as part of the 
>   * set containing the @a associated access baton.
> Index: libsvn_wc/lock.c
> ===================================================================
> --- libsvn_wc/lock.c	(revision 6432)
> +++ libsvn_wc/lock.c	(working copy)
> @@ -439,9 +439,55 @@
>  {
>    return do_open (adm_access, NULL, path, TRUE, FALSE, TRUE, pool);
>  }
> -     
>  
>  svn_error_t *
> +svn_wc_adm_open_descendant (svn_wc_adm_access_t **adm_access,
> +                            svn_wc_adm_access_t *associated,
> +                            const char *path, 
> +                            svn_boolean_t write_lock,
> +                            svn_boolean_t tree_lock,
> +                            apr_pool_t *pool)
> +{
> +  const char *parent, *relative;
> +  apr_array_header_t *paths;
> +  int i;
> +
> +  parent = apr_pstrdup(pool, associated->path);

Why is this duplicated?

> +  relative = svn_path_is_child (parent, path, pool);
> +
> +  if (relative == NULL)
> +    return svn_error_createf (SVN_ERR_ENTRY_NOT_FOUND, NULL,

I don't think this is really the correct error code.

> +                              "%s is not descendant of %s", path, parent);
> +
> +  paths = svn_path_decompose (relative, pool);
> +
> +  for (i = 0; i < paths->nelts; ++i)
> +    {
> +      const char *component = APR_ARRAY_IDX(paths, i, const char *);
> +      svn_wc_adm_access_t *parent_access;
> +      apr_hash_t *entries;
> +
> +      char *child = svn_path_join (parent, component, pool);
> +
> +      SVN_ERR (svn_wc_adm_probe_try (&parent_access, associated,
> +                                     parent, FALSE, FALSE, pool));

No, don't use the probe functions, just use the information in the
entries, call apr_hash_get to see if access batons exist, and call
do_open to open any missing access batons.

Look at my original algorithm again

  Get P the path for adm_access
  Get R the relative path of descendant and P
  Split R into components.
  For each component C in R
    Get the access baton for P
    If no access baton then error
    Get the entry for C in P
    If no entry for C then error
    If C is not a directory return
    Construct new P = P/C
    Get access baton for P
    If no access baton 
      Open new access baton for P
      If that fails then error

Note the step "If C is not a directory return" this will allow the
descendant path to be a file.  At this point, before the algorithm
returns, it should check that C is the last component of R, if not it
should return an error.

> +      SVN_ERR (svn_wc_entries_read (&entries, parent_access, FALSE, pool));
> +      if (apr_hash_get (entries, component, APR_HASH_KEY_STRING) == NULL)
> +        return svn_error_createf (SVN_ERR_ENTRY_NOT_FOUND, NULL,
> +       	                          "%s is not descendant of %s", path, parent);
> +
> +      parent = child;
> +    }
> +
> +  SVN_ERR (svn_wc_adm_probe_try (adm_access, associated, path,
> +       	                         write_lock, tree_lock, pool));
> +
> +  assert ((*adm_access)->type == svn_wc__adm_access_write_lock);

No, don't restrict this to write locks.  The technique, open the
parent non-recursively and then open selected children, may well come
to be used by read-only operations.

You are correct, there is a check missing from my algorithm: if the
step "Get access baton for P" gets an access baton and write_lock is
TRUE then check that the type is svn_wc__adm_access_write_lock.  This
should be an ordinary run-time check that returns an error, not an
assert.

I think there is (at least) one other problem.  If the access baton
for the descendant path is already open but it is not a full tree
lock, and the caller requested a tree lock, then there needs to be
some code to handle that.  For a first attempt at a solution we could
probably get away with not handling this case, but if we did handle it
then the logic in svn_client_commit to compare the base_dir with the
targets would not be needed.  Perhaps the if (tree_lock) stuff in
do_open needs to be factored into a separate function.

> +
> +  return SVN_NO_ERROR;
> +}

Ideally, svn_wc_adm_open_descendant would close any batons that it has
opened if it returns an error.  Thus the access baton set either gets
all the required batons, or doesn't change.  Look at the way do_open
uses a temporary hash when doing a tree lock.  Once again, we could
probably get away with a first attempt at a solution that ignores this
problem.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] "svn commit" performance

Posted by Paul Lussier <pl...@lanminds.com>.
Hi,

What's the status of this patch?  Are you off re-inventing it, or 
should it be made an issue?

Just checking.



-- 

Seeya,
Paul
--
Key fingerprint = 1660 FECC 5D21 D286 F853  E808 BB07 9239 53F1 28EE

	It may look like I'm just sitting here doing nothing,
   but I'm really actively waiting for all my problems to go away.

	 If you're not having fun, you're not doing it right!



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

[PATCH] "svn commit" performance

Posted by Chia-liang Kao <cl...@clkao.org>.
ok, so here's the patch. (I've run the basic and commit regression tests)

On Wed, Jul 09, 2003 at 04:24:34AM +0800, Chia-liang Kao wrote:
> ah, this looks cleaner than what I was doing with entry url
> and path checking. I'll probably svn revert my stuff. 

Treelock indivudual targets instead of the base dir in commit. This
fixes the case when you issue "svn commit dir", every directory under
cwd is treelocked. this makes it very slow when you have lots of things
outside dir.

* libsvn_subr/sorts.c include/svn_sort.h
  (svn_sort_compare_paths): New function.

* libsvn_client/commit.c
  (svn_client_commit): Do not treelock base_dir.  Iterate over the
    sorted targets and use svn_wc_adm_open_descendent to tree lock
    them.

* include/svn_wc.h
* subversion/libsvn_wc/lock.c
  (svn_wc_adm_open_descendent): New function.

Index: libsvn_subr/sorts.c
===================================================================
--- libsvn_subr/sorts.c	(revision 6432)
+++ libsvn_subr/sorts.c	(working copy)
@@ -83,7 +83,16 @@
   return a_rev < b_rev ? 1 : -1;
 }
 
+int
+svn_sort_compare_paths (const void *a, const void *b)
+{
+  const char *astr = *(const char **)a;
+  const char *bstr = *(const char **)b;
 
+  return svn_path_compare_paths (astr, bstr);
+}
+
+
 #ifndef apr_hash_sort_keys
 
 /* see svn_sorts.h for documentation */
Index: include/svn_sorts.h
===================================================================
--- include/svn_sorts.h	(revision 6432)
+++ include/svn_sorts.h	(working copy)
@@ -76,7 +76,12 @@
  */
 int svn_sort_compare_revisions (const void *a, const void *b);
 
+/** Helper function for sorting @c apr_array_header_t with paths
+ * into alphabetical order.
+ */
+int svn_sort_compare_paths (const void *a, const void *b);
 
+
 #ifndef apr_hash_sorted_keys
 /** Sort @a ht according to its keys, return an @c apr_array_header_t
  * containing @c svn_item_t structures holding those keys and values
Index: libsvn_client/commit.c
===================================================================
--- libsvn_client/commit.c	(revision 6432)
+++ libsvn_client/commit.c	(working copy)
@@ -39,6 +39,7 @@
 #include "svn_io.h"
 #include "svn_md5.h"
 #include "svn_time.h"
+#include "svn_sorts.h"
 
 #include "client.h"
 
@@ -812,7 +813,7 @@
   apr_array_header_t *rel_targets;
   apr_hash_t *committables, *tempfiles = NULL;
   svn_wc_adm_access_t *base_dir_access;
-  apr_array_header_t *commit_items;
+  apr_array_header_t *commit_items, *lock_targets;
   svn_error_t *cmt_err = SVN_NO_ERROR, *unlock_err = SVN_NO_ERROR;
   svn_error_t *bump_err = SVN_NO_ERROR, *cleanup_err = SVN_NO_ERROR;
   svn_boolean_t commit_in_progress = FALSE;
@@ -862,22 +863,55 @@
         }
     }
 
-  SVN_ERR (svn_wc_adm_open (&base_dir_access, NULL, base_dir, TRUE, TRUE,
+  /* Extract the path of targets */
+  lock_targets = apr_array_make (pool, targets->nelts, 
+				 sizeof (const char *));
+
+  for (i = 0; i < targets->nelts; ++i)
+    {
+      const char *target;
+      svn_node_kind_t kind;
+
+      SVN_ERR (svn_path_get_absolute (&target,
+                                      ((const char **)targets->elts)[i],
+                                      pool));
+
+      SVN_ERR (svn_io_check_path (target, &kind, pool));
+      if (kind != svn_node_dir)
+        target = svn_path_dirname (target, pool);
+
+      *((const char **) apr_array_push (lock_targets)) = target;
+    }
+
+  SVN_ERR (svn_wc_adm_open (&base_dir_access, NULL, base_dir, TRUE, FALSE,
                             pool));
 
+  /* Sort the targets so parents will be locked first, descendants will
+     retrieve from parents. Also if base_dir is one of the lock_targets,
+     re-lock it with treelock */
+
+  qsort (lock_targets->elts, lock_targets->nelts, lock_targets->elt_size, 
+         svn_sort_compare_paths);
+
   /* One day we might support committing from multiple working copies, but
      we don't yet.  This check ensures that we don't silently commit a
      subset of the targets */
-  for (i = 0; i < targets->nelts; ++i)
+
+  for (i = 0; i < lock_targets->nelts; ++i)
     {
+      const char *target = APR_ARRAY_IDX (lock_targets, i, const char *);
       svn_wc_adm_access_t *adm_access;
-      const char *target;
-          SVN_ERR (svn_path_get_absolute (&target,
-                                          ((const char **)targets->elts)[i],
-                                          pool));
-      SVN_ERR_W (svn_wc_adm_probe_retrieve (&adm_access, base_dir_access,
-                                            target, pool),
-                 "Are all the targets part of the same working copy?");
+
+      if (strcmp (target, base_dir) == 0)
+        {
+          svn_wc_adm_close (base_dir_access);
+          SVN_ERR (svn_wc_adm_open (&base_dir_access, NULL, base_dir,
+                                    TRUE, TRUE, pool));
+        }
+      else
+	SVN_ERR_W (svn_wc_adm_open_descendant (&adm_access, base_dir_access,
+                                               target, TRUE, TRUE, pool),
+                  "Are all the targets part of the same working copy?");
     }
 
   /* Crawl the working copy for commit items. */
Index: include/svn_wc.h
===================================================================
--- include/svn_wc.h	(revision 6432)
+++ include/svn_wc.h	(working copy)
@@ -127,6 +127,20 @@
                                     svn_boolean_t tree_lock,
                                     apr_pool_t *pool);
 
+/** Checks if @a path is in the same working copy as the adm_access
+ * @associated. Acts like @c svn_wc_adm_open if true, otherwise
+ * returns SVN_ERR_ENTRY_NOT_FOUND.
+ */
+
+svn_error_t *
+svn_wc_adm_open_descendant (svn_wc_adm_access_t **adm_access,
+                            svn_wc_adm_access_t *associated,
+                            const char *path, 
+                            svn_boolean_t write_lock,
+                            svn_boolean_t tree_lock,
+                            apr_pool_t *pool);
+
+
 /** Return, in @a *adm_access, a pointer to an existing access baton associated
  * with @a path.  @a path must be a directory that is locked as part of the 
  * set containing the @a associated access baton.
Index: libsvn_wc/lock.c
===================================================================
--- libsvn_wc/lock.c	(revision 6432)
+++ libsvn_wc/lock.c	(working copy)
@@ -439,9 +439,55 @@
 {
   return do_open (adm_access, NULL, path, TRUE, FALSE, TRUE, pool);
 }
-     
 
 svn_error_t *
+svn_wc_adm_open_descendant (svn_wc_adm_access_t **adm_access,
+                            svn_wc_adm_access_t *associated,
+                            const char *path, 
+                            svn_boolean_t write_lock,
+                            svn_boolean_t tree_lock,
+                            apr_pool_t *pool)
+{
+  const char *parent, *relative;
+  apr_array_header_t *paths;
+  int i;
+
+  parent = apr_pstrdup(pool, associated->path);
+  relative = svn_path_is_child (parent, path, pool);
+
+  if (relative == NULL)
+    return svn_error_createf (SVN_ERR_ENTRY_NOT_FOUND, NULL,
+                              "%s is not descendant of %s", path, parent);
+
+  paths = svn_path_decompose (relative, pool);
+
+  for (i = 0; i < paths->nelts; ++i)
+    {
+      const char *component = APR_ARRAY_IDX(paths, i, const char *);
+      svn_wc_adm_access_t *parent_access;
+      apr_hash_t *entries;
+
+      char *child = svn_path_join (parent, component, pool);
+
+      SVN_ERR (svn_wc_adm_probe_try (&parent_access, associated,
+                                     parent, FALSE, FALSE, pool));
+      SVN_ERR (svn_wc_entries_read (&entries, parent_access, FALSE, pool));
+      if (apr_hash_get (entries, component, APR_HASH_KEY_STRING) == NULL)
+        return svn_error_createf (SVN_ERR_ENTRY_NOT_FOUND, NULL,
+       	                          "%s is not descendant of %s", path, parent);
+
+      parent = child;
+    }
+
+  SVN_ERR (svn_wc_adm_probe_try (adm_access, associated, path,
+       	                         write_lock, tree_lock, pool));
+
+  assert ((*adm_access)->type == svn_wc__adm_access_write_lock);
+
+  return SVN_NO_ERROR;
+}
+
+svn_error_t *
 svn_wc_adm_probe_open (svn_wc_adm_access_t **adm_access,
                        svn_wc_adm_access_t *associated,
                        const char *path,

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: "svn commit" performance

Posted by Chia-liang Kao <cl...@clkao.org>.
On Tue, Jul 08, 2003 at 09:00:02PM +0100, Philip Martin wrote:
> > so what we need to do is lock the grandparent non-recursively and 
> > each target recursively, also checks if all target are from the same
> > repository.
> 
> What's a "grandparent", how does it relate to a target?  Is it the
> common ancestor?

I menat the base_dir in svn_client_commit that is being treelocked, referred
as "grandfather directory" in the code. the treelock is what I am trying avoid.

> Hmm, it appears you only want to support a single working copy.  In
> that case I would avoid the probe functions and simply use the wc
> entries to identify directories, and simply generate some error if a
> directory is obstructed or missing.
> 
> How about a function like
> 
>   svn_error_t *
>   svn_wc_adm_open_descendant (svn_wc_adm_access_t *adm_access,
>                               const char *descendant, 
>                               svn_boolean_t write_lock,
>                               apr_pool_t *pool);

ah, this looks cleaner than what I was doing with entry url
and path checking. I'll probably svn revert my stuff. 

It'd be cool if you could do this easily since I'm still getting familiar 
with the api. :)

Cheers,
CLK

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: "svn commit" performance

Posted by Philip Martin <ph...@codematters.co.uk>.
Chia-liang Kao <cl...@clkao.org> writes:

> Ok, now I'm looking at this.
>
> so what we need to do is lock the grandparent non-recursively and 
> each target recursively, also checks if all target are from the same
> repository.

What's a "grandparent", how does it relate to a target?  Is it the
common ancestor?

> right now i'm using svn_wc_admin_probe_try to tree lock the target
> associated with grandparent access. and by checking if the url of the
> targets' entry we could tell if they are of the same wc.

I'm not clear what you are trying to achieve.  Do you want to support
commits from multiple working copies?  I believe the commit code was
originally intended to support commits from multiple working copies.
If you are going to rework this it may be simpler to go directly to
multiple working copy commits (although since those haven't worked for
some time I don't know what problems may arise).

> but I'm wondering if svn_wc_admin_probe_try should just fail when
> the new path and association are not of the same wc.

Hmm, it appears you only want to support a single working copy.  In
that case I would avoid the probe functions and simply use the wc
entries to identify directories, and simply generate some error if a
directory is obstructed or missing.

How about a function like

  svn_error_t *
  svn_wc_adm_open_descendant (svn_wc_adm_access_t *adm_access,
                              const char *descendant, 
                              svn_boolean_t write_lock,
                              apr_pool_t *pool);

  Get P the path for adm_access
  Get R the relative path of descendant and P
  Split R into components.
  For each component C in R
    Get the access baton for P
    If no access baton then error
    Get the entry for C in P
    If no entry for C then error
    If C is not a directory return
    Construct new P = P/C
    Get access baton for P
    If no access baton 
      Open new access baton for P
      If that fails then error

So given

   svn ci foo/xxx/yyy/alpha foo/xxx/zzz/beta foo/aaa/bbb/gamma

First open foo/ non-recursively, then svn_wc_adm_open_descendant each
of the targets.

This will fail if foo/ is not a working copy and foo/xxx/ and foo/aaa/
are two different working copies, but that's not a regression as the
current code will also fail in this case.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: "svn commit" performance

Posted by cm...@collab.net.
Chia-liang Kao <cl...@clkao.org> writes:

> Ok, now I'm looking at this.
> 
> so what we need to do is lock the grandparent non-recursively and 
> each target recursively, also checks if all target are from the same
> repository.

This is what my patch did.  But, like you, I ran into issue with the
locking code not discriminating about all the various working copies
that were being associated together.

That said, I don't think the fault necessarily lies there.  Subversion
will eventually have to allow operations on multiple working copies,
so we don't want to slam the door shut on that.  

I think that the better solution here is to:

  - do the locking as you (and I) have already implemented, and 

  - no longer do redundancy removal during the target condensation
    step -- instead simply detect that a target that's already been
    processed has, well, already been processed

This will cause the commit to fail today as it should, but then
automatically start working later when we turn on multi-repos
commits.  It clears up the locking issue.  And finally, it doesn't
cost much more in terms of target traversal.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: "svn commit" performance

Posted by Chia-liang Kao <cl...@clkao.org>.
Ok, now I'm looking at this.

so what we need to do is lock the grandparent non-recursively and 
each target recursively, also checks if all target are from the same
repository.

right now i'm using svn_wc_admin_probe_try to tree lock the target
associated with grandparent access. and by checking if the url of the
targets' entry we could tell if they are of the same wc.

but I'm wondering if svn_wc_admin_probe_try should just fail when
the new path and association are not of the same wc.

since i'm just start exploring the wc land, I'd like to know what's
the better way for doing this.

Cheers,
CLK

On Mon, Jul 07, 2003 at 01:18:11PM -0500, cmpilato@collab.net wrote:
> Chia-liang Kao <cl...@clkao.org> writes:
> 
> > Hi, so what's the status of your patch? would you like to post it for
> > testing and reviewing?
> 
> I hit a wall.  Well, let's just be honest -- because I didn't
> understand *before* making the patch the reasons for Philip's change,
> I simply coded the wrong thing.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: "svn commit" performance

Posted by cm...@collab.net.
Chia-liang Kao <cl...@clkao.org> writes:

> Hi, so what's the status of your patch? would you like to post it for
> testing and reviewing?

I hit a wall.  Well, let's just be honest -- because I didn't
understand *before* making the patch the reasons for Philip's change,
I simply coded the wrong thing.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: "svn commit" performance

Posted by Chia-liang Kao <cl...@clkao.org>.
Hi, so what's the status of your patch? would you like to post it for
testing and reviewing?

Cheers,
CLK

On Sat, Jun 21, 2003 at 01:09:18AM -0500, cmpilato@collab.net wrote:
> > > For reasons that I honestly don't understand, 'svn commit foo' does a
> > > full recursive lock on foo's parent directory.  I've noticed this
> > > behavior for some time now, but haven't had the chance to investigate.
> 
> Oh, it's wrong behavior -- no question there.  And now I have a patch
> in progress which locks the parent dir *non-recursively*, and then
> locks each of the targets below that recusively (or not, based on
> whether or not the commit was given the --non-recursive (-N) flag).
> This seems to work as planned, except that in revision 3594, loooooong
> ago, Philip Martin added this code:

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: "svn commit" performance

Posted by cm...@collab.net.
Chia-liang Kao <cl...@clkao.org> writes:

> On Sat, Jun 21, 2003 at 12:26:04AM -0500, cmpilato@collab.net wrote:
> > And I'm gonna guess that what you saw was 'lock' files being written
> > under tags/*, yes?
> 
> Exactly.
> 
> > For reasons that I honestly don't understand, 'svn commit foo' does a
> > full recursive lock on foo's parent directory.  I've noticed this
> > behavior for some time now, but haven't had the chance to investigate.
> 
> If this is indeed the wrong behavior(at least i believed so), I'll
> have a look some time next week unless someone picks it up.

Oh, it's wrong behavior -- no question there.  And now I have a patch
in progress which locks the parent dir *non-recursively*, and then
locks each of the targets below that recusively (or not, based on
whether or not the commit was given the --non-recursive (-N) flag).
This seems to work as planned, except that in revision 3594, loooooong
ago, Philip Martin added this code:

  Index: commit.c
  ===================================================================
  --- commit.c	(revision 3593)
  +++ commit.c	(revision 3594)
  @@ -755,6 +755,21 @@
     SVN_ERR (svn_wc_adm_open (&base_dir_access, NULL, base_dir, TRUE, TRUE,
                               pool));
   
  +  /* One day we might support committing from multiple working copies, but
  +     we don't yet.  This check ensures that we don't silently commit a
  +     subset of the targets */
  +  for (i = 0; i < targets->nelts; ++i)
  +    {
  +      svn_wc_adm_access_t *adm_access;
  +      const char *target;
  +          SVN_ERR (svn_path_get_absolute (&target,
  +                                          ((const char **)targets->elts)[i],
  +                                          pool));
  +      SVN_ERR_W (svn_wc_adm_probe_retrieve (&adm_access, base_dir_access,
  +                                            target, pool),
  +                 "Are all the targets part of the same working copy?");
  +    }
  +

Using my new locking method, I don't get this kind of sanity check for
free any more.  But then, I don't know what prompted him to make this
change, since it's perfectly okay for 'svn commit' to commit from
multiple working copies (by my design, in fact).  The commit code is
driven off of URLs, not on-disk paths, so perhaps Philip can page back
through the annals of time and tell me what was bugging him when he
added this code?

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: "svn commit" performance

Posted by Chia-liang Kao <cl...@clkao.org>.
On Sat, Jun 21, 2003 at 12:26:04AM -0500, cmpilato@collab.net wrote:
> And I'm gonna guess that what you saw was 'lock' files being written
> under tags/*, yes?

Exactly.

> For reasons that I honestly don't understand, 'svn commit foo' does a
> full recursive lock on foo's parent directory.  I've noticed this
> behavior for some time now, but haven't had the chance to investigate.

If this is indeed the wrong behavior(at least i believed so), I'll
have a look some time next week unless someone picks it up.

Cheers,
CLK

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: "svn commit" performance

Posted by Greg Stein <gs...@lyra.org>.
On Sat, Jun 21, 2003 at 12:26:04AM -0500, cmpilato@collab.net wrote:
> Chia-liang Kao <cl...@clkao.org> writes:
> 
> > I jus did a ktrace check, "svn commit trunk" is checking .svn stuff
> > under tags/*
> 
> And I'm gonna guess that what you saw was 'lock' files being written
> under tags/*, yes?
> 
> For reasons that I honestly don't understand, 'svn commit foo' does a
> full recursive lock on foo's parent directory.  I've noticed this
> behavior for some time now, but haven't had the chance to investigate.

Sounds like a variant of issue 1245:

    http://subversion.tigris.org/issues/show_bug.cgi?id=1245


Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: "svn commit" performance

Posted by cm...@collab.net.
Chia-liang Kao <cl...@clkao.org> writes:

> I jus did a ktrace check, "svn commit trunk" is checking .svn stuff
> under tags/*

And I'm gonna guess that what you saw was 'lock' files being written
under tags/*, yes?

For reasons that I honestly don't understand, 'svn commit foo' does a
full recursive lock on foo's parent directory.  I've noticed this
behavior for some time now, but haven't had the chance to investigate.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: "svn commit" performance

Posted by Chia-liang Kao <cl...@clkao.org>.
I jus did a ktrace check, "svn commit trunk" is checking .svn stuff
under tags/*

On Sat, Jun 21, 2003 at 12:02:04PM +0800, Chia-liang Kao wrote:
> it seems the number of tags made the difference, since i didn't 
> notice this lag when i had no tags. but i think "svn commit trunk"
> should be irrevelent to tags/ ?

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org