You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Ben Collins-Sussman <su...@red-bean.com> on 2009/02/11 15:51:31 UTC

how does our commit-finalization race logic *really* work?

A googlecode user gave me this bash one-liner today:

$ for a in 0 1 2 3 ; do for b in 0 1 2 3 4 5 6 7 8 9 ; do svn mkdir -m
"" http://host/repos/$a$b & done & done

It simply spawns 40 simultaneous processes, and each attempting to do
'svn mkdir' on a unique URL.

When I run this against either a googlecode repository, or even a
stock mod_dav_svn + fsfs repository, somewhere between 5 and 15 jobs
succeed in their commits.  All the others return error:

subversion/libsvn_ra_neon/commit.c:492: (apr_err=160024)
svn: File or directory '.' is out of date; try updating
subversion/libsvn_ra_neon/util.c:723: (apr_err=160024)
svn: version resource newer than txn (restart the commit)

I'm sort of bewildered here, because this is not at all what I would
expect.  If you look at tree.c:svn_fs_base__commit_txn(), we have
kfogel's famous "while (1729)" loop which attempts to infinitely
re-merge the pending transaction against ever-newer HEAD revisions.
Because these 40 simultaneous commits are *all mergeable* with each
other, I'd expect every one to eventually succeed.  Maybe they'd all
block on each other awkwardly at first, but the traffic jam should
slowly clear up and they should all complete in some random order.

So what's causing this behavior?  Why am I not seeing the ideal behavior?

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1139464

RE: how does our commit-finalization race logic *really* work?

Posted by Bert Huijben <rh...@sharpsvn.net>.
> -----Original Message-----
> From: sussman@gmail.com [mailto:sussman@gmail.com] On Behalf Of Ben
> Collins-Sussman
> Sent: woensdag 11 februari 2009 17:25
> To: C. Michael Pilato
> Cc: dev@subversion.tigris.org
> Subject: Re: how does our commit-finalization race logic *really* work?
> 
> On Wed, Feb 11, 2009 at 10:08 AM, C. Michael Pilato
> <cm...@collab.net> wrote:
> 
> >
> > How, if at all, does the behavior differ if you do this over svnserve
> or
> > ra-local?
> 
> Both svn:// and file:// succeed perfectly.  All 40 commits go through.
> 
> Good call.  So something is fubar in either ra_neon or mod_dav_svn...
> somehow our HTTP layer is being way more, erm, restrictive than
> libsvn_fs needs it to be.  Should we investigate further?

Just guessing...

Look out for issue #3119 as you go..
"File '...' already exists" when it obviously doesn't, during commit


	Bert

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1139602

Re: how does our commit-finalization race logic *really* work?

Posted by Ben Collins-Sussman <su...@red-bean.com>.
On Wed, Feb 11, 2009 at 10:28 AM, Bert Huijben <rh...@sharpsvn.net> wrote:

> Just guessing...
>
> Look out for issue #3119 as you go..
> "File '...' already exists" when it obviously doesn't, during commit
>

Interesting... on googlecode, our libsvn_fs_bigtable backend produces
a slightly different set of failed-commit errors:

   svn: Reference to non-existent node '..' in filesystem 'blah'

I wonder if it's related to issue #3119.

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1139824

Re: how does our commit-finalization race logic *really* work?

Posted by km...@rockwellcollins.com.
sussman@gmail.com wrote on 02/11/2009 10:24:53 AM:

> On Wed, Feb 11, 2009 at 10:08 AM, C. Michael Pilato 
<cm...@collab.net> wrote:
> 
> >
> > How, if at all, does the behavior differ if you do this over svnserve 
or
> > ra-local?
> 
> Both svn:// and file:// succeed perfectly.  All 40 commits go through.
> 
> Good call.  So something is fubar in either ra_neon or mod_dav_svn...
> somehow our HTTP layer is being way more, erm, restrictive than
> libsvn_fs needs it to be.  Should we investigate further?

If it matters, I've had numerous users complain about this
situation, since they have modified a completely unrelated
part of the tree, but still need to "update" before commit.
This can be a little painful on a heavily used repository.
I.E. you can't update and commit fast enough before someone
else has committed, so you need to just need to manually
continue to update/commit until it works...

Kevin R.

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1139760

Re: how does our commit-finalization race logic *really* work?

Posted by Ben Collins-Sussman <su...@red-bean.com>.
On Wed, Feb 11, 2009 at 10:08 AM, C. Michael Pilato <cm...@collab.net> wrote:

>
> How, if at all, does the behavior differ if you do this over svnserve or
> ra-local?

Both svn:// and file:// succeed perfectly.  All 40 commits go through.

Good call.  So something is fubar in either ra_neon or mod_dav_svn...
somehow our HTTP layer is being way more, erm, restrictive than
libsvn_fs needs it to be.  Should we investigate further?

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1139582

Re: how does our commit-finalization race logic *really* work?

Posted by "C. Michael Pilato" <cm...@collab.net>.
Ben Collins-Sussman wrote:
> A googlecode user gave me this bash one-liner today:
> 
> $ for a in 0 1 2 3 ; do for b in 0 1 2 3 4 5 6 7 8 9 ; do svn mkdir -m
> "" http://host/repos/$a$b & done & done
> 
> It simply spawns 40 simultaneous processes, and each attempting to do
> 'svn mkdir' on a unique URL.
> 
> When I run this against either a googlecode repository, or even a
> stock mod_dav_svn + fsfs repository, somewhere between 5 and 15 jobs
> succeed in their commits.  All the others return error:
> 
> subversion/libsvn_ra_neon/commit.c:492: (apr_err=160024)
> svn: File or directory '.' is out of date; try updating
> subversion/libsvn_ra_neon/util.c:723: (apr_err=160024)
> svn: version resource newer than txn (restart the commit)
> 
> I'm sort of bewildered here, because this is not at all what I would
> expect.  If you look at tree.c:svn_fs_base__commit_txn(), we have
> kfogel's famous "while (1729)" loop which attempts to infinitely
> re-merge the pending transaction against ever-newer HEAD revisions.
> Because these 40 simultaneous commits are *all mergeable* with each
> other, I'd expect every one to eventually succeed.  Maybe they'd all
> block on each other awkwardly at first, but the traffic jam should
> slowly clear up and they should all complete in some random order.
> 
> So what's causing this behavior?  Why am I not seeing the ideal behavior?

How, if at all, does the behavior differ if you do this over svnserve or
ra-local?

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1139523

Re: how does our commit-finalization race logic *really* work?

Posted by Carsten Koch <Ca...@web.de>.
On Wed, 2009-02-11 at 09:51 -0600, Ben Collins-Sussman wrote:
...
> somewhere between 5 and 15 jobs
> succeed in their commits.  All the others return error:
> 
> subversion/libsvn_ra_neon/commit.c:492: (apr_err=160024)
> svn: File or directory '.' is out of date; try updating
> subversion/libsvn_ra_neon/util.c:723: (apr_err=160024)
> svn: version resource newer than txn (restart the commit)

I am seeing the exact same thing on an "svn remove"
in a busy repository here.

I retried the remove manually later.
So far (I have seen the error twice), that always succeeded.

I have now changed the Python code that does the "svn remove"
to retry 10 times.

We'll see how that goes.

IMHO, it should not be the user's responsibility to retry here.

Cheers,
Carsten.

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2420546