You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Daniel Shahaf <da...@elego.de> on 2012/03/19 13:24:33 UTC

#4129 is reproducible Re: predecessor count for the root node-revision is wrong message

Daniel Shahaf wrote on Sun, Mar 18, 2012 at 16:28:21 +0200:
> [ cc += dev@. summary for dev@: investigating issue #4129: predecessor
> count of rN is not incremented by one wrt that of r(N-1); see 
> http://subversion.tigris.org/issues/show_bug.cgi?id=4129 ]

Okay, count me happy :-)  I can reproduce this:

[[[
$SVN ps svn:mergeinfo '/branch/A:2' wc1/trunk/A
$SVN ps svn:mergeinfo '/branch/iota:2' wc2/trunk/iota
$SVN mkdir wc1/foo
$SVN mkdir wc2/bar
$svn ci -mm wc1 & $svn ci -mm wc2 & wait
]]]

Output:
[[[
Adding         wc1/foo
Sending        wc1/trunk/A
Adding         wc2/bar
Sending        wc2/trunk/iota

Committed revision 2.
subversion/svn/commit-cmd.c:183: (apr_err=160004)
subversion/libsvn_client/commit.c:876: (apr_err=160004)
subversion/libsvn_client/commit.c:876: (apr_err=160004)
svn: E160004: Commit failed (details follow):
subversion/libsvn_ra_serf/commit.c:2216: (apr_err=160004)
subversion/libsvn_ra_serf/commit.c:2216: (apr_err=160004)
subversion/libsvn_ra_serf/commit.c:2216: (apr_err=160004)
subversion/libsvn_ra_serf/util.c:774: (apr_err=160004)
subversion/libsvn_ra_serf/util.c:2087: (apr_err=160004)
subversion/libsvn_ra_serf/util.c:2087: (apr_err=160004)
subversion/libsvn_ra_serf/util.c:920: (apr_err=160004)
svn: E160004: predecessor count for the root node-revision is wrong: found (2+1 != 2), committing r3
]]]

Error log:
[[[
[Mon Mar 19 14:19:41.388413 2012] [dav:error] [pid 17156:tid 3064073072] [client ::1:40969] Could not MERGE resource "/t/r1/!svn/txn/1-2" into "/t/r1".  [409, #0]
[Mon Mar 19 14:19:41.388497 2012] [dav:error] [pid 17156:tid 3064073072] [client ::1:40969] An error occurred while committing the transaction.  [409, #160004]
[Mon Mar 19 14:19:41.388518 2012] [dav:error] [pid 17156:tid 3064073072] [client ::1:40969] predecessor count for the root node-revision is wrong: found (2+1 != 2), committing r3  [409, #160004]
[Mon Mar 19 14:19:42.024052 2012] [authz_core:debug] [pid 17156:tid 3055680368] mod_authz_core.c(783): [client ::1:40971] AH01626: authorization result of Require all granted: granted
[Mon Mar 19 14:19:42.024114 2012] [authz_core:debug] [pid 17156:tid 3055680368] mod_authz_core.c(783): [client ::1:40971] AH01626: authorization result of <RequireAny>: granted
[Mon Mar 19 14:19:42.024781 2012] [dav:error] [pid 17156:tid 3055680368] [client ::1:40971] Could not fetch resource information.  [404, #0]
[Mon Mar 19 14:19:42.024820 2012] [dav:error] [pid 17156:tid 3055680368] [client ::1:40971] Named transaction doesn't exist.  [404, #0]
]]]

Using another property, or omitting either propset, is enough to cause
the bug not to trigger.


The bug reproduced both with and without the following patch:
[[[
--- subversion/libsvn_fs_fs/tree.c	(revision 1301511)
+++ subversion/libsvn_fs_fs/tree.c	(working copy)
@@ -1 +1,2 @@
+#include <unistd.h>
 /* tree.c : tree-like filesystem, built on DAG filesystem
@@ -1701 +1702,2 @@ svn_fs_fs__commit_txn(const char **conflict_p,
+      sleep(3);
       err = svn_fs_fs__commit(new_rev, fs, txn, iterpool);
@@ -1729,3 +1731,4 @@ svn_fs_fs__commit_txn(const char **conflict_p,
  cleanup:
 
+  sleep(3);
   svn_fs_fs__reset_txn_caches(fs);
]]]


The bug reproduced with either "ServerLimit 1" or "ThreadLimit 1" in
httpd.conf.  (That forced both commits to be served by the same process
(resp., by different processes).)  I use httpd 2.4.1 with event MPM.


Re: #4129 is reproducible Re: predecessor count for the root node-revision is wrong message

Posted by Daniel Shahaf <da...@elego.de>.
Philip Martin wrote on Mon, Mar 19, 2012 at 18:31:41 +0000:
> Daniel Shahaf <da...@elego.de> writes:
> 
> > C. Michael Pilato wrote on Mon, Mar 19, 2012 at 13:57:51 -0400:
> >> Is this problem specific to the FSFS backend?
> >
> > No.
> >
> > % ../runpytest svnadmin mergeinfo_race --fs-type bdb
> > 2012-03-19 20:21:44 [WARNING] CWD: /home/daniel/src/svn/t1/subversion/tests/cmdline
> > 2012-03-19 20:21:44 [WARNING] EXCEPTION: Failure: one or both commits failed
> > XFAIL: svnadmin_tests.py 29: concurrent mergeinfo commits invalidate pred-count
> 
> I think that's a failure of the regression test, after the XFAIL the
> repository looks like:
> 

Right as usual, implemented your fix in r1302591.

Relation to mergeinfo-count corruption Re: #4129 is reproducible Re: predecessor count for the root node-revision is wrong message

Posted by Daniel Shahaf <da...@elego.de>.
(Just changing the subject so mergeinfo gurus spot this thread too.
tldr: #4129 also explains a bug whereby FSFS minfo-cnt values were set
to the value read from uninitialized memory (and which might therefore
have been smaller than the correct value).)

Philip Martin wrote on Fri, Mar 23, 2012 at 19:58:29 +0000:
> The commit fails because the count: is wrong, but if I disable that
> check to allow the commit to complete then I see a bogus minfo-cnt such
> as:
> 
> id: 0.0.r7/322
> type: dir
> pred: 0.0.r4/180
> count: 5
> text: 7 198 111 0 242bff40060f22bcf85959dcf552851a
> cpath: /
> copyroot: 0 /
> minfo-cnt: 43
> 
> Using the current trunk with the issue 4129 fix applied this no longer
> happens: I don't get the valgrind warnings and the minfo-cnt is correct.
> 
> -- 
> uberSVN: Apache Subversion Made Easy
> http://www.uberSVN.com

Relation to mergeinfo-count corruption Re: #4129 is reproducible Re: predecessor count for the root node-revision is wrong message

Posted by Daniel Shahaf <da...@elego.de>.
(Just changing the subject so mergeinfo gurus spot this thread too.
tldr: #4129 also explains a bug whereby FSFS minfo-cnt values were set
to the value read from uninitialized memory (and which might therefore
have been smaller than the correct value).)

Philip Martin wrote on Fri, Mar 23, 2012 at 19:58:29 +0000:
> The commit fails because the count: is wrong, but if I disable that
> check to allow the commit to complete then I see a bogus minfo-cnt such
> as:
> 
> id: 0.0.r7/322
> type: dir
> pred: 0.0.r4/180
> count: 5
> text: 7 198 111 0 242bff40060f22bcf85959dcf552851a
> cpath: /
> copyroot: 0 /
> minfo-cnt: 43
> 
> Using the current trunk with the issue 4129 fix applied this no longer
> happens: I don't get the valgrind warnings and the minfo-cnt is correct.
> 
> -- 
> uberSVN: Apache Subversion Made Easy
> http://www.uberSVN.com

Re: #4129 is reproducible Re: predecessor count for the root node-revision is wrong message

Posted by Daniel Shahaf <da...@elego.de>.
Philip Martin wrote on Mon, Mar 19, 2012 at 18:31:41 +0000:
> Daniel Shahaf <da...@elego.de> writes:
> 
> > C. Michael Pilato wrote on Mon, Mar 19, 2012 at 13:57:51 -0400:
> >> Is this problem specific to the FSFS backend?
> >
> > No.
> >
> > % ../runpytest svnadmin mergeinfo_race --fs-type bdb
> > 2012-03-19 20:21:44 [WARNING] CWD: /home/daniel/src/svn/t1/subversion/tests/cmdline
> > 2012-03-19 20:21:44 [WARNING] EXCEPTION: Failure: one or both commits failed
> > XFAIL: svnadmin_tests.py 29: concurrent mergeinfo commits invalidate pred-count
> 
> I think that's a failure of the regression test, after the XFAIL the
> repository looks like:
> 

Right as usual, implemented your fix in r1302591.

Re: #4129 is reproducible Re: predecessor count for the root node-revision is wrong message

Posted by Daniel Shahaf <da...@elego.de>.
Philip Martin wrote on Mon, Mar 19, 2012 at 18:45:37 +0000:
> Philip Martin <ph...@wandisco.com> writes:
> 
> > If I use the debugger to manually set target->node_revision to NULL
> > inside svn_fs_fs__dag_increment_mergeinfo_count then the commit works.
> > I'm not exactly sure how all the FSFS caching layers are supposed to
> > interact.  Is tree.c:update_ancestry supposed to update the in-memory
> > predecessor_count?  Should there be a svn_fs_fs__dag_xxx function to
> > change the predecessor count?  Should target->node_revision be set to
> > NULL soemwehere?  Something else?
> 
> Moving update_ancestry from tree.c to dag.c is one way to fix the
> problem.  Daniel also suggested removing the node_revision member of
> dag_node_t altogether and relying on new 1.7 caching to give us the
> performance.  I suppose we would still need a patch like this for 1.6.
> 

++1

Re: #4129 is reproducible Re: predecessor count for the root node-revision is wrong message

Posted by Daniel Shahaf <da...@elego.de>.
Philip Martin wrote on Mon, Mar 19, 2012 at 18:45:37 +0000:
> Philip Martin <ph...@wandisco.com> writes:
> 
> > If I use the debugger to manually set target->node_revision to NULL
> > inside svn_fs_fs__dag_increment_mergeinfo_count then the commit works.
> > I'm not exactly sure how all the FSFS caching layers are supposed to
> > interact.  Is tree.c:update_ancestry supposed to update the in-memory
> > predecessor_count?  Should there be a svn_fs_fs__dag_xxx function to
> > change the predecessor count?  Should target->node_revision be set to
> > NULL soemwehere?  Something else?
> 
> Moving update_ancestry from tree.c to dag.c is one way to fix the
> problem.  Daniel also suggested removing the node_revision member of
> dag_node_t altogether and relying on new 1.7 caching to give us the
> performance.  I suppose we would still need a patch like this for 1.6.
> 

++1

Re: #4129 is reproducible Re: predecessor count for the root node-revision is wrong message

Posted by Daniel Shahaf <da...@elego.de>.
Philip Martin wrote on Mon, Mar 19, 2012 at 17:25:22 +0000:
> Philip Martin <ph...@wandisco.com> writes:
> 
> > I can reproduce ove ra_local:
> >
> > svnadmin create repo
> > svn mkdir -mm file://`pwd`/repo/A
> > svn mkdir -mm file://`pwd`/repo/B
> > svn co file://`pwd`/repo wc1
> > svn co file://`pwd`/repo wc2
> > svn ps svn:mergeinfo /P:2 wc1/A
> > svn ps svn:mergeinfo /Q:2 wc2/B
> > svn mkdir wc1/X
> > svn mkdir wc2/Y
> > svn ci -mm wc1 & svn ci -mm wc2 & wait
> >
> > Gives:
> >
> > Sending        wc1/A
> > Adding         wc1/X
> > Sending        wc2/B
> > Adding         wc2/Y
> >
> > Committed revision 3.
> > svn: E160004: Commit failed (details follow):
> > svn: E160004: predecessor count for the root node-revision is wrong: found 3, committing r4
> 
> This is the problem code in libsvn_fs_fs/tree.c:merge
> 
>   SVN_ERR(svn_fs_fs__dag_get_predecessor_count(&pred_count, source, pool));
>   SVN_ERR(update_ancestry(fs, source_id, target_id, target_path,
>                           pred_count, pool));
> 
>   if (svn_fs_fs__fs_supports_mergeinfo(fs))
>     SVN_ERR(svn_fs_fs__dag_increment_mergeinfo_count(target,
>                                                      mergeinfo_increment,
>                                                      pool));
> 
> 
> target is dag_node_t* which is opaque outside dag.c and
> target->node_revision->predecessor_count is 3 when we reach the above
> code.
> 
> update_ancestry rewrites the file repo/db/transactions/2-2.txn/node.0.0
> with the correctly updated value "count: 4" but nothing updates
> target->node_revision->predecessor_count in memory.
> 
> svn_fs_fs__dag_increment_mergeinfo_count then rewrites 
> repo/db/transactions/2-2.txn/node.0.0
> pulling the old value of target->node_revision->predecessor_count from
> memory and putting "count: 3" back in the file.
> 
> If I use the debugger to manually set target->node_revision to NULL
> inside svn_fs_fs__dag_increment_mergeinfo_count then the commit works.
> I'm not exactly sure how all the FSFS caching layers are supposed to
> interact.  Is tree.c:update_ancestry supposed to update the in-memory
> predecessor_count?  Should there be a svn_fs_fs__dag_xxx function to
> change the predecessor count?  Should target->node_revision be set to
> NULL soemwehere?  Something else?

Just a note that another option here is for merge() to re-fetch
'dag_node_t *target' via dag.h APIs, to get another struct with a fresh
(actually: not yet populated) cache ->noderev member.

Re: #4129 is reproducible Re: predecessor count for the root node-revision is wrong message

Posted by Philip Martin <ph...@wandisco.com>.
Daniel Shahaf <da...@elego.de> writes:

> C. Michael Pilato wrote on Mon, Mar 19, 2012 at 13:57:51 -0400:
>> Is this problem specific to the FSFS backend?
>
> No.
>
> % ../runpytest svnadmin mergeinfo_race --fs-type bdb
> 2012-03-19 20:21:44 [WARNING] CWD: /home/daniel/src/svn/t1/subversion/tests/cmdline
> 2012-03-19 20:21:44 [WARNING] EXCEPTION: Failure: one or both commits failed
> XFAIL: svnadmin_tests.py 29: concurrent mergeinfo commits invalidate pred-count

I think that's a failure of the regression test, after the XFAIL the
repository looks like:

$ svn log -vqr3:2 file://`pwd`/svn-test-work/repositories/svnadmin_tests-29/
------------------------------------------------------------------------
r3 | jrandom | 2012-03-19 18:28:53 +0000 (Mon, 19 Mar 2012)
Changed paths:
   M /A
   A /d1
------------------------------------------------------------------------
r2 | jrandom | 2012-03-19 18:28:53 +0000 (Mon, 19 Mar 2012)
Changed paths:
   A /d2
   M /iota
------------------------------------------------------------------------

Also the test still XFAILs for FSFS with my FSFS patch to fix the
problem.

-- 
uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com

Re: #4129 is reproducible Re: predecessor count for the root node-revision is wrong message

Posted by Philip Martin <ph...@wandisco.com>.
Daniel Shahaf <da...@elego.de> writes:

> C. Michael Pilato wrote on Mon, Mar 19, 2012 at 13:57:51 -0400:
>> Is this problem specific to the FSFS backend?
>
> No.
>
> % ../runpytest svnadmin mergeinfo_race --fs-type bdb
> 2012-03-19 20:21:44 [WARNING] CWD: /home/daniel/src/svn/t1/subversion/tests/cmdline
> 2012-03-19 20:21:44 [WARNING] EXCEPTION: Failure: one or both commits failed
> XFAIL: svnadmin_tests.py 29: concurrent mergeinfo commits invalidate pred-count

I think that's a failure of the regression test, after the XFAIL the
repository looks like:

$ svn log -vqr3:2 file://`pwd`/svn-test-work/repositories/svnadmin_tests-29/
------------------------------------------------------------------------
r3 | jrandom | 2012-03-19 18:28:53 +0000 (Mon, 19 Mar 2012)
Changed paths:
   M /A
   A /d1
------------------------------------------------------------------------
r2 | jrandom | 2012-03-19 18:28:53 +0000 (Mon, 19 Mar 2012)
Changed paths:
   A /d2
   M /iota
------------------------------------------------------------------------

Also the test still XFAILs for FSFS with my FSFS patch to fix the
problem.

-- 
uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com

Re: #4129 is reproducible Re: predecessor count for the root node-revision is wrong message

Posted by Daniel Shahaf <da...@elego.de>.
C. Michael Pilato wrote on Mon, Mar 19, 2012 at 13:57:51 -0400:
> Is this problem specific to the FSFS backend?

No.

% ../runpytest svnadmin mergeinfo_race --fs-type bdb
2012-03-19 20:21:44 [WARNING] CWD: /home/daniel/src/svn/t1/subversion/tests/cmdline
2012-03-19 20:21:44 [WARNING] EXCEPTION: Failure: one or both commits failed
XFAIL: svnadmin_tests.py 29: concurrent mergeinfo commits invalidate pred-count

Re: #4129 is reproducible Re: predecessor count for the root node-revision is wrong message

Posted by Daniel Shahaf <da...@elego.de>.
C. Michael Pilato wrote on Mon, Mar 19, 2012 at 13:57:51 -0400:
> Is this problem specific to the FSFS backend?

No.

% ../runpytest svnadmin mergeinfo_race --fs-type bdb
2012-03-19 20:21:44 [WARNING] CWD: /home/daniel/src/svn/t1/subversion/tests/cmdline
2012-03-19 20:21:44 [WARNING] EXCEPTION: Failure: one or both commits failed
XFAIL: svnadmin_tests.py 29: concurrent mergeinfo commits invalidate pred-count

Re: #4129 is reproducible Re: predecessor count for the root node-revision is wrong message

Posted by Daniel Shahaf <da...@elego.de>.
Philip Martin wrote on Mon, Mar 19, 2012 at 17:25:22 +0000:
> Philip Martin <ph...@wandisco.com> writes:
> 
> > I can reproduce ove ra_local:
> >
> > svnadmin create repo
> > svn mkdir -mm file://`pwd`/repo/A
> > svn mkdir -mm file://`pwd`/repo/B
> > svn co file://`pwd`/repo wc1
> > svn co file://`pwd`/repo wc2
> > svn ps svn:mergeinfo /P:2 wc1/A
> > svn ps svn:mergeinfo /Q:2 wc2/B
> > svn mkdir wc1/X
> > svn mkdir wc2/Y
> > svn ci -mm wc1 & svn ci -mm wc2 & wait
> >
> > Gives:
> >
> > Sending        wc1/A
> > Adding         wc1/X
> > Sending        wc2/B
> > Adding         wc2/Y
> >
> > Committed revision 3.
> > svn: E160004: Commit failed (details follow):
> > svn: E160004: predecessor count for the root node-revision is wrong: found 3, committing r4
> 
> This is the problem code in libsvn_fs_fs/tree.c:merge
> 
>   SVN_ERR(svn_fs_fs__dag_get_predecessor_count(&pred_count, source, pool));
>   SVN_ERR(update_ancestry(fs, source_id, target_id, target_path,
>                           pred_count, pool));
> 
>   if (svn_fs_fs__fs_supports_mergeinfo(fs))
>     SVN_ERR(svn_fs_fs__dag_increment_mergeinfo_count(target,
>                                                      mergeinfo_increment,
>                                                      pool));
> 
> 
> target is dag_node_t* which is opaque outside dag.c and
> target->node_revision->predecessor_count is 3 when we reach the above
> code.
> 
> update_ancestry rewrites the file repo/db/transactions/2-2.txn/node.0.0
> with the correctly updated value "count: 4" but nothing updates
> target->node_revision->predecessor_count in memory.
> 
> svn_fs_fs__dag_increment_mergeinfo_count then rewrites 
> repo/db/transactions/2-2.txn/node.0.0
> pulling the old value of target->node_revision->predecessor_count from
> memory and putting "count: 3" back in the file.
> 
> If I use the debugger to manually set target->node_revision to NULL
> inside svn_fs_fs__dag_increment_mergeinfo_count then the commit works.
> I'm not exactly sure how all the FSFS caching layers are supposed to
> interact.  Is tree.c:update_ancestry supposed to update the in-memory
> predecessor_count?  Should there be a svn_fs_fs__dag_xxx function to
> change the predecessor count?  Should target->node_revision be set to
> NULL soemwehere?  Something else?

Just a note that another option here is for merge() to re-fetch
'dag_node_t *target' via dag.h APIs, to get another struct with a fresh
(actually: not yet populated) cache ->noderev member.

Re: #4129 is reproducible Re: predecessor count for the root node-revision is wrong message

Posted by "C. Michael Pilato" <cm...@collab.net>.
On 03/19/2012 02:24 PM, Philip Martin wrote:
> "C. Michael Pilato" <cm...@collab.net> writes:
>> Is this problem specific to the FSFS backend?
> 
> Yes, I think it is.
> 
> For BDB the dag_node_t type in dag.c doesn't have a node_revision
> member.  When update_ancestry does svn_fs_bdb__put_node_revision it
> writes to the database and subsequent svn_fs_bdb__get_node_revision will
> see the updated value.
> 
> For FSFS the svn_fs_fs__put_node_revision call writes to the
> transactions subdir but the in-memory node_revision doesn't get changed.

Gotcha.  Thanks for the response.

I remember when the BDB code had an in-memory node-revision structure and
tried to maintain that using the trail construct.  I remember also the day I
purged the code of that mess. :-)

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand


Re: #4129 is reproducible Re: predecessor count for the root node-revision is wrong message

Posted by "C. Michael Pilato" <cm...@collab.net>.
On 03/19/2012 02:24 PM, Philip Martin wrote:
> "C. Michael Pilato" <cm...@collab.net> writes:
>> Is this problem specific to the FSFS backend?
> 
> Yes, I think it is.
> 
> For BDB the dag_node_t type in dag.c doesn't have a node_revision
> member.  When update_ancestry does svn_fs_bdb__put_node_revision it
> writes to the database and subsequent svn_fs_bdb__get_node_revision will
> see the updated value.
> 
> For FSFS the svn_fs_fs__put_node_revision call writes to the
> transactions subdir but the in-memory node_revision doesn't get changed.

Gotcha.  Thanks for the response.

I remember when the BDB code had an in-memory node-revision structure and
tried to maintain that using the trail construct.  I remember also the day I
purged the code of that mess. :-)

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand


Re: #4129 is reproducible Re: predecessor count for the root node-revision is wrong message

Posted by Philip Martin <ph...@wandisco.com>.
"C. Michael Pilato" <cm...@collab.net> writes:

> On 03/19/2012 01:25 PM, Philip Martin wrote:
>> Philip Martin <ph...@wandisco.com> writes:
>> 
>>> I can reproduce ove ra_local:
>>>
>>> svnadmin create repo
>>> svn mkdir -mm file://`pwd`/repo/A
>>> svn mkdir -mm file://`pwd`/repo/B
>>> svn co file://`pwd`/repo wc1
>>> svn co file://`pwd`/repo wc2
>>> svn ps svn:mergeinfo /P:2 wc1/A
>>> svn ps svn:mergeinfo /Q:2 wc2/B
>>> svn mkdir wc1/X
>>> svn mkdir wc2/Y
>>> svn ci -mm wc1 & svn ci -mm wc2 & wait
>>>
>>> Gives:
>>>
>>> Sending        wc1/A
>>> Adding         wc1/X
>>> Sending        wc2/B
>>> Adding         wc2/Y
>>>
>>> Committed revision 3.
>>> svn: E160004: Commit failed (details follow):
>>> svn: E160004: predecessor count for the root node-revision is wrong: found 3, committing r4
>
> Is this problem specific to the FSFS backend?

Yes, I think it is.

For BDB the dag_node_t type in dag.c doesn't have a node_revision
member.  When update_ancestry does svn_fs_bdb__put_node_revision it
writes to the database and subsequent svn_fs_bdb__get_node_revision will
see the updated value.

For FSFS the svn_fs_fs__put_node_revision call writes to the
transactions subdir but the in-memory node_revision doesn't get changed.

-- 
uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com

Re: #4129 is reproducible Re: predecessor count for the root node-revision is wrong message

Posted by Philip Martin <ph...@wandisco.com>.
"C. Michael Pilato" <cm...@collab.net> writes:

> On 03/19/2012 01:25 PM, Philip Martin wrote:
>> Philip Martin <ph...@wandisco.com> writes:
>> 
>>> I can reproduce ove ra_local:
>>>
>>> svnadmin create repo
>>> svn mkdir -mm file://`pwd`/repo/A
>>> svn mkdir -mm file://`pwd`/repo/B
>>> svn co file://`pwd`/repo wc1
>>> svn co file://`pwd`/repo wc2
>>> svn ps svn:mergeinfo /P:2 wc1/A
>>> svn ps svn:mergeinfo /Q:2 wc2/B
>>> svn mkdir wc1/X
>>> svn mkdir wc2/Y
>>> svn ci -mm wc1 & svn ci -mm wc2 & wait
>>>
>>> Gives:
>>>
>>> Sending        wc1/A
>>> Adding         wc1/X
>>> Sending        wc2/B
>>> Adding         wc2/Y
>>>
>>> Committed revision 3.
>>> svn: E160004: Commit failed (details follow):
>>> svn: E160004: predecessor count for the root node-revision is wrong: found 3, committing r4
>
> Is this problem specific to the FSFS backend?

Yes, I think it is.

For BDB the dag_node_t type in dag.c doesn't have a node_revision
member.  When update_ancestry does svn_fs_bdb__put_node_revision it
writes to the database and subsequent svn_fs_bdb__get_node_revision will
see the updated value.

For FSFS the svn_fs_fs__put_node_revision call writes to the
transactions subdir but the in-memory node_revision doesn't get changed.

-- 
uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com

Re: #4129 is reproducible Re: predecessor count for the root node-revision is wrong message

Posted by "C. Michael Pilato" <cm...@collab.net>.
On 03/19/2012 01:25 PM, Philip Martin wrote:
> Philip Martin <ph...@wandisco.com> writes:
> 
>> I can reproduce ove ra_local:
>>
>> svnadmin create repo
>> svn mkdir -mm file://`pwd`/repo/A
>> svn mkdir -mm file://`pwd`/repo/B
>> svn co file://`pwd`/repo wc1
>> svn co file://`pwd`/repo wc2
>> svn ps svn:mergeinfo /P:2 wc1/A
>> svn ps svn:mergeinfo /Q:2 wc2/B
>> svn mkdir wc1/X
>> svn mkdir wc2/Y
>> svn ci -mm wc1 & svn ci -mm wc2 & wait
>>
>> Gives:
>>
>> Sending        wc1/A
>> Adding         wc1/X
>> Sending        wc2/B
>> Adding         wc2/Y
>>
>> Committed revision 3.
>> svn: E160004: Commit failed (details follow):
>> svn: E160004: predecessor count for the root node-revision is wrong: found 3, committing r4

Is this problem specific to the FSFS backend?

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand


Re: #4129 is reproducible Re: predecessor count for the root node-revision is wrong message

Posted by "C. Michael Pilato" <cm...@collab.net>.
On 03/19/2012 01:25 PM, Philip Martin wrote:
> Philip Martin <ph...@wandisco.com> writes:
> 
>> I can reproduce ove ra_local:
>>
>> svnadmin create repo
>> svn mkdir -mm file://`pwd`/repo/A
>> svn mkdir -mm file://`pwd`/repo/B
>> svn co file://`pwd`/repo wc1
>> svn co file://`pwd`/repo wc2
>> svn ps svn:mergeinfo /P:2 wc1/A
>> svn ps svn:mergeinfo /Q:2 wc2/B
>> svn mkdir wc1/X
>> svn mkdir wc2/Y
>> svn ci -mm wc1 & svn ci -mm wc2 & wait
>>
>> Gives:
>>
>> Sending        wc1/A
>> Adding         wc1/X
>> Sending        wc2/B
>> Adding         wc2/Y
>>
>> Committed revision 3.
>> svn: E160004: Commit failed (details follow):
>> svn: E160004: predecessor count for the root node-revision is wrong: found 3, committing r4

Is this problem specific to the FSFS backend?

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand


Re: #4129 is reproducible Re: predecessor count for the root node-revision is wrong message

Posted by Philip Martin <ph...@wandisco.com>.
Philip Martin <ph...@wandisco.com> writes:

> Moving update_ancestry from tree.c to dag.c is one way to fix the
> problem.

This was applied in r1302613.  I believe this also fixes the minfo-cnt
corruption that has been observed.  To reproduce apply the following
patch to the old client to allow commit to be paused:

Index: ../src/subversion/libsvn_fs_fs/tree.c
===================================================================
--- ../src/subversion/libsvn_fs_fs/tree.c       (revision 1302612)
+++ ../src/subversion/libsvn_fs_fs/tree.c       (working copy)
@@ -1694,6 +1694,7 @@
          any future merges will only be between that node and whatever
          the root node of the youngest rev is by then. */
       err = merge_changes(NULL, youngish_root_node, txn, conflict, iterpool);
+      { char buf[256]; fputs("waiting...", stdout);fgets(buf, 255, stdin); }
       if (err)
         {
           if ((err->apr_err == SVN_ERR_FS_CONFLICT) && conflict_p)

Now the scenario:

svnadmin create repo
svn mkdir -mm file://`pwd`/repo/{A,B,C}
svn co file://`pwd`/repo wc1
svn co file://`pwd`/repo wc2
svn co file://`pwd`/repo wc3
svn ps svn:mergeinfo /P:2 wc1/A
svn ps svn:mergeinfo /Q:2 wc2/B
svn ps svn:mergeinfo /R:2 wc3/C

Now commit wc1 using the patched client running under valgrind. At the
first "waiting..."  prompt commit wc2 using a standard client.  Continue
the first commit to get a second "waiting..." prompt and commit wc3
using a standard client.  Continue the first commit and valgrind reports
errors such as:

==23311== Conditional jump or move depends on uninitialised value(s)
==23311==    at 0x98E0E5A: svn_fs_fs__dag_increment_mergeinfo_count (dag.c:544)
==23311==    by 0x9906847: merge (tree.c:1551)
==23311==    by 0x9906A18: merge_changes (tree.c:1599)
==23311==    by 0x9906C06: svn_fs_fs__commit_txn (tree.c:1696)
==23311==    by 0x6C8C6D9: svn_fs_commit_txn (fs-loader.c:646)
==23311==    by 0x6A5B12C: svn_repos_fs_commit_txn (fs-wrap.c:59)
==23311==    by 0x6A51D0E: close_edit (commit.c:693)
==23311==    by 0x4E4A34F: svn_client__do_commit (commit_util.c:1898)
==23311==    by 0x4E45723: svn_client_commit6 (commit.c:1689)
==23311==    by 0x4097D6: svn_cl__commit (commit-cmd.c:169)
==23311==    by 0x416FA1: main (main.c:2699)

==23311== Syscall param write(buf) points to uninitialised byte(s)
==23311==    at 0x62CC0D0: __write_nocancel (syscall-template.S:82)
==23311==    by 0x5E7D4FC: apr_file_flush_locked (readwrite.c:317)
==23311==    by 0x5E7D7AF: apr_file_flush (readwrite.c:340)
==23311==    by 0x5E7CFA4: apr_unix_file_cleanup (open.c:77)
==23311==    by 0x59FADAF: svn_io_file_close (io.c:3092)
==23311==    by 0x98E9083: svn_fs_fs__put_node_revision (fs_fs.c:2318)
==23311==    by 0x98E1039: svn_fs_fs__dag_increment_mergeinfo_count (dag.c:575)
==23311==    by 0x9906847: merge (tree.c:1551)
==23311==    by 0x9906A18: merge_changes (tree.c:1599)
==23311==    by 0x9906C06: svn_fs_fs__commit_txn (tree.c:1696)
==23311==    by 0x6C8C6D9: svn_fs_commit_txn (fs-loader.c:646)
==23311==    by 0x6A5B12C: svn_repos_fs_commit_txn (fs-wrap.c:59)
==23311==  Address 0xb2ee1dc is 92 bytes inside a block of size 4,096 alloc'd
==23311==    at 0x4C244E8: malloc (vg_replace_malloc.c:236)
==23311==    by 0x5E806D0: pool_alloc (apr_pools.c:1463)
==23311==    by 0x5E7CDFB: apr_file_open (open.c:211)
==23311==    by 0x59F5194: file_open (io.c:280)
==23311==    by 0x59FABF6: svn_io_file_open (io.c:3049)
==23311==    by 0x98E8FC4: svn_fs_fs__put_node_revision (fs_fs.c:2308)
==23311==    by 0x98E1039: svn_fs_fs__dag_increment_mergeinfo_count (dag.c:575)
==23311==    by 0x9906847: merge (tree.c:1551)
==23311==    by 0x9906A18: merge_changes (tree.c:1599)
==23311==    by 0x9906C06: svn_fs_fs__commit_txn (tree.c:1696)
==23311==    by 0x6C8C6D9: svn_fs_commit_txn (fs-loader.c:646)
==23311==    by 0x6A5B12C: svn_repos_fs_commit_txn (fs-wrap.c:59)

The commit fails because the count: is wrong, but if I disable that
check to allow the commit to complete then I see a bogus minfo-cnt such
as:

id: 0.0.r7/322
type: dir
pred: 0.0.r4/180
count: 5
text: 7 198 111 0 242bff40060f22bcf85959dcf552851a
cpath: /
copyroot: 0 /
minfo-cnt: 43

Using the current trunk with the issue 4129 fix applied this no longer
happens: I don't get the valgrind warnings and the minfo-cnt is correct.

-- 
uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com

Re: #4129 is reproducible Re: predecessor count for the root node-revision is wrong message

Posted by Philip Martin <ph...@wandisco.com>.
Philip Martin <ph...@wandisco.com> writes:

> Moving update_ancestry from tree.c to dag.c is one way to fix the
> problem.

This was applied in r1302613.  I believe this also fixes the minfo-cnt
corruption that has been observed.  To reproduce apply the following
patch to the old client to allow commit to be paused:

Index: ../src/subversion/libsvn_fs_fs/tree.c
===================================================================
--- ../src/subversion/libsvn_fs_fs/tree.c       (revision 1302612)
+++ ../src/subversion/libsvn_fs_fs/tree.c       (working copy)
@@ -1694,6 +1694,7 @@
          any future merges will only be between that node and whatever
          the root node of the youngest rev is by then. */
       err = merge_changes(NULL, youngish_root_node, txn, conflict, iterpool);
+      { char buf[256]; fputs("waiting...", stdout);fgets(buf, 255, stdin); }
       if (err)
         {
           if ((err->apr_err == SVN_ERR_FS_CONFLICT) && conflict_p)

Now the scenario:

svnadmin create repo
svn mkdir -mm file://`pwd`/repo/{A,B,C}
svn co file://`pwd`/repo wc1
svn co file://`pwd`/repo wc2
svn co file://`pwd`/repo wc3
svn ps svn:mergeinfo /P:2 wc1/A
svn ps svn:mergeinfo /Q:2 wc2/B
svn ps svn:mergeinfo /R:2 wc3/C

Now commit wc1 using the patched client running under valgrind. At the
first "waiting..."  prompt commit wc2 using a standard client.  Continue
the first commit to get a second "waiting..." prompt and commit wc3
using a standard client.  Continue the first commit and valgrind reports
errors such as:

==23311== Conditional jump or move depends on uninitialised value(s)
==23311==    at 0x98E0E5A: svn_fs_fs__dag_increment_mergeinfo_count (dag.c:544)
==23311==    by 0x9906847: merge (tree.c:1551)
==23311==    by 0x9906A18: merge_changes (tree.c:1599)
==23311==    by 0x9906C06: svn_fs_fs__commit_txn (tree.c:1696)
==23311==    by 0x6C8C6D9: svn_fs_commit_txn (fs-loader.c:646)
==23311==    by 0x6A5B12C: svn_repos_fs_commit_txn (fs-wrap.c:59)
==23311==    by 0x6A51D0E: close_edit (commit.c:693)
==23311==    by 0x4E4A34F: svn_client__do_commit (commit_util.c:1898)
==23311==    by 0x4E45723: svn_client_commit6 (commit.c:1689)
==23311==    by 0x4097D6: svn_cl__commit (commit-cmd.c:169)
==23311==    by 0x416FA1: main (main.c:2699)

==23311== Syscall param write(buf) points to uninitialised byte(s)
==23311==    at 0x62CC0D0: __write_nocancel (syscall-template.S:82)
==23311==    by 0x5E7D4FC: apr_file_flush_locked (readwrite.c:317)
==23311==    by 0x5E7D7AF: apr_file_flush (readwrite.c:340)
==23311==    by 0x5E7CFA4: apr_unix_file_cleanup (open.c:77)
==23311==    by 0x59FADAF: svn_io_file_close (io.c:3092)
==23311==    by 0x98E9083: svn_fs_fs__put_node_revision (fs_fs.c:2318)
==23311==    by 0x98E1039: svn_fs_fs__dag_increment_mergeinfo_count (dag.c:575)
==23311==    by 0x9906847: merge (tree.c:1551)
==23311==    by 0x9906A18: merge_changes (tree.c:1599)
==23311==    by 0x9906C06: svn_fs_fs__commit_txn (tree.c:1696)
==23311==    by 0x6C8C6D9: svn_fs_commit_txn (fs-loader.c:646)
==23311==    by 0x6A5B12C: svn_repos_fs_commit_txn (fs-wrap.c:59)
==23311==  Address 0xb2ee1dc is 92 bytes inside a block of size 4,096 alloc'd
==23311==    at 0x4C244E8: malloc (vg_replace_malloc.c:236)
==23311==    by 0x5E806D0: pool_alloc (apr_pools.c:1463)
==23311==    by 0x5E7CDFB: apr_file_open (open.c:211)
==23311==    by 0x59F5194: file_open (io.c:280)
==23311==    by 0x59FABF6: svn_io_file_open (io.c:3049)
==23311==    by 0x98E8FC4: svn_fs_fs__put_node_revision (fs_fs.c:2308)
==23311==    by 0x98E1039: svn_fs_fs__dag_increment_mergeinfo_count (dag.c:575)
==23311==    by 0x9906847: merge (tree.c:1551)
==23311==    by 0x9906A18: merge_changes (tree.c:1599)
==23311==    by 0x9906C06: svn_fs_fs__commit_txn (tree.c:1696)
==23311==    by 0x6C8C6D9: svn_fs_commit_txn (fs-loader.c:646)
==23311==    by 0x6A5B12C: svn_repos_fs_commit_txn (fs-wrap.c:59)

The commit fails because the count: is wrong, but if I disable that
check to allow the commit to complete then I see a bogus minfo-cnt such
as:

id: 0.0.r7/322
type: dir
pred: 0.0.r4/180
count: 5
text: 7 198 111 0 242bff40060f22bcf85959dcf552851a
cpath: /
copyroot: 0 /
minfo-cnt: 43

Using the current trunk with the issue 4129 fix applied this no longer
happens: I don't get the valgrind warnings and the minfo-cnt is correct.

-- 
uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com

Re: #4129 is reproducible Re: predecessor count for the root node-revision is wrong message

Posted by Philip Martin <ph...@wandisco.com>.
Philip Martin <ph...@wandisco.com> writes:

> If I use the debugger to manually set target->node_revision to NULL
> inside svn_fs_fs__dag_increment_mergeinfo_count then the commit works.
> I'm not exactly sure how all the FSFS caching layers are supposed to
> interact.  Is tree.c:update_ancestry supposed to update the in-memory
> predecessor_count?  Should there be a svn_fs_fs__dag_xxx function to
> change the predecessor count?  Should target->node_revision be set to
> NULL soemwehere?  Something else?

Moving update_ancestry from tree.c to dag.c is one way to fix the
problem.  Daniel also suggested removing the node_revision member of
dag_node_t altogether and relying on new 1.7 caching to give us the
performance.  I suppose we would still need a patch like this for 1.6.


Index: ../src/subversion/libsvn_fs_fs/dag.c
===================================================================
--- ../src/subversion/libsvn_fs_fs/dag.c	(revision 1302591)
+++ ../src/subversion/libsvn_fs_fs/dag.c	(working copy)
@@ -1296,3 +1296,27 @@
 
   return SVN_NO_ERROR;
 }
+
+svn_error_t *
+svn_fs_fs__dag_update_ancestry(dag_node_t *target,
+                               dag_node_t *source,
+                               apr_pool_t *pool)
+{
+  node_revision_t *source_noderev, *target_noderev;
+
+  if (! svn_fs_fs__dag_check_mutable(target))
+    return svn_error_createf
+      (SVN_ERR_FS_NOT_MUTABLE, NULL,
+       _("Attempted to update ancestry of non-mutable node"));
+
+  SVN_ERR(get_node_revision(&source_noderev, source, pool));
+  SVN_ERR(get_node_revision(&target_noderev, target, pool));
+
+  target_noderev->predecessor_id = source->id;
+  target_noderev->predecessor_count = source_noderev->predecessor_count;
+  if (target_noderev->predecessor_count != -1)
+    target_noderev->predecessor_count++;
+
+  return svn_fs_fs__put_node_revision(target->fs, target->id, target_noderev,
+                                      FALSE, pool);
+}
Index: ../src/subversion/libsvn_fs_fs/tree.c
===================================================================
--- ../src/subversion/libsvn_fs_fs/tree.c	(revision 1302591)
+++ ../src/subversion/libsvn_fs_fs/tree.c	(working copy)
@@ -1142,32 +1142,6 @@
 }
 
 
-/* Teach node-revision TARGET_ID that node-revision SOURCE_ID is its
-   predecessor.  TARGET_PATH is used for error messages only. */
-static svn_error_t *
-update_ancestry(svn_fs_t *fs,
-                const svn_fs_id_t *source_id,
-                const svn_fs_id_t *target_id,
-                const char *target_path,
-                int source_pred_count,
-                apr_pool_t *pool)
-{
-  node_revision_t *noderev;
-
-  if (svn_fs_fs__id_txn_id(target_id) == NULL)
-    return svn_error_createf
-      (SVN_ERR_FS_NOT_MUTABLE, NULL,
-       _("Unexpected immutable node at '%s'"), target_path);
-
-  SVN_ERR(svn_fs_fs__get_node_revision(&noderev, fs, target_id, pool));
-  noderev->predecessor_id = source_id;
-  noderev->predecessor_count = source_pred_count;
-  if (noderev->predecessor_count != -1)
-    noderev->predecessor_count++;
-  return svn_fs_fs__put_node_revision(fs, target_id, noderev, FALSE, pool);
-}
-
-
 /* Set the contents of CONFLICT_PATH to PATH, and return an
    SVN_ERR_FS_CONFLICT error that indicates that there was a conflict
    at PATH.  Perform all allocations in POOL (except the allocation of
@@ -1219,7 +1193,6 @@
   apr_hash_index_t *hi;
   svn_fs_t *fs;
   apr_pool_t *iterpool;
-  int pred_count;
   apr_int64_t mergeinfo_increment = 0;
 
   /* Make sure everyone comes from the same filesystem. */
@@ -1543,9 +1516,7 @@
     }
   svn_pool_destroy(iterpool);
 
-  SVN_ERR(svn_fs_fs__dag_get_predecessor_count(&pred_count, source, pool));
-  SVN_ERR(update_ancestry(fs, source_id, target_id, target_path,
-                          pred_count, pool));
+  SVN_ERR(svn_fs_fs__dag_update_ancestry(target, source, pool));
 
   if (svn_fs_fs__fs_supports_mergeinfo(fs))
     SVN_ERR(svn_fs_fs__dag_increment_mergeinfo_count(target,
Index: ../src/subversion/libsvn_fs_fs/dag.h
===================================================================
--- ../src/subversion/libsvn_fs_fs/dag.h	(revision 1302591)
+++ ../src/subversion/libsvn_fs_fs/dag.h	(working copy)
@@ -603,6 +603,10 @@
                                               dag_node_t *node,
                                               apr_pool_t *pool);
 
+svn_error_t *
+svn_fs_fs__dag_update_ancestry(dag_node_t *target,
+                               dag_node_t *source,
+                               apr_pool_t *pool);
 #ifdef __cplusplus
 }
 #endif /* __cplusplus */

-- 
uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com

Re: #4129 is reproducible Re: predecessor count for the root node-revision is wrong message

Posted by Philip Martin <ph...@wandisco.com>.
Philip Martin <ph...@wandisco.com> writes:

> If I use the debugger to manually set target->node_revision to NULL
> inside svn_fs_fs__dag_increment_mergeinfo_count then the commit works.
> I'm not exactly sure how all the FSFS caching layers are supposed to
> interact.  Is tree.c:update_ancestry supposed to update the in-memory
> predecessor_count?  Should there be a svn_fs_fs__dag_xxx function to
> change the predecessor count?  Should target->node_revision be set to
> NULL soemwehere?  Something else?

Moving update_ancestry from tree.c to dag.c is one way to fix the
problem.  Daniel also suggested removing the node_revision member of
dag_node_t altogether and relying on new 1.7 caching to give us the
performance.  I suppose we would still need a patch like this for 1.6.


Index: ../src/subversion/libsvn_fs_fs/dag.c
===================================================================
--- ../src/subversion/libsvn_fs_fs/dag.c	(revision 1302591)
+++ ../src/subversion/libsvn_fs_fs/dag.c	(working copy)
@@ -1296,3 +1296,27 @@
 
   return SVN_NO_ERROR;
 }
+
+svn_error_t *
+svn_fs_fs__dag_update_ancestry(dag_node_t *target,
+                               dag_node_t *source,
+                               apr_pool_t *pool)
+{
+  node_revision_t *source_noderev, *target_noderev;
+
+  if (! svn_fs_fs__dag_check_mutable(target))
+    return svn_error_createf
+      (SVN_ERR_FS_NOT_MUTABLE, NULL,
+       _("Attempted to update ancestry of non-mutable node"));
+
+  SVN_ERR(get_node_revision(&source_noderev, source, pool));
+  SVN_ERR(get_node_revision(&target_noderev, target, pool));
+
+  target_noderev->predecessor_id = source->id;
+  target_noderev->predecessor_count = source_noderev->predecessor_count;
+  if (target_noderev->predecessor_count != -1)
+    target_noderev->predecessor_count++;
+
+  return svn_fs_fs__put_node_revision(target->fs, target->id, target_noderev,
+                                      FALSE, pool);
+}
Index: ../src/subversion/libsvn_fs_fs/tree.c
===================================================================
--- ../src/subversion/libsvn_fs_fs/tree.c	(revision 1302591)
+++ ../src/subversion/libsvn_fs_fs/tree.c	(working copy)
@@ -1142,32 +1142,6 @@
 }
 
 
-/* Teach node-revision TARGET_ID that node-revision SOURCE_ID is its
-   predecessor.  TARGET_PATH is used for error messages only. */
-static svn_error_t *
-update_ancestry(svn_fs_t *fs,
-                const svn_fs_id_t *source_id,
-                const svn_fs_id_t *target_id,
-                const char *target_path,
-                int source_pred_count,
-                apr_pool_t *pool)
-{
-  node_revision_t *noderev;
-
-  if (svn_fs_fs__id_txn_id(target_id) == NULL)
-    return svn_error_createf
-      (SVN_ERR_FS_NOT_MUTABLE, NULL,
-       _("Unexpected immutable node at '%s'"), target_path);
-
-  SVN_ERR(svn_fs_fs__get_node_revision(&noderev, fs, target_id, pool));
-  noderev->predecessor_id = source_id;
-  noderev->predecessor_count = source_pred_count;
-  if (noderev->predecessor_count != -1)
-    noderev->predecessor_count++;
-  return svn_fs_fs__put_node_revision(fs, target_id, noderev, FALSE, pool);
-}
-
-
 /* Set the contents of CONFLICT_PATH to PATH, and return an
    SVN_ERR_FS_CONFLICT error that indicates that there was a conflict
    at PATH.  Perform all allocations in POOL (except the allocation of
@@ -1219,7 +1193,6 @@
   apr_hash_index_t *hi;
   svn_fs_t *fs;
   apr_pool_t *iterpool;
-  int pred_count;
   apr_int64_t mergeinfo_increment = 0;
 
   /* Make sure everyone comes from the same filesystem. */
@@ -1543,9 +1516,7 @@
     }
   svn_pool_destroy(iterpool);
 
-  SVN_ERR(svn_fs_fs__dag_get_predecessor_count(&pred_count, source, pool));
-  SVN_ERR(update_ancestry(fs, source_id, target_id, target_path,
-                          pred_count, pool));
+  SVN_ERR(svn_fs_fs__dag_update_ancestry(target, source, pool));
 
   if (svn_fs_fs__fs_supports_mergeinfo(fs))
     SVN_ERR(svn_fs_fs__dag_increment_mergeinfo_count(target,
Index: ../src/subversion/libsvn_fs_fs/dag.h
===================================================================
--- ../src/subversion/libsvn_fs_fs/dag.h	(revision 1302591)
+++ ../src/subversion/libsvn_fs_fs/dag.h	(working copy)
@@ -603,6 +603,10 @@
                                               dag_node_t *node,
                                               apr_pool_t *pool);
 
+svn_error_t *
+svn_fs_fs__dag_update_ancestry(dag_node_t *target,
+                               dag_node_t *source,
+                               apr_pool_t *pool);
 #ifdef __cplusplus
 }
 #endif /* __cplusplus */

-- 
uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com

Re: #4129 is reproducible Re: predecessor count for the root node-revision is wrong message

Posted by Philip Martin <ph...@wandisco.com>.
Philip Martin <ph...@wandisco.com> writes:

> I can reproduce ove ra_local:
>
> svnadmin create repo
> svn mkdir -mm file://`pwd`/repo/A
> svn mkdir -mm file://`pwd`/repo/B
> svn co file://`pwd`/repo wc1
> svn co file://`pwd`/repo wc2
> svn ps svn:mergeinfo /P:2 wc1/A
> svn ps svn:mergeinfo /Q:2 wc2/B
> svn mkdir wc1/X
> svn mkdir wc2/Y
> svn ci -mm wc1 & svn ci -mm wc2 & wait
>
> Gives:
>
> Sending        wc1/A
> Adding         wc1/X
> Sending        wc2/B
> Adding         wc2/Y
>
> Committed revision 3.
> svn: E160004: Commit failed (details follow):
> svn: E160004: predecessor count for the root node-revision is wrong: found 3, committing r4

This is the problem code in libsvn_fs_fs/tree.c:merge

  SVN_ERR(svn_fs_fs__dag_get_predecessor_count(&pred_count, source, pool));
  SVN_ERR(update_ancestry(fs, source_id, target_id, target_path,
                          pred_count, pool));

  if (svn_fs_fs__fs_supports_mergeinfo(fs))
    SVN_ERR(svn_fs_fs__dag_increment_mergeinfo_count(target,
                                                     mergeinfo_increment,
                                                     pool));


target is dag_node_t* which is opaque outside dag.c and
target->node_revision->predecessor_count is 3 when we reach the above
code.

update_ancestry rewrites the file repo/db/transactions/2-2.txn/node.0.0
with the correctly updated value "count: 4" but nothing updates
target->node_revision->predecessor_count in memory.

svn_fs_fs__dag_increment_mergeinfo_count then rewrites 
repo/db/transactions/2-2.txn/node.0.0
pulling the old value of target->node_revision->predecessor_count from
memory and putting "count: 3" back in the file.

If I use the debugger to manually set target->node_revision to NULL
inside svn_fs_fs__dag_increment_mergeinfo_count then the commit works.
I'm not exactly sure how all the FSFS caching layers are supposed to
interact.  Is tree.c:update_ancestry supposed to update the in-memory
predecessor_count?  Should there be a svn_fs_fs__dag_xxx function to
change the predecessor count?  Should target->node_revision be set to
NULL soemwehere?  Something else?

-- 
uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com

Re: #4129 is reproducible Re: predecessor count for the root node-revision is wrong message

Posted by Philip Martin <ph...@wandisco.com>.
Philip Martin <ph...@wandisco.com> writes:

> I can reproduce ove ra_local:
>
> svnadmin create repo
> svn mkdir -mm file://`pwd`/repo/A
> svn mkdir -mm file://`pwd`/repo/B
> svn co file://`pwd`/repo wc1
> svn co file://`pwd`/repo wc2
> svn ps svn:mergeinfo /P:2 wc1/A
> svn ps svn:mergeinfo /Q:2 wc2/B
> svn mkdir wc1/X
> svn mkdir wc2/Y
> svn ci -mm wc1 & svn ci -mm wc2 & wait
>
> Gives:
>
> Sending        wc1/A
> Adding         wc1/X
> Sending        wc2/B
> Adding         wc2/Y
>
> Committed revision 3.
> svn: E160004: Commit failed (details follow):
> svn: E160004: predecessor count for the root node-revision is wrong: found 3, committing r4

This is the problem code in libsvn_fs_fs/tree.c:merge

  SVN_ERR(svn_fs_fs__dag_get_predecessor_count(&pred_count, source, pool));
  SVN_ERR(update_ancestry(fs, source_id, target_id, target_path,
                          pred_count, pool));

  if (svn_fs_fs__fs_supports_mergeinfo(fs))
    SVN_ERR(svn_fs_fs__dag_increment_mergeinfo_count(target,
                                                     mergeinfo_increment,
                                                     pool));


target is dag_node_t* which is opaque outside dag.c and
target->node_revision->predecessor_count is 3 when we reach the above
code.

update_ancestry rewrites the file repo/db/transactions/2-2.txn/node.0.0
with the correctly updated value "count: 4" but nothing updates
target->node_revision->predecessor_count in memory.

svn_fs_fs__dag_increment_mergeinfo_count then rewrites 
repo/db/transactions/2-2.txn/node.0.0
pulling the old value of target->node_revision->predecessor_count from
memory and putting "count: 3" back in the file.

If I use the debugger to manually set target->node_revision to NULL
inside svn_fs_fs__dag_increment_mergeinfo_count then the commit works.
I'm not exactly sure how all the FSFS caching layers are supposed to
interact.  Is tree.c:update_ancestry supposed to update the in-memory
predecessor_count?  Should there be a svn_fs_fs__dag_xxx function to
change the predecessor count?  Should target->node_revision be set to
NULL soemwehere?  Something else?

-- 
uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com

Re: #4129 is reproducible Re: predecessor count for the root node-revision is wrong message

Posted by Philip Martin <ph...@wandisco.com>.
Daniel Shahaf <da...@elego.de> writes:

> The bug reproduced with either "ServerLimit 1" or "ThreadLimit 1" in
> httpd.conf.  (That forced both commits to be served by the same process
> (resp., by different processes).)  I use httpd 2.4.1 with event MPM.

I can reproduce ove ra_local:

svnadmin create repo
svn mkdir -mm file://`pwd`/repo/A
svn mkdir -mm file://`pwd`/repo/B
svn co file://`pwd`/repo wc1
svn co file://`pwd`/repo wc2
svn ps svn:mergeinfo /P:2 wc1/A
svn ps svn:mergeinfo /Q:2 wc2/B
svn mkdir wc1/X
svn mkdir wc2/Y
svn ci -mm wc1 & svn ci -mm wc2 & wait

Gives:

Sending        wc1/A
Adding         wc1/X
Sending        wc2/B
Adding         wc2/Y

Committed revision 3.
svn: E160004: Commit failed (details follow):
svn: E160004: predecessor count for the root node-revision is wrong: found 3, committing r4

-- 
uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com

Re: #4129 is reproducible Re: predecessor count for the root node-revision is wrong message

Posted by Philip Martin <ph...@wandisco.com>.
Daniel Shahaf <da...@elego.de> writes:

> The bug reproduced with either "ServerLimit 1" or "ThreadLimit 1" in
> httpd.conf.  (That forced both commits to be served by the same process
> (resp., by different processes).)  I use httpd 2.4.1 with event MPM.

I can reproduce ove ra_local:

svnadmin create repo
svn mkdir -mm file://`pwd`/repo/A
svn mkdir -mm file://`pwd`/repo/B
svn co file://`pwd`/repo wc1
svn co file://`pwd`/repo wc2
svn ps svn:mergeinfo /P:2 wc1/A
svn ps svn:mergeinfo /Q:2 wc2/B
svn mkdir wc1/X
svn mkdir wc2/Y
svn ci -mm wc1 & svn ci -mm wc2 & wait

Gives:

Sending        wc1/A
Adding         wc1/X
Sending        wc2/B
Adding         wc2/Y

Committed revision 3.
svn: E160004: Commit failed (details follow):
svn: E160004: predecessor count for the root node-revision is wrong: found 3, committing r4

-- 
uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com