You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Paul Burba <pa...@softlanding.com> on 2006/10/12 16:56:11 UTC

[svn] x [fsfs] basic_tests.py failures

In IRC:

[10:57] pburba: On trunk I'm seeing many [svn] x [fsfs] failures in 
basic_test.py on Windows
[10:57] pburba: Specifically I'm seeing this error on a commit attempt:
[10:57] pburba: svn: Cannot write to the prototype revision file of 
transaction '1-1' because a previous representation is currently being 
written by this process
[10:57] pburba: Is this a problem anyone is aware of already?
[11:01] glasser: Is that an error from the new fsfs corruption fix thing?
[11:03] malcolmr: Okay, that's mine, sure.
[11:03] malcolmr: Is this over ra_local?
[11:03] pburba: glasser: svn://
[11:04] malcolmr: Huh.  I ran my set of tests over svn:// and they seemed 
ok.  Are they failing reproducibly?
[11:04] malcolmr: I did run without SASL though - I wonder if that makes 
any difference at all?
[11:05] pburba: Yes, basic_tests 7,8,12,13,14,15,27,28,29, tried it twice, 
just [svn] x [fsfs]
[11:05] malcolmr: Okay, it's possible that it's a windows-only bug in 
libsvn_fs_fs's new anti-corruption code, then.
[11:05] malcolmr: I'll check out cygwin and see if I can reproduce.
[11:06] pburba: malcomr: I can take a look too to see if anything obvious 
jumps out at me, now that I know where to look
[11:07] malcolmr: Thanks.  r21738 was the commit.
[11:24] malcolmr: pburba: basic_tests passes for me over ra_svn for both 
Cygwin and Slackware.
[11:24] malcolmr: I can't test native win32, unfortunately.
[11:25] malcolmr: If you could narrow down _why_ you're having failures, 
that might give a clue as to what's triggering the corruption.
[11:26] pburba: malcolmr: I'll do what I can, just finishing up a few 
other things then I'll be able to take a look, I'll let you know what I 
find.
[11:26] malcolmr: Thanks!

Malcolm,

Log with the failures for just basic_tests.py:


Here are some initial findings:

1) As you can see from the log, the failures occur mostly on commit 
attempts, but also on a rm URL and import.

2) Somehow the problem is tied to the basic_corruption test.  If that test 
is not run all the other tests pass.

3) If svnserve is restarted, subsequent attempts to commit succeed.

4) The SVN_ERR_FS_REP_BEING_WRITTEN is returned from 
get_writable_proto_rev_body():

get_writable_proto_rev_body(svn_fs_t * 0x005f4518, void * 0x00c4f6a0, 
apr_pool_t * 0x0065a750) line 457
with_txnlist_lock(svn_fs_t * 0x005f4518, svn_error_t * (svn_fs_t *, void 
*, apr_pool_t *)* 0x0043d8e0 get_writable_proto_rev_body(svn_fs_t *, void 
*, apr_pool_t *), void * 0x00c4f6a0, apr_pool_t * 0x0065a750) line 343 + 
17 bytes
get_writable_proto_rev(apr_file_t * * 0x00c4f744, void * * 0x00c65c90, 
svn_fs_t * 0x005f4518, const char * 0x00c659fd, apr_pool_t * 0x0065a750) 
line 556 + 22 bytes
rep_write_get_baton(rep_write_baton * * 0x00c4f7b4, svn_fs_t * 0x005f4518, 
node_revision_t * 0x00c659c8, apr_pool_t * 0x00588210) line 3510 + 46 
bytes
set_representation(svn_stream_t * * 0x00c4f888, svn_fs_t * 0x005f4518, 
node_revision_t * 0x00c659c8, apr_pool_t * 0x00588210) line 3615 + 21 
bytes
svn_fs_fs__set_contents(svn_stream_t * * 0x00c4f888, svn_fs_t * 
0x005f4518, node_revision_t * 0x00c659c8, apr_pool_t * 0x00588210) line 
3634 + 21 bytes
svn_fs_fs__dag_get_edit_stream(svn_stream_t * * 0x00588260, dag_node_t * 
0x00c641d8, const char * 0x0062c670, apr_pool_t * 0x00588210) line 907 + 
23 bytes
apply_textdelta(void * 0x00588248, apr_pool_t * 0x00588210) line 2406 + 30 
bytes
fs_apply_textdelta(svn_error_t * (svn_txdelta_window_t *, void *)* * 
0x00c4faf8, void * * 0x00c4faf4, svn_fs_root_t * 0x0062c560, const char * 
0x005751d0, const char * 0x005b2580, const char * 0x00000000, apr_pool_t * 
0x00588210) line 2462 + 13 bytes
svn_fs_apply_textdelta(svn_error_t * (svn_txdelta_window_t *, void *)* * 
0x00c4faf8, void * * 0x00c4faf4, svn_fs_root_t * 0x0062c560, const char * 
0x005751d0, const char * 0x005b2580, const char * 0x00000000, apr_pool_t * 
0x00588210) line 804 + 39 bytes
apply_textdelta(void * 0x005751e0, const char * 0x005b2580, apr_pool_t * 
0x00588210, svn_error_t * (svn_txdelta_window_t *, void *)* * 0x00c4faf8, 
void * * 0x00c4faf4) line 429 + 39 bytes
ra_svn_handle_apply_textdelta(svn_ra_svn_conn_st * 0x005680c0, apr_pool_t 
* 0x005b2408, apr_array_header_t * 0x005b24d0, ra_svn_driver_state_t * 
0x00c4fb90) line 668 + 36 bytes
svn_ra_svn__drive_editorp(svn_ra_svn_conn_st * 0x005680c0, apr_pool_t * 
0x00573138, const svn_delta_editor_t * 0x005735e0, void * 0x00642668, int 
* 0x00c4fd58, int 0) line 887 + 28 bytes
svn_ra_svn_drive_editor2(svn_ra_svn_conn_st * 0x005680c0, apr_pool_t * 
0x00573138, const svn_delta_editor_t * 0x005735e0, void * 0x00642668, int 
* 0x00c4fd58, int 0) line 751 + 29 bytes
svn_ra_svn_drive_editor(svn_ra_svn_conn_st * 0x005680c0, apr_pool_t * 
0x00573138, const svn_delta_editor_t * 0x005735e0, void * 0x00642668, int 
* 0x00c4fd58) line 772 + 27 bytes
commit(svn_ra_svn_conn_st * 0x005680c0, apr_pool_t * 0x00573138, 
apr_array_header_t * 0x005731f0, void * 0x00c4feac) line 969 + 25 bytes
svn_ra_svn_handle_commands(svn_ra_svn_conn_st * 0x005680c0, apr_pool_t * 
0x00571120, const svn_ra_svn_cmd_entry_t * 0x00527830 main_commands, void 
* 0x00c4feac) line 838 + 31 bytes
serve(svn_ra_svn_conn_st * 0x005680c0, serve_params_t * 0x0012ff30, 
apr_pool_t * 0x00571120) line 2305 + 22 bytes
serve_thread(apr_thread_t * 0x0056a180, void * 0x0056a170) line 253 + 25 
bytes
LIBAPR! 6eed324e()
KERNEL32! 7c80b683()

-       fs      0x005f4518
        pool    0x00571120
+       path    0x005f4b48 
"C:/SVN/svn.trunk.copymove/src-trunk.collabnet.trunk/Release/subversion/tests/cmdline/svn-test-work/repositories/basic_tests-2/db"
        warning 0x0040d820 default_warning_func(void *, svn_error_t *)
        warning_baton   0x00000000
        config  0x00000000
+       access_ctx      0x005734f8
+       vtable  0x00549780 fs_vtable
        fsap_data       0x005f4538
        baton   0x00c4f6a0
        pool    0x0065a750
+       file    0x00c4f744
-       txn     0x006365d8
+       next    0x005a2380
+       txn_id  0x006365dc "1-1"
        being_written   1
        pool    0x006365a0
+       txn_id  0x00c659fd "1-1"
        lockcookie      0x00c65c90
-       b       0x00c4f6a0
+       file    0x00c4f744
        lockcookie      0x00c65c90
+       txn_id  0x00c659fd "1-1"
+       err     0xcccccccc

Does any of this suggest a cause to you?

Paul B.

Re: [svn] x [fsfs] basic_tests.py failures

Posted by Lieven Govaerts <sv...@mobsol.be>.
Quoting "C. Michael Pilato" <cm...@collab.net>:

> Lieven Govaerts wrote:
> > Quoting "C. Michael Pilato" <cm...@collab.net>:
> >
> >> C. Michael Pilato wrote:
> >>> Lieven Govaerts wrote:
> >>>> Malcolm Rowe wrote:
> >>>>> On Wed, Nov 08, 2006 at 11:40:28PM +0100, Lieven Govaerts wrote:
> >>>>>
> >>>>>> FYI: I get this issue with trunk when I'm running basic_tests parallel
> >>>>>> in 10 processes over ra_dav (on windows):
> >>>>>>
> >>>>>> ..\..\..\subversion\libsvn_client\commit.c:866: (apr_err=160044)
> >>>>>> svn: Commit failed (details follow):
> >>>>>> ..\..\..\subversion\libsvn_ra_dav\util.c:895: (apr_err=160044)
> >>>>>> svn: MERGE request failed on
> >> '/svn-test-work/repositories/basic_tests-3/A'
> >>>>>> ..\..\..\subversion\libsvn_ra_dav\util.c:385: (apr_err=160044)
> >>>>>> svn: Cannot write to the prototype revision file of transaction '1-1'
> >>>>>> because a previous representation is currently being written by this
> >> process
> >>>>>> FAIL:  basic_tests.py 3: basic commit command
> >>>>>>
> >>>>>> This issue didn't show up during ra_local testing. I'll see if making
> >>>>>> the UUID unique per test repository solves it.
> >>>>>>
> >>>>>>
> >>>>> It will - the intraprocess txn serialisation is keyed on (UUID, txn
> id),
> >>>>> so this an expected failure mode for the case where we attempt to
> access
> >>>>> two different filesystems that have the same UUID.
> >>>>>
> >>>>> You'll also see serialisation in commits across all the filesystems
> >>>>> as the write-lock is keyed on UUID alone :-)
> >>>>>
> >>>>> One thing I'd like to do is make it an error to open a filesystem with
> >>>>> the same UUID as another (different) filesystem opened in the same
> >> library
> >>>>> context.  But that rather requires us to fix the test suite first.
> >>>>>
> >>>> To fix the test suite I need to roll back the hotcopy/checkout/copy
> >>>> method of setting up the sandbox, back to the original (slower) svnadmin
> >>>> dump|load/checkout method.
> >>>>
> >>>> I'll do that tomorrow.
> >>> Do we have a general need for an 'svnadmin reset-uuid'?  I've had to
> >>> hack in such a thing for CollabNet's deployments, but there's nothing
> >>> particularly unique about CollabNet's needs, and here's another
> >>> opportunity to make use of such a thing.
> >>>
> >> Doh!  I shoulda read the rest of the thread first.  Okay.  Count this as
> >> "+1 for "svnadmin setuuid [--generate]".
> >>
> >> By the way, in the CollabNet patch, I basically just made
> >> svn_fs_set_uuid() able to accept a NULL value which means, "I don't have
> >> a UUID to suggest; go generate one for me."
> >
> > (Re)setting the UUID doesn't help the test suite anything. Ok, we can use
> > hotcopy +reset UUID instead of svnadmin dump|load, but the majority of the
> > performance increase we get is from checking out only one working copy, and
> > then copy+relocate that for each sandbox working copy.
> > That relocate only works when the uuid of the sandbox repo is identical as
> the
> > pristine repo.
>
> Eww.  That's an entirely different matter, huh?
>
> How nasty do we wanna get here?  It's certainly within our
> Python-wielding powers to not use 'svn switch --relocate', and just
> write a custom WC crawler that can tweak both URL and UUID.  (Yep, we've
> done *that* for CollabNet too.)

Hm, I prefer that the python tests only use 'official' svn operations and don't
touch the admin entries.

The reason why the tests now have problems with the uuid's not being unique is
because I'm changing the framework to run tests in parallel. Running the tests
in parallel gives a performance increase of x2.5, so I have no problem rolling
back to the original sandbox method (create repos+checkout).

Lieven



----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [svn] x [fsfs] basic_tests.py failures

Posted by "C. Michael Pilato" <cm...@collab.net>.
Lieven Govaerts wrote:
> Quoting "C. Michael Pilato" <cm...@collab.net>:
> 
>> C. Michael Pilato wrote:
>>> Lieven Govaerts wrote:
>>>> Malcolm Rowe wrote:
>>>>> On Wed, Nov 08, 2006 at 11:40:28PM +0100, Lieven Govaerts wrote:
>>>>>
>>>>>> FYI: I get this issue with trunk when I'm running basic_tests parallel
>>>>>> in 10 processes over ra_dav (on windows):
>>>>>>
>>>>>> ..\..\..\subversion\libsvn_client\commit.c:866: (apr_err=160044)
>>>>>> svn: Commit failed (details follow):
>>>>>> ..\..\..\subversion\libsvn_ra_dav\util.c:895: (apr_err=160044)
>>>>>> svn: MERGE request failed on
>> '/svn-test-work/repositories/basic_tests-3/A'
>>>>>> ..\..\..\subversion\libsvn_ra_dav\util.c:385: (apr_err=160044)
>>>>>> svn: Cannot write to the prototype revision file of transaction '1-1'
>>>>>> because a previous representation is currently being written by this
>> process
>>>>>> FAIL:  basic_tests.py 3: basic commit command
>>>>>>
>>>>>> This issue didn't show up during ra_local testing. I'll see if making
>>>>>> the UUID unique per test repository solves it.
>>>>>>
>>>>>>
>>>>> It will - the intraprocess txn serialisation is keyed on (UUID, txn id),
>>>>> so this an expected failure mode for the case where we attempt to access
>>>>> two different filesystems that have the same UUID.
>>>>>
>>>>> You'll also see serialisation in commits across all the filesystems
>>>>> as the write-lock is keyed on UUID alone :-)
>>>>>
>>>>> One thing I'd like to do is make it an error to open a filesystem with
>>>>> the same UUID as another (different) filesystem opened in the same
>> library
>>>>> context.  But that rather requires us to fix the test suite first.
>>>>>
>>>> To fix the test suite I need to roll back the hotcopy/checkout/copy
>>>> method of setting up the sandbox, back to the original (slower) svnadmin
>>>> dump|load/checkout method.
>>>>
>>>> I'll do that tomorrow.
>>> Do we have a general need for an 'svnadmin reset-uuid'?  I've had to
>>> hack in such a thing for CollabNet's deployments, but there's nothing
>>> particularly unique about CollabNet's needs, and here's another
>>> opportunity to make use of such a thing.
>>>
>> Doh!  I shoulda read the rest of the thread first.  Okay.  Count this as
>> "+1 for "svnadmin setuuid [--generate]".
>>
>> By the way, in the CollabNet patch, I basically just made
>> svn_fs_set_uuid() able to accept a NULL value which means, "I don't have
>> a UUID to suggest; go generate one for me."
> 
> (Re)setting the UUID doesn't help the test suite anything. Ok, we can use
> hotcopy +reset UUID instead of svnadmin dump|load, but the majority of the
> performance increase we get is from checking out only one working copy, and
> then copy+relocate that for each sandbox working copy.
> That relocate only works when the uuid of the sandbox repo is identical as the
> pristine repo.

Eww.  That's an entirely different matter, huh?

How nasty do we wanna get here?  It's certainly within our
Python-wielding powers to not use 'svn switch --relocate', and just
write a custom WC crawler that can tweak both URL and UUID.  (Yep, we've
done *that* for CollabNet too.)

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand


Re: [svn] x [fsfs] basic_tests.py failures

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
On 11/9/06, Lieven Govaerts <sv...@mobsol.be> wrote:
> That relocate only works when the uuid of the sandbox repo is identical as the
> pristine repo.

--force bypasses that check, IIRC.  -- justin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [svn] x [fsfs] basic_tests.py failures

Posted by Lieven Govaerts <sv...@mobsol.be>.
Quoting "C. Michael Pilato" <cm...@collab.net>:

> C. Michael Pilato wrote:
> > Lieven Govaerts wrote:
> >> Malcolm Rowe wrote:
> >>> On Wed, Nov 08, 2006 at 11:40:28PM +0100, Lieven Govaerts wrote:
> >>>
> >>>> FYI: I get this issue with trunk when I'm running basic_tests parallel
> >>>> in 10 processes over ra_dav (on windows):
> >>>>
> >>>> ..\..\..\subversion\libsvn_client\commit.c:866: (apr_err=160044)
> >>>> svn: Commit failed (details follow):
> >>>> ..\..\..\subversion\libsvn_ra_dav\util.c:895: (apr_err=160044)
> >>>> svn: MERGE request failed on
> '/svn-test-work/repositories/basic_tests-3/A'
> >>>> ..\..\..\subversion\libsvn_ra_dav\util.c:385: (apr_err=160044)
> >>>> svn: Cannot write to the prototype revision file of transaction '1-1'
> >>>> because a previous representation is currently being written by this
> process
> >>>> FAIL:  basic_tests.py 3: basic commit command
> >>>>
> >>>> This issue didn't show up during ra_local testing. I'll see if making
> >>>> the UUID unique per test repository solves it.
> >>>>
> >>>>
> >>> It will - the intraprocess txn serialisation is keyed on (UUID, txn id),
> >>> so this an expected failure mode for the case where we attempt to access
> >>> two different filesystems that have the same UUID.
> >>>
> >>> You'll also see serialisation in commits across all the filesystems
> >>> as the write-lock is keyed on UUID alone :-)
> >>>
> >>> One thing I'd like to do is make it an error to open a filesystem with
> >>> the same UUID as another (different) filesystem opened in the same
> library
> >>> context.  But that rather requires us to fix the test suite first.
> >>>
> >> To fix the test suite I need to roll back the hotcopy/checkout/copy
> >> method of setting up the sandbox, back to the original (slower) svnadmin
> >> dump|load/checkout method.
> >>
> >> I'll do that tomorrow.
> >
> > Do we have a general need for an 'svnadmin reset-uuid'?  I've had to
> > hack in such a thing for CollabNet's deployments, but there's nothing
> > particularly unique about CollabNet's needs, and here's another
> > opportunity to make use of such a thing.
> >
>
> Doh!  I shoulda read the rest of the thread first.  Okay.  Count this as
> "+1 for "svnadmin setuuid [--generate]".
>
> By the way, in the CollabNet patch, I basically just made
> svn_fs_set_uuid() able to accept a NULL value which means, "I don't have
> a UUID to suggest; go generate one for me."

(Re)setting the UUID doesn't help the test suite anything. Ok, we can use
hotcopy +reset UUID instead of svnadmin dump|load, but the majority of the
performance increase we get is from checking out only one working copy, and
then copy+relocate that for each sandbox working copy.
That relocate only works when the uuid of the sandbox repo is identical as the
pristine repo.

Lieven

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [svn] x [fsfs] basic_tests.py failures

Posted by "C. Michael Pilato" <cm...@collab.net>.
C. Michael Pilato wrote:
> Lieven Govaerts wrote:
>> Malcolm Rowe wrote:
>>> On Wed, Nov 08, 2006 at 11:40:28PM +0100, Lieven Govaerts wrote:
>>>   
>>>> FYI: I get this issue with trunk when I'm running basic_tests parallel
>>>> in 10 processes over ra_dav (on windows):
>>>>
>>>> ..\..\..\subversion\libsvn_client\commit.c:866: (apr_err=160044)
>>>> svn: Commit failed (details follow):
>>>> ..\..\..\subversion\libsvn_ra_dav\util.c:895: (apr_err=160044)
>>>> svn: MERGE request failed on '/svn-test-work/repositories/basic_tests-3/A'
>>>> ..\..\..\subversion\libsvn_ra_dav\util.c:385: (apr_err=160044)
>>>> svn: Cannot write to the prototype revision file of transaction '1-1'
>>>> because a previous representation is currently being written by this process
>>>> FAIL:  basic_tests.py 3: basic commit command
>>>>
>>>> This issue didn't show up during ra_local testing. I'll see if making
>>>> the UUID unique per test repository solves it.
>>>>
>>>>     
>>> It will - the intraprocess txn serialisation is keyed on (UUID, txn id),
>>> so this an expected failure mode for the case where we attempt to access
>>> two different filesystems that have the same UUID.
>>>
>>> You'll also see serialisation in commits across all the filesystems
>>> as the write-lock is keyed on UUID alone :-)
>>>
>>> One thing I'd like to do is make it an error to open a filesystem with
>>> the same UUID as another (different) filesystem opened in the same library
>>> context.  But that rather requires us to fix the test suite first.
>>>   
>> To fix the test suite I need to roll back the hotcopy/checkout/copy
>> method of setting up the sandbox, back to the original (slower) svnadmin
>> dump|load/checkout method.
>>
>> I'll do that tomorrow.
> 
> Do we have a general need for an 'svnadmin reset-uuid'?  I've had to
> hack in such a thing for CollabNet's deployments, but there's nothing
> particularly unique about CollabNet's needs, and here's another
> opportunity to make use of such a thing.
> 

Doh!  I shoulda read the rest of the thread first.  Okay.  Count this as
"+1 for "svnadmin setuuid [--generate]".

By the way, in the CollabNet patch, I basically just made
svn_fs_set_uuid() able to accept a NULL value which means, "I don't have
a UUID to suggest; go generate one for me."

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand


Re: [svn] x [fsfs] basic_tests.py failures

Posted by "C. Michael Pilato" <cm...@collab.net>.
Lieven Govaerts wrote:
> Malcolm Rowe wrote:
>> On Wed, Nov 08, 2006 at 11:40:28PM +0100, Lieven Govaerts wrote:
>>   
>>> FYI: I get this issue with trunk when I'm running basic_tests parallel
>>> in 10 processes over ra_dav (on windows):
>>>
>>> ..\..\..\subversion\libsvn_client\commit.c:866: (apr_err=160044)
>>> svn: Commit failed (details follow):
>>> ..\..\..\subversion\libsvn_ra_dav\util.c:895: (apr_err=160044)
>>> svn: MERGE request failed on '/svn-test-work/repositories/basic_tests-3/A'
>>> ..\..\..\subversion\libsvn_ra_dav\util.c:385: (apr_err=160044)
>>> svn: Cannot write to the prototype revision file of transaction '1-1'
>>> because a previous representation is currently being written by this process
>>> FAIL:  basic_tests.py 3: basic commit command
>>>
>>> This issue didn't show up during ra_local testing. I'll see if making
>>> the UUID unique per test repository solves it.
>>>
>>>     
>> It will - the intraprocess txn serialisation is keyed on (UUID, txn id),
>> so this an expected failure mode for the case where we attempt to access
>> two different filesystems that have the same UUID.
>>
>> You'll also see serialisation in commits across all the filesystems
>> as the write-lock is keyed on UUID alone :-)
>>
>> One thing I'd like to do is make it an error to open a filesystem with
>> the same UUID as another (different) filesystem opened in the same library
>> context.  But that rather requires us to fix the test suite first.
>>   
> To fix the test suite I need to roll back the hotcopy/checkout/copy
> method of setting up the sandbox, back to the original (slower) svnadmin
> dump|load/checkout method.
> 
> I'll do that tomorrow.

Do we have a general need for an 'svnadmin reset-uuid'?  I've had to
hack in such a thing for CollabNet's deployments, but there's nothing
particularly unique about CollabNet's needs, and here's another
opportunity to make use of such a thing.

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand


Re: [svn] x [fsfs] basic_tests.py failures

Posted by Daniel Rall <dl...@collab.net>.
On Wed, 08 Nov 2006, Garrett Rooney wrote:

> On 11/8/06, Malcolm Rowe <ma...@farside.org.uk> wrote:
> 
> >[1] It might be nice to have some easier way to manipulate repository
> >    UUIDs as well: svnsync is another instance that currently relies on
> >    that load trick to allow you to set up a mirror you can switch away
> >    from.
> >
> >    Perhaps we should introduce 'svnadmin setuuid <REPOS> [UUID]'?
> >    (If UUID wasn't specified, it would just pick a new random one.)
> 
> +1 to a setuuid command, suitably documented so people don't break
> things with it.

Yeah, this seems like a pretty reasonable administrative tool to have
in your bag.

Re: [svn] x [fsfs] basic_tests.py failures

Posted by Garrett Rooney <ro...@electricjellyfish.net>.
On 11/8/06, Malcolm Rowe <ma...@farside.org.uk> wrote:

> [1] It might be nice to have some easier way to manipulate repository
>     UUIDs as well: svnsync is another instance that currently relies on
>     that load trick to allow you to set up a mirror you can switch away
>     from.
>
>     Perhaps we should introduce 'svnadmin setuuid <REPOS> [UUID]'?
>     (If UUID wasn't specified, it would just pick a new random one.)

+1 to a setuuid command, suitably documented so people don't break
things with it.

-garrett

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [svn] x [fsfs] basic_tests.py failures

Posted by Malcolm Rowe <ma...@farside.org.uk>.
On Thu, Nov 09, 2006 at 01:04:41AM +0100, Lieven Govaerts wrote:
> To fix the test suite I need to roll back the hotcopy/checkout/copy
> method of setting up the sandbox, back to the original (slower) svnadmin
> dump|load/checkout method.
> 

Or just adjust the UUID of the copied repositories (which you can do
with svnadmin load [1]).  The difficult bit would probably be generating
unique, valid, UUIDs, unless we're happy to use invalid ones (as Max
pointed out to me, you can actually have 'I like ponies' as a UUID if
you really want to, since we don't do anything other than string
comparison with it).

[1] It might be nice to have some easier way to manipulate repository
    UUIDs as well: svnsync is another instance that currently relies on
    that load trick to allow you to set up a mirror you can switch away
    from.

    Perhaps we should introduce 'svnadmin setuuid <REPOS> [UUID]'?
    (If UUID wasn't specified, it would just pick a new random one.)

Regards,
Malcolm

Re: [svn] x [fsfs] basic_tests.py failures

Posted by Lieven Govaerts <sv...@mobsol.be>.
Malcolm Rowe wrote:
> On Wed, Nov 08, 2006 at 11:40:28PM +0100, Lieven Govaerts wrote:
>   
>> FYI: I get this issue with trunk when I'm running basic_tests parallel
>> in 10 processes over ra_dav (on windows):
>>
>> ..\..\..\subversion\libsvn_client\commit.c:866: (apr_err=160044)
>> svn: Commit failed (details follow):
>> ..\..\..\subversion\libsvn_ra_dav\util.c:895: (apr_err=160044)
>> svn: MERGE request failed on '/svn-test-work/repositories/basic_tests-3/A'
>> ..\..\..\subversion\libsvn_ra_dav\util.c:385: (apr_err=160044)
>> svn: Cannot write to the prototype revision file of transaction '1-1'
>> because a previous representation is currently being written by this process
>> FAIL:  basic_tests.py 3: basic commit command
>>
>> This issue didn't show up during ra_local testing. I'll see if making
>> the UUID unique per test repository solves it.
>>
>>     
>
> It will - the intraprocess txn serialisation is keyed on (UUID, txn id),
> so this an expected failure mode for the case where we attempt to access
> two different filesystems that have the same UUID.
>
> You'll also see serialisation in commits across all the filesystems
> as the write-lock is keyed on UUID alone :-)
>
> One thing I'd like to do is make it an error to open a filesystem with
> the same UUID as another (different) filesystem opened in the same library
> context.  But that rather requires us to fix the test suite first.
>   
To fix the test suite I need to roll back the hotcopy/checkout/copy
method of setting up the sandbox, back to the original (slower) svnadmin
dump|load/checkout method.

I'll do that tomorrow.

Lieven

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [svn] x [fsfs] basic_tests.py failures

Posted by Malcolm Rowe <ma...@farside.org.uk>.
On Wed, Nov 08, 2006 at 11:40:28PM +0100, Lieven Govaerts wrote:
> FYI: I get this issue with trunk when I'm running basic_tests parallel
> in 10 processes over ra_dav (on windows):
> 
> ..\..\..\subversion\libsvn_client\commit.c:866: (apr_err=160044)
> svn: Commit failed (details follow):
> ..\..\..\subversion\libsvn_ra_dav\util.c:895: (apr_err=160044)
> svn: MERGE request failed on '/svn-test-work/repositories/basic_tests-3/A'
> ..\..\..\subversion\libsvn_ra_dav\util.c:385: (apr_err=160044)
> svn: Cannot write to the prototype revision file of transaction '1-1'
> because a previous representation is currently being written by this process
> FAIL:  basic_tests.py 3: basic commit command
> 
> This issue didn't show up during ra_local testing. I'll see if making
> the UUID unique per test repository solves it.
> 

It will - the intraprocess txn serialisation is keyed on (UUID, txn id),
so this an expected failure mode for the case where we attempt to access
two different filesystems that have the same UUID.

You'll also see serialisation in commits across all the filesystems
as the write-lock is keyed on UUID alone :-)

One thing I'd like to do is make it an error to open a filesystem with
the same UUID as another (different) filesystem opened in the same library
context.  But that rather requires us to fix the test suite first.

Regards,
Malcolm

Re: [svn] x [fsfs] basic_tests.py failures

Posted by Lieven Govaerts <sv...@mobsol.be>.
Paul Burba wrote:
> In IRC:
>
> [10:57] pburba: On trunk I'm seeing many [svn] x [fsfs] failures in 
> basic_test.py on Windows
> [10:57] pburba: Specifically I'm seeing this error on a commit attempt:
> [10:57] pburba: svn: Cannot write to the prototype revision file of 
> transaction '1-1' because a previous representation is currently being 
> written by this process
FYI: I get this issue with trunk when I'm running basic_tests parallel
in 10 processes over ra_dav (on windows):

..\..\..\subversion\libsvn_client\commit.c:866: (apr_err=160044)
svn: Commit failed (details follow):
..\..\..\subversion\libsvn_ra_dav\util.c:895: (apr_err=160044)
svn: MERGE request failed on '/svn-test-work/repositories/basic_tests-3/A'
..\..\..\subversion\libsvn_ra_dav\util.c:385: (apr_err=160044)
svn: Cannot write to the prototype revision file of transaction '1-1'
because a previous representation is currently being written by this process
FAIL:  basic_tests.py 3: basic commit command

This issue didn't show up during ra_local testing. I'll see if making
the UUID unique per test repository solves it.

Lieven

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [svn] x [fsfs] basic_tests.py failures

Posted by David James <dj...@collab.net>.
On 10/13/06, Malcolm Rowe <ma...@farside.org.uk> wrote:
> On Thu, Oct 12, 2006 at 12:56:11PM -0400, Paul Burba wrote:
> > 2) Somehow the problem is tied to the basic_corruption test.  If that test
> > is not run all the other tests pass.
> >
> This is what I'm fairly sure is happening:
>
> [snip theories]
>
> I can reproduce this now by running svnserve in threaded mode on Linux.

It turns out that this problem was caused by an incorrect pointer
dereference. Malcolm fixed this in r21924.

Cheers,

David

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [svn] x [fsfs] basic_tests.py failures

Posted by Garrett Rooney <ro...@electricjellyfish.net>.
On 10/13/06, Malcolm Rowe <ma...@farside.org.uk> wrote:

> 3. Having said that, the thing that's really causing a problem (and
> causing all the remaining tests to fail) is the peculiar method we're
> using to generate our repositories in the test suite means that all the
> repositories we open have the same UUID.  This is actually quite bad -
> the FSFS code at least uses UUID uniqueness to track per-repository
> information (like the fs-wide write lock and the per-txn lock you've
> encountered here), so all the transactions have the same UUID/txnid.

Yes, I think we should look into making the repositories in the tests
use different UUIDs...

-garrett

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [svn] x [fsfs] basic_tests.py failures

Posted by Malcolm Rowe <ma...@farside.org.uk>.
On Thu, Oct 12, 2006 at 12:56:11PM -0400, Paul Burba wrote:
> 2) Somehow the problem is tied to the basic_corruption test.  If that test 
> is not run all the other tests pass.
> 

This is what I'm fairly sure is happening:

1. Whatever the basic corruption test ends up doing on the server side
with the first commit is leaving us with a transaction that we've
started to write to, but then neither finished writing nor aborted,
which is what the FSFS fix was designed to prevent further updates to.

2. There does seem to be a bug in the transaction-locking code that
basic_corruption is hitting, since the second attempt to commit is
getting killed because we think that a transaction with the same ID is
still active (it may well have an open file handle to the proto-rev file
from the first attempt, but I bet we've blown away the transaction
directory by this point, so it is supposed to have forgotten about the
transaction).

3. Having said that, the thing that's really causing a problem (and
causing all the remaining tests to fail) is the peculiar method we're
using to generate our repositories in the test suite means that all the
repositories we open have the same UUID.  This is actually quite bad -
the FSFS code at least uses UUID uniqueness to track per-repository
information (like the fs-wide write lock and the per-txn lock you've
encountered here), so all the transactions have the same UUID/txnid.

And why doesn't this trigger on Cygwin or Linux?  Because svnserve on
those OSs uses forking model rather than the threaded model by default,
so the intra-process lock (keyed on UUID/txnid) is coming into play
rather than the fcntl() inter-process lock - which works just fine in
this situation.

I can reproduce this now by running svnserve in threaded mode on Linux.

So:

1. We should look at why the second commit in basic_corruption is
failing (or rather, why the transaction hasn't been forgotten when we
purged the transaction directory from the first, aborted, commit.  I
might be able to do that sometime before the end of the Summit.

2. We should re-set the UUID on the new repositories we generate in the
test suite immediately after we hotcopy them.  Though offhand, I'm not
sure how we'd achieve this.

3. I think we should actively look to prevent opening 'different'
repositories with the same UUID, since we are (and have always been)
using the UUID to determine filesystem identity in FSFS.  Is there a
reliable way to determine if two paths point to the same file in APR?
(it exposes an inode, doesn't it? does anyone know if that works on
Windows and for NFS-mounted files?)  If so, we could fail attempts to
open a filesystem that had the same UUID and different inode to an
already-open filesystem.  I _think_ this approach is okay w.r.t.
backwards compatibility - anyone have any objections?

4. It would be nice for the client to abort the transaction for that
failing first commit in basic_corruption, if it's not already, and if
it's not too hard to do.

5. I should have tested using svnserve in threaded mode.

Regards,
Malcolm