You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by "Peter N. Lundblad" <pe...@famlundblad.se> on 2005/04/05 09:46:52 UTC

attn: ghudson; FSFS and multi-threading on Unix broken?

Hi,

As D.J Heap pointed out some days ago, FSFS may be broken in the
multithreaded case on Unix. As it turns out, the write lock in FSFS is
implemented using fcntl (if available). According to POSIX, locks created
by fcntl are per-process. So there is no mutual exclusion for the things
that assume so using the write lock.

(For people not familiar with FSFS internals, this is code used during the
final phase of a commit - and, from 1.2 onwards, the code that changes the
on-disk data structures for file locks.)

I'll experiment and try to demonstrate that this bug exists in practice,
but I don't see how it could not be a bug.

Note that (at least I) haven't heard of any problems related to this
possible bug. This may be because not many users use FSFS in multiple
threads simultaneously (I think people normally use svnserve in fork mode,
at least on Unix). Also, like all races it is probably hard to reproduce
reliably.

So then comes the question how to solve this... We obviously can't change
the interprocess locking scheme used in FSFS. That'd break compatibility
and the current scheme is portable and works on NFS and all that. What we
might be able to do is to introduce an intra-process mutex, which gets
acquired before the file is locked (and released after the file is
unlocked). This works since we only need to ad serialization inside
processes.

Dejavu? Yes. See svn_utf_initialize. And the timing is quite the same
regarding to releases... I mean, we need a way to get this mutex
initialized...

We can add svn_fs_initialize, whihc could be called if you want this bug
fixed. ;) Note that I don't like it, but at least it doesn't make the
situation worse for pre 1.2 libsvn_fs users. If this was called, FS module
initialization could be serialized. then FSFS could initialize a hash of
mutexes, one for each FS UUID or something.

As you see, I haven't fleshed out the resolution proposal very much. But I
want people to be aware of the possibility that we have a serious FSFS
(dataloss?) bug. Please someone (ghudson?), tell me that this is a false
alarm! :-)

Regards,
//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: attn: ghudson; FSFS and multi-threading on Unix broken?

Posted by Philip Martin <ph...@codematters.co.uk>.
Greg Hudson <gh...@MIT.EDU> writes:

> On Tue, 2005-04-05 at 05:46, Peter N. Lundblad wrote:
>> As you see, I haven't fleshed out the resolution proposal very much. But I
>> want people to be aware of the possibility that we have a serious FSFS
>> (dataloss?) bug. Please someone (ghudson?), tell me that this is a false
>> alarm! :-)
>
> I don't think it's a false alarm.  But as you noted, it's rare to use
> svnserve in threaded mode under Unix.  Under Windows, we know you can
> block against a lock you already hold (see recent deadlock issues), so
> locks are effectively per-thread.  And I think it's currently unheard of
> for a third-party multithreaded Unix program to be interested in more
> than the network client part of the Subversion libraries.

The problem won't show up on older Linux systems because in the
LinuxThreads library fcntl was a per-thread lock, i.e. it had
non-POSIX behaviour.  Newer Linux systems (generally kernel 2.6) use
NPTL and they can show the problem.

I believe I can trigger it by using stress.pl and svnserve in thread
mode.  I ran the server on my Linux-2.6 laptop:

$ svnserve -Tdr.

and three clients on my Linux-2.4 desktop:

$ stress.pl -i1 -d -s0 -Usvn://debian1/repostress
$ stress.pl -i2 -d -s0 -Usvn://debian1/repostress
$ stress.pl -i3 -d -s0 -Usvn://debian1/repostress

One of the clients failed with:

Updated to revision 51.
Committing:
Sending        wcstress.5742/trunk/bar1/foo1
Sending        wcstress.5742/trunk/bar1/foo2
Sending        wcstress.5742/trunk/foo1
Sending        wcstress.5742/trunk/foo2
Transmitting file data ....../svn/subversion/libsvn_client/commit.c:781: (apr_err=160028)
svn: Commit failed (details follow):
../svn/subversion/libsvn_repos/commit.c:120: (apr_err=160028)
svn: Out of date: '/trunk/bar1/foo1' in transaction '52-1'
Updating:
../svn/subversion/libsvn_wc/update_editor.c:1609: (apr_err=155017)
svn: Checksum mismatch for 'wcstress.5742/trunk/.svn/text-base/foo1.svn-base'; expected: 'eed54c9b53e24e2ff953797ea74d7dca', actual: '9a0a1ec9dd62dfc56be947ba36df0ede'
unexpected update fail: exit status: 256

The wc diff looks as expected:

$ svn diff wcstress.5742/trunk/foo1 
Index: wcstress.5742/trunk/foo1
===================================================================
--- wcstress.5742/trunk/foo1    (revision 51)
+++ wcstress.5742/trunk/foo1    (working copy)
@@ -1,7 +1,7 @@
 A0
 0
 A1
-1,1,3,4,5,6,7,9,1,3,4,5,6,7,9
+1,1,3,4,5,6,7,9,1,3,4,5,6,7,9,10
 A2
 2,1,3,4,5,6,7,1,3,4,5,6,7,9,10,12
 A3

But comparing against the repository I get

$ svn diff -r50 wcstress.5742/trunk/foo1 
../svn/subversion/libsvn_delta/text_delta.c:594: (apr_err=200003)
svn: Delta source ended unexpectedly

and doing an explicit cat/diff I see a missing modification:

$ svn cat -r51 svn://debian1/repostress/trunk/foo1 > xx
$ diff -u wcstress.5742/trunk/foo1 xx
--- wcstress.5742/trunk/foo1    Tue Apr  5 17:20:08 2005
+++ xx  Tue Apr  5 17:31:37 2005
@@ -1,11 +1,11 @@
 A0
 0
 A1
-1,1,3,4,5,6,7,9,1,3,4,5,6,7,9,10
+1,1,3,4,5,6,7,9,1,3,4,5,6,7
 A2
 2,1,3,4,5,6,7,1,3,4,5,6,7,9,10,12
 A3
-3,1,1,3,4,5,6,7,9,10
+3,1,1,3,4,5,6,7,9,10,12
 A4
 4
 A5

I assume Apache using the worker MPM will have the same problem.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: FSFS and multi-threading on Unix broken?

Posted by Mark Phippard <Ma...@softlanding.com>.
"Peter N. Lundblad" <pe...@famlundblad.se> wrote on 04/05/2005 05:18:44 
PM:

> > Do we have a way of calling svn_fs_initialize from mod_dav_svn?  We
> > don't seem to call svn_utf_initialize there.
> >
> Someone who knows about apache could probably say something here. Not 
sure
> if dav_svn_init qualifies. "Called once in a single threaded 
environment."

I seem to recall a similar discussion took place around BDB a few months 
back when improvements were being discussed.  I thought the conclusion was 
that this would be impossible to do from DAV and that was one of the 
reasons it got pushed into BDB.  The issues could be different though.

> A problem with this fix is that it won't be trivial. So it might be 
risky
> for 1.2. OTOH, as yoiu point out, we want it in 1.2 also.

While this may be a bit premature until the proper fix is discovered, 
given that we are talking about dataloss, I would hope that a backport to 
1.1.x would also be considered.

Mark


_____________________________________________________________________________
Scanned for SoftLanding Systems, Inc. by IBM Email Security Management Services powered by MessageLabs. 
_____________________________________________________________________________

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: FSFS and multi-threading on Unix broken?

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.
On Tue, 5 Apr 2005, Greg Hudson wrote:

> On Tue, 2005-04-05 at 15:21, Peter N. Lundblad wrote:
> Because this problem could affect anyone using a DAV server on Unix, I
> don't think we can defer the problem to APR.  Also, since APR has built
> this global mutex thing which looks like a solution but isn't, they
> might be resistant to adding yet another inter-process/inter-thread
> locking mechanism.  So we'll need to do something in Subversion,
> probably along the lines of an svn_fs_initialize like you suggested.
>
> Unlike the situation with UTF-8, if we don't get an svn_fs_initialize()
> call, we can't sacrifice performance for correctness as far as I know.
> But, we can make a best effort:
>
>   if (!static_variable)  {
>     static_variable = 1;
>     create static_mutex;
>   }
>
> If this seems bad, note that this is how neon protects SSL
> initialization.  The window of failure is very small.  And, of course,
> if you do call svn_fs_initialize before going threaded, it would create
> the mutex and set the variable and there would be no failure window.
>
Yeah, this is ugly, but maybe it is acceptable to be pragmatic here. We've
gone through this initialization once stuff before.

> Do we have a way of calling svn_fs_initialize from mod_dav_svn?  We
> don't seem to call svn_utf_initialize there.
>
Someone who knows about apache could probably say something here. Not sure
if dav_svn_init qualifies. "Called once in a single threaded environment."

A problem with this fix is that it won't be trivial. So it might be risky
for 1.2. OTOH, as yoiu point out, we want it in 1.2 also.

Regards,
//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: FSFS and multi-threading on Unix broken?

Posted by Greg Hudson <gh...@MIT.EDU>.
On Tue, 2005-04-05 at 15:21, Peter N. Lundblad wrote:
> But won't we have the same initialization problems as I scetched in my
> original mail? A global_mutex creates a thread mutex internally, so you
> hae to use the same global_mutex for each thread for this to work.

You're right; the solution isn't as simple as I thought.  I was
imagining that APR's global mutexes were really "global" like the name
implies--that the provided pathname identifies a lock within a namespace
common to all threads and processes.  But it doesn't work like that; the
pathname is only used for inter-process locking and is only used by some
inter-process locking mechanisms.

>  Then,
> is there really a benefit to using this global_mutex instead of just
> wrapping the apr_lock_file with locking/unlocking a mutex with thread
> scope?

Probably not much of one, since we can't use a global mutex on Windows.

So, my revised position:

Because this problem could affect anyone using a DAV server on Unix, I
don't think we can defer the problem to APR.  Also, since APR has built
this global mutex thing which looks like a solution but isn't, they
might be resistant to adding yet another inter-process/inter-thread
locking mechanism.  So we'll need to do something in Subversion,
probably along the lines of an svn_fs_initialize like you suggested.

Unlike the situation with UTF-8, if we don't get an svn_fs_initialize()
call, we can't sacrifice performance for correctness as far as I know. 
But, we can make a best effort:

  if (!static_variable)  {
    static_variable = 1;
    create static_mutex;
  }

If this seems bad, note that this is how neon protects SSL
initialization.  The window of failure is very small.  And, of course,
if you do call svn_fs_initialize before going threaded, it would create
the mutex and set the variable and there would be no failure window.

Do we have a way of calling svn_fs_initialize from mod_dav_svn?  We
don't seem to call svn_utf_initialize there.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: attn: ghudson; FSFS and multi-threading on Unix broken?

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.
On Tue, 5 Apr 2005, Greg Hudson wrote:

> I'm guessing what we'll wind up doing is using apr_file_lock() under
> Windows, and a global mutex with the fcntl mechanism specified under
> Unix.  But obviously APR is not providing quite what we want here.
>
But won't we have the same initialization problems as I scetched in my
original mail? A global_mutex creates a thread mutex internally, so you
hae to use the same global_mutex for each thread for this to work. Then,
is there really a benefit to using this global_mutex instead of just
wrapping the apr_lock_file with locking/unlocking a mutex with thread
scope?

Regards,
//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: attn: ghudson; FSFS and multi-threading on Unix broken?

Posted by Sander Striker <s....@striker.nl>.
Greg Hudson wrote:
> On Tue, 2005-04-05 at 11:57, Ryan Bloom wrote:
> 
>>ok, the global_mutex (that's the real name), allows you to specify the
>>lock mechanism, just as apr_process_lock does, so that is easily
>>resolved.
> 
> 
> Okay, that's much closer to what we want.
> 
> But we don't want Subversion to be in the business of picking a locking
> mechanism.  That's why we used apr_file_lock() in the first place.  We
> want APR to choose a mechanism, but we want it to work over network
> filesystems.  If we ask APR to choose the default mechanism, it looks
> like APR on Unix will choose flock/sysv/fcntl/pthreads/semaphores in
> that order, which will work/fail/work/fail/fail respectively.
> 
> Worse, on Windows, it appears the locking mechanism is ignored (!) and a
> mutex is used in all cases.  I can only assume that this mutex is
> machine-specific.  We want to continue using LockFileEx() under Windows.
> 
> I'm guessing what we'll wind up doing is using apr_file_lock() under
> Windows, and a global mutex with the fcntl mechanism specified under
> Unix.  But obviously APR is not providing quite what we want here.
> 
> (I'm looking at APR_0_9_BRANCH if that's relevant.)

Meta reply: Greg, you are aware that apr is in SVN nowadays?

  http://svn.apache.org/repos/asf/apr/apr/branches/0.9.x/


Sander



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: attn: ghudson; FSFS and multi-threading on Unix broken?

Posted by Greg Hudson <gh...@MIT.EDU>.
On Tue, 2005-04-05 at 11:57, Ryan Bloom wrote:
> ok, the global_mutex (that's the real name), allows you to specify the
> lock mechanism, just as apr_process_lock does, so that is easily
> resolved.

Okay, that's much closer to what we want.

But we don't want Subversion to be in the business of picking a locking
mechanism.  That's why we used apr_file_lock() in the first place.  We
want APR to choose a mechanism, but we want it to work over network
filesystems.  If we ask APR to choose the default mechanism, it looks
like APR on Unix will choose flock/sysv/fcntl/pthreads/semaphores in
that order, which will work/fail/work/fail/fail respectively.

Worse, on Windows, it appears the locking mechanism is ignored (!) and a
mutex is used in all cases.  I can only assume that this mutex is
machine-specific.  We want to continue using LockFileEx() under Windows.

I'm guessing what we'll wind up doing is using apr_file_lock() under
Windows, and a global mutex with the fcntl mechanism specified under
Unix.  But obviously APR is not providing quite what we want here.

(I'm looking at APR_0_9_BRANCH if that's relevant.)


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: attn: ghudson; FSFS and multi-threading on Unix broken?

Posted by Ryan Bloom <rb...@gmail.com>.
ok, the global_mutex (that's the real name), allows you to specify the
lock mechanism, just as apr_process_lock does, so that is easily
resolved.

Ryan

On Apr 5, 2005 11:45 AM, Greg Hudson <gh...@mit.edu> wrote:
> On Tue, 2005-04-05 at 11:32, Ryan Bloom wrote:
> > I'm confused.  APR already has the ability to have a per-thread lock.
> > I believe it is called a global_lock.  The way this is implemented
> > depends on the platform, but for Linux I am pretty sure it is a fcntl
> > lock for inter-process and a pthread lock for intra-process.  For AIX,
> > it is a cross-process pthread lock.
> 
> A cross-process pthread lock would be specific to a particular machine.
> FSFS has to work on a network filesystem.
> 
> 


-- 
Ryan Bloom
rbb@apache.org
rbb@rkbloom.net
rbloom@gmail.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: attn: ghudson; FSFS and multi-threading on Unix broken?

Posted by Greg Hudson <gh...@MIT.EDU>.
On Tue, 2005-04-05 at 11:32, Ryan Bloom wrote:
> I'm confused.  APR already has the ability to have a per-thread lock. 
> I believe it is called a global_lock.  The way this is implemented
> depends on the platform, but for Linux I am pretty sure it is a fcntl
> lock for inter-process and a pthread lock for intra-process.  For AIX,
> it is a cross-process pthread lock.

A cross-process pthread lock would be specific to a particular machine. 
FSFS has to work on a network filesystem.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: attn: ghudson; FSFS and multi-threading on Unix broken?

Posted by Ryan Bloom <rb...@gmail.com>.
I'm confused.  APR already has the ability to have a per-thread lock. 
I believe it is called a global_lock.  The way this is implemented
depends on the platform, but for Linux I am pretty sure it is a fcntl
lock for inter-process and a pthread lock for intra-process.  For AIX,
it is a cross-process pthread lock.

There was a lock of work done in APR to allow the global lock code to
work properly on all platforms, so SVN should probably just migrate to
that API.

Ryan

On Apr 5, 2005 11:12 AM, Greg Hudson <gh...@mit.edu> wrote:
> On Tue, 2005-04-05 at 05:46, Peter N. Lundblad wrote:
> > As you see, I haven't fleshed out the resolution proposal very much. But I
> > want people to be aware of the possibility that we have a serious FSFS
> > (dataloss?) bug. Please someone (ghudson?), tell me that this is a false
> > alarm! :-)
> 
> I don't think it's a false alarm.  But as you noted, it's rare to use
> svnserve in threaded mode under Unix.  Under Windows, we know you can
> block against a lock you already hold (see recent deadlock issues), so
> locks are effectively per-thread.  And I think it's currently unheard of
> for a third-party multithreaded Unix program to be interested in more
> than the network client part of the Subversion libraries.
> 
> I would like to see the fix happen in APR.  Subversion is not unusual in
> desiring a per-thread file lock; I know krb5 has run into this issue as
> well.  And APR already has an initialization function, so it doesn't
> have that problem.
> 
> One option for the new APR interface is simply a lock which is
> guaranteed to block even if the lock is already held within the current
> process--that is, the Windows apr_file_lock() behavior, but on all
> platforms.  A more complex notion would be to create the idea of a
> locking context; the new lock interface would not block the current
> thread if the lock is held within the given locking context (it would
> act as a recursive lock), but would block if the lock is held within a
> different context in the same thread.  You can see how this would be
> nice for the FSFS locking code--no more "I already have the lock" flag
> for lock auto-expiry--but it's arguably doing too much of the
> application's job within the APR library.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: dev-help@subversion.tigris.org
> 
> 


-- 
Ryan Bloom
rbb@apache.org
rbb@rkbloom.net
rbloom@gmail.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: attn: ghudson; FSFS and multi-threading on Unix broken?

Posted by Mark Benedetto King <mb...@lowlatency.com>.
On Tue, Apr 05, 2005 at 11:51:38AM -0400, Mark Phippard wrote:
> Greg Hudson <gh...@MIT.EDU> wrote on 04/05/2005 11:48:52 AM:
> 
> > It uses apr_file_lock().  I have no ide ahow this is implemented under
> > OS/400.
> 
> Neither do I, but IBM provides support so I will have to call it in.  I 
> guess I would want to ask them whether the lock is scoped to the thread or 
> the process?
> 
> Thanks
> 
> Mark
> 

You could write a short test program that spins up two threads, each
of which tries to acquire a lock, prints success, and then sleeps.

If you get two success messages, you've got a problem.

I'd recommend that you do this regardless of what IBM support tells you.

--ben


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: attn: ghudson; FSFS and multi-threading on Unix broken?

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.
On Tue, 5 Apr 2005, Greg Hudson wrote:

> On Tue, 2005-04-05 at 11:44, Mark Phippard wrote:
> > What would be the easiest way to try to create the problem to see if it
> > happens?  Just get a bunch of people to commit at the same time?
>
> You'd want a very heavy commit load, yeah, and some way to detect that a
> commit didn't stick.
>
Or you can try writing a program creating two threads and lock the same
file in both threads with apr_file_lock. If both threads can have the lock
simultaneously, then you have the problem.

Regards,
//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: attn: ghudson; FSFS and multi-threading on Unix broken?

Posted by Mark Phippard <Ma...@softlanding.com>.
Greg Hudson <gh...@MIT.EDU> wrote on 04/05/2005 11:48:52 AM:

> It uses apr_file_lock().  I have no ide ahow this is implemented under
> OS/400.

Neither do I, but IBM provides support so I will have to call it in.  I 
guess I would want to ask them whether the lock is scoped to the thread or 
the process?

Thanks

Mark




_____________________________________________________________________________
Scanned for SoftLanding Systems, Inc. by IBM Email Security Management Services powered by MessageLabs. 
_____________________________________________________________________________

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: attn: ghudson; FSFS and multi-threading on Unix broken?

Posted by Greg Hudson <gh...@MIT.EDU>.
On Tue, 2005-04-05 at 11:44, Mark Phippard wrote:
> What would be the easiest way to try to create the problem to see if it 
> happens?  Just get a bunch of people to commit at the same time?

You'd want a very heavy commit load, yeah, and some way to detect that a
commit didn't stick.

> I read the docs for fcntl() on OS/400 and it definitely seems to indicate 
> the lock is for the process, not the thread.  Does the code specifically 
> use fcntl() or does it use an APR routine that possibly the OS/400 port 
> implemented with a per-thread lock?

It uses apr_file_lock().  I have no ide ahow this is implemented under
OS/400.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: attn: ghudson; FSFS and multi-threading on Unix broken?

Posted by Mark Phippard <Ma...@softlanding.com>.
Greg Hudson <gh...@MIT.EDU> wrote on 04/05/2005 11:39:49 AM:

> On Tue, 2005-04-05 at 11:27, Mark Phippard wrote:
> > I think it is possible that this could effect the OS/400 port, but I 
am 
> > not sure how I could tell.  OS/400 is Unix-like and supports POSIX, 
but it 
> > does not support fork().  svnserve runs multi-threaded and does not 
spawn 
> > any additional processes.  Would this mean that we would be exposed to 

> > this sort of error?  If so, how would it manifest?
> 
> The problem would manifest when two commits happening within the same
> process are finalized at the same time.  (Finalization begins when the
> client has finished transmitting all of its data.  At this point, all
> the file deltas have already been computed and the transaction has been
> auto-merged against the head if necessary.  The finalization process is:
> grab the write lock, choose a new revision number, marshal the changed
> directory data into the new proto-rev file, and move the new rev file
> into place.)  It's hard to say exactly what would go wrong, but most
> likely one of the commits would be obliterated by the other.

What would be the easiest way to try to create the problem to see if it 
happens?  Just get a bunch of people to commit at the same time?

I read the docs for fcntl() on OS/400 and it definitely seems to indicate 
the lock is for the process, not the thread.  Does the code specifically 
use fcntl() or does it use an APR routine that possibly the OS/400 port 
implemented with a per-thread lock?

Thanks

Mark

_____________________________________________________________________________
Scanned for SoftLanding Systems, Inc. by IBM Email Security Management Services powered by MessageLabs. 
_____________________________________________________________________________

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: attn: ghudson; FSFS and multi-threading on Unix broken?

Posted by Greg Hudson <gh...@MIT.EDU>.
On Tue, 2005-04-05 at 11:27, Mark Phippard wrote:
> I think it is possible that this could effect the OS/400 port, but I am 
> not sure how I could tell.  OS/400 is Unix-like and supports POSIX, but it 
> does not support fork().  svnserve runs multi-threaded and does not spawn 
> any additional processes.  Would this mean that we would be exposed to 
> this sort of error?  If so, how would it manifest?

The problem would manifest when two commits happening within the same
process are finalized at the same time.  (Finalization begins when the
client has finished transmitting all of its data.  At this point, all
the file deltas have already been computed and the transaction has been
auto-merged against the head if necessary.  The finalization process is:
grab the write lock, choose a new revision number, marshal the changed
directory data into the new proto-rev file, and move the new rev file
into place.)  It's hard to say exactly what would go wrong, but most
likely one of the commits would be obliterated by the other.

> We use fsfs exclusively in the OS/400 port with either svnserve or 
> mod_dav_svn for the server.  Wouldn't mod_dav_svn also potentially have 
> this problem?  Apache also runs threaded.

Possibly.  In that case, we may be merely getting lucky by virtue of
finalization being quick.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: attn: ghudson; FSFS and multi-threading on Unix broken?

Posted by Mark Phippard <Ma...@softlanding.com>.
Greg Hudson <gh...@MIT.EDU> wrote on 04/05/2005 11:12:07 AM:

> On Tue, 2005-04-05 at 05:46, Peter N. Lundblad wrote:
> > As you see, I haven't fleshed out the resolution proposal very much. 
But I
> > want people to be aware of the possibility that we have a serious FSFS
> > (dataloss?) bug. Please someone (ghudson?), tell me that this is a 
false
> > alarm! :-)
> 
> I don't think it's a false alarm.  But as you noted, it's rare to use
> svnserve in threaded mode under Unix.  Under Windows, we know you can
> block against a lock you already hold (see recent deadlock issues), so
> locks are effectively per-thread.  And I think it's currently unheard of
> for a third-party multithreaded Unix program to be interested in more
> than the network client part of the Subversion libraries.

I think it is possible that this could effect the OS/400 port, but I am 
not sure how I could tell.  OS/400 is Unix-like and supports POSIX, but it 
does not support fork().  svnserve runs multi-threaded and does not spawn 
any additional processes.  Would this mean that we would be exposed to 
this sort of error?  If so, how would it manifest?

We use fsfs exclusively in the OS/400 port with either svnserve or 
mod_dav_svn for the server.  Wouldn't mod_dav_svn also potentially have 
this problem?  Apache also runs threaded.

Thanks

Mark


_____________________________________________________________________________
Scanned for SoftLanding Systems, Inc. by IBM Email Security Management Services powered by MessageLabs. 
_____________________________________________________________________________

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: attn: ghudson; FSFS and multi-threading on Unix broken?

Posted by Greg Hudson <gh...@MIT.EDU>.
On Tue, 2005-04-05 at 05:46, Peter N. Lundblad wrote:
> As you see, I haven't fleshed out the resolution proposal very much. But I
> want people to be aware of the possibility that we have a serious FSFS
> (dataloss?) bug. Please someone (ghudson?), tell me that this is a false
> alarm! :-)

I don't think it's a false alarm.  But as you noted, it's rare to use
svnserve in threaded mode under Unix.  Under Windows, we know you can
block against a lock you already hold (see recent deadlock issues), so
locks are effectively per-thread.  And I think it's currently unheard of
for a third-party multithreaded Unix program to be interested in more
than the network client part of the Subversion libraries.

I would like to see the fix happen in APR.  Subversion is not unusual in
desiring a per-thread file lock; I know krb5 has run into this issue as
well.  And APR already has an initialization function, so it doesn't
have that problem.

One option for the new APR interface is simply a lock which is
guaranteed to block even if the lock is already held within the current
process--that is, the Windows apr_file_lock() behavior, but on all
platforms.  A more complex notion would be to create the idea of a
locking context; the new lock interface would not block the current
thread if the lock is held within the given locking context (it would
act as a recursive lock), but would block if the lock is held within a
different context in the same thread.  You can see how this would be
nice for the FSFS locking code--no more "I already have the lock" flag
for lock auto-expiry--but it's arguably doing too much of the
application's job within the APR library.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: attn: ghudson; FSFS and multi-threading on Unix broken?

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.
On Tue, 5 Apr 2005 kfogel@collab.net wrote:

> Meta-comment: Can we please not treat FSFS as the special domain of
> ghudson and or jpieper?  (While we're at it, same goes for ghudson and
> ra_svn/svnserve).
>
Well, as you guess later on, my intent was to catch him, since this is a
serious dataloss bug in the right circumstances. I wouldn't normally use a
subject line like that. And I didn't mean to imply he has to fix it, just
wanted his opinion since he designed it.

I understand your concern. Point taken.

Regards,
//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: attn: ghudson; FSFS and multi-threading on Unix broken?

Posted by kf...@collab.net.
Meta-comment: Can we please not treat FSFS as the special domain of
ghudson and or jpieper?  (While we're at it, same goes for ghudson and
ra_svn/svnserve).

I think a subject line of just "FSFS and multi-threading on Unix
broken?" would be as likely to catch ghudson's attention, without
implying that FSFS problems are somehow his special responsibility.

Please note that ghudson himself did not make this meta-complaint in
this thread, nor did he ask me to do so on his behalf.  I'm just
trying to stamp out an incipient division before it becomes
semi-official.  FSFS is all our responsibility.

Peter, given all the work you've done in FSFS, I know you don't think
of it as just ghudson's domain, and therefore you weren't implying
that with your subject line.  But I was worried that people might read
it that way, no matter what you intended.

Sorry if I'm being paranoid, just thought this pattern should be
stopped early,

-K

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org