You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Ivan Zhakov <iv...@visualsvn.com> on 2014/08/12 19:22:28 UTC

Possible deadlock in named atomic infrastructure used by revprop caching

Stefan,

I've noticed another severe issue in named atomic infrastructure used
by revprop caching code while reviewing r1611379 fix:
svn_atomic_namespace__create() doesn't release file lock and
process-wide mutex (!) on error in libsvn_subr\named_atomic.c:446.

Which is basically mean that server will be deadlocked if *any* of
named atomic shared file contains some invalid data.

-- 
Ivan Zhakov
CTO | VisualSVN | http://www.visualsvn.com

Re: Possible deadlock in named atomic infrastructure used by revprop caching

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Wed, Aug 27, 2014 at 6:49 PM, Ivan Zhakov <iv...@visualsvn.com> wrote:

> On 18 August 2014 16:51, Stefan Fuhrmann <st...@wandisco.com>
> wrote:
> > On Tue, Aug 12, 2014 at 7:22 PM, Ivan Zhakov <iv...@visualsvn.com> wrote:
> >>
> >> Stefan,
> >>
> >> I've noticed another severe issue in named atomic infrastructure used
> >> by revprop caching code while reviewing r1611379 fix:
> >> svn_atomic_namespace__create() doesn't release file lock and
> >> process-wide mutex (!) on error in libsvn_subr\named_atomic.c:446.
> >
> >
> > Thanks for the report, fixed in r1617687.
> >
> >>
> >> Which is basically mean that server will be deadlocked if *any* of
> >> named atomic shared file contains some invalid data.
> >
> > That specific error condition means that something randomly
> > modified or modifies on-disk data on your server. A deadlock
> > is probably not the worst effect in that case ...
>
> What do you consider to be the worst effect here?
>

Deadlock are bad (as they would be hard to diagnose depending
on whether you get to see the error message about corruption),
but untreated and / or progressive corruption are arguably worse.

Have you considered cloud servers with thousands of repositories that
> could be deadlocked because of a SINGLE corrupted repository?
>

Something on at least one of the machines seems to randomly
modify repository files.  Don't you think that stopping service
before accepting any new data into possibly corrupted repos
is not too bad after all? And if this actually happens to more
than one of the replicated servers in your cloud, your problems
are more severe than a deadlock ...

FWIW, this is a moot point now as the SHM code is no longer
being used.

-- Stefan^2.

Re: Possible deadlock in named atomic infrastructure used by revprop caching

Posted by Ivan Zhakov <iv...@visualsvn.com>.
On 18 August 2014 16:51, Stefan Fuhrmann <st...@wandisco.com> wrote:
> On Tue, Aug 12, 2014 at 7:22 PM, Ivan Zhakov <iv...@visualsvn.com> wrote:
>>
>> Stefan,
>>
>> I've noticed another severe issue in named atomic infrastructure used
>> by revprop caching code while reviewing r1611379 fix:
>> svn_atomic_namespace__create() doesn't release file lock and
>> process-wide mutex (!) on error in libsvn_subr\named_atomic.c:446.
>
>
> Thanks for the report, fixed in r1617687.
>
>>
>> Which is basically mean that server will be deadlocked if *any* of
>> named atomic shared file contains some invalid data.
>
> That specific error condition means that something randomly
> modified or modifies on-disk data on your server. A deadlock
> is probably not the worst effect in that case ...

What do you consider to be the worst effect here?

Have you considered cloud servers with thousands of repositories that
could be deadlocked because of a SINGLE corrupted repository?


-- 
Ivan Zhakov

Re: Possible deadlock in named atomic infrastructure used by revprop caching

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Tue, Aug 12, 2014 at 7:22 PM, Ivan Zhakov <iv...@visualsvn.com> wrote:

> Stefan,
>
> I've noticed another severe issue in named atomic infrastructure used
> by revprop caching code while reviewing r1611379 fix:
> svn_atomic_namespace__create() doesn't release file lock and
> process-wide mutex (!) on error in libsvn_subr\named_atomic.c:446.
>

Thanks for the report, fixed in r1617687.


> Which is basically mean that server will be deadlocked if *any* of
> named atomic shared file contains some invalid data.
>

That specific error condition means that something randomly
modified or modifies on-disk data on your server. A deadlock
is probably not the worst effect in that case ...

-- Stefan^2.