You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Stefan Fuhrmann <st...@wandisco.com> on 2012/10/15 19:32:57 UTC

Re: svn commit: r1331883 - /subversion/trunk/subversion/svnadmin/main.c

On Mon, Oct 15, 2012 at 7:10 PM, Daniel Shahaf <d....@daniel.shahaf.name>wrote:

> Daniel Shahaf wrote on Sun, Apr 29, 2012 at 12:58:56 +0300:
> > stefan2@apache.org wrote on Sun, Apr 29, 2012 at 09:12:49 -0000:
> > > Author: stefan2
> > > Date: Sun Apr 29 09:12:48 2012
> > > New Revision: 1331883
> > >
> > > URL: http://svn.apache.org/viewvc?rev=1331883&view=rev
> > > Log:
> > > SvnAdmin should always have revprop caching enabled such that
> > > the infrastructure is being set up to notify *other* processes of
> > > changes done by e.g. setrevprop.
> > >
>
> What happens if 'svnadmin1.7 setrevprop' is run while a 1.8 server is
> running?  Will the 1.8 server miss svnadmin1.7's changes because the
> latter doesn't have revprop caching enabled?
>

Yes. To use revprop caching, all applications
accessing the same repository must be 1.8+.

-- Stefan^2.

-- 
*

Join us this October at Subversion Live
2012<http://www.wandisco.com/svn-live-2012>
 for two days of best practice SVN training, networking, live demos,
committer meet and greet, and more! Space is limited, so get signed up
today<http://www.wandisco.com/svn-live-2012>
!
*

Re: svn commit: r1331883 - /subversion/trunk/subversion/svnadmin/main.c

Posted by Johan Corveleyn <jc...@gmail.com>.
On Tue, Oct 16, 2012 at 7:13 AM, Stefan Fuhrmann
<st...@wandisco.com> wrote:
> On Tue, Oct 16, 2012 at 5:51 AM, Branko Čibej <br...@wandisco.com> wrote:
>>
>> On 15.10.2012 17:14, Stefan Fuhrmann wrote:
>> > However, if you have a long-running process like a server, that race
>> > condition extends now extends over its whole lifetime. I.e. once a
>> > revprop got read, any change to its value by a pre-1.8 tool may never
>> > get detected.
>>
>> Ouch. This seems wrong. I'd understand a design like this if the
>> long-running server were the only process accessing the repository, but
>> that has never been the case in Subversion. In this case, a cache that
>> can't detect out-of-band changes to the canonical dataset isn't very
>> useful.
>
>
> *sigh* If you really have to use 1.7 tools on an 1.8 server,
> you can't use revprop caching. Most people, however,
> will use 1.8 svnadmin on 1.8 servers.

That seems okay to me. If you use all 1.8(+) tools, then out-of-band
changes will be detected. I haven't done any measurements, but I
expect the performance advantages of revprop caching to be quite
significant.

Mixing tool versions will be done by say 0.1% of svn admins. I think
it would be a pity to deprive every sane svn administrator of this
performance advantage because of this edge case. As long as this
limitation is documented in the release notes, I'm okay with it.

-- 
Johan

Re: svn commit: r1331883 - /subversion/trunk/subversion/svnadmin/main.c

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Tue, Oct 16, 2012 at 5:51 AM, Branko Čibej <br...@wandisco.com> wrote:

> On 15.10.2012 17:14, Stefan Fuhrmann wrote:
> > However, if you have a long-running process like a server, that race
> > condition extends now extends over its whole lifetime. I.e. once a
> > revprop got read, any change to its value by a pre-1.8 tool may never
> > get detected.
>
> Ouch. This seems wrong. I'd understand a design like this if the
> long-running server were the only process accessing the repository, but
> that has never been the case in Subversion. In this case, a cache that
> can't detect out-of-band changes to the canonical dataset isn't very
> useful.
>

*sigh* If you really have to use 1.7 tools on an 1.8 server,
you can't use revprop caching. Most people, however,
will use 1.8 svnadmin on 1.8 servers.

-- Stefan^2.

-- 
*

Join us this October at Subversion Live
2012<http://www.wandisco.com/svn-live-2012>
 for two days of best practice SVN training, networking, live demos,
committer meet and greet, and more! Space is limited, so get signed up
today<http://www.wandisco.com/svn-live-2012>
!
*

Re: svn commit: r1331883 - /subversion/trunk/subversion/svnadmin/main.c

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Thu, Oct 18, 2012 at 10:46 PM, Daniel Shahaf <d....@daniel.shahaf.name>wrote:

> Stefan Fuhrmann wrote on Thu, Oct 18, 2012 at 02:26:13 +0200:
> > On Thu, Oct 18, 2012 at 1:57 AM, Daniel Shahaf <d.s@daniel.shahaf.name
> >wrote:
> >
> > > Branko Čibej wrote on Mon, Oct 15, 2012 at 23:51:43 -0400:
> > > > On 15.10.2012 17:14, Stefan Fuhrmann wrote:
> > > > > However, if you have a long-running process like a server, that
> race
> > > > > condition extends now extends over its whole lifetime. I.e. once a
> > > > > revprop got read, any change to its value by a pre-1.8 tool may
> never
> > > > > get detected.
> > > >
> > > > Ouch. This seems wrong. I'd understand a design like this if the
> > > > long-running server were the only process accessing the repository,
> but
> > > > that has never been the case in Subversion. In this case, a cache
> that
> > > > can't detect out-of-band changes to the canonical dataset isn't very
> > > useful.
> > >
> > > Another questionable part of the revprop cache: the
> > > ATOMIC_REVPROP_TIMEOUT mechanism in fs_fs.c.  (Basically, the code
> > > assumes that if another process is rewriting a revprops file, it will
> > > always finish within 10 seconds.)
> > >
> >
> > Did you read and *understand* the code?
> > What exactly will happen after the 10s timeout?
>
> The two processes will have a different idea of what revprop
> generation N is?  One process will miss the other's changes?
>

The key is that the revprop generation is placed in
shared memory and all processes should always
see a consistent value. A process increasing that
value will automatically invalidate *everyones* caches.
So, increasing is always safe.

I admit I wasn't really sure about the concrete kind of badness that
> would happen if the assumption around ATOMIC_REVPROP_TIMEOUT ---
> I changed my mind at least once while drafting the email --- but I'd be
> surprised if no badness at all happens even if the assumption "the other
> process must have crashed" turns out to be wrong.  (If that is the case,
> why does the comment mention the assumption at all?)
>

Thinking really hard, I finally was able to come up with
a very unlikely constellation that's being addressed by
r1400669. There is no problem when the writer gets
blocked (on a simple atomic file operation) and other
processes simply suspect that they might have missed
a revprop change and therefore bump the generation.

In that case, the write would still complete its operation
eventually and *further* bump the generation. The hole
in my logic had been what happens if the file op eventually
gets through but the writer is being terminated / killed /
crashes just before bumping the generation.

This should now be addressed by the other processes
acquiring the write lock before bumping the generation
after the timeout. At that point, we know that the original
writer will not modify the data afterwards (and potentially
terminate before telling us about it). Once we got the lock,
we may or may not need to bump the generation (the
writer or some other reader might have done that before us).

-- Stefan^2.

-- 
*

Join us this October at Subversion Live
2012<http://www.wandisco.com/svn-live-2012>
 for two days of best practice SVN training, networking, live demos,
committer meet and greet, and more! Space is limited, so get signed up
today<http://www.wandisco.com/svn-live-2012>
!
*

Re: svn commit: r1331883 - /subversion/trunk/subversion/svnadmin/main.c

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Stefan Fuhrmann wrote on Thu, Oct 18, 2012 at 02:26:13 +0200:
> On Thu, Oct 18, 2012 at 1:57 AM, Daniel Shahaf <d....@daniel.shahaf.name>wrote:
> 
> > Branko Čibej wrote on Mon, Oct 15, 2012 at 23:51:43 -0400:
> > > On 15.10.2012 17:14, Stefan Fuhrmann wrote:
> > > > However, if you have a long-running process like a server, that race
> > > > condition extends now extends over its whole lifetime. I.e. once a
> > > > revprop got read, any change to its value by a pre-1.8 tool may never
> > > > get detected.
> > >
> > > Ouch. This seems wrong. I'd understand a design like this if the
> > > long-running server were the only process accessing the repository, but
> > > that has never been the case in Subversion. In this case, a cache that
> > > can't detect out-of-band changes to the canonical dataset isn't very
> > useful.
> >
> > Another questionable part of the revprop cache: the
> > ATOMIC_REVPROP_TIMEOUT mechanism in fs_fs.c.  (Basically, the code
> > assumes that if another process is rewriting a revprops file, it will
> > always finish within 10 seconds.)
> >
> 
> Did you read and *understand* the code?
> What exactly will happen after the 10s timeout?

The two processes will have a different idea of what revprop
generation N is?  One process will miss the other's changes?

I admit I wasn't really sure about the concrete kind of badness that
would happen if the assumption around ATOMIC_REVPROP_TIMEOUT ---
I changed my mind at least once while drafting the email --- but I'd be
surprised if no badness at all happens even if the assumption "the other
process must have crashed" turns out to be wrong.  (If that is the case,
why does the comment mention the assumption at all?)

Re: svn commit: r1331883 - /subversion/trunk/subversion/svnadmin/main.c

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Stefan Fuhrmann wrote on Thu, Oct 18, 2012 at 02:26:13 +0200:
> No need to go off on a hyperbole. Raise your
> concerns in a factual way and hold off meta-
> discussion until after those facts got confirmed.
> 
> Still appreciate the review, though. And, yes, the
> coded logic may contain holes and I'm absolutely
> willing to work them out, discuss ways to address
> them or even to remove large chunks of work if
> that turns out the be the right thing to do.

And thanks for the levelheaded response.  Yes I was a bit annoyed when
I wrote that email yesterday, but I'll table my conclusions until
further analysis of the issue.

Cheers

Daniel

Re: svn commit: r1331883 - /subversion/trunk/subversion/svnadmin/main.c

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Thu, Oct 18, 2012 at 1:57 AM, Daniel Shahaf <d....@daniel.shahaf.name>wrote:

> Branko Čibej wrote on Mon, Oct 15, 2012 at 23:51:43 -0400:
> > On 15.10.2012 17:14, Stefan Fuhrmann wrote:
> > > However, if you have a long-running process like a server, that race
> > > condition extends now extends over its whole lifetime. I.e. once a
> > > revprop got read, any change to its value by a pre-1.8 tool may never
> > > get detected.
> >
> > Ouch. This seems wrong. I'd understand a design like this if the
> > long-running server were the only process accessing the repository, but
> > that has never been the case in Subversion. In this case, a cache that
> > can't detect out-of-band changes to the canonical dataset isn't very
> useful.
>
> Another questionable part of the revprop cache: the
> ATOMIC_REVPROP_TIMEOUT mechanism in fs_fs.c.  (Basically, the code
> assumes that if another process is rewriting a revprops file, it will
> always finish within 10 seconds.)
>

Did you read and *understand* the code?
What exactly will happen after the 10s timeout?


> 1. It's not documented.  (And by that I mean a few paragraphs explaining
> the algorithm; not docstrings to individual functions, nor documentation
> of the on-disk format.)
>

Still a lot of documentation to done for sure.
And it is the right time to that (1.8 stabilization
starting soon-ish).


> 2. It's another instance of breaking FSFS discipline --- if you do
> something that we think is very rare, your filesystem's behaviour may
> lose that 'correctness' property.  FSFS does not have "correctness
> except in rare situations" anywhere else --- and 1.6 and 1.7 features
> had taken some trouble to make sure that had remained the case.
>
> The problem raised earlier in this thread is that a 1.8 server might not
> notice changes made by a 1.7 svnadmin.  The problem I have in mind here
> is that if a writer _does_ happen to take more than 10 seconds, the
> first process will miss the second process's changes.
>
> In short, I believe the FSFS code in 1.8.x sacrifices Subversion's
> unconditional correctness promises on the altar of performance.
>

No need to go off on a hyperbole. Raise your
concerns in a factual way and hold off meta-
discussion until after those facts got confirmed.

Still appreciate the review, though. And, yes, the
coded logic may contain holes and I'm absolutely
willing to work them out, discuss ways to address
them or even to remove large chunks of work if
that turns out the be the right thing to do.

-- Stefan^2.

-- 
*

Join us this October at Subversion Live
2012<http://www.wandisco.com/svn-live-2012>
 for two days of best practice SVN training, networking, live demos,
committer meet and greet, and more! Space is limited, so get signed up
today<http://www.wandisco.com/svn-live-2012>
!
*

Re: svn commit: r1331883 - /subversion/trunk/subversion/svnadmin/main.c

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Branko Čibej wrote on Mon, Oct 15, 2012 at 23:51:43 -0400:
> On 15.10.2012 17:14, Stefan Fuhrmann wrote:
> > However, if you have a long-running process like a server, that race
> > condition extends now extends over its whole lifetime. I.e. once a
> > revprop got read, any change to its value by a pre-1.8 tool may never
> > get detected.
> 
> Ouch. This seems wrong. I'd understand a design like this if the
> long-running server were the only process accessing the repository, but
> that has never been the case in Subversion. In this case, a cache that
> can't detect out-of-band changes to the canonical dataset isn't very useful.

Another questionable part of the revprop cache: the
ATOMIC_REVPROP_TIMEOUT mechanism in fs_fs.c.  (Basically, the code
assumes that if another process is rewriting a revprops file, it will
always finish within 10 seconds.)

1. It's not documented.  (And by that I mean a few paragraphs explaining
the algorithm; not docstrings to individual functions, nor documentation
of the on-disk format.)

2. It's another instance of breaking FSFS discipline --- if you do
something that we think is very rare, your filesystem's behaviour may
lose that 'correctness' property.  FSFS does not have "correctness
except in rare situations" anywhere else --- and 1.6 and 1.7 features
had taken some trouble to make sure that had remained the case.

The problem raised earlier in this thread is that a 1.8 server might not
notice changes made by a 1.7 svnadmin.  The problem I have in mind here
is that if a writer _does_ happen to take more than 10 seconds, the
first process will miss the second process's changes.

In short, I believe the FSFS code in 1.8.x sacrifices Subversion's
unconditional correctness promises on the altar of performance.

Re: svn commit: r1331883 - /subversion/trunk/subversion/svnadmin/main.c

Posted by Branko Čibej <br...@wandisco.com>.
On 15.10.2012 17:14, Stefan Fuhrmann wrote:
> However, if you have a long-running process like a server, that race
> condition extends now extends over its whole lifetime. I.e. once a
> revprop got read, any change to its value by a pre-1.8 tool may never
> get detected.

Ouch. This seems wrong. I'd understand a design like this if the
long-running server were the only process accessing the repository, but
that has never been the case in Subversion. In this case, a cache that
can't detect out-of-band changes to the canonical dataset isn't very useful.

-- Brane

-- 
Certified & Supported Apache Subversion Downloads:
http://www.wandisco.com/subversion/download


Re: svn commit: r1331883 - /subversion/trunk/subversion/svnadmin/main.c

Posted by Philip Martin <ph...@wandisco.com>.
Philip Martin <ph...@wandisco.com> writes:

> Stefan Fuhrmann <st...@wandisco.com> writes:
>
>> On Mon, Oct 15, 2012 at 7:36 PM, Daniel Shahaf <d....@daniel.shahaf.name>wrote:
>>
>>> (It also has the pretty odd side effect that it's not safe to run
>>> 'svnadmin1.7 setrevprop' and 'svnadmin1.8 dump' concurrently on the
>>> same repository...)
>>>
>>
>> That's a misrepresentation of what is going on.
>>
>> Running svnadmin setrevprop and svnadmin dump
>> concurrently, has never had a well-defined behavior.
>> I.e. the propset may or may not have shown up in
>> the dump (basically a race condition).
>
> That's a bug in 1.7.  When the r1237779 backport is approved 1.7 will be
> using the atomic revprop mechanism for load, and since it already uses
> the atomic mechanism for setrevprop any race will be detected and the
> second operation will fail.

Oops! My mistake.  I read that as "setrevprop and svnadmin load" and
wanted to point out that while 1.7 load isn't atomic it should be.

-- 
Join us this October at Subversion Live 2012
http://www.wandisco.com/svn-live-2012

Re: svn commit: r1331883 - /subversion/trunk/subversion/svnadmin/main.c

Posted by Philip Martin <ph...@wandisco.com>.
Stefan Fuhrmann <st...@wandisco.com> writes:

> On Mon, Oct 15, 2012 at 7:36 PM, Daniel Shahaf <d....@daniel.shahaf.name>wrote:
>
>> (It also has the pretty odd side effect that it's not safe to run
>> 'svnadmin1.7 setrevprop' and 'svnadmin1.8 dump' concurrently on the
>> same repository...)
>>
>
> That's a misrepresentation of what is going on.
>
> Running svnadmin setrevprop and svnadmin dump
> concurrently, has never had a well-defined behavior.
> I.e. the propset may or may not have shown up in
> the dump (basically a race condition).

That's a bug in 1.7.  When the r1237779 backport is approved 1.7 will be
using the atomic revprop mechanism for load, and since it already uses
the atomic mechanism for setrevprop any race will be detected and the
second operation will fail.

> The same
> thing is true for 1.8 and any combination of 1.8 and
> older tools.

1.8 before revprop caching would use the atomic revprop change mechanism
to detect the race and return an error.

There is still a bug in 1.7/1.8 here, we should stop using
svn_repos_fs_change_rev_prop2 in load-fs-vtable.c:change_rev_prop and
use a function that bypasses validation but still uses the atomic
mechanism.  No such function currently exists :-(

-- 
Join us this October at Subversion Live 2012
http://www.wandisco.com/svn-live-2012

Re: svn commit: r1331883 - /subversion/trunk/subversion/svnadmin/main.c

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Mon, Oct 15, 2012 at 7:36 PM, Daniel Shahaf <d....@daniel.shahaf.name>wrote:

> Stefan Fuhrmann wrote on Mon, Oct 15, 2012 at 19:32:57 +0200:
> > On Mon, Oct 15, 2012 at 7:10 PM, Daniel Shahaf <d.s@daniel.shahaf.name
> >wrote:
> >
> > > Daniel Shahaf wrote on Sun, Apr 29, 2012 at 12:58:56 +0300:
> > > > stefan2@apache.org wrote on Sun, Apr 29, 2012 at 09:12:49 -0000:
> > > > > Author: stefan2
> > > > > Date: Sun Apr 29 09:12:48 2012
> > > > > New Revision: 1331883
> > > > >
> > > > > URL: http://svn.apache.org/viewvc?rev=1331883&view=rev
> > > > > Log:
> > > > > SvnAdmin should always have revprop caching enabled such that
> > > > > the infrastructure is being set up to notify *other* processes of
> > > > > changes done by e.g. setrevprop.
> > > > >
> > >
> > > What happens if 'svnadmin1.7 setrevprop' is run while a 1.8 server is
> > > running?  Will the 1.8 server miss svnadmin1.7's changes because the
> > > latter doesn't have revprop caching enabled?
> > >
> >
> > Yes. To use revprop caching, all applications
> > accessing the same repository must be 1.8+.
>
> This does not seem to be documented.
>

Will update the release notes.


> (It also has the pretty odd side effect that it's not safe to run
> 'svnadmin1.7 setrevprop' and 'svnadmin1.8 dump' concurrently on the
> same repository...)
>

That's a misrepresentation of what is going on.

Running svnadmin setrevprop and svnadmin dump
concurrently, has never had a well-defined behavior.
I.e. the propset may or may not have shown up in
the dump (basically a race condition). The same
thing is true for 1.8 and any combination of 1.8 and
older tools.

However, if you have a long-running process like
a server, that race condition extends now extends
over its whole lifetime. I.e. once a revprop got read,
any change to its value by a pre-1.8 tool may never
get detected.

-- Stefan^2.

-- 
*

Join us this October at Subversion Live
2012<http://www.wandisco.com/svn-live-2012>
 for two days of best practice SVN training, networking, live demos,
committer meet and greet, and more! Space is limited, so get signed up
today<http://www.wandisco.com/svn-live-2012>
!
*

Re: svn commit: r1331883 - /subversion/trunk/subversion/svnadmin/main.c

Posted by "C. Michael Pilato" <cm...@collab.net>.
On 10/15/2012 01:36 PM, Daniel Shahaf wrote:
> Stefan Fuhrmann wrote on Mon, Oct 15, 2012 at 19:32:57 +0200:
>> On Mon, Oct 15, 2012 at 7:10 PM, Daniel Shahaf <d....@daniel.shahaf.name>wrote:
>>> What happens if 'svnadmin1.7 setrevprop' is run while a 1.8 server is
>>> running?  Will the 1.8 server miss svnadmin1.7's changes because the
>>> latter doesn't have revprop caching enabled?
>>>
>>
>> Yes. To use revprop caching, all applications
>> accessing the same repository must be 1.8+.
> 
> This does not seem to be documented.
> 
> (It also has the pretty odd side effect that it's not safe to run
> 'svnadmin1.7 setrevprop' and 'svnadmin1.8 dump' concurrently on the
> same repository...)

Eww...

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Enterprise Cloud Development


Re: svn commit: r1331883 - /subversion/trunk/subversion/svnadmin/main.c

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Stefan Fuhrmann wrote on Mon, Oct 15, 2012 at 19:32:57 +0200:
> On Mon, Oct 15, 2012 at 7:10 PM, Daniel Shahaf <d....@daniel.shahaf.name>wrote:
> 
> > Daniel Shahaf wrote on Sun, Apr 29, 2012 at 12:58:56 +0300:
> > > stefan2@apache.org wrote on Sun, Apr 29, 2012 at 09:12:49 -0000:
> > > > Author: stefan2
> > > > Date: Sun Apr 29 09:12:48 2012
> > > > New Revision: 1331883
> > > >
> > > > URL: http://svn.apache.org/viewvc?rev=1331883&view=rev
> > > > Log:
> > > > SvnAdmin should always have revprop caching enabled such that
> > > > the infrastructure is being set up to notify *other* processes of
> > > > changes done by e.g. setrevprop.
> > > >
> >
> > What happens if 'svnadmin1.7 setrevprop' is run while a 1.8 server is
> > running?  Will the 1.8 server miss svnadmin1.7's changes because the
> > latter doesn't have revprop caching enabled?
> >
> 
> Yes. To use revprop caching, all applications
> accessing the same repository must be 1.8+.

This does not seem to be documented.

(It also has the pretty odd side effect that it's not safe to run
'svnadmin1.7 setrevprop' and 'svnadmin1.8 dump' concurrently on the
same repository...)