You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Ivan Zhakov <iv...@visualsvn.com> on 2013/09/24 18:51:34 UTC

Re: Log-addressing branch ready for review

On 23 August 2013 03:21, Stefan Fuhrmann <st...@wandisco.com> wrote:
> Hi all,
>
> As of r1516665, work on this branch has been completed.
> Please review. See the BRANCH-README for the list of
> major changes.
>
> If there are no objections, I will merge the code in the week
> of Sep 23th.
>
Hi Stefan,

I have looked through the code but I'm not ready to put my +1 on it. I
think this branch is a good candidate for the "three +1" policy
discused some time ago.

BTW, why we are not going to implement this in the FSX only?

-- 
Ivan Zhakov

Re: Log-addressing branch ready for review

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Sat, Nov 30, 2013 at 2:23 PM, Daniel Shahaf <d....@daniel.shahaf.name>wrote:

> Daniel Shahaf wrote on Sat, Nov 30, 2013 at 15:19:56 +0200:
> > Stefan Fuhrmann wrote on Sat, Nov 30, 2013 at 08:48:30 +0100:
> > > Again, you are missing the point. You are saying that there
> > > are lots of possible configurations and various combinations.
> > > I'm saying that we *are* actually able to test many variants.
> > > One simply needs to update their overall test script.
> >
> > There are classes of problems that our test suite simply isn't designed
> > to catch.
> >
> > For example:
> >
> > - We won't catch 'premature kill -9' bugs (r1494298)
> > - We won't catch concurrency bugs (r1302613)
>
> As another example, we have architectue-dependent code, for example
> the definition of SVN_UNALIGNED_ACCESS_IS_OK.  In those cases, do we
> test the "unlikely" branch (eg: SVN_UNALIGNED_ACCESS_IS_OK=0) too?
>

Well, that is exactly the kind of things that a test script
should cover. Running just another combination of config
options add less than 2:30 total (or 2 minutes with ra_local).
So, being able to quickly run a great number of variants
is clearly useful here.


> My build script defines -DSVN_UNALIGNED_ACCESS_IS_OK=0 for precisely
> this reason.
>

I think we should make a list of combinations that should
be part of release testing. It is certainly a waste to do
something like

[local, serf, svn] x [bdb, fsfs, fsx] x [1.0 .. 1.9] x [P(config options)]

Instead, maybe along the lines of:

[local, serf, svn] x [bdb, fsfs, fsx] x [1.9] x [config defaults]
[local, serf or svn] x [fsfs] x [1.9] x [Selected config option sets]
[local, serf or svn] x [fsfs] x [1.1 .. 1.9] x [config defaults]
[local, serf or svn] x [bdb] x [1.0] x [config defaults]

For the ra layer bit, I feel that having "all in one" and
"separate c/s" may trigger different failure scenarios.
BDB is deprecated and FSX is experimental, so let's
focus on FSFS for the non-standard setups. 1.9 should
cover the most code making it a good test bed for
config option variation. At least the standard config
should work with all format versions.

Makes sense? What combinations of config options
should we test?

-- Stefan^2.

Re: Log-addressing branch ready for review

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Daniel Shahaf wrote on Sat, Nov 30, 2013 at 15:19:56 +0200:
> Stefan Fuhrmann wrote on Sat, Nov 30, 2013 at 08:48:30 +0100:
> > Again, you are missing the point. You are saying that there
> > are lots of possible configurations and various combinations.
> > I'm saying that we *are* actually able to test many variants.
> > One simply needs to update their overall test script.
> 
> There are classes of problems that our test suite simply isn't designed
> to catch.
> 
> For example:
> 
> - We won't catch 'premature kill -9' bugs (r1494298)
> - We won't catch concurrency bugs (r1302613)

As another example, we have architectue-dependent code, for example
the definition of SVN_UNALIGNED_ACCESS_IS_OK.  In those cases, do we
test the "unlikely" branch (eg: SVN_UNALIGNED_ACCESS_IS_OK=0) too?

My build script defines -DSVN_UNALIGNED_ACCESS_IS_OK=0 for precisely
this reason.

Re: Log-addressing branch ready for review

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Stefan Fuhrmann wrote on Sat, Nov 30, 2013 at 15:10:00 +0100:
> Now, we are at a point were internal review is certainly
> welcome and will probably find more issues but the real
> test is using it in the real world, e.g. as part of an alpha
> release.

+1

Re: Log-addressing branch ready for review

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Sat, Nov 30, 2013 at 2:19 PM, Daniel Shahaf <d....@daniel.shahaf.name>wrote:

> Stefan Fuhrmann wrote on Sat, Nov 30, 2013 at 08:48:30 +0100:
> > Again, you are missing the point. You are saying that there
> > are lots of possible configurations and various combinations.
> > I'm saying that we *are* actually able to test many variants.
> > One simply needs to update their overall test script.
>
> There are classes of problems that our test suite simply isn't designed
> to catch.
>
> For example:
>
> - We won't catch 'premature kill -9' bugs (r1494298)
> - We won't catch concurrency bugs (r1302613)
>

It is true that our test suite does not cover these scenarios.
This has always been the case and so far did not prevent
releases.

My private testing with larger repositories (a.o. & friends)
had me identify a concurrency / consistency issue with
'svnadmin pack' during a 'svnadmin load'

In the future, however, we can of course add tests that
either kill some test application or trigger some yet-to-be-
designed kill / suicide callbacks in the code. My recent
change to make the C tests execute in parallel is to some
extend intended to simply put "unusual" stress on our code.
And that identified a few points where our code or APR
did not behave as expected by me at least.

1.8 added the ability to generate code coverage reports
for our tests (not sure that currently works). So, we should
also be able to identify critical bits that don't get any
coverage.


> So, no, running "make check" under more configurations (despite still
> being an improvement) won't make me more confident that FSFS7 is in
> release-shape.  It simply doesn't test enough code paths.
>

Well, Ivan's argument was along the lines of "many formats"
and their combination over the lifetime of a given repository.
That's a fair point and this one *can* be addressed be running
our test suite through many (all?) of these combinations.

BTW, the log-addressing branch is not the first incarnation
of that code. Developed in the format7 branch, it underwent
review (by me) and improvement while it got transferred
to the log-addressing branch. There it received stress testing
and further fixes. Finally, Philip gave it an in-depth review
with a specific focus on potential concurrency issues. A
few minor things showed up and got fixed.

Now, we are at a point were internal review is certainly
welcome and will probably find more issues but the real
test is using it in the real world, e.g. as part of an alpha
release.

-- Stefan^2.

Re: Log-addressing branch ready for review

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Daniel Shahaf wrote on Sat, Nov 30, 2013 at 15:19:56 +0200:
> So, no, running "make check" under more configurations (despite still
> being an improvement) won't make me more confident that FSFS7 is in
> release-shape.  It simply doesn't test enough code paths.

[ I was not trying to imply that FSFS7 isn't ready.  I was just trying
to say that it will almost certainly pass tests under any existing 'make
check' knob we might throw at it, given that it passes tests under the
default 'make check'. ]

Re: Log-addressing branch ready for review

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Daniel Shahaf wrote on Sat, Nov 30, 2013 at 15:19:56 +0200:
> So, no, running "make check" under more configurations (despite still
> being an improvement) won't make me more confident that FSFS7 is in
> release-shape.  It simply doesn't test enough code paths.

[ I was not trying to imply that FSFS7 isn't ready.  I was just trying
to say that it will almost certainly pass tests under any existing 'make
check' knob we might throw at it, given that it passes tests under the
default 'make check'. ]

Re: Log-addressing branch ready for review

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Stefan Fuhrmann wrote on Sat, Nov 30, 2013 at 08:48:30 +0100:
> Again, you are missing the point. You are saying that there
> are lots of possible configurations and various combinations.
> I'm saying that we *are* actually able to test many variants.
> One simply needs to update their overall test script.

There are classes of problems that our test suite simply isn't designed
to catch.

For example:

- We won't catch 'premature kill -9' bugs (r1494298)
- We won't catch concurrency bugs (r1302613)

So, no, running "make check" under more configurations (despite still
being an improvement) won't make me more confident that FSFS7 is in
release-shape.  It simply doesn't test enough code paths.

Daniel

Re: Log-addressing branch ready for review

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Fri, Nov 29, 2013 at 5:29 PM, Ivan Zhakov <iv...@visualsvn.com> wrote:

> On 28 November 2013 10:22, Stefan Fuhrmann <st...@wandisco.com>
> wrote:
> [...]
>
> >> >
> >> > Thanks a lot, Philip!
> >> > I plan on merging the branch later this week (Friday-ish).
> >> >
> >> Well, I still think that log-addressing branch should *NOT* be merged
> >> and all FS _performance_ and format changes should be implemented in
> >> FSX.
> >
> >
> > One more thing. Are you aware that for all we know today,
> > move support requires log scans? Probably even for updates.
> > This is the very operation that benefits most from format 7.
> > And so do merges.
> What I'm worry about is bottom-up design approach: we're going to
> release fsfs 7 and then someday use it to solve moves and merges.
>

You are confusing two independent topics:
(1) Storing "native" move information in the FS layer.
    There is some experimental code for this on trunk
    that may or may not help people developing the
    move merge feature. We can decide to remove or
    disable it at any point before release.
(2) Ability to deliver the amount of log information
    required for move-aware updates and merges with
    reasonable speed. This is what FSFS format 7 provides.

So, your "bottom-up" argument is false.


> > I think I fully understand your position. Your concerns are:
> >
> > * Minimize code churn (= potential destabilization) in FSFS
> > * Protect users against repository corruption.
> > * Implement features that users crave for, like full move support.
> >
> > The latter two effectively hinge upon format 7. So, let's make
> > the first point as little an issue as we can (see below).
> >
>
> >> My primary concern is that currently FSFS supports reading and writing
> >> all FSFS formats. And there are more than 6 formats now, because they
> >> can differ of how repositories were upgraded. This makes code very
> >> complicated, because it should handle all different combinations of
> >> formats/upgrades.
> >
> >
> > Well, the obvious way to address this concern is to extend
> > testing. You may not be aware that running tests on /trunk
> > takes only 2:00 min in debug and 1:50 in release. Running
> > the test suite [fsfs +svn x all minor server versions] is not
> > a problem any longer. (*)
> >
> Test suite execution time is developers problem, not users.
>

Again, you are missing the point. You are saying that there
are lots of possible configurations and various combinations.
I'm saying that we *are* actually able to test many variants.
One simply needs to update their overall test script.


> > It should also be no problem to test repository upgrades:
> > A new option could tell the tests to create the repo in whatever
> > old format was selected and then do an 'svnadmin upgrade'
> > before running the remainder of the test. Given your observation
> > that FSFS format dependencies have already caused trouble
> > in the past, adding these kinds of tests is a good idea even
> > without format 7,
> Let's do proposed test suite improvements first, instead of pushing
> changes to trunk.


Alright. I think I can do these additions by Sunday night.
Going to merge right after that.


> We also need setup buildbots to test all FSFS
> formats to prevent problems like we got with revprop packing stuff.
>

It is certainly a good idea to have different repository formats
in our build bots to increase code coverage. But I don't see
that holding off an alpha release.

-- Stefan^2.

Re: Log-addressing branch ready for review

Posted by Ivan Zhakov <iv...@visualsvn.com>.
On 28 November 2013 10:22, Stefan Fuhrmann <st...@wandisco.com> wrote:
[...]

>> >
>> > Thanks a lot, Philip!
>> > I plan on merging the branch later this week (Friday-ish).
>> >
>> Well, I still think that log-addressing branch should *NOT* be merged
>> and all FS _performance_ and format changes should be implemented in
>> FSX.
>
>
> One more thing. Are you aware that for all we know today,
> move support requires log scans? Probably even for updates.
> This is the very operation that benefits most from format 7.
> And so do merges.
What I'm worry about is bottom-up design approach: we're going to
release fsfs 7 and then someday use it to solve moves and merges.

>
> I think I fully understand your position. Your concerns are:
>
> * Minimize code churn (= potential destabilization) in FSFS
> * Protect users against repository corruption.
> * Implement features that users crave for, like full move support.
>
> The latter two effectively hinge upon format 7. So, let's make
> the first point as little an issue as we can (see below).
>

>> My primary concern is that currently FSFS supports reading and writing
>> all FSFS formats. And there are more than 6 formats now, because they
>> can differ of how repositories were upgraded. This makes code very
>> complicated, because it should handle all different combinations of
>> formats/upgrades.
>
>
> Well, the obvious way to address this concern is to extend
> testing. You may not be aware that running tests on /trunk
> takes only 2:00 min in debug and 1:50 in release. Running
> the test suite [fsfs +svn x all minor server versions] is not
> a problem any longer. (*)
>
Test suite execution time is developers problem, not users.

> It should also be no problem to test repository upgrades:
> A new option could tell the tests to create the repo in whatever
> old format was selected and then do an 'svnadmin upgrade'
> before running the remainder of the test. Given your observation
> that FSFS format dependencies have already caused trouble
> in the past, adding these kinds of tests is a good idea even
> without format 7,
Let's do proposed test suite improvements first, instead of pushing
changes to trunk. We also need setup buildbots to test all FSFS
formats to prevent problems like we got with revprop packing stuff.

-- 
Ivan Zhakov
CTO | VisualSVN | http://www.visualsvn.com

Re: Log-addressing branch ready for review

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Wed, Nov 27, 2013 at 12:19 PM, Ivan Zhakov <iv...@visualsvn.com> wrote:

> On 25 November 2013 17:44, Stefan Fuhrmann <st...@wandisco.com>
> wrote:
> > On Mon, Nov 25, 2013 at 1:29 PM, Philip Martin <philip@codematters.co.uk
> >
> > wrote:
> >>
> >> Stefan Fuhrmann <st...@wandisco.com> writes:
> >>
> >> > On Tue, Sep 24, 2013 at 6:51 PM, Ivan Zhakov <iv...@visualsvn.com>
> wrote:
> >> >
> >> >> On 23 August 2013 03:21, Stefan Fuhrmann <
> stefan.fuhrmann@wandisco.com>
> >> >> wrote:
> >> >> > Hi all,
> >> >> >
> >> >> > As of r1516665, work on this branch has been completed.
> >> >> > Please review. See the BRANCH-README for the list of
> >> >> > major changes.
> >> >> >
> >> >> > If there are no objections, I will merge the code in the week
> >> >> > of Sep 23th.
> >> >> >
> >> >> Hi Stefan,
> >> >>
> >> >> I have looked through the code but I'm not ready to put my +1 on it.
> I
> >> >> think this branch is a good candidate for the "three +1" policy
> >> >> discused some time ago.
> >> >>
> >> >
> >> > Would you vote +1 under the 3-vote policy ("seems o.k. but more
> >> > independent review should take place")?
> >>
> >> I've been reviewing this branch and I am now happy for it to be merged
> >> to trunk.
> >
> >
> > Thanks a lot, Philip!
> > I plan on merging the branch later this week (Friday-ish).
> >
> Well, I still think that log-addressing branch should *NOT* be merged
> and all FS _performance_ and format changes should be implemented in
> FSX.
>

One more thing. Are you aware that for all we know today,
move support requires log scans? Probably even for updates.
This is the very operation that benefits most from format 7.
And so do merges.

I think I fully understand your position. Your concerns are:

* Minimize code churn (= potential destabilization) in FSFS
* Protect users against repository corruption.
* Implement features that users crave for, like full move support.

The latter two effectively hinge upon format 7. So, let's make
the first point as little an issue as we can (see below).


> My primary concern is that currently FSFS supports reading and writing
> all FSFS formats. And there are more than 6 formats now, because they
> can differ of how repositories were upgraded. This makes code very
> complicated, because it should handle all different combinations of
> formats/upgrades.
>

Well, the obvious way to address this concern is to extend
testing. You may not be aware that running tests on /trunk
takes only 2:00 min in debug and 1:50 in release. Running
the test suite [fsfs +svn x all minor server versions] is not
a problem any longer. (*)

It should also be no problem to test repository upgrades:
A new option could tell the tests to create the repo in whatever
old format was selected and then do an 'svnadmin upgrade'
before running the remainder of the test. Given your observation
that FSFS format dependencies have already caused trouble
in the past, adding these kinds of tests is a good idea even
without format 7,

So, this is what I will do. I will merge the branch in time
for the alpha release but keep the source branch around.
If we discover severe problems with the code we can
undo the change on /trunk at any point before going beta.
This is the whole point of using a VCS, right?

-- Stefan^2.

(*) See, speed *does* matter.

Re: Log-addressing branch ready for review

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Wed, Nov 27, 2013 at 12:19 PM, Ivan Zhakov <iv...@visualsvn.com> wrote:

> On 25 November 2013 17:44, Stefan Fuhrmann <st...@wandisco.com>
> wrote:
> > On Mon, Nov 25, 2013 at 1:29 PM, Philip Martin <philip@codematters.co.uk
> >
> > wrote:
> >>
> >> Stefan Fuhrmann <st...@wandisco.com> writes:
> >>
> >> > On Tue, Sep 24, 2013 at 6:51 PM, Ivan Zhakov <iv...@visualsvn.com>
> wrote:
> >> >
> >> >> On 23 August 2013 03:21, Stefan Fuhrmann <
> stefan.fuhrmann@wandisco.com>
> >> >> wrote:
> >> >> > Hi all,
> >> >> >
> >> >> > As of r1516665, work on this branch has been completed.
> >> >> > Please review. See the BRANCH-README for the list of
> >> >> > major changes.
> >> >> >
> >> >> > If there are no objections, I will merge the code in the week
> >> >> > of Sep 23th.
> >> >> >
> >> >> Hi Stefan,
> >> >>
> >> >> I have looked through the code but I'm not ready to put my +1 on it.
> I
> >> >> think this branch is a good candidate for the "three +1" policy
> >> >> discused some time ago.
> >> >>
> >> >
> >> > Would you vote +1 under the 3-vote policy ("seems o.k. but more
> >> > independent review should take place")?
> >>
> >> I've been reviewing this branch and I am now happy for it to be merged
> >> to trunk.
> >
> >
> > Thanks a lot, Philip!
> > I plan on merging the branch later this week (Friday-ish).
> >
> Well, I still think that log-addressing branch should *NOT* be merged
> and all FS _performance_ and format changes should be implemented in
> FSX.
>

Format 7 is not simply an enabler for more speed. It adds
significant security features like a *complete* checksum
coverage of all repository content. This is based on the
new indexing infrastructure.

Ask people that had to fix corrupted repositories in the
past how hard it is to identify the source of a corruption.
Sticking with format 6 also means that we will not be able
to detect certain tree modifications / corruptions.

My primary concern is that currently FSFS supports reading and writing
> all FSFS formats. And there are more than 6 formats now, because they
> can differ of how repositories were upgraded. This makes code very
> complicated, because it should handle all different combinations of
> formats/upgrades.
>

That is a valid concern and one of many reasons for FSX
not being backward compatible. But I think the added cost
for supporting another format is not as high as you think.

In recent Subversion 1.8 release we got several critical data
> corruption errors due this complexity and handling older formats:
> * hotcopy: fix hotcopy losing revprop files in packed repos (issue #4448)
> * hotcopy: cleanup unpacked revprops with '--incremental' (r1512300 et al)
>

These two are strictly related to the new 1.8 feature
"incremental hotcopy".


> * fsfs: resolve endless loop problem when repos/db/uuid has \r\n (r1492145)
>

This is was a bug in io.c that happened to manifest
in fsfs and is not related to recent format changes.


> * fsfs: remove revision property buffer limit (r1491770)
>

That one is debatable. Since revprops are not handled
streamingly, there is always an upper limit to the size
of props (and revprops). In fact, not limiting the length
is potentially more risky.

In this case, we simply decided in favour of backward
compatibility over security.


> * fsfs: fixed manifest file growth with revprop changes (r1513874)
> * fsfs: fix packed revprops causing loss of revprops (r1513879 et al)
> * svnadmin upgrade: fix error of non-sharded fsfs repositories (r1494287)
> * svnadmin upgrade: fix data loss when cancelling in last stage (r1494298)
>

All these problems were found shortly after format 6
had been released. Compare that to problems like
"newlines in file names" that lingered in our code for
almost a decade.


>  >> BTW, why we are not going to implement this in the FSX only?
> > Because FSX is not going to be recommended for general before
> > 2017 or so.
> This doesn't make sense for me: if FSX code is not ready for general
> use before 2017 then we should not push FSX-like features to stable
> and time proven FSFS.
>

That is a non-sequitur: not everything that takes a
single *idea* from FSX is as unstable as the complete
FS rewrite that FSX will ultimately be.

-- Stefan^2.

Re: Log-addressing branch ready for review

Posted by Ivan Zhakov <iv...@visualsvn.com>.
On 25 November 2013 17:44, Stefan Fuhrmann <st...@wandisco.com> wrote:
> On Mon, Nov 25, 2013 at 1:29 PM, Philip Martin <ph...@codematters.co.uk>
> wrote:
>>
>> Stefan Fuhrmann <st...@wandisco.com> writes:
>>
>> > On Tue, Sep 24, 2013 at 6:51 PM, Ivan Zhakov <iv...@visualsvn.com> wrote:
>> >
>> >> On 23 August 2013 03:21, Stefan Fuhrmann <st...@wandisco.com>
>> >> wrote:
>> >> > Hi all,
>> >> >
>> >> > As of r1516665, work on this branch has been completed.
>> >> > Please review. See the BRANCH-README for the list of
>> >> > major changes.
>> >> >
>> >> > If there are no objections, I will merge the code in the week
>> >> > of Sep 23th.
>> >> >
>> >> Hi Stefan,
>> >>
>> >> I have looked through the code but I'm not ready to put my +1 on it. I
>> >> think this branch is a good candidate for the "three +1" policy
>> >> discused some time ago.
>> >>
>> >
>> > Would you vote +1 under the 3-vote policy ("seems o.k. but more
>> > independent review should take place")?
>>
>> I've been reviewing this branch and I am now happy for it to be merged
>> to trunk.
>
>
> Thanks a lot, Philip!
> I plan on merging the branch later this week (Friday-ish).
>
Well, I still think that log-addressing branch should *NOT* be merged
and all FS _performance_ and format changes should be implemented in
FSX.

My primary concern is that currently FSFS supports reading and writing
all FSFS formats. And there are more than 6 formats now, because they
can differ of how repositories were upgraded. This makes code very
complicated, because it should handle all different combinations of
formats/upgrades.

In recent Subversion 1.8 release we got several critical data
corruption errors due this complexity and handling older formats:
* hotcopy: fix hotcopy losing revprop files in packed repos (issue #4448)
* hotcopy: cleanup unpacked revprops with '--incremental' (r1512300 et al)
* fsfs: fixed manifest file growth with revprop changes (r1513874)
* fsfs: fix packed revprops causing loss of revprops (r1513879 et al)
* fsfs: resolve endless loop problem when repos/db/uuid has \r\n (r1492145)
* fsfs: remove revision property buffer limit (r1491770)
* svnadmin upgrade: fix error of non-sharded fsfs repositories (r1494287)
* svnadmin upgrade: fix data loss when cancelling in last stage (r1494298)

>> BTW, why we are not going to implement this in the FSX only?
> Because FSX is not going to be recommended for general before
> 2017 or so.
This doesn't make sense for me: if FSX code is not ready for general
use before 2017 then we should not push FSX-like features to stable
and time proven FSFS.

-- 
Ivan Zhakov
CTO | VisualSVN | http://www.visualsvn.com

Re: Log-addressing branch ready for review

Posted by Julian Foad <ju...@btopenworld.com>.
Stefan Fuhrmann wrote:
> On Wed, Nov 27, 2013 at 6:55 PM, Julian Foad wrote:
>> Stefan Fuhrmann wrote:
>>>>>>> As of r1516665, work on this branch has been completed.
>>>>>>> Please review. See the BRANCH-README for the list of
>>>>>>> major changes.
>> 
>> First I am looking for an overall description of the changes. The
>> BRANCH-README says: [...]
> 
> I only used it as a TODO list. [...]
> r1546928 on the fsfs-improvements branch now contains a
> summary of all f7 changes. It does not give function-level
> detail but describes the intended changes on a per-file level.

Thanks, Stefan -- that log message is really useful.

I started reading, and took the liberty of committing a few grammer-ish tweaks to the 'structure' file in r1546956.

[...]
>> Would it make sense for you to merge this to fsfs-improvements right
>> away, to make it easier for other people to run a trial merge to
>> trunk? And then of course the proposal would be to merge
>> fsfs-improvements to trunk, rather than log-addressing to trunk.
>> (I'm assuming this is the only active sub-branch off the
>> fsfs-improvements branch at the moment.)
> 
> That is correct. I just merged everything into fsfs-improvements
> and dropped the sub-branch.

Cool. Thanks.

- Julian

Re: Log-addressing branch ready for review

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Wed, Nov 27, 2013 at 6:55 PM, Julian Foad <ju...@btopenworld.com>wrote:

> Stefan Fuhrmann wrote:
> >>>>> As of r1516665, work on this branch has been completed.
> >>>>> Please review. See the BRANCH-README for the list of
> >>>>> major changes.
>
> I have not reviewed, but looked at how to review it.
>
> First I am looking for an overall description of the changes. The
> BRANCH-README says:
>
> [[[
> This is an integration / development branch to copy / implement and
> review the logical addressing feature originally developed on the
> fsfs-format7 branch.  After review and after the fsfs-improvements
> branch got merged into /trunk, this whole branch will get merged
> into /trunk as well.
>
> Features to implement here:
>
> * support for logical addressing
> * switch to format 7; use log addressing after for new shards created
>   after svnadmin upgrade
> * block read / data prefetch for f7 rev / pack file data
> * reorder data upon pack
> * add checksums to index data
> * improve verification to cover all f7 rev / pack file data
>
> More on this here:
> http://svn.haxx.se/dev/archive-2013-06/0755.shtml
> http://svn.haxx.se/dev/archive-2013-07/0059.shtml
> ]]]
>
> I don't know how up to date that is, but anyhow it doesn't precisely
> describe the changes.


I only used it as a TODO list.


> If you could perhaps, starting from this text, turn it into the standard
> log message format that you would eventually include in the merge log
> message, that would be much better. I am in favour of merges having a log
> message that really describes what's being merged, since trying to
> understand it from scanning the original series of branch commits can be
> very inefficient.
>

r1546928 on the fsfs-improvements branch now contains a
summary of all f7 changes. It does not give function-level
detail but describes the intended changes on a per-file level.


> Then, to see the actual changes, we can run a diff if we first discover
> the trunk revision to diff against:


> $ svn mergeinfo ^/subversion/trunk \
>                 ^/subversion/branches/fsfs-improvements
>     youngest common ancestor
>     |         last full merge
>     |         |        tip of branch
>     |         |        |         repository path
>
>     1499980   1545953  1546120
>     |         |        |
>   -------| |------------         subversion/trunk
>      \         \
>       \         \
>        --| |------------         subversion/branches/fsfs-improvements
>                        |
>                        1546120
>
> $ svn mergeinfo ^/subversion/branches/fsfs-improvements \
>                 ^/subversion/branches/log-addressing
>     youngest common ancestor
>     |         last full merge
>     |         |        tip of branch
>     |         |        |         repository path
>
>     1509278   1545955  1546118
>     |         |        |
>   -------| |------------         subversion/branches/fsfs-improvements
>      \         \
>       \         \
>        --| |------------         subversion/branches/log-addressing
>                        |
>                        1546118
>
> $ svn diff --old=^/subversion/trunk@1545953 \
>            --new=^/subversion/branches/log-addressing > d
>
> I find it easier to have the changes in a trunk WC and use diff-aware
> editors than to browse the patch file directly. I can apply this patch:
>
> $ svn patch d
> D         subversion/libsvn_subr/md5.h
> D         subversion/libsvn_subr/sha1.c
> D         subversion/libsvn_subr/sha1.h
> U         build.conf
> A         BRANCH-README
> U         subversion/libsvn_fs_base/fs.c
> [...]
>
> OK, fine. (I got a minor text conflict in a test, presumably due to a
> recent trunk change that's not yet sync'd to the branch.)
>
> Alternatively, I usually just run the proposed merge. At first I tried to
> merge it directly to a trunk WC, using automatic merge: "svn merge
> ^/subversion/branches/log-addressing". No go. Major conflicts.
>
> The branching structure is like this (showing only the most recent merges):
>
> ------------------------ trunk
>    \     ...   \
>     +-----------+------- branches/fsfs-improvements
>        \    ...    \
>         +-----------+--- branches/log-addressing
>
> Subversion's automatic merge doesn't support tracking merges going round a
> cycle (trunk -> fsfs-imp. -> log-addr. -> trunk). That's the basic reason
> it produces conflicts when we try. The "proper" way to merge this back to
> trunk, in terms of what Subversion's automatic merge supports, is first to
> merge it to fsfs-improvements. That works fine in my trial. (I only ran the
> merge, didn't try to build the resulting code.) Then I suppose a merge from
> fsfs-improvements to trunk would work fine.
>

Yes, that is a limitation in the way we currently use merge
tracking: Catch-up merges must follow the branch hierarchy.

Cherry-picks are slightly less critical since one would expect
them to cause (trivial) conflicts anyways when resync'ing
the source branch with the target.

It looks like fsfs-improvements itself currently has no changes on it that
> have not been merged to trunk (apart from BRANCH-README and a mergeinfo
> property).
>
> Stefan, does all this match your understanding? Would it make sense for
> you to merge this to fsfs-improvements right away, to make it easier for
> other people to run a trial merge to trunk? And then of course the proposal
> would be to merge fsfs-improvements to trunk, rather than log-addressing to
> trunk. (I'm assuming this is the only active sub-branch off the
> fsfs-improvements branch at the moment.)
>

That is correct. I just merged everything into fsfs-improvements
and dropped the sub-branch.

-- Stefan^2.

Re: Log-addressing branch ready for review

Posted by Julian Foad <ju...@btopenworld.com>.
Stefan Fuhrmann wrote:
>>>>> As of r1516665, work on this branch has been completed.
>>>>> Please review. See the BRANCH-README for the list of
>>>>> major changes.

I have not reviewed, but looked at how to review it.

First I am looking for an overall description of the changes. The BRANCH-README says:

[[[
This is an integration / development branch to copy / implement and
review the logical addressing feature originally developed on the
fsfs-format7 branch.  After review and after the fsfs-improvements
branch got merged into /trunk, this whole branch will get merged
into /trunk as well.

Features to implement here:

* support for logical addressing
* switch to format 7; use log addressing after for new shards created
  after svnadmin upgrade
* block read / data prefetch for f7 rev / pack file data
* reorder data upon pack
* add checksums to index data
* improve verification to cover all f7 rev / pack file data

More on this here:
http://svn.haxx.se/dev/archive-2013-06/0755.shtml
http://svn.haxx.se/dev/archive-2013-07/0059.shtml
]]]

I don't know how up to date that is, but anyhow it doesn't precisely describe the changes. If you could perhaps, starting from this text, turn it into the standard log message format that you would eventually include in the merge log message, that would be much better. I am in favour of merges having a log message that really describes what's being merged, since trying to understand it from scanning the original series of branch commits can be very inefficient.

Then, to see the actual changes, we can run a diff if we first discover the trunk revision to diff against:

$ svn mergeinfo ^/subversion/trunk \
                ^/subversion/branches/fsfs-improvements
    youngest common ancestor
    |         last full merge
    |         |        tip of branch
    |         |        |         repository path

    1499980   1545953  1546120 
    |         |        |       
  -------| |------------         subversion/trunk
     \         \               
      \         \              
       --| |------------         subversion/branches/fsfs-improvements
                       |       
                       1546120 

$ svn mergeinfo ^/subversion/branches/fsfs-improvements \
                ^/subversion/branches/log-addressing
    youngest common ancestor
    |         last full merge
    |         |        tip of branch
    |         |        |         repository path

    1509278   1545955  1546118 
    |         |        |       
  -------| |------------         subversion/branches/fsfs-improvements
     \         \               
      \         \              
       --| |------------         subversion/branches/log-addressing
                       |       
                       1546118 

$ svn diff --old=^/subversion/trunk@1545953 \
           --new=^/subversion/branches/log-addressing > d

I find it easier to have the changes in a trunk WC and use diff-aware editors than to browse the patch file directly. I can apply this patch:

$ svn patch d
D         subversion/libsvn_subr/md5.h
D         subversion/libsvn_subr/sha1.c
D         subversion/libsvn_subr/sha1.h
U         build.conf
A         BRANCH-README
U         subversion/libsvn_fs_base/fs.c
[...]

OK, fine. (I got a minor text conflict in a test, presumably due to a recent trunk change that's not yet sync'd to the branch.)

Alternatively, I usually just run the proposed merge. At first I tried to merge it directly to a trunk WC, using automatic merge: "svn merge ^/subversion/branches/log-addressing". No go. Major conflicts.

The branching structure is like this (showing only the most recent merges):

------------------------ trunk
   \     ...   \
    +-----------+------- branches/fsfs-improvements
       \    ...    \
        +-----------+--- branches/log-addressing

Subversion's automatic merge doesn't support tracking merges going round a cycle (trunk -> fsfs-imp. -> log-addr. -> trunk). That's the basic reason it produces conflicts when we try. The "proper" way to merge this back to trunk, in terms of what Subversion's automatic merge supports, is first to merge it to fsfs-improvements. That works fine in my trial. (I only ran the merge, didn't try to build the resulting code.) Then I suppose a merge from fsfs-improvements to trunk would work fine.

It looks like fsfs-improvements itself currently has no changes on it that have not been merged to trunk (apart from BRANCH-README and a mergeinfo property).

Stefan, does all this match your understanding? Would it make sense for you to merge this to fsfs-improvements right away, to make it easier for other people to run a trial merge to trunk? And then of course the proposal would be to merge fsfs-improvements to trunk, rather than log-addressing to trunk. (I'm assuming this is the only active sub-branch off the fsfs-improvements branch at the moment.)

- Julian


Re: Log-addressing branch ready for review

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Mon, Nov 25, 2013 at 1:29 PM, Philip Martin <ph...@codematters.co.uk>wrote:

> Stefan Fuhrmann <st...@wandisco.com> writes:
>
> > On Tue, Sep 24, 2013 at 6:51 PM, Ivan Zhakov <iv...@visualsvn.com> wrote:
> >
> >> On 23 August 2013 03:21, Stefan Fuhrmann <st...@wandisco.com>
> >> wrote:
> >> > Hi all,
> >> >
> >> > As of r1516665, work on this branch has been completed.
> >> > Please review. See the BRANCH-README for the list of
> >> > major changes.
> >> >
> >> > If there are no objections, I will merge the code in the week
> >> > of Sep 23th.
> >> >
> >> Hi Stefan,
> >>
> >> I have looked through the code but I'm not ready to put my +1 on it. I
> >> think this branch is a good candidate for the "three +1" policy
> >> discused some time ago.
> >>
> >
> > Would you vote +1 under the 3-vote policy ("seems o.k. but more
> > independent review should take place")?
>
> I've been reviewing this branch and I am now happy for it to be merged
> to trunk.
>

Thanks a lot, Philip!
I plan on merging the branch later this week (Friday-ish).

-- Stefan^2.

Re: Log-addressing branch ready for review

Posted by Philip Martin <ph...@codematters.co.uk>.
Stefan Fuhrmann <st...@wandisco.com> writes:

> On Tue, Sep 24, 2013 at 6:51 PM, Ivan Zhakov <iv...@visualsvn.com> wrote:
>
>> On 23 August 2013 03:21, Stefan Fuhrmann <st...@wandisco.com>
>> wrote:
>> > Hi all,
>> >
>> > As of r1516665, work on this branch has been completed.
>> > Please review. See the BRANCH-README for the list of
>> > major changes.
>> >
>> > If there are no objections, I will merge the code in the week
>> > of Sep 23th.
>> >
>> Hi Stefan,
>>
>> I have looked through the code but I'm not ready to put my +1 on it. I
>> think this branch is a good candidate for the "three +1" policy
>> discused some time ago.
>>
>
> Would you vote +1 under the 3-vote policy ("seems o.k. but more
> independent review should take place")?

I've been reviewing this branch and I am now happy for it to be merged
to trunk.

-- 
Philip

Re: Log-addressing branch ready for review

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Tue, Sep 24, 2013 at 6:51 PM, Ivan Zhakov <iv...@visualsvn.com> wrote:

> On 23 August 2013 03:21, Stefan Fuhrmann <st...@wandisco.com>
> wrote:
> > Hi all,
> >
> > As of r1516665, work on this branch has been completed.
> > Please review. See the BRANCH-README for the list of
> > major changes.
> >
> > If there are no objections, I will merge the code in the week
> > of Sep 23th.
> >
> Hi Stefan,
>
> I have looked through the code but I'm not ready to put my +1 on it. I
> think this branch is a good candidate for the "three +1" policy
> discused some time ago.
>

Would you vote +1 under the 3-vote policy ("seems o.k. but more
independent review should take place")?


> BTW, why we are not going to implement this in the FSX only?
>

Because FSX is not going to be recommended for general before
2017 or so.

While being only moderately complex, the log-addressing feature
revitalizes FSFS development. For instance, it will allow us to
introduce new versions of existing item types (e.g. extended change
list information; move-aware noderevs) in the same repository as
old item types.

Of curse, FSFSv7 will also be massively faster for a wide range of
operations and will continue to benefit from optimizations such as
pre-fetching in future releases. Finally, it provides 100% checksum
coverage of the rev/pack file contents. The source of a corruption
becomes much easier to spot.

-- Stefan^2.