You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Stefan Fuhrmann <st...@alice-dsl.de> on 2011/12/27 01:30:38 UTC
Directory deltification
As of r1224839, FSFS now supports directory deltification.
Please review the changes and run tests against different
repositories so that we get a better idea of what the costs
and benefits are. As soon as I'm back home, I will run tests
against the Apache and KDE repositories.
So far, I ran tests against the rather small TSVN repository.
It seems that we get 50% more capacity / 33% size savings
for 0 .. 20% CPU overhead. The savings should be more
significant on larger repositories and some of the extra
overhead should be removed by the file handle caching
code - once merged into /trunk.
In any case, we are still much faster than 1.6.
Detailed results can be found in the attached document.
"/trunk" is at r1224828 (i.e. without deltification tuning.
-- Stefan^2.
Re: Directory deltification
Posted by Stefan Fuhrmann <st...@alice-dsl.de>.
On 27.12.2011 14:20, Daniel Shahaf wrote:
> Stefan Fuhrmann wrote on Tue, Dec 27, 2011 at 01:30:38 +0100:
>> As of r1224839, FSFS now supports directory deltification.
> If you haven't seen it already, I opened an issue around this just recently:
>
> http://subversion.tigris.org/issues/show_bug.cgi?id=4084
I'm aware of that. That discussion was based
on outdated information and the only reasonable
way to address this issue is by providing updated
information.
-- Stefan^2.
Re: Directory deltification
Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Stefan Fuhrmann wrote on Tue, Dec 27, 2011 at 01:30:38 +0100:
> As of r1224839, FSFS now supports directory deltification.
If you haven't seen it already, I opened an issue around this just recently:
http://subversion.tigris.org/issues/show_bug.cgi?id=4084
> Please review the changes and run tests against different
> repositories so that we get a better idea of what the costs
> and benefits are. As soon as I'm back home, I will run tests
> against the Apache and KDE repositories.
>
> So far, I ran tests against the rather small TSVN repository.
> It seems that we get 50% more capacity / 33% size savings
> for 0 .. 20% CPU overhead. The savings should be more
> significant on larger repositories and some of the extra
> overhead should be removed by the file handle caching
> code - once merged into /trunk.
>
> In any case, we are still much faster than 1.6.
> Detailed results can be found in the attached document.
> "/trunk" is at r1224828 (i.e. without deltification tuning.
>
> -- Stefan^2.
> Ubuntu 11.10, 64 bit (ssd), packed repository
>
> repository size
>
> 1.6.12 414.9MB (100%)
> trunk 414.9MB (100%)
> trunk+diff 275.9MB ( 66%)
>
> svnadmin load -q (4.3GB non-deltified dump file)
>
> 1.6.12 5m25.625s (156%)
> trunk 3m29.225s (100%)
> trunk -M1000 3m 1.455s ( 87%)
> trunk+diff 4m 1.313s (115%)
> trunk+diff -M1000 3m23.496s ( 97%)
>
> svnadmin verify -q
>
> 1.6.12 68m21.244s (3877%)
> trunk 1m45.786s (100%)
> trunk -M1000 1m18.207s ( 74%)
> trunk+diff 2m10.199s (123%)
> trunk+diff -M1000 1m36.188s ( 91%)
>
> svnserve -dT (addtional flags for trunk: -c0)
> svn export svn://localhost/repo/trunk -q --ignore-externals (cold)
> (85MB, 3840 items)
>
> 1.6.12 18.223s (883%)
> trunk 2.063s (100%)
> trunk -M1000 2.888s (140%)
> trunk+diff 2.607s (126%)
> trunk+diff -M1000 2.691s (130%)
>
> svn export svn://localhost/repo/trunk -q --ignore-externals (hot)
>
> 1.6.12 17.743s (841%)
> trunk 2.111s (100%)
> trunk -M1000 0.901s ( 43%)
> trunk+diff 2.412s (114%)
> trunk+diff -M1000 0.994s ( 47%)
>
> svn ls svn://localhost/repo/tags (cold)
> (accuracy +/- 2ms)
>
> 1.6.12 66ms (125%)
> trunk 53ms (100%)
> trunk -M1000 128ms (242%)
> trunk+diff 52ms ( 98%)
> trunk+diff -M1000 123ms (232%)
>
> svn ls svn://localhost/repo/tags (hot)
> (accuracy +/- 2ms)
>
> 1.6.12 66ms (140%)
> trunk 47ms (100%)
> trunk -M1000 49ms (104%)
> trunk+diff 45ms ( 96%)
> trunk+diff -M1000 47ms (100%)
>
>
Re: Directory deltification
Posted by Philip Martin <ph...@wandisco.com>.
Hyrum K Wright <hy...@wandisco.com> writes:
> On Tue, Jan 3, 2012 at 8:23 AM, Philip Martin
> <ph...@wandisco.com> wrote:
>> I've been testing this with the old CollabNet Subversion repository, the
>> first 40,515 Subversion revisions, on my Linux laptop:
>>
>> The db/revs directory (unpacked) is 320MB instead of 490MB.
>> Loading takes about 12% more CPU.
>> Dumping takes about 22% more CPU.
>>
>> which matches your results. Packing removes about 85MB for both
>> repositories.
>>
>> There are operations where reading the directory representations is more
>> dominant. 'svn log' on a path inside the repository uses 100% more CPU.
>
> Is that peak CPU or overall? If the I/O overhead went down, I'd
> expect the peak CPU usage to go up, but the overall operation time to
> drop.
That's overall CPU used, so the runtime for 'svn log' doubles when the
repository is in RAM.
--
uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com
Re: Directory deltification
Posted by Hyrum K Wright <hy...@wandisco.com>.
On Tue, Jan 3, 2012 at 8:23 AM, Philip Martin
<ph...@wandisco.com> wrote:
> Stefan Fuhrmann <st...@alice-dsl.de> writes:
>
>> As of r1224839, FSFS now supports directory deltification.
>> Please review the changes and run tests against different
>> repositories so that we get a better idea of what the costs
>> and benefits are. As soon as I'm back home, I will run tests
>> against the Apache and KDE repositories.
>>
>> So far, I ran tests against the rather small TSVN repository.
>> It seems that we get 50% more capacity / 33% size savings
>> for 0 .. 20% CPU overhead.
>
> I've been testing this with the old CollabNet Subversion repository, the
> first 40,515 Subversion revisions, on my Linux laptop:
>
> The db/revs directory (unpacked) is 320MB instead of 490MB.
> Loading takes about 12% more CPU.
> Dumping takes about 22% more CPU.
>
> which matches your results. Packing removes about 85MB for both
> repositories.
>
> There are operations where reading the directory representations is more
> dominant. 'svn log' on a path inside the repository uses 100% more CPU.
Is that peak CPU or overall? If the I/O overhead went down, I'd
expect the peak CPU usage to go up, but the overall operation time to
drop.
(FWIW, I'm in the midst of loading a copy of the ASF repo using the
new code, and I'll let folks know the results.)
-Hyrum
--
uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com/
Re: Directory deltification
Posted by Stefan Fuhrmann <eq...@web.de>.
On 03.01.2012 15:23, Philip Martin wrote:
> Stefan Fuhrmann<st...@alice-dsl.de> writes:
>
>> As of r1224839, FSFS now supports directory deltification.
>> Please review the changes and run tests against different
>> repositories so that we get a better idea of what the costs
>> and benefits are. As soon as I'm back home, I will run tests
>> against the Apache and KDE repositories.
>>
>> So far, I ran tests against the rather small TSVN repository.
>> It seems that we get 50% more capacity / 33% size savings
>> for 0 .. 20% CPU overhead.
> I've been testing this with the old CollabNet Subversion repository, the
> first 40,515 Subversion revisions, on my Linux laptop:
>
> The db/revs directory (unpacked) is 320MB instead of 490MB.
> Loading takes about 12% more CPU.
> Dumping takes about 22% more CPU.
>
> which matches your results. Packing removes about 85MB for both
> repositories.
Thanks for testing!
Packed Apache repo is 29 vs. 41GB.
Packed KDE repo is 39 vs. 69GB.
I also noticed that "svndump dump | svndump load"
is much faster than "svnsync file:// file://". My guess
is that revprop changes are somehow expensive
(even on a RAM disk).
> There are operations where reading the directory representations is more
> dominant. 'svn log' on a path inside the repository uses 100% more CPU.
>
Good finding. I should find some time to look into
this in the next two months or so. My guess is that
combining deltas is the critical operation here.
In that case, almost the whole overhead can be
eliminated.
-- Stefan^2.
Re: Directory deltification
Posted by Philip Martin <ph...@wandisco.com>.
Stefan Fuhrmann <st...@alice-dsl.de> writes:
> As of r1224839, FSFS now supports directory deltification.
> Please review the changes and run tests against different
> repositories so that we get a better idea of what the costs
> and benefits are. As soon as I'm back home, I will run tests
> against the Apache and KDE repositories.
>
> So far, I ran tests against the rather small TSVN repository.
> It seems that we get 50% more capacity / 33% size savings
> for 0 .. 20% CPU overhead.
I've been testing this with the old CollabNet Subversion repository, the
first 40,515 Subversion revisions, on my Linux laptop:
The db/revs directory (unpacked) is 320MB instead of 490MB.
Loading takes about 12% more CPU.
Dumping takes about 22% more CPU.
which matches your results. Packing removes about 85MB for both
repositories.
There are operations where reading the directory representations is more
dominant. 'svn log' on a path inside the repository uses 100% more CPU.
--
uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com