You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Branko Čibej <br...@xbc.nu> on 2002/07/25 21:16:19 UTC
Delta combiner stress test results
I ran some stress tests on the delta combiner today, using revision
2703, and comparisons whth /trunk at the same revision. Here's how I tested:
* I used the dump file for the Subversion repository (up to revision
2662) to create two repositories, one with the branch svn and one
with the trunk svn. Here's what the repositories look like, after
cleaning out the log files:
[brane@silmaril s]$ ll repo-trunk/db/
total 57921
-rw-r--r-- 1 brane None 328 Jul 25 06:06 DB_CONFIG
-rw-r--r-- 1 brane None 8192 Jul 25 06:06 __db.001
-rw-r--r-- 1 brane None 270336 Jul 25 06:06 __db.002
-rw-r--r-- 1 brane None 327680 Jul 25 06:06 __db.003
-rw-r--r-- 1 brane None 704512 Jul 25 06:06 __db.004
-rw-r--r-- 1 brane None 16384 Jul 25 06:06 __db.005
-rw-r--r-- 1 brane None 1826816 Jul 25 22:16 changes
-rw-r--r-- 1 brane None 16384 Jul 25 22:16 copies
-rw-r--r-- 1 brane None 1720320 Jul 25 22:16 nodes
-rw-r--r-- 1 brane None 2097152 Jul 25 22:16 representations
-rw-r--r-- 1 brane None 73728 Jul 25 22:16 revisions
-rw-r--r-- 1 brane None 49037312 Jul 25 22:16 strings
-rw-r--r-- 1 brane None 3211264 Jul 25 22:16 transactions
[brane@silmaril s]$ ll repo-branch/db/
total 41033
-rw-r--r-- 1 brane None 328 Jul 25 10:13 DB_CONFIG
-rw-r--r-- 1 brane None 8192 Jul 25 10:13 __db.001
-rw-r--r-- 1 brane None 270336 Jul 25 10:13 __db.002
-rw-r--r-- 1 brane None 327680 Jul 25 10:13 __db.003
-rw-r--r-- 1 brane None 704512 Jul 25 10:13 __db.004
-rw-r--r-- 1 brane None 16384 Jul 25 10:13 __db.005
-rw-r--r-- 1 brane None 1826816 Jul 25 21:46 changes
-rw-r--r-- 1 brane None 16384 Jul 25 21:46 copies
-rw-r--r-- 1 brane None 1720320 Jul 25 21:46 nodes
-rw-r--r-- 1 brane None 2088960 Jul 25 21:46 representations
-rw-r--r-- 1 brane None 73728 Jul 25 21:46 revisions
-rw-r--r-- 1 brane None 31752192 Jul 25 21:46 strings
-rw-r--r-- 1 brane None 3211264 Jul 25 21:46 transactions
The branch repo is smaller, because it deltifies files regardless
of their size. On the trunk, files larger than the delta window
(100k) aren't deltified.
* I created out four working copies, to cover all the possible
combinations: trunk svn/trunk repo, trunk svn/branch repo, branch
svn/trunk repo and branch svn/branch svn.
* I then did the following tests in all four flavours:
svn co -r500 # Check out an early version -- lots of
undeltification, few files
svn up -r2500 # Move to recent version -- less undeltification,
many files
svn up -r1500 # Move to older version -- more undeltification,
less files
* After each set of tests, I recursively compared the working copies
(except the .svn directories). All the working copes were
identical, _except_ that the trunk svn/branch repo combination
corrupted files larget than 100k. I assume there's a latent bug in
the undeltification code on /trunk that shows up when trying to
undeltify files larger than the window size. We didn't notice it
because there are no such files in our repositories. The branch
sbuversion doesn't have this problem.
The timing results are in the attached table. Note that the results are
skewed, especially the "svn up" ones, because I measured ra_local. Based
on how I observed "svn" to behave over ra_dav, I estimate that working
copy handling represents a constant 60s and 6-7M of the results.
The tests show that operations on the branch repo are slightly more
expensive, because there's more undeltification going on. On the plus
side, the branch repo is 30% smaller.
Memory usage in the branch svn was much more constant than in the trunk
svn, but both versions displayed a slight but continuous increase in
working set size. We probably have a memory leak somewhere.
Conclusion: The combiner is ready to be merged on the mainline. Please
test the code in /branches/issue-531-dev on your repositories, and let
me know the results. I'd be especially interested in results from really
huge repositores, e.g., the Linux kernel archives I know some of you
have created.
If everything goes well, I'll merge the combiner onto the trunk on Monday.
--
Brane Čibej <br...@xbc.nu> http://www.xbc.nu/brane/
Re: Delta combiner stress test results
Posted by Branko Čibej <br...@xbc.nu>.
cmpilato@collab.net wrote:
>Karl Fogel <kf...@newton.ch.collab.net> writes:
>
>
>
>>Branko Čibej <br...@xbc.nu> writes:
>>
>>
>>> * After each set of tests, I recursively compared the working copies
>>> (except the .svn directories). All the working copes were
>>> identical, _except_ that the trunk svn/branch repo combination
>>> corrupted files larget than 100k. I assume there's a latent bug in
>>> the undeltification code on /trunk that shows up when trying to
>>> undeltify files larger than the window size. We didn't notice it
>>> because there are no such files in our repositories. The branch
>>> sbuversion doesn't have this problem.
>>>
>>>
>>Heh. Nice discovery.
>>
>>
>
>I'm confused. Does large_file_integrity() in fs-test.c not do an
>adequate job of testing?
>
No, because on trunk, large files don't get deltified at all. This bug
only showed up when using trunk svn with a repository created by the
branch svn, which _does_ deltify large files.
--
Brane Čibej <br...@xbc.nu> http://www.xbc.nu/brane/
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: Delta combiner stress test results
Posted by Karl Fogel <kf...@newton.ch.collab.net>.
cmpilato@collab.net writes:
> In fact, large_file_integrity() does exactly this sort of thing. Here
> are the in-line comments from the helper function:
Yeah, but the files aren't nearly large enough. They're larger than
the svn delta window size, but they're not anywhere in the
neighborhood of hundreds of MB, nor even tens of MB :-).
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: Delta combiner stress test results
Posted by cm...@collab.net.
Karl Fogel <kf...@newton.ch.collab.net> writes:
> Branko Äibej <br...@xbc.nu> writes:
> > * After each set of tests, I recursively compared the working copies
> > (except the .svn directories). All the working copes were
> > identical, _except_ that the trunk svn/branch repo combination
> > corrupted files larget than 100k. I assume there's a latent bug in
> > the undeltification code on /trunk that shows up when trying to
> > undeltify files larger than the window size. We didn't notice it
> > because there are no such files in our repositories. The branch
> > sbuversion doesn't have this problem.
>
> Heh. Nice discovery.
I'm confused. Does large_file_integrity() in fs-test.c not do an
adequate job of testing?
> > Conclusion: The combiner is ready to be merged on the mainline. Please
> > test the code in /branches/issue-531-dev on your repositories, and let
> > me know the results. I'd be especially interested in results from
> > really huge repositores, e.g., the Linux kernel archives I know some
> > of you have created.
>
> I'd be *most* interested in results for repositories versioning huge
> files (~100MB), in two separate ways:
>
> - Huge file, but each revision is a small delta to it (like adding
> the string "fish" inside a giant image file)
>
> - Huge file, and each revision touches a lot of its bytes
In fact, large_file_integrity() does exactly this sort of thing. Here
are the in-line comments from the helper function:
/* Create a big, ugly, pseudo-random-filled file and commit it. */
/* Now, let's make some edits to the beginning of our file, and
commit those. */
/* Now, let's make some edits to the end of our file. */
/* How about some edits to both the beginning and the end of the
file? */
/* Alright, now we're just going to go crazy. Let's make many more
edits -- pseudo-random numbers and offsets of bytes changed to
more pseudo-random values. */
> Our showstopper scalability issue had been that we didn't deltify
> files larger than the delta window size (~100k). Now that Brane's
> fixed that, wouldn't be interesting to see where our limits are? :-)
Ooh...perhaps these big files in large_file_integrity() weren't
actually being deltified... ?
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: Delta combiner stress test results
Posted by Karl Fogel <kf...@newton.ch.collab.net>.
Branko Čibej <br...@xbc.nu> writes:
> * After each set of tests, I recursively compared the working copies
> (except the .svn directories). All the working copes were
> identical, _except_ that the trunk svn/branch repo combination
> corrupted files larget than 100k. I assume there's a latent bug in
> the undeltification code on /trunk that shows up when trying to
> undeltify files larger than the window size. We didn't notice it
> because there are no such files in our repositories. The branch
> sbuversion doesn't have this problem.
Heh. Nice discovery.
> The tests show that operations on the branch repo are slightly more
> expensive, because there's more undeltification going on. On the plus
> side, the branch repo is 30% smaller.
These time & mem differences are pretty small, insignificant really.
> Conclusion: The combiner is ready to be merged on the mainline. Please
> test the code in /branches/issue-531-dev on your repositories, and let
> me know the results. I'd be especially interested in results from
> really huge repositores, e.g., the Linux kernel archives I know some
> of you have created.
I'd be *most* interested in results for repositories versioning huge
files (~100MB), in two separate ways:
- Huge file, but each revision is a small delta to it (like adding
the string "fish" inside a giant image file)
- Huge file, and each revision touches a lot of its bytes
Our showstopper scalability issue had been that we didn't deltify
files larger than the delta window size (~100k). Now that Brane's
fixed that, wouldn't be interesting to see where our limits are? :-)
> If everything goes well, I'll merge the combiner onto the trunk on Monday.
Bravo, sir!
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org