You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by B Smith-Mannschott <bs...@gmail.com> on 2008/12/03 11:34:46 UTC

Dump file bloating not observed (was How big can a repository get?)

*oops, a copy for the list...*

---------- Forwarded message ----------
From: B Smith-Mannschott <bs...@gmail.com>
Date: Wed, Dec 3, 2008 at 12:32 PM
Subject: Re: Re: Antwort: How big can a repository get?
To: Andreas.Otto@versit.de




On Wed, Dec 3, 2008 at 11:05 AM, <An...@versit.de> wrote:

>
> Hi,
>
>         I want to stop this thread because the issue is clear
>
>         everyone not able to see this issue I would recomment the following
> test
>
>
>         1. create an empty rep
>         2. fill it with 100M Data using multiple files
>               3. do a dump
>         4. make tags with svn copy using the whole rep
>         5. do a dump again
>
>         -> you will see that the size differ
>
> Freundliche Grüße
>

Whatever you are doing, you're not describing it accurately. I wrote a
script to reproduce your points above as best I could.  I ignored
"using the whole rep" in your fourth point because I have no idea what
that's supposed to mean.

*I observed no significant difference.*

$ svn --version | head -n 1
svn, version 1.5.2 (r32768)

*Here are the results:*

# fill the trunk
# note the size of the wc/trunk, exported
253M    wc/trunk
112M    wc/trunk.exported
# preserving otto before creating tags
# creating tags for revisions 2 through 42
# preserving otto after creating tags
# note the additional revisions in otto.after
otto.before 42
otto.after 83
# generating dump files
# sizes
15M    otto.after
17M    otto.after.deltas.dump
97M    otto.after.dump
14M    otto.before
17M    otto.before.deltas.dump
97M    otto.before.dump

As you can no doubt see, there's no significant difference in the
sizes of the dump files between before and after.

Using deltas is helping us here not because we've made edits to file
contents (we haven't) but because deltas are also compressed and the
text files I'm using are easy to compress.

There is some difference in the repository sizes, though this is to be
expected given the fact that FSFS genrates two files on my stock EXT3
file system (4KB blocksize) for each revision.

*This is the script I used:*

#!/bin/bash
svnadmin create otto
url=file://$PWD/otto
svn -q co $url wc
svn -q mkdir wc/{tags,trunk,branches}
svn -q ci wc -m "added empty tags, trunk, branches"
echo "# fill the trunk"
n=1
while (( $(du -sm wc/trunk|cut -f1) < 250 ))
do  # we keep going until the wc of the trunk is 250 MB, which
    # will get us more than 100 MB of content, even with the
    # overhead of .svn directories
    cp -a about-2.8-megs-of-text-files wc/trunk/$n
    svn -q add wc/trunk/$n
    svn -q commit wc -m "added directory $n to trunk"
    n=$((n + 1))
done
echo "# note the size of the wc/trunk, exported"
svn -q export wc/trunk wc/trunk.exported
du -sh wc/trunk wc/trunk.exported
rm -rf wc
echo "# preserving otto before creating tags"
cp -a otto otto.before
echo "# creating tags for revisions 2 through $n"
while (( n > 1 ))
do
    svn -q cp $url/trunk@$n $url/tags/$n -m "creating tag $n of trunk@$n"
    n=$((n - 1))
done
echo "# preserving otto after creating tags"
mv otto otto.after
echo "# note the additional revisions in otto.after"
echo otto.before $(svnlook youngest otto.before)
echo otto.after $(svnlook youngest otto.after)
echo "# generating dump files"
svnadmin -q dump otto.before > otto.before.dump
svnadmin -q dump --deltas otto.before > otto.before.deltas.dump
svnadmin -q dump otto.after > otto.after.dump
svnadmin -q dump --deltas otto.after > otto.after.deltas.dump
echo "# sizes"
du -sh otto.[ab]*

// Ben Smith-Mannschott

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=1065&dsMessageId=978828

To unsubscribe from this discussion, e-mail: [users-unsubscribe@subversion.tigris.org].