You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Tobias Ringström <to...@ringstrom.mine.nu> on 2003/07/10 18:07:19 UTC
Simple repository size experiment
After yesterdays thread about my SVN repo that became much larger than
the CVS repo it was converted from, I've been experimenting a bit and
thought I should share the results. The repository I was talking about
yesterday contains data that I cannot share with the world.
Until I've come up with a public repository with the same properties,
I've downloaded a part of the XFree86 CVS tree, namely xc/programs (or
xc-clients according to cvsup). The CVS repository has 2560 files and
1551 revisions after the conversion to SVN. I'm using "bzip2 -9" for the
compression mentioned below.
Size of uncompressed tar file of CVS repo: 53 MiB
Size of compressed tar file of CVS repo: 11 MiB
After running cvs2svn from svn-0.24.2 and removing all log files in the
db directory:
Size of uncompressed dump file from cvs2svn: 246 MiB
Size of compressed dump file from cvs2svn: 48 MiB
Size of uncompressed tar file of SVN repo: 59 MiB
Size of compressed tar file of SVN repo: 20 MiB
From reading this list, I really expected SVN repositories to be
smaller than CVS repories, but so far I've not seen any evidence of
that. It's not a lot worse than CVS, but it is definately worse. After
compression, it loses by a factor of two. I'm hoping that there is a
simple bug somewhere... :-)
/Tobias
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: Simple repository size experiment
Posted by Branko Čibej <br...@xbc.nu>.
Tobias Ringström wrote:
> For another (real) repository of mine, the compressed dump file is
> *smaller* than the compressed repository. I mentioned that on this
> list a long time ago, and "everyone" just figured that I must have a
> very strange repository since the vdelta is supposed to be soo great.
> I wish I could see that. The propaganda (lacking a better word, sorry)
> has been soo successful that I'm still very surprised every time I see
> that a subversion repository becomes larger than the CVS equivalent. :-)
Then we have to figure out why that happens. vdelta definitely won't be
very god at compressing differences between binary file types that are
already compressed -- that includes ZIP file, JAR files and many image
types (GIF, PNG, JPEG come to mind, TIFF can be LZW compressed, etc.).
It's possible that, for these kinds of files, we should always store the
fulltext.
--
Brane Čibej <br...@xbc.nu> http://www.xbc.nu/brane/
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: Simple repository size experiment
Posted by Tobias Ringström <to...@ringstrom.mine.nu>.
Justin Erenkrantz wrote:
> --On Thursday, July 10, 2003 8:07 PM +0200 Tobias Ringström
> <to...@ringstrom.mine.nu> wrote:
>
>> After running cvs2svn from svn-0.24.2 and removing all log files in
>> the db
>> directory:
>>
>> Size of uncompressed dump file from cvs2svn: 246 MiB
>> Size of compressed dump file from cvs2svn: 48 MiB
>
> A dump file does not contain any deltas - only fulltexts. So, a dump
> isn't a good estimate of how large your repository is. -- justin
I know that, but I find it somewhat interesting that the compressed dump
size is around the same size as the compressed repository (with log
files removed).
For another (real) repository of mine, the compressed dump file is
*smaller* than the compressed repository. I mentioned that on this list
a long time ago, and "everyone" just figured that I must have a very
strange repository since the vdelta is supposed to be soo great. I wish
I could see that. The propaganda (lacking a better word, sorry) has been
soo successful that I'm still very surprised every time I see that a
subversion repository becomes larger than the CVS equivalent. :-)
The one reason that I converted the repository containing binary files I
wrote about in the thread "SVN dumps much bigger than CVS repository"
was that I wanted to show our CM that a subversion repository would be
much smaller than a CVS repository. It was three time larger than the
CVS repo. Fortunately he forgot about my little demonstration before I
went on vacation... :-)
/Tobias
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: Simple repository size experiment
Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
--On Thursday, July 10, 2003 8:07 PM +0200 Tobias Ringström
<to...@ringstrom.mine.nu> wrote:
> After running cvs2svn from svn-0.24.2 and removing all log files in the db
> directory:
>
> Size of uncompressed dump file from cvs2svn: 246 MiB
> Size of compressed dump file from cvs2svn: 48 MiB
A dump file does not contain any deltas - only fulltexts. So, a dump isn't a
good estimate of how large your repository is. -- justin
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org