You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Tobias Ringström <to...@ringstrom.mine.nu> on 2003/07/10 18:07:19 UTC

Simple repository size experiment

After yesterdays thread about my SVN repo that became much larger than 
the CVS repo it was converted from, I've been experimenting a bit and 
thought I should share the results. The repository I was talking about 
yesterday contains data that I cannot share with the world.

Until I've come up with a public repository with the same properties, 
I've downloaded a part of the XFree86 CVS tree, namely xc/programs (or 
xc-clients according to cvsup). The CVS repository has 2560 files and 
1551 revisions after the conversion to SVN. I'm using "bzip2 -9" for the 
compression mentioned below.

Size of uncompressed tar file of CVS repo: 53 MiB
Size of compressed tar file of CVS repo: 11 MiB

After running cvs2svn from svn-0.24.2 and removing all log files in the 
db directory:

Size of uncompressed dump file from cvs2svn: 246 MiB
Size of compressed dump file from cvs2svn: 48 MiB

Size of uncompressed tar file of SVN repo: 59 MiB
Size of compressed tar file of SVN repo: 20 MiB

 From reading this list, I really expected SVN repositories to be 
smaller than CVS repories, but so far I've not seen any evidence of 
that. It's not a lot worse than CVS, but it is definately worse. After 
compression, it loses by a factor of two. I'm hoping that there is a 
simple bug somewhere... :-)

/Tobias



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Simple repository size experiment

Posted by Branko Čibej <br...@xbc.nu>.
Tobias Ringström wrote:

> For another (real) repository of mine, the compressed dump file is
> *smaller* than the compressed repository. I mentioned that on this
> list a long time ago, and "everyone" just figured that I must have a
> very strange repository since the vdelta is supposed to be soo great.
> I wish I could see that. The propaganda (lacking a better word, sorry)
> has been soo successful that I'm still very surprised every time I see
> that a subversion repository becomes larger than the CVS equivalent. :-) 

Then we have to figure out why that happens. vdelta definitely won't be
very god at compressing differences between binary file types that are
already compressed -- that includes ZIP file, JAR files and many image
types (GIF, PNG, JPEG come to mind, TIFF can be LZW compressed, etc.).
It's possible that, for these kinds of files, we should always store the
fulltext.


-- 
Brane Čibej   <br...@xbc.nu>   http://www.xbc.nu/brane/


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Simple repository size experiment

Posted by Tobias Ringström <to...@ringstrom.mine.nu>.
Justin Erenkrantz wrote:

> --On Thursday, July 10, 2003 8:07 PM +0200 Tobias Ringström 
> <to...@ringstrom.mine.nu> wrote:
> 
>> After running cvs2svn from svn-0.24.2 and removing all log files in 
>> the db
>> directory:
>>
>> Size of uncompressed dump file from cvs2svn: 246 MiB
>> Size of compressed dump file from cvs2svn: 48 MiB
> 
> A dump file does not contain any deltas - only fulltexts.  So, a dump 
> isn't a good estimate of how large your repository is.  -- justin

I know that, but I find it somewhat interesting that the compressed dump 
size is around the same size as the compressed repository (with log 
files removed).

For another (real) repository of mine, the compressed dump file is 
*smaller* than the compressed repository. I mentioned that on this list 
a long time ago, and "everyone" just figured that I must have a very 
strange repository since the vdelta is supposed to be soo great. I wish 
I could see that. The propaganda (lacking a better word, sorry) has been 
soo successful that I'm still very surprised every time I see that a 
subversion repository becomes larger than the CVS equivalent. :-)

The one reason that I converted the repository containing binary files I 
wrote about in the thread "SVN dumps much bigger than CVS repository" 
was that I wanted to show our CM that a subversion repository would be 
much smaller than a CVS repository. It was three time larger than the 
CVS repo. Fortunately he forgot about my little demonstration before I 
went on vacation... :-)

/Tobias


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Simple repository size experiment

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
--On Thursday, July 10, 2003 8:07 PM +0200 Tobias Ringström 
<to...@ringstrom.mine.nu> wrote:

> After running cvs2svn from svn-0.24.2 and removing all log files in the db
> directory:
>
> Size of uncompressed dump file from cvs2svn: 246 MiB
> Size of compressed dump file from cvs2svn: 48 MiB

A dump file does not contain any deltas - only fulltexts.  So, a dump isn't a 
good estimate of how large your repository is.  -- justin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org