You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Ryan Schmidt <su...@ryandesign.com> on 2005/10/05 10:51:05 UTC

Re: Disk space usage

On Sep 30, 2005, at 10:40, Jon Sporring wrote:

> We've recently converted a cvs repository into a subversion using  
> cvs2svn and
> FSFS.  We experince the following problem:  The repository on the  
> server is
> only 5.5GB, which is half the size of the original cvs repository  
> 11GB, but
> when I checkout the trunk, then the resulting working copy is  
> almost twice
> the size, 19GB,  of the original checkout of the cvs repository.   
> We do have
> large binary files, which we use as a method of file-sharing, and I  
> realize,
> that this will cause the file to be present in the working copy  
> twice: in
> the .svn directory and in the corresponding place of work.   
> However, it would
> appear that the subversion repository uses compression; does anyone  
> know of a
> simple way to enforce compression on the .svn directories or any  
> other smart
> trick to reduce the size of the working copy?

A CVS working copy does not store a pristine copy of the checked-out  
files; a Subversion working copy does. The advantange is that  
Subversion can quickly create diffs, whereas CVS requires a trip to  
the network, which can be slow. Subversion is constructed on the  
philosophy that disk space is cheap and fast, and network bandwidth  
is expensive and/or slow. For those who disagree with this idea,  
there is currently no way of telling Subversion to behave  
differently, although there are already feature requests open to have  
the working copy pristine data compressed or altogether absent.

A Subversion repository is stored as a sequence of diffs against  
previous revisions, and so is generally fairly space-efficient. I  
cannot comment on how Subversion's repository storage algorithm  
compares with CVS's.



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Disk space usage

Posted by Hari Kodungallur <hk...@gmail.com>.
On 10/5/05, Jon Bendtsen <jo...@laerdal.dk> wrote:
>
> Den 5. okt 2005 kl. 17:37 skrev Paul Koning:
>
> >>>>>> "Ryan" == Ryan Schmidt <su...@ryandesign.com> writes:
> >>>>>>
> >
> >  Ryan> A Subversion repository is stored as a sequence of diffs
> >  Ryan> against previous revisions, and so is generally fairly
> >  Ryan> space-efficient. I cannot comment on how Subversion's
> >  Ryan> repository storage algorithm compares with CVS's.
> >
> > One data point:
> >   CVS repository: 7.5 GB -- converted by cvs2svn yields:
> >   SVN repository: 4.8 GB
>
> Thats funny, because i converted a CVS from 19GB which turned into 20GB
> subversion...
>

I am guessing here, but I think one of the factors (don't know how big
of a factor it is) is the number of files in the repository and number
of branches/tags.
In CVS, for each tag/branch, every file that is part of the tag/branch
will get an entry in their respective RCS file. Whereas in SVN a
branch/tag is one small file representing a link.


rgds

--
-Hari Kodungallur
SpikeSource Inc.
http://developer.spikesource.com

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org


Re: Disk space usage

Posted by Paul Koning <pk...@equallogic.com>.
>>>>> "Daniel" == Daniel Berlin <db...@dberlin.org> writes:

 >>> One data point: CVS repository: 7.5 GB -- converted by cvs2svn
 >>> yields: SVN repository: 4.8 GB
 >> Thats funny, because i converted a CVS from 19GB which turned into
 >> 20GB subversion...

 Daniel> It depends on the cvs repo.  Branch tags/copies using cvs2svn
 Daniel> generate much more disk than it would if you had used svn
 Daniel> originally.

I've noticed that too, and I've tried to make it better.
Unfortunately, there seem to be a bunch of different reasons why
cvs2svn does things as inefficiently as it does, and they are not
easily fixable. 

       paul


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Disk space usage

Posted by Daniel Berlin <db...@dberlin.org>.
>> One data point:
>>   CVS repository: 7.5 GB -- converted by cvs2svn yields:
>>   SVN repository: 4.8 GB
>
> Thats funny, because i converted a CVS from 19GB which turned into 20GB
> subversion...

It depends on the cvs repo.
Branch tags/copies using cvs2svn generate much more disk than it would if 
you had used svn originally.

There is a branch in the gcc cvs repo that every time it gets tagged, 
generates 4 meg of just fsfs *metadata*, and it's been tagged about 200 
times.  This is because of how svn has to generate the branch from the cvs 
commits that created it.

If these had been svn cp's, it would take nowhere *near* that amount of 
disk space (maybe a meg or two, total).

--Dan

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Disk space usage

Posted by Jon Bendtsen <jo...@laerdal.dk>.
Den 5. okt 2005 kl. 17:37 skrev Paul Koning:

>>>>>> "Ryan" == Ryan Schmidt <su...@ryandesign.com> writes:
>>>>>>
>
>  Ryan> A Subversion repository is stored as a sequence of diffs
>  Ryan> against previous revisions, and so is generally fairly
>  Ryan> space-efficient. I cannot comment on how Subversion's
>  Ryan> repository storage algorithm compares with CVS's.
>
> One data point:
>   CVS repository: 7.5 GB -- converted by cvs2svn yields:
>   SVN repository: 4.8 GB

Thats funny, because i converted a CVS from 19GB which turned into 20GB
subversion...



JonB

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Disk space usage

Posted by Paul Koning <pk...@equallogic.com>.
>>>>> "Ryan" == Ryan Schmidt <su...@ryandesign.com> writes:

 Ryan> A Subversion repository is stored as a sequence of diffs
 Ryan> against previous revisions, and so is generally fairly
 Ryan> space-efficient. I cannot comment on how Subversion's
 Ryan> repository storage algorithm compares with CVS's.

One data point:
  CVS repository: 7.5 GB -- converted by cvs2svn yields:
  SVN repository: 4.8 GB

      paul


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org