You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Brad O'Hearne <br...@neurofire.com> on 2005/03/23 14:04:18 UTC

Reconstructing a file on checkout and SVN scaling

After reading the documentation that generally describes how files in 
SVN are stored (by storing diffs and references to previous versions), 
and after doing some poking around in the files created in a 
filesystem-type repository, I've begun wondering about how SVN scales. 
Suppose I have a file added to subversion at revision 1, and it is 
changed 5000 times, with each change committed individually. When I 
checkout revision 5001 of that file from subversion, does subversion 
have to reconstruct this file by examining every change in sequence from 
revision 1? Will this result in a substantially increased checkout time 
to reconstruct this file to its proper revision?

There is obviously a question aimed at scaling. In a large system, if 
the above scenario is true, there exists the possibility of coming to a 
grinding halt with many files changed over many revisions. Merely 
checking out a project would be a fairly intensive operation.

I'm sure I am missing something.  I was hoping some of the svn gurus out 
there could educate me as to how subversion handles this.

Thanks!

Brad

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Reconstructing a file on checkout and SVN scaling

Posted by Brad O'Hearne <br...@neurofire.com>.
Thanks!  This helped tremendously!

Brad

Garrett Rooney wrote:

> Brad O'Hearne wrote:
>
>> After reading the documentation that generally describes how files in 
>> SVN are stored (by storing diffs and references to previous 
>> versions), and after doing some poking around in the files created in 
>> a filesystem-type repository, I've begun wondering about how SVN 
>> scales. Suppose I have a file added to subversion at revision 1, and 
>> it is changed 5000 times, with each change committed individually. 
>> When I checkout revision 5001 of that file from subversion, does 
>> subversion have to reconstruct this file by examining every change in 
>> sequence from revision 1? Will this result in a substantially 
>> increased checkout time to reconstruct this file to its proper revision?
>>
>> There is obviously a question aimed at scaling. In a large system, if 
>> the above scenario is true, there exists the possibility of coming to 
>> a grinding halt with many files changed over many revisions. Merely 
>> checking out a project would be a fairly intensive operation.
>>
>> I'm sure I am missing something.  I was hoping some of the svn gurus 
>> out there could educate me as to how subversion handles this.
>
>
> Depending on which back end filesystem you're using the answer is 
> different.  In a BDB based filesystem the HEAD revision (most recent 
> version) is stored as fulltext, since that's what you usually want to 
> see, and deltas are applied only when you need previous revisions.  In 
> an FSFS based repository you do indeed have to start at the initial 
> version and apply deltas forward, but the deltas are stored in such a 
> way so as to only require applying of log(n) deltas to get to any 
> given version of a file.  See 
> http://svn.collab.net/repos/svn/trunk/notes/skip-deltas for detailed 
> notes on how the FSFS version of this works.
>
> -garrett



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Reconstructing a file on checkout and SVN scaling

Posted by Garrett Rooney <ro...@electricjellyfish.net>.
Brad O'Hearne wrote:
> After reading the documentation that generally describes how files in 
> SVN are stored (by storing diffs and references to previous versions), 
> and after doing some poking around in the files created in a 
> filesystem-type repository, I've begun wondering about how SVN scales. 
> Suppose I have a file added to subversion at revision 1, and it is 
> changed 5000 times, with each change committed individually. When I 
> checkout revision 5001 of that file from subversion, does subversion 
> have to reconstruct this file by examining every change in sequence from 
> revision 1? Will this result in a substantially increased checkout time 
> to reconstruct this file to its proper revision?
> 
> There is obviously a question aimed at scaling. In a large system, if 
> the above scenario is true, there exists the possibility of coming to a 
> grinding halt with many files changed over many revisions. Merely 
> checking out a project would be a fairly intensive operation.
> 
> I'm sure I am missing something.  I was hoping some of the svn gurus out 
> there could educate me as to how subversion handles this.

Depending on which back end filesystem you're using the answer is 
different.  In a BDB based filesystem the HEAD revision (most recent 
version) is stored as fulltext, since that's what you usually want to 
see, and deltas are applied only when you need previous revisions.  In 
an FSFS based repository you do indeed have to start at the initial 
version and apply deltas forward, but the deltas are stored in such a 
way so as to only require applying of log(n) deltas to get to any given 
version of a file.  See 
http://svn.collab.net/repos/svn/trunk/notes/skip-deltas for detailed 
notes on how the FSFS version of this works.

-garrett

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Reconstructing a file on checkout and SVN scaling

Posted by Mark Phippard <Ma...@softlanding.com>.
Brad O'Hearne <br...@neurofire.com> wrote on 03/23/2005 09:04:18 AM:

> After reading the documentation that generally describes how files in 
> SVN are stored (by storing diffs and references to previous versions), 
> and after doing some poking around in the files created in a 
> filesystem-type repository, I've begun wondering about how SVN scales. 
> Suppose I have a file added to subversion at revision 1, and it is 
> changed 5000 times, with each change committed individually. When I 
> checkout revision 5001 of that file from subversion, does subversion 
> have to reconstruct this file by examining every change in sequence from 

> revision 1? Will this result in a substantially increased checkout time 
> to reconstruct this file to its proper revision?
> 
> There is obviously a question aimed at scaling. In a large system, if 
> the above scenario is true, there exists the possibility of coming to a 
> grinding halt with many files changed over many revisions. Merely 
> checking out a project would be a fairly intensive operation.
> 
> I'm sure I am missing something.  I was hoping some of the svn gurus out 

> there could educate me as to how subversion handles this.

It doesn't work exactly the way you describe.  Read this:

http://svn.collab.net/repos/svn/trunk/notes/skip-deltas

Mark



_____________________________________________________________________________
Scanned for SoftLanding Systems, Inc. by IBM Email Security Management Services powered by MessageLabs. 
_____________________________________________________________________________

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org