You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by "Loren M. Lang" <lo...@north-winds.org> on 2011/05/05 02:39:57 UTC

Size of Subversion repository

We have been using Subversion 1.4.x for quite some time and just earlier
this year, we upgraded to 1.5.x.  Our repository is still the same as we
did no dump/load between upgrades.  I was curious to see what kind of
space savings we might have if we did.  Our original repository is fsfs
and was created under 1.4.x as far as I remember.  The format file says
3 so I might have made it with 1.3.x.  The format for the new repo is 5.
Here are the numbers for two tests I did, one with fsfs and one with
bdb.

737M	svn-original
690M	svn-fsfs
903M	svn-bdb
2.3G	total

Are these numbers typical?  And why is BDB so significantly larger?  Is
there any real benefit to it nowadays?  Our server set-up is all access
must be through the apache user via mod_dav_svn.


Re: Size of Subversion repository

Posted by "Hyrum K. Wright" <hy...@mail.utexas.edu>.
2011/5/5 Thorsten Schöning <ts...@am-soft.de>:
> Guten Tag Loren M. Lang,
> am Donnerstag, 5. Mai 2011 um 02:39 schrieben Sie:
>
>> We have been using Subversion 1.4.x for quite some time and just earlier
>> this year, we upgraded to 1.5.x.  Our repository is still the same as we
>> did no dump/load between upgrades.  I was curious to see what kind of
>> space savings we might have if we did.
>
> I recently started syncing our old repositorys, all fsfs and created
> with Subversion versions 1.4.x and earlier, to a new server with most
> repositories created with Subversion 1.6.x and only standard features
> enabled. It's about 7,2 GB vs. 6,2 GB with one of our largest
> repositories on the sync target still in an older fsfs format. Seems
> it worth to do a complete dump/load cycle and the newer repository
> formats also have a feature called rep-sharing, where data is stored
> only once for the complete repository.

You also have the option of packing a repository created with or after
1.6.x.  For packing, earlier repositories can be upgraded in place,
rather than being dump/loaded.

-Hyrum

Re: Size of Subversion repository

Posted by Thorsten Schöning <ts...@am-soft.de>.
Guten Tag Loren M. Lang,
am Donnerstag, 5. Mai 2011 um 02:39 schrieben Sie:

> We have been using Subversion 1.4.x for quite some time and just earlier
> this year, we upgraded to 1.5.x.  Our repository is still the same as we
> did no dump/load between upgrades.  I was curious to see what kind of
> space savings we might have if we did.

I recently started syncing our old repositorys, all fsfs and created
with Subversion versions 1.4.x and earlier, to a new server with most
repositories created with Subversion 1.6.x and only standard features
enabled. It's about 7,2 GB vs. 6,2 GB with one of our largest
repositories on the sync target still in an older fsfs format. Seems
it worth to do a complete dump/load cycle and the newer repository
formats also have a feature called rep-sharing, where data is stored
only once for the complete repository.

Mit freundlichen Grüßen,

Thorsten Schöning

-- 
Thorsten Schöning
AM-SoFT IT-Systeme - Hameln | Potsdam | Leipzig
 
Telefon: Potsdam: 0331-743881-0
E-Mail:  tschoening@am-soft.de
Web:     http://www.am-soft.de

AM-SoFT GmbH IT-Systeme, Konsumhof 1-5, 14482 Potsdam
Amtsgericht Potsdam HRB 21278 P, Geschäftsführer: Andreas Muchow


Re: Size of Subversion repository

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Hyrum K. Wright wrote on Fri, May 06, 2011 at 07:49:08 -0500:
> On Thu, May 5, 2011 at 7:27 PM, Loren M. Lang <lo...@north-winds.org> wrote:
> > On Fri, 2011-05-06 at 00:37 +0300, Daniel Shahaf wrote:
> >> Loren M. Lang wrote on Thu, May 05, 2011 at 14:32:37 -0700:
> >> > On Thu, 2011-05-05 at 15:43 +0300, Daniel Shahaf wrote:
> >> > > Loren M. Lang wrote on Wed, May 04, 2011 at 17:39:57 -0700:
> >> > > > The format file says 3 so I might have made it with 1.3.x.
> >> > >
> >> > > This conclusion is wrong.  The format number is NOT the minor release
> >> > > number (because we may bump it multiple times between successive minor
> >> > > lines).
> >> >
> >> > Is there a list of these format numbers and their meanings/features?
> >> > I'm curious what I missed by not upgrading to 4 when I had the chance.
> >> >
> >>
> >> https://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_fs_base/fs.h
> >> https://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_fs_fs/fs.h
> >> https://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_fs_fs/structure
> >
> > There appears to be some confusion here.  I was referring to format
> > under the repository root.  The references you gave me appear to refer
> > to db/format.  My original 1.4.x repository had format = 3 and db/format
> > = 1.  When I created the new repositories, format was bumped to 5.  I do
> > not know what db/format was as I already deleted them, but I'd assume
> > db/format for the 1.5.x fsfs repository was 3.
> >
> > My primary question though, was simply whether bdb was normally as
> > space-inefficient as my test showed and whether I should consider it
> > over fsfs for Subversion 1.5.x or 1.6.x+.
> 
> Prior to 1.6.x, bdb stored full-texts at HEAD, and deltified
> backwards, whereas fsfs has always stored an initial full-text and
> deltified going forward.  This creates a situation that when nodes are
> copied (usually via a branch), you can get multiple fulltexts which
> are roughly the same, rather than multiple deltas.  This is one of the
> major reasons the bdb backend takes more space.
> 
> In 1.6, bdb was changed to use the same deltification scheme as fsfs.
> As a result, after bdb-backed repositories are upgraded to the 1.6
> format, they will begin to use forward deltas.

Clarification: upgraded repositories will use forward deltas
(FSFS-direction-deltas) only for new data, not for historical data.

> Dumping a loading in this case will cause forward deltas to be used
> repo-wide,

i.e., in historical revisions too

> thus resulting in space savings.
> 
> But really, if you're going to dump / load a bdb repository, why not
> just use fsfs?
> 
> -Hyrum

Re: Size of Subversion repository

Posted by "Hyrum K. Wright" <hy...@mail.utexas.edu>.
On Thu, May 5, 2011 at 7:27 PM, Loren M. Lang <lo...@north-winds.org> wrote:
> On Fri, 2011-05-06 at 00:37 +0300, Daniel Shahaf wrote:
>> Loren M. Lang wrote on Thu, May 05, 2011 at 14:32:37 -0700:
>> > On Thu, 2011-05-05 at 15:43 +0300, Daniel Shahaf wrote:
>> > > Loren M. Lang wrote on Wed, May 04, 2011 at 17:39:57 -0700:
>> > > > The format file says 3 so I might have made it with 1.3.x.
>> > >
>> > > This conclusion is wrong.  The format number is NOT the minor release
>> > > number (because we may bump it multiple times between successive minor
>> > > lines).
>> >
>> > Is there a list of these format numbers and their meanings/features?
>> > I'm curious what I missed by not upgrading to 4 when I had the chance.
>> >
>>
>> https://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_fs_base/fs.h
>> https://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_fs_fs/fs.h
>> https://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_fs_fs/structure
>
> There appears to be some confusion here.  I was referring to format
> under the repository root.  The references you gave me appear to refer
> to db/format.  My original 1.4.x repository had format = 3 and db/format
> = 1.  When I created the new repositories, format was bumped to 5.  I do
> not know what db/format was as I already deleted them, but I'd assume
> db/format for the 1.5.x fsfs repository was 3.
>
> My primary question though, was simply whether bdb was normally as
> space-inefficient as my test showed and whether I should consider it
> over fsfs for Subversion 1.5.x or 1.6.x+.

Prior to 1.6.x, bdb stored full-texts at HEAD, and deltified
backwards, whereas fsfs has always stored an initial full-text and
deltified going forward.  This creates a situation that when nodes are
copied (usually via a branch), you can get multiple fulltexts which
are roughly the same, rather than multiple deltas.  This is one of the
major reasons the bdb backend takes more space.

In 1.6, bdb was changed to use the same deltification scheme as fsfs.
As a result, after bdb-backed repositories are upgraded to the 1.6
format, they will begin to use forward deltas.  Dumping a loading in
this case will cause forward deltas to be used repo-wide, thus
resulting in space savings.

But really, if you're going to dump / load a bdb repository, why not
just use fsfs?

-Hyrum

Re: Size of Subversion repository

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Loren M. Lang wrote on Thu, May 05, 2011 at 17:27:28 -0700:
> On Fri, 2011-05-06 at 00:37 +0300, Daniel Shahaf wrote:
> > Loren M. Lang wrote on Thu, May 05, 2011 at 14:32:37 -0700:
> > > On Thu, 2011-05-05 at 15:43 +0300, Daniel Shahaf wrote:
> > > > Loren M. Lang wrote on Wed, May 04, 2011 at 17:39:57 -0700:
> > > > > The format file says 3 so I might have made it with 1.3.x.
> > > > 
> > > > This conclusion is wrong.  The format number is NOT the minor release
> > > > number (because we may bump it multiple times between successive minor
> > > > lines).
> > > 
> > > Is there a list of these format numbers and their meanings/features?
> > > I'm curious what I missed by not upgrading to 4 when I had the chance.
> > > 
> > 
> > https://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_fs_base/fs.h
> > https://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_fs_fs/fs.h
> > https://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_fs_fs/structure
> 
> There appears to be some confusion here.  I was referring to format
> under the repository root.  The references you gave me appear to refer
> to db/format.

Correct; I referred to the filesystem format numbers.  The repository
format numbers would be documented wherever SVN_REPOS__FORMAT_NUMBER
lives --- presumably subversion/libsvn_repos/repos.h.

Re: Size of Subversion repository

Posted by "Loren M. Lang" <lo...@north-winds.org>.
On Fri, 2011-05-06 at 00:37 +0300, Daniel Shahaf wrote:
> Loren M. Lang wrote on Thu, May 05, 2011 at 14:32:37 -0700:
> > On Thu, 2011-05-05 at 15:43 +0300, Daniel Shahaf wrote:
> > > Loren M. Lang wrote on Wed, May 04, 2011 at 17:39:57 -0700:
> > > > The format file says 3 so I might have made it with 1.3.x.
> > > 
> > > This conclusion is wrong.  The format number is NOT the minor release
> > > number (because we may bump it multiple times between successive minor
> > > lines).
> > 
> > Is there a list of these format numbers and their meanings/features?
> > I'm curious what I missed by not upgrading to 4 when I had the chance.
> > 
> 
> https://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_fs_base/fs.h
> https://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_fs_fs/fs.h
> https://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_fs_fs/structure

There appears to be some confusion here.  I was referring to format
under the repository root.  The references you gave me appear to refer
to db/format.  My original 1.4.x repository had format = 3 and db/format
= 1.  When I created the new repositories, format was bumped to 5.  I do
not know what db/format was as I already deleted them, but I'd assume
db/format for the 1.5.x fsfs repository was 3.

My primary question though, was simply whether bdb was normally as
space-inefficient as my test showed and whether I should consider it
over fsfs for Subversion 1.5.x or 1.6.x+.

> 
> > > 
> > 
> > 
> 



Re: Size of Subversion repository

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Loren M. Lang wrote on Thu, May 05, 2011 at 14:32:37 -0700:
> On Thu, 2011-05-05 at 15:43 +0300, Daniel Shahaf wrote:
> > Loren M. Lang wrote on Wed, May 04, 2011 at 17:39:57 -0700:
> > > The format file says 3 so I might have made it with 1.3.x.
> > 
> > This conclusion is wrong.  The format number is NOT the minor release
> > number (because we may bump it multiple times between successive minor
> > lines).
> 
> Is there a list of these format numbers and their meanings/features?
> I'm curious what I missed by not upgrading to 4 when I had the chance.
> 

https://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_fs_base/fs.h
https://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_fs_fs/fs.h
https://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_fs_fs/structure

> > 
> 
> 

Re: Size of Subversion repository

Posted by "Loren M. Lang" <lo...@north-winds.org>.
On Thu, 2011-05-05 at 15:43 +0300, Daniel Shahaf wrote:
> Loren M. Lang wrote on Wed, May 04, 2011 at 17:39:57 -0700:
> > The format file says 3 so I might have made it with 1.3.x.
> 
> This conclusion is wrong.  The format number is NOT the minor release
> number (because we may bump it multiple times between successive minor
> lines).

Is there a list of these format numbers and their meanings/features?
I'm curious what I missed by not upgrading to 4 when I had the chance.

> 



Re: Size of Subversion repository

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Loren M. Lang wrote on Wed, May 04, 2011 at 17:39:57 -0700:
> The format file says 3 so I might have made it with 1.3.x.

This conclusion is wrong.  The format number is NOT the minor release
number (because we may bump it multiple times between successive minor
lines).

Re: AW: Size of Subversion repository

Posted by "Loren M. Lang" <lo...@north-winds.org>.
On Thu, 2011-05-05 at 08:00 +0200, Markus Schaber wrote:
> Hi, Loren,
> 
> Did you try "svnadmin pack" on the repositories?

svnadmin pack is a new feature of 1.6.x.  As I stated in my email, I am
using 1.5.x.  Would pack reduce space on a freshly loaded repository?
I'd assume it would pack it tightly on a load operation.

> 
> Best regards
> 
> Markus Schaber
> 
> ___________________________
> We software Automation.
> 
> 3S-Smart Software Solutions GmbH
> Markus Schaber | Developer
> Memminger Str. 151 | 87439 Kempten | Germany | Tel. +49-831-54031-0 | Fax +49-831-54031-50
> 
> Email: m.schaber@3s-software.com | Web: http://www.3s-software.com 
> CoDeSys internet forum: http://forum.3s-software.com
> Download CoDeSys sample projects: http://www.3s-software.com/index.shtml?sample_projects
> 
> Managing Directors: Dipl.Inf. Dieter Hess, Dipl.Inf. Manfred Werner | Trade register: Kempten HRB 6186 | Tax ID No.: DE 167014915 
> 
> > -----Ursprüngliche Nachricht-----
> > Von: Loren M. Lang [mailto:lorenl@north-winds.org]
> > Gesendet: Donnerstag, 5. Mai 2011 02:40
> > An: users@subversion.apache.org
> > Betreff: Size of Subversion repository
> > 
> > We have been using Subversion 1.4.x for quite some time and just earlier
> > this year, we upgraded to 1.5.x.  Our repository is still the same as we
> > did no dump/load between upgrades.  I was curious to see what kind of
> > space savings we might have if we did.  Our original repository is fsfs
> > and was created under 1.4.x as far as I remember.  The format file says
> > 3 so I might have made it with 1.3.x.  The format for the new repo is 5.
> > Here are the numbers for two tests I did, one with fsfs and one with bdb.
> > 
> > 737M	svn-original
> > 690M	svn-fsfs
> > 903M	svn-bdb
> > 2.3G	total
> > 
> > Are these numbers typical?  And why is BDB so significantly larger?  Is
> > there any real benefit to it nowadays?  Our server set-up is all access
> > must be through the apache user via mod_dav_svn.
>