You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Justin Connell <ju...@propylon.com> on 2010/02/12 20:56:09 UTC

Maintaining large repositories

Hi,
I have a repository that has been in use for well over a year and over 
this period the size on disk has grown to over 150 GB, I found that when 
running svnadmin dump, that the resulting dump file was at 46 GB on disk 
and then when loading the dump file into a new repository that the size 
on disk in the repository folder was 8 GB in total.

What's disturbing is the drop in disk usage from 150 --> 46 --> 8 Gig.

Does anyone have an explanation for this?

Or rather is there a better way of freeing up disk space back to the OS? 
(we are using FS and not Berkley DB storage)

Re: Maintaining large repositories

Posted by Mark Phippard <ma...@gmail.com>.
On Fri, Feb 12, 2010 at 3:56 PM, Justin Connell
<ju...@propylon.com> wrote:

> I have a repository that has been in use for well over a year and over this
> period the size on disk has grown to over 150 GB, I found that when running
> svnadmin dump, that the resulting dump file was at 46 GB on disk and then
> when loading the dump file into a new repository that the size on disk in
> the repository folder was 8 GB in total.
>
> What's disturbing is the drop in disk usage from 150 --> 46 --> 8 Gig.
>
> Does anyone have an explanation for this?
>
> Or rather is there a better way of freeing up disk space back to the OS? (we
> are using FS and not Berkley DB storage)

Generally speaking a dumpfile ought to be bigger than the repository.
Are you sure you did not have some other files that had nothing to
with SVN stored in the same filesystem location?

Also, generally speaking, a dump load should produce an identical
repository.  A notable exception is when your version changes.  For
example, SVN 1.6 included a space saving feature called rep sharing
that is allows duplicate items to only be stored one time.  One of the
earlier releases, perhaps 1.4?, included some zlib compression of
contents that also saved space.  So if you were coming from earlier
versions and dumped and loaded into newer it would be expected to save
space.

The 150 GB repos to 46 GB dump does not sound right though.  Was the
dump produced with the --deltas option?  That could explain it.

A reduction from 150GB to 8 GB is not expected.  Most people only see
10-20% savings from these changes.  However, both rep-sharing and
compression would be heavily influenced by the actual repository
content.

To answer what I think is your main question, it is not expected that
you need to do ANY repository maintenance and it is not expected that
a routine dump/load should cause disk space to be recovered.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/