You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Andre Harper <ha...@gmail.com> on 2013/06/04 16:30:49 UTC

large db/revs files

Hi, all –

I am not subscribed and would appreciate being explicitly Cc:ed in any
responses.

I am on a team that began using subversion near the end of last year.
As a part of our process, we tag each successful run of our systems.
This can mean thousands of tags for certain systems every six months.

We’re having an issue with the db/revs directory size, which for all
our projects currently exceeds 289G.  We only use relatively small
working directories containing less than a meg of text files; no
binary files.

In the archives, I found a mention that the db/revs directories are
populated using xdelta, but there didn’t appear to be a solution to
large file sizes at time
[http://svn.haxx.se/users/archive-2011-08/0229.shtml
].  I was hoping someone may have found a work-around or solution.

Would someone be able to:
1)	suggest how to avoid this in the future
2)	suggest how to reduce the current large files (if possible)

Thank you.
André Harper

Re: large db/revs files

Posted by Thorsten Schöning <ts...@am-soft.de>.
Guten Tag Andre Harper,
am Dienstag, 4. Juni 2013 um 16:30 schrieben Sie:

> We’re having an issue with the db/revs directory size, which for all
> our projects currently exceeds 289G.  We only use relatively small
> working directories containing less than a meg of text files; no
> binary files.

Could you provide some examples of your directory structure especially
for all your tags? Are your tags all on the same directory level? If
yes, that's probably the reason for large sizes of res files because
each commit in a directory replicates the complete directory structure
for this directory.

There were some discussions about this in the past, which I'm always
unable to find again. Below are some hints:

Besides that, you could try a complete dump/load cycle to use
representation sharing for all your data, but this only helps if your
file contents are the problem, of course.

http://www.red-bean.com/kfogel/beautiful-code/bc-chapter-02.html
http://stackoverflow.com/questions/6917505/inexplicable-svn-repository-size-increase-from-small-differences-to-big-files

Mit freundlichen Grüßen,

Thorsten Schöning

-- 
Thorsten Schöning       E-Mail:Thorsten.Schoening@AM-SoFT.de
AM-SoFT IT-Systeme      http://www.AM-SoFT.de/

Telefon...........05151-  9468- 55
Fax...............05151-  9468- 88
Mobil..............0178-8 9468- 04

AM-SoFT GmbH IT-Systeme, Brandenburger Str. 7c, 31789 Hameln
AG Hannover HRB 207 694 - Geschäftsführer: Andreas Muchow


Re: large db/revs files

Posted by olli hauer <oh...@gmx.de>.
On 2013-06-04 16:30, Andre Harper wrote:
> Hi, all –
> 
> I am not subscribed and would appreciate being explicitly Cc:ed in any
> responses.
> 
> I am on a team that began using subversion near the end of last year.
> As a part of our process, we tag each successful run of our systems.
> This can mean thousands of tags for certain systems every six months.
> 
> We’re having an issue with the db/revs directory size, which for all
> our projects currently exceeds 289G.  We only use relatively small
> working directories containing less than a meg of text files; no
> binary files.
> 
> In the archives, I found a mention that the db/revs directories are
> populated using xdelta, but there didn’t appear to be a solution to
> large file sizes at time
> [http://svn.haxx.se/users/archive-2011-08/0229.shtml
> ].  I was hoping someone may have found a work-around or solution.
> 
> Would someone be able to:
> 1)	suggest how to avoid this in the future
> 2)	suggest how to reduce the current large files (if possible)
> 
> Thank you.
> André Harper
> 

Wow, how many revisions are this and what is the average size of the source (1MB)?

Even with a several thousands tags and plain text files there is a change to keep the repository small.

For example your files could be plain text files with
 - max 10 chars per line (good for xdelta)
 - containing only single line (bad for xdelta)
 - EOL style / white spacing changed during commit (bad for xdelta)
 - ....

With the worst case examples in mind you can inspect your sources and maybe find some improvements.

Have you looked direct on the server side to the repo ($repo/db/), where is all the space used?
 - rep-cache.db
 - revs
 - transactions (I've seen repos with several GB inside)
 - revprops



Re: large db/revs files

Posted by Daniel Shahaf <da...@elego.de>.
Andre Harper wrote on Thu, Jun 06, 2013 at 08:56:02 -0400:
> If I understand the skip-delta cost, which I may not completely, it sounds
> like each tagged release based on the trunk will occupy more space as it is
> farther away from the trunk revision number?

Yes, unless directory deltification is enabled (i.e., 1.8 servers or
later).  See Mark's link.

Re: large db/revs files

Posted by Andre Harper <ha...@gmail.com>.
Thanks for everyone’s replies.

Took me a couple days to review the links everyone sent.

Thorsten’s links sent me to several of your posts, Mark -- thanks.  Based
on your feedback it appears the new release of svn within a month or so
should resolve the problem.  It appears the best solution is to wait until
then.

But, I did want to follow-up on the questions Thorsten & Olli had.  Here is
an example of our directory structure:

projectA/
         trunk/
         tags/
           release100_06062013_0809/
           release100_06062013_0810/
           …
         branches/

The tags are at the same directory level, which supports the inclination
you had, Thorsten. I wasn’t aware that tagging a directory replicates the
complete directory structure in db/res, which contains the largest files in
our repo.  The other directories are reasonable.  The source of the largest
db/rev repo, which has 141,573 revisions, is 780K.

If I understand the skip-delta cost, which I may not completely, it sounds
like each tagged release based on the trunk will occupy more space as it is
farther away from the trunk revision number?

Thanks.
André Harper

Re: large db/revs files

Posted by Mark Phippard <ma...@gmail.com>.
On Tue, Jun 4, 2013 at 7:30 AM, Andre Harper <ha...@gmail.com> wrote:
> Hi, all –
>
> I am not subscribed and would appreciate being explicitly Cc:ed in any
> responses.
>
> I am on a team that began using subversion near the end of last year.
> As a part of our process, we tag each successful run of our systems.
> This can mean thousands of tags for certain systems every six months.
>
> We’re having an issue with the db/revs directory size, which for all
> our projects currently exceeds 289G.  We only use relatively small
> working directories containing less than a meg of text files; no
> binary files.
>
> In the archives, I found a mention that the db/revs directories are
> populated using xdelta, but there didn’t appear to be a solution to
> large file sizes at time
> [http://svn.haxx.se/users/archive-2011-08/0229.shtml
> ].  I was hoping someone may have found a work-around or solution.
>
> Would someone be able to:
> 1)      suggest how to avoid this in the future
> 2)      suggest how to reduce the current large files (if possible)


1. Upgrade to 1.8 when it is available.

2. Dump and load your repository

3. Figure out what to do with all of the free disk space you suddenly have.

See:

http://subversion.apache.org/docs/release-notes/1.8.html#fsfs-deltification



--
Thanks

Mark Phippard
http://markphip.blogspot.com/