You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by J Robert Ray <jr...@imageworks.com> on 2004/10/23 03:00:42 UTC

Backing up large repositories

I have read issue 1819.  I know the developers know about this problem. 
    This message is to discuss possible workarounds.

My bdb respository has grown such that 'strings' is larger than 2GB.  I 
still wish to use svnadmin hotcopy but this is getting increasingly 
difficult.

In svn 1.0.x the hotcopy code would successfully copy this large file 
(after upgrading to the apr in apache 2.0.51), but fail to copy over the 
file permissions (because stat fails).  I was able to patch the code to 
make that stat failure non-fatal, and things are okay.

In svn 1.1.x, the hotcopy code has changed drastically, such that it no 
longer has a hard-coded list of files to copy, but uses readdir/stat to 
(generically) clone an arbitrary directory structure.  It can't be 
patched to ignore stat failures and still work correctly.

My options I can think of are:

* build apr with largefile support

The little info I find on the mailing lists suggest this is not 
recommended and there are reports, without any detail, that this could 
be "dangerous."

* switch to fsfs

I'd like to wait a little longer for fsfs to mature before switching my 
repository over.  Wouldn't have this problem with fsfs though, until 
someone checks in a >2GB file...  *chuckle*

* write a custom hotcopy

This option is looking better and better, although it is disappointing 
if this is my best option.  Has anyone done this already?

* use svnadmin dump

Dump the repository instead of making a copy.  Would work but doesn't 
backup the whole repository (config, hooks).

Any other options?  I'm sure other people are dealing with this problem 
too.  Anyone else like to share their experiences?

Thanks,

- Robert

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Backing up large repositories

Posted by J Robert Ray <jr...@imageworks.com>.
J Robert Ray wrote:
> In svn 1.1.x, the hotcopy code has changed drastically, such that it no 
> longer has a hard-coded list of files to copy, but uses readdir/stat to 
> (generically) clone an arbitrary directory structure.  It can't be 
> patched to ignore stat failures and still work correctly.

After studying the code a little more, I realized that it still does 
have a hard-coded list of files for bdb repos, but it first has to use 
APR to do the 'io_dir_walk' to find all the non-db files to copy.  This 
is where hotcopy is failing.

I have attached a patch that modifies svn_io_dir_walk so that in the 
case where APR returns APR_INCOMPLETE, and the only missing bit of 
information is APR_FINFO_TYPE, it assumes the file is a regular file and 
not a directory.

With this patch applied, hotcopy succeeds on my large repository.  I 
feel the patch is relatively safe, the only place I can find where 
svn_io_dir_walk is used is when it is called by svn_repos_hotcopy. 
Therefore I can reasonably assume that it will never run across a 
directory entry larger than 2GB, or fail stating some other special kind 
of file, because it only is used to traverse the repository.

Can anybody think of a reason why this patch might be a Very Bad Thing? 
  I'm considering rolling it out to my users, so I can upgrade to 1.1.1.

Thanks,

- Robert