You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Marko Käning <mk...@mch.osram.de> on 2008/07/18 11:46:35 UTC

repo backup

Hi,

I do every night 
 a) hotcopies of my repos
 b) do consistency checks on the these copies
 c) and even create dumps (incremental on Mon...Thu and full on Fri).

I just ask myself whether it makes sense at all to create these "dum[pb] 
files" for real backup purposes:

 1) They are generally much larger than the repo itself.

 2) One ends up with one large file which might get more easily destroyed 
    due to backup media errors than all the little files in a hotcopied 
    repo.

 3) I understood that future versions of svn (like 1.5) will be able to 
    work on older repos. (1.5 might run a bit faster if you do a 
    dump/reload cycle. So, one can use just dump the hotcopied-backup-repo 
    and reload in the new repo.)

Any comments from the list?

Well, I have to add that I just had a hard disc desaster on my main 
server. I had two sets of full backups. Unfortunately the HD error must 
have corrupted the two largest tar.bz2-files in both backups containing my 
most important CVS and SVN repos. Due to the easy-going dump/reload cycle 
I was able to extract my SVN repos from the half-faulty server when it was 
still able to access the erroneous HD. I had no time to recover my CVS 
repos anymore, since the HD ceased to function just then.

Well, perhaps one should think about finding a more reliable way to backup 
repos somehow... How to cope with flipping bits somewhere in the middle of 
everything?

Regards,
Marko


P.S.: Am currently using svn 1.4.4.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: repo backup

Posted by David Chapman <dc...@earthlink.net>.
Marko Käning wrote:
> Hi,
>
> I do every night 
>  a) hotcopies of my repos
>  b) do consistency checks on the these copies
>  c) and even create dumps (incremental on Mon...Thu and full on Fri).
>
> I just ask myself whether it makes sense at all to create these "dum[pb] 
> files" for real backup purposes:
>
>  1) They are generally much larger than the repo itself.
>
>  2) One ends up with one large file which might get more easily destroyed 
>     due to backup media errors than all the little files in a hotcopied 
>     repo.
>   
Corruption likelihood is proportional to total file size, not the size 
of the individual files - if one of the "little files" in a hot-copied 
repository gets destroyed, you lose that revision and portions of your 
repository will be inaccessible.

My dump files are twice the size of the repository (733 MB vs. 347 MB).  
I can still write that to any reasonable removable media.  Your mileage 
may vary.

>  3) I understood that future versions of svn (like 1.5) will be able to 
>     work on older repos. (1.5 might run a bit faster if you do a 
>     dump/reload cycle. So, one can use just dump the hotcopied-backup-repo 
>     and reload in the new repo.)
>
> Any comments from the list?
>   

A hot copy will work on a machine that is the same processor 
architecture and OS as the machine which created it.  If your backup 
server (or replacement server) is the same as the main server, then a 
hot copy will work just fine.  Dump files are better when your main 
server is old and there is no equivalent replacement.

> Well, I have to add that I just had a hard disc desaster on my main 
> server. I had two sets of full backups. Unfortunately the HD error must 
> have corrupted the two largest tar.bz2-files in both backups containing my 
> most important CVS and SVN repos. Due to the easy-going dump/reload cycle 
> I was able to extract my SVN repos from the half-faulty server when it was 
> still able to access the erroneous HD. I had no time to recover my CVS 
> repos anymore, since the HD ceased to function just then.
>
>   

Are you saying that the hard disk error corrupted the backups as they 
were being made, or that the backups were on the hard disk which 
failed?  In the former case, you might want to do a hot copy to a second 
(compatible) machine, verify it there, and then write to backup media.  
I never rely on a single piece of hardware to keep my data safe.

You might be able to take the hard disk in to a service that specializes 
in extracting data from failed hard disks.  It will cost you hundreds of 
dollars (maybe more now; I haven't had to do it myself for years) but 
might be cheap compared to the value of the lost data.

> Well, perhaps one should think about finding a more reliable way to backup 
> repos somehow... How to cope with flipping bits somewhere in the middle of 
> everything?
>   

Keeping multiple versions of backup files is the only way to deal with 
media aging (or network transmission) issues.  If you're truly nervous, 
set up a system of multiple permanent off-site backups.  If all of your 
offsite backups are full backups, using hot copies is better because if 
one file is damaged on one of the old backups, it may well be OK on 
another backup (and you can copy that one file into the reconstituted 
repository).  You must, however, ensure that you have a machine that can 
load the hot copies.  Every time you upgrade to a different server 
platform you will have to start with a new set of offsite backups.

-- 
    David Chapman         dcchapman@earthlink.net
    Chapman Consulting -- San Jose, CA


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org