You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Scott Lamb <sl...@slamb.org> on 2006/05/01 23:14:28 UTC

Consistent live fsfs backups without copy

We want to get consistent backups of an fsfs repository [*], and  
we've got a couple of major constraints:

1. We have automated nightly builds, so it's impractical to find a  
time to shut down the server entirely.
2. This repository will be likely grow to larger than half the  
capacity of the drive, so an "svnadmin hotcopy" will fail. We need to  
transfer the data off the machine directly.

 From looking over the source, I see a couple interesting things:

1. It looks like write locking is done by holding flocks on the file  
"write-lock" or maybe "locks/<whatever>". Probably doesn't help us much.

2. it looks like svn_fs_fs__hotcopy copies stuff like this:

    a. check format file - must be fsfs version 1 or 2.
    b. copy current file
    c. copy uuid
    d. get youngest revision from current file
    e. copy revisions, oldest to youngest listed in current_file
    f. copy revprop files, oldest to youngest
    g. make transactions directory - all actual transactions are  
discarded "for now".
    h. copy locks tree
    i. make format file

    ...all without any sort of locking. Correct? (I guess files must  
be changed with the atomic rename() trick so this works?)

How much of this care is necessary? I'm not sure yet if it's possible  
to get our backup software (NetVault) to copy stuff in a pre-defined  
order. I could easily do this on backup:

1. pre script - copy current file to a location where it won't  
change, erroring out if it already exists
2. copy everything
3. post script - remove my copied current file

and on restore,

- throw away revisions and revprops newer than in my copied current file
- throw away all transactions
- copy the current file back to its proper location

is this good enough?

Regards,
Scott

[*] - We could probably switch to Berkeley DB, but I _know_ it needs  
the log files to be copied _after_ everything else. As mentioned  
midway through this message, I'm not sure yet if our backup software  
can do that.

-- 
Scott Lamb <http://www.slamb.org/>



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Consistent live fsfs backups without copy

Posted by si <ss...@gmail.com>.
Hi Scott,

> We want to get consistent backups of an fsfs repository [*], and
> we've got a couple of major constraints:
>
> 1. We have automated nightly builds, so it's impractical to find a
> time to shut down the server entirely.
> 2. This repository will be likely grow to larger than half the
> capacity of the drive, so an "svnadmin hotcopy" will fail. We need to
> transfer the data off the machine directly.

What about a post-commit hook spawning rsync or unison to a remote
backup site?

peace
si

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org


Re: Consistent live fsfs backups without copy

Posted by Ryan Schmidt <su...@ryandesign.com>.
On May 3, 2006, at 01:42, Scott Lamb wrote:

> I don't think adding a new revision alters any of the old ones...

That is correct. In an FSFS repository, old revisions never change.  
This is one of the advantages of FSFS over BDB, where the  
representation of old revisions does change when new ones are added.



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Consistent live fsfs backups without copy

Posted by Scott Lamb <sl...@slamb.org>.
Thanks for all the suggestions. I think what I'm going to end up  
doing is this:

On May 1, 2006, at 4:14 PM, Scott Lamb wrote:
> I could easily do this on backup:
>
> 1. pre script - copy current file to a location where it won't  
> change, erroring out if it already exists

and also error out if "fs-type" doesn't hold "fsfs" or "format"  
doesn't hold "1" or "2".

> 2. copy everything
> 3. post script - remove my copied current file

...and error out if any of the previous revision or revprop files  
have timestamps newer than my copied current file. I don't think  
adding a new revision alters any of the old ones...but if I'm wrong,  
this extra step will mean that I find out about my incorrect  
assumption right away rather than when I need to restore from a  
backup that's been silently corrupted.

Looks like old revprops files can be replaced if we do "svn propset -- 
revprop", but we just won't do that during a backup. (Or if we do,  
we'll try again after getting the 'backup failed' email.)

> and on restore,
>
> - throw away revisions and revprops newer than in my copied current  
> file
> - throw away all transactions
> - copy the current file back to its proper location
>
> is this good enough?


-- 
Scott Lamb <http://www.slamb.org/>



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Consistent live fsfs backups without copy

Posted by Scott Lamb <sl...@slamb.org>.
On May 1, 2006, at 4:14 PM, Scott Lamb wrote:
> 2. it looks like svn_fs_fs__hotcopy copies stuff like this:

...

>    ...all without any sort of locking. Correct? (I guess files must  
> be changed with the atomic rename() trick so this works?)

Looks like yes on the rename(). I see svn_fs_fs__move_into_place now,  
and it's used for revision files, revprop files, and the current file.

-- 
Scott Lamb <http://www.slamb.org/>



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Consistent live fsfs backups without copy

Posted by Phil Endecott <sp...@chezphil.org>.
Scott Lamb <sl...@slamb.org> wrote:
> On May 2, 2006, at 3:51 AM, Phil Endecott wrote:
> > Have a look at LVM snapshots.
>
> The LVM HOWTO used to talk about using "xfs_freeze" to guarantee  
> consistency if you're using XFS. I don't see a similar tool for ext3  
> or anything telling me whether it's unnecessary or impossible.

It's unnecessary in reasonably recent kernels and LVM versions.
I'm afraid I can't tell you what "recent" actually means though.

--Phil.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Consistent live fsfs backups without copy

Posted by Scott Lamb <sl...@slamb.org>.
On May 2, 2006, at 3:51 AM, Phil Endecott wrote:
> Have a look at LVM snapshots.  I have posted previously about the  
> method
> on this list, and a search should find the threads.  It seems to  
> work well
> for me and seems to meet your requirements.

We were just discussing this method. It's definitely a promising  
idea. We're reluctant, though - we've heard about LVM causing crashes  
(i.e., [1]) and seen it cause data loss not so long ago.

The LVM HOWTO used to talk about using "xfs_freeze" to guarantee  
consistency if you're using XFS. I don't see a similar tool for ext3  
or anything telling me whether it's unnecessary or impossible.

Now it says "This facility does require that the snapshot be made at  
a time when the data on the logical volume is in a consistent state -  
the VFS-lock patch for LVM1 makes sure that some filesystems do this  
automatically when a snapshot is created, and many of the filesystems  
in the 2.6 kernel do this automatically when a snapshot is created  
without patching." [2] I'd be a lot more comfortable if it gave me an  
authoritative way to know if my filesystem gives me this important  
guarantee on the kernel I'm currently running. Guess I'll be digging  
through kernel source...

[1] http://lists.debian.org/debian-isp/2005/10/msg00063.html
[2] http://www.tldp.org/HOWTO/LVM-HOWTO/snapshotintro.html

-- 
Scott Lamb <http://www.slamb.org/>



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Consistent live fsfs backups without copy

Posted by Scott Lamb <sl...@slamb.org>.
On May 2, 2006, at 2:03 AM, Markus KARG wrote:
> Just for my curiosity: Why not just adding another drive? Hard disk  
> are cheap these days.

This repository is expected to grow to be crazy large, and the  
machine's full of drives. It's a little 2U thing, it's got six drives  
in it (two RAID-1 for root, four RAID-5 for the repo), and they don't  
sell drives large enough to store the thing single-handedly, so we'd  
need at least two more for that.

-- 
Scott Lamb <http://www.slamb.org/>



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Consistent live fsfs backups without copy

Posted by Markus KARG <ma...@quipsy.de>.
Phil Endecott schrieb:

>Scott Lamb <sl...@slamb.org> wrote:
>  
>
>>We want to get consistent backups of an fsfs repository [*], and  
>>we've got a couple of major constraints:
>>
>>1. We have automated nightly builds, so it's impractical to find a  
>>time to shut down the server entirely.
>>2. This repository will be likely grow to larger than half the	
>>capacity of the drive, so an "svnadmin hotcopy" will fail. We need to  
>>transfer the data off the machine directly.
>>    
>>
>
>  
>
Just for my curiosity: Why not just adding another drive? Hard disk are 
cheap these days.

Markus


Re: Consistent live fsfs backups without copy

Posted by Phil Endecott <sp...@chezphil.org>.
Scott Lamb <sl...@slamb.org> wrote:
> We want to get consistent backups of an fsfs repository [*], and  
> we've got a couple of major constraints:
> 
> 1. We have automated nightly builds, so it's impractical to find a  
> time to shut down the server entirely.
> 2. This repository will be likely grow to larger than half the	
> capacity of the drive, so an "svnadmin hotcopy" will fail. We need to  
> transfer the data off the machine directly.

Have a look at LVM snapshots.  I have posted previously about the method
on this list, and a search should find the threads.  It seems to work well
for me and seems to meet your requirements.

--Phil.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Consistent live fsfs backups without copy

Posted by Scott Lamb <sl...@slamb.org>.
On May 1, 2006, at 4:39 PM, Russ wrote:
> Isn't it true that you don't need to do a hotcopy with fsfs  
> repositories, as even if a commit is in progress, it will probably  
> just be rolled back on restore?

Not sure. It seems like if that were the case, though,  
svn_fs_fs__hotcopy would just do a recursive copy. It looks like  
there was a lot of care put into this function, and I don't want to  
throw it out without understanding it.

> Also the build tools that you have accessing the repo overnight, do  
> they do any commits or only checkouts and updates?  It doesn't look  
> like you would need to worry about corruption if only readonly  
> operations are being performed.

Actually, yes, they do. This repository holds build _products_ -  
we're basically using Subversion for auditing and atomicity. (That's  
also why we expect the repo to be so huge.)

> Also if you are worried about rogue commits by developers during  
> the time of the backup, perhaps you can change the apache config to  
> make the repo read only during the time of the backup?  Either  
> through changing the conf file with a cron job or somehow rewriting  
> the requests based on the time.

Regards,
Scott

-- 
Scott Lamb <http://www.slamb.org/>



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org