You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@subversion.apache.org by Josh Kuo <jo...@prioritynetworks.net> on 2006/02/08 19:12:04 UTC

Re: repository on client is newer than on server as a result of server hd failure

Not sure if this helps you, I have a post-commit hook script that backs
up the repository, so I know I always have the most up-to-date backup.

Here's a overly simplified version writtien in shell:

<post-commit>
#!/bin/sh
svnadmin dump "$1" > /home/backups/backup.svndump &
</post-commit>

I discovered that my commit was taking a while because of the dump, so I
put the process in the background (with the & symbol), so there's no
delay when users commit new revisions.


On Wed, 2006-02-08 at 13:41 -0500, Phillip Susi wrote:
> Jon Scott Stevens wrote:
> > hi there,
> > 
> > i had a hard drive failure on my server and my backups were about a 
> > month old. long story, but normally i'm much better (ie: daily) about 
> > backups. i do have my svn project checked out on my laptop and that is 
> > the latest version of the project so that is good as i haven't lost any 
> > data. i also have all the old commitmessage emails so i can see what 
> > changed over the month of lost data.
> > 
> > now, i would like to now bring the server back up to date with what is 
> > on my laptop. since my laptop thinks the server revision doesn't exist, 
> > i don't know of any other idea than doing a fresh checkout of the older 
> > revision on the server, copying the files on my laptop over the fresh 
> > checkout and then committing back.
> > 
> 
> That's basically what you have to do.  In the future you might want to 
> set up a nightly cron job to backup the repo to another disk/machine.
> 
> > advice on the best practices in this situation is appreciated.
> > 
> > thanks,
> > 
> > jon
> > 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: users-help@subversion.tigris.org
> 
-- 
Josh Kuo <jo...@prioritynetworks.net>

Re: repository on client is newer than on server as a result of server hd failure

Posted by Blair Zajac <bl...@orcaware.com>.

On Feb 8, 2006, at 3:16 PM, Josh Kuo wrote:

>> I'm thinking a quicker process would be to have the post-commit hook
>> only trigger an incremental backup, to a file whose name is based on
>> the revision number, stored in a directory "incremental". Then do a
>> nightly full dump, when the server is less busy, by getting the HEAD
>> revision (svnlook youngest); dumping everything up to that revision
>> into a file whose name is based on that revision, stored in a
>> directory "full"; deleting the now-superfluous incrementals up to
>> that revision; and deleting the older full dumps, possibly leaving
>> the last couple as additional safeguards.
>
> This sounds good, mine was a quick hack... and like I said, the
> repository I am using it in doesn't have that much commit traffic,  
> so no
> one notices the performance issue.
>
> Does subversion already come with such backup script? Maybe we should
> submit these scripts back, I think some others may find it helpful...
> where would I submit it to?

Subversion comes with three.  svnadmin hotcopy, hot-backup.py and svn- 
fast-backup.

hot-backup.py is a Python wrapper around 'svnadmin hotcopy' that  
removes older backups, keeping at most N backups around.

My preferred one is svn-fast-backup.  This one only works if you use  
FSFS on Unix based system that supports hard-links.  It's faster and  
less disk intensive than 'svnadmin hotcopy' and hot-backup.py.

http://svn.collab.net/repos/svn/trunk/contrib/server-side/svn-fast- 
backup

The backups are exact duplicates of the entire repository structure  
(the backup is not a dump file).   So if you need to recover, you  
make a copy of the backup and move it into place.

Any files in the latest backup that do not differ from the last  
backup are hard-links, so it's cheap on disk space and fast, since  
there's no copying being done.

Regards,
Blair

-- 
Blair Zajac, Ph.D.
CTO, OrcaWare Technologies
<bl...@orcaware.com>
Subversion training, consulting and support
http://www.orcaware.com/svn/

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: repository on client is newer than on server as a result of server hd failure

Posted by Josh Kuo <jo...@prioritynetworks.net>.

> I'm thinking a quicker process would be to have the post-commit hook  
> only trigger an incremental backup, to a file whose name is based on  
> the revision number, stored in a directory "incremental". Then do a  
> nightly full dump, when the server is less busy, by getting the HEAD  
> revision (svnlook youngest); dumping everything up to that revision  
> into a file whose name is based on that revision, stored in a  
> directory "full"; deleting the now-superfluous incrementals up to  
> that revision; and deleting the older full dumps, possibly leaving  
> the last couple as additional safeguards.

This sounds good, mine was a quick hack... and like I said, the
repository I am using it in doesn't have that much commit traffic, so no
one notices the performance issue.

Does subversion already come with such backup script? Maybe we should
submit these scripts back, I think some others may find it helpful...
where would I submit it to?

-- 
Josh Kuo <jo...@prioritynetworks.net>

Re: repository on client is newer than on server as a result of server hd failure

Posted by Ryan Schmidt <su...@ryandesign.com>.

On Feb 8, 2006, at 23:48, Josh Kuo wrote:

>> And what happens if two users commit changes one right after one the
>> other? I'd expect the two dump processes together to clobber the
>> dumpfile. Doesn't sound like such a great strategy, unless you've
>> built in some mechanism that you're not showing that ensures only one
>> dump process runs at a time.
>
> Yes, the script shown is an overly simplified version of what I  
> did. In
> actuality, I am dumping to a temp file (name by time stamp), and once
> the dump is finished, then I copy the temp file to its final  
> destination
> and rename it to the proper name (project.svnduimp or something).
>
> Of course, this puts a dent on the overall performance, especially  
> when
> you have many commits back to back for the same repository, but it  
> works
> well for me. I have a repository that does not get as many commits
> (maybe a dozen a day), but it is crucial that I do not lose any
> revisions.  So I am willing to sacrifice a little performance for  
> that.
>
> Actually, this works pretty well for me so far, the backup is  
> completely
> transparent to the users, they do not see any delay when comitting.
>
> Thank you for pointing that out :-)

And thank you for telling me more. That sounds awful to me from a  
performance perspective :-) — our repository is over 1GB now and I  
wouldn't want our server having to push all that data around after  
each commit — but it does sound like you shouldn't lose any data.

I'm thinking a quicker process would be to have the post-commit hook  
only trigger an incremental backup, to a file whose name is based on  
the revision number, stored in a directory "incremental". Then do a  
nightly full dump, when the server is less busy, by getting the HEAD  
revision (svnlook youngest); dumping everything up to that revision  
into a file whose name is based on that revision, stored in a  
directory "full"; deleting the now-superfluous incrementals up to  
that revision; and deleting the older full dumps, possibly leaving  
the last couple as additional safeguards.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: repository on client is newer than on server as a result of server hd failure

Posted by Josh Kuo <jo...@prioritynetworks.net>.

I do have RAID level 1 set up.  I guess I am just paranoid :-)

> If the data is so crucial, then why not use RAID to mirror the drive?  
> Seems like a much better system from a performance perspective.
> 
> jon
> 

-- 
Josh Kuo <jo...@prioritynetworks.net>

Re: repository on client is newer than on server as a result of server hd failure

Posted by Kris Deugau <kd...@vianet.ca>.

Jon Scott Stevens wrote:
> If the data is so crucial, then why not use RAID to mirror the drive?  
> Seems like a much better system from a performance perspective.

RAID does not protect against (accidentally) Doing Something Dumb.

It just protects against hardware-level faults on the physical disks.

My own personal repos have no backups in place at the moment, but the 
ones I set up for work have a daily full backup.  (They're small.)  I 
keep 14 days of daily backups.

I actually back up a number of things in the same script;  here's an 
extract with the subversion bits:

=====
#!/bin/sh

curdate=`date +%Y%m%d`
olddate=`date -d '14 days ago' +%Y%m%d`

# Back up Subversion repos

for repo in repo1 repo2; do
   # Dump repo
   echo "Dumping $repo..."
   /usr/bin/svnadmin -q dump /svnroot/$repo | gzip > \
	/backups/svn-$repo-$curdate.gz

   # Delete old repo backup.  Keep 14 days' changes.
   rm -f /backups/svn-$repo-$olddate.gz

   # Get admin area
   tar -c /svnroot/$repo/{conf,hooks} |gzip > \
	/backups/svn-$repo-conf.tar.gz
done
=====

The backup destination is actually an NFS mount to another system in the 
real script, mounted and unmounted within the script.

If I were working with much larger repositories, I'd likely do a weekly 
full dump, and daily incrementals (or per-commit incrementals that get 
copied to the backup NFS mount on a daily basis).

-kgd

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: repository on client is newer than on server as a result of server hd failure

Posted by Jon Scott Stevens <jo...@latchkey.com>.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Feb 8, 2006, at 2:48 PM, Josh Kuo wrote:

> Of course, this puts a dent on the overall performance, especially  
> when
> you have many commits back to back for the same repository, but it  
> works
> well for me. I have a repository that does not get as many commits
> (maybe a dozen a day), but it is crucial that I do not lose any
> revisions.  So I am willing to sacrifice a little performance for  
> that.

If the data is so crucial, then why not use RAID to mirror the drive?  
Seems like a much better system from a performance perspective.

jon

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (Darwin)

iD8DBQFD6n0N24RYsUKZY+oRAgiuAKDEHgEjex+WHKtL/cyhC5AOdoUq5wCg78jG
HXm9kg1K2BBOOREjU3vFYn0=
=L9Bh
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: repository on client is newer than on server as a result of server hd failure

Posted by Josh Kuo <jo...@prioritynetworks.net>.

> And what happens if two users commit changes one right after one the  
> other? I'd expect the two dump processes together to clobber the  
> dumpfile. Doesn't sound like such a great strategy, unless you've  
> built in some mechanism that you're not showing that ensures only one  
> dump process runs at a time.

Yes, the script shown is an overly simplified version of what I did. In
actuality, I am dumping to a temp file (name by time stamp), and once
the dump is finished, then I copy the temp file to its final destination
and rename it to the proper name (project.svnduimp or something).

Of course, this puts a dent on the overall performance, especially when
you have many commits back to back for the same repository, but it works
well for me. I have a repository that does not get as many commits
(maybe a dozen a day), but it is crucial that I do not lose any
revisions.  So I am willing to sacrifice a little performance for that.

Actually, this works pretty well for me so far, the backup is completely
transparent to the users, they do not see any delay when comitting.

Thank you for pointing that out :-)

Re: repository on client is newer than on server as a result of server hd failure

Posted by Ryan Schmidt <su...@ryandesign.com>.

On Feb 8, 2006, at 20:12, Josh Kuo wrote:

> Not sure if this helps you, I have a post-commit hook script that  
> backs
> up the repository, so I know I always have the most up-to-date backup.
>
> Here's a overly simplified version writtien in shell:
>
> <post-commit>
> #!/bin/sh
> svnadmin dump "$1" > /home/backups/backup.svndump &
> </post-commit>
>
> I discovered that my commit was taking a while because of the dump,  
> so I
> put the process in the background (with the & symbol), so there's no
> delay when users commit new revisions.

And what happens if two users commit changes one right after one the  
other? I'd expect the two dump processes together to clobber the  
dumpfile. Doesn't sound like such a great strategy, unless you've  
built in some mechanism that you're not showing that ensures only one  
dump process runs at a time.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org