You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Anton Shepelev <an...@gmail.com> on 2019/08/22 13:16:17 UTC

Questions about a script for regular backups

[Having failed to post this message via Gmane, I am sending it by e-mail]

Hello, all

In order to write a backup script in the Windows batch
language, I was reading the section "Migrating Repository
Data Elsewhere" from "Repository Maintenance":

   http://svnbook.red-bean.com/en/1.7/svn.reposadmin.maint.html

where I found the following interesting paragraph:

   Another neat trick you can perform with this
   --incremental option involves appending to an existing
   dump file a new range of dumped revisions. For example,
   you might have a post-commit hook that simply appends the
   repository dump of the single revision that triggered the
   hook. Or you might have a script that runs nightly to
   append dump file data for all the revisions that were
   added to the repository since the last time the script
   ran. Used like this, svnadmin dump can be one way to back
   up changes to your repository over time in case of a
   system crash or some other catastrophic event.

The book unfortunately does not seem to give any examples of
this usage, leaving the following questions:

  1.  Is "appending" to be understood literally, that is
      using the >> operator on a previously existing dump
      file, or is it a figure of speach describing a
      supplementary dump file that shall be applied "on top"
      of a previous one?

  2.  How does one determine the revision range for a
      routine incremental dump -- by calling
      `svnlook youngest' before dumping?

  3.  Must the backup script somehow store the last revision
      in the dump between calls?  If so, I shall have to
      keep in a file and not let anybody touch it.

-- 
Please, do not forward replies to the list to my e-mail.

Re: Questions about a script for regular backups

Posted by Anton Shepelev <an...@gmail.com>.
Thank you for your comments, Andreas:

> Literally any situation where the undesired change to be
> recovered from happened before this last and single copy
> was taken.

I am going to prevent this by means of `svnadmin verify':

> > Is it practical to call it [verify] daily with a several
> > Gb, several thousand-revision repository?
>
> Again, you don't need to care about how long the backup
> takes. Only about it's consistency and the time to restore
> in a restore event.

Then I propose the following strategy:

1. Maintain a repository mirror by calling
     svnadmin hotcopy --incremental
   from a post-commit hook.

2. Store the last verified hot-copy in an archive:

   a. make a hot copy of the mirror

   b. verify this secondary hot-copy in order not to lock
      the mirror for a long time,

   c. archive the secondary hot-copy in a format with error-
      correction, such as .rar.  I haven't found a free,
      stable, and easiliy available archiver with built-in
      error-correcion.

Of course, my script will notify the administrator of errors
at any step.

-- 
Please, do not forward replies to my e-mail.


Re: Questions about a script for regular backups

Posted by Andreas Stieger <An...@gmx.de>.
Hi,

> > A hobbyist approach this this has lead to many instances
> > of data loss in serious applications.
>
> While planning a backup strategy, one must consider the
> possible malfunctions and ways to counteract them.  How was
> the data lost in the cases you describe?

Literally any situation where the undesired change to be recovered from happened before this last and single copy was taken.

[...]
> Dumps are very slow.  `svnadmin verify' emulates a dump.  Is
> it equally slow?

Pretty much, yes. But at backup time you don't care about that. And the recommendations against the dump format for a backup is the fact that the dump/load cycle is much slower, emphasis on the load part.

> Is it practical to call it daily with a
> several Gb, several thousand-revision repository?

To verify a successful backup, yes. Again, you don't need to care about how long the backup takes. Only about it's consistency and the time to restore in a restore event.

Good luck,
Andreas

Re: Questions about a script for regular backups

Posted by Anton Shepelev <an...@gmail.com>.
Andreas Stieger to Anton Shepelev:

> > No, it depends on one's purpose.  If it is to keep the
> > data in case of HDD crashes, a single mirror is
> > sufficient.
>
> A hobbyist approach this this has lead to many instances
> of data loss in serious applications.

While planning a backup strategy, one must consider the
possible malfunctions and ways to counteract them.  How was
the data lost in the cases you describe?

> > Then again, since an SVN repository maintains its whole
> > history, a point-in-time recovery is easily effected by
> > `svn up -r N'.
>
> That is application level (versioning), different from
> file level backup.

Yes, but it seems largely to withdraw the need of file-level
history.  All I need is a backup of a recent version of the
repository wihout data corruption.

> > The only potential problem is some quiet data
> > corruption, which is why I ask: will `hotcopy' propagate
> > data corruption or will it detect it via internal
> > integrity checks and fail?
>
> Your concern about silent data corruption is not
> consistent with your "a copy is a backup" statement. Why
> would you care about one while accepting the other?

As I said, it is "the only potential problem."

> That being said, hotcopy will copy corruptions that may
> have happened, even if in the incremental case will only
> do so when first processed. svnadmin verify is suitable
> for an integrity check.

Dumps are very slow.  `svnadmin verify' emulates a dump.  Is
it equally slow?  Is it practical to call it daily with a
several Gb, several thousand-revision repository?

-- 
Please, do not forward replies to my e-mail.


Re: Questions about a script for regular backups

Posted by Andreas Stieger <An...@gmx.de>.
> No, it depends on one's purpose.  If it is to keep the data
> in case of HDD crashes, a single mirror is sufficient.

A hobbyist approach this this has lead to many instances of data loss in serious applications.

> again, since an SVN repository maintains its whole history,
> a point-in-time recovery is easily effected by
> `svn up -r N'.

That is application level (versioning), different from file level backup.

> The only potential problem is some quiet data corruption,
> which is why I ask: will `hotcopy' propagate data corruption
> or will it detect it via internal integrity checks and fail?

Your concern about silent data corruption is not consistent with your "a copy is a backup" statement. Why would you care about one while accepting the other? That being said, hotcopy will copy corruptions that may have happened, even if in the incremental case will only do so when first processed. svnadmin verify is suitable for an integrity check.

Andreas

Re: Questions about a script for regular backups

Posted by Anton Shepelev <an...@gmail.com>.
Andreas Stieger to Anton Shepelev:

> > Thanks to everybody for their replies.  I now understand
> > that -- incremental hot-copies are sufficient for
> > regular backups, which can then be mirrored by content-
> > aware file- synchronisation tools, but the problem
> > remains of preventing an accidental propagation of
> > corrupt data into the backup.  How do you solve it?
>
> What the fruit do you mean?  The whole purpose of a backup
> is that you can restore previous points in time.  That
> means multiple points in time, whenever the backup
> happened to be run.  Don't just make a copy and overwrite
> it every time. That is just copy, not a backup. Select
> backup software that can do that.

No, it depends on one's purpose.  If it is to keep the data
in case of HDD crashes, a single mirror is sufficient.  Then
again, since an SVN repository maintains its whole history,
a point-in-time recovery is easily effected by
`svn up -r N'.

The only potential problem is some quiet data corruption,
which is why I ask: will `hotcopy' propagate data corruption
or will it detect it via internal integrity checks and fail?

-- 
Please, do not forward replies to my e-mail.


Re: Questions about a script for regular backups

Posted by Andreas Stieger <An...@gmx.de>.
Hello,

> that --incremental hot-copies are sufficient for regular
> backups, which can then be mirrored by content-aware file-
> synchronisation tools, but the problem remains of preventing
> an accidental propagation of corrupt data into the backup.
> How do you solve it?

What the fruit do you mean? The whole purpose of a backup is that you can restore previous points in time. That means multiple points in time, whenever the backup happened to be run. Don't just make a copy and overwrite it every time. That is just copy, not a backup. Select backup software that can do that.

Andreas

Re: Questions about a script for regular backups

Posted by Anton Shepelev <an...@gmail.com>.
Thanks to everybody for their replies.  I now understand
that --incremental hot-copies are sufficient for regular
backups, which can then be mirrored by content-aware file-
synchronisation tools, but the problem remains of preventing
an accidental propagation of corrupt data into the backup.
How do you solve it?

-- 
Please, do not forward replies to my e-mail.


Re: Questions about a script for regular backups

Posted by Branko Čibej <br...@apache.org>.
On 14.10.2019 15:25, Anton Shepelev wrote:

>> See the example mentioned in the 1.8 release notes [1]:
>>
>>   svnadmin freeze /svn/my-repos -- rsync -av /svn/my-repos /backup/my-repos
> Hmm.  I should also expect a simple freeze/unfreeze pair
> with the caller resposible for unfreezing a fronzen repo...

Nope, you don't need that. If you do need a long-running,
explicitly-unfrozen freeze, you can easily implement it with a smart
enough command. Although, frankly, that way lies a ton of opportunities
for error.

-- Brane

Re: Questions about a script for regular backups

Posted by Anton Shepelev <an...@gmail.com>.
Johan Corveleyn

>Just to mention another option: Since 1.8 there is the
>command 'svnadmin freeze', which locks the repository for
>writing while you run another command. That way, you can
>use regular backup / copy commands (like rsync) to create a
>consistent copy.

I think `freeze' is also helpful for atomic runs of `verify'
and `hotcopy' in order to ensure you do not accidentally
copy a corrupt repository.  Without `freeze', how can I make
sure that `hotcopy' applies exclusively to verified data?

>See the example mentioned in the 1.8 release notes [1]:
>
>   svnadmin freeze /svn/my-repos -- rsync -av /svn/my-repos /backup/my-repos

Hmm.  I should also expect a simple freeze/unfreeze pair
with the caller resposible for unfreezing a fronzen repo...

>Of course, in contrast with hotcopy, the original
>repository is locked for a (hopefully short) while, so
>users might experience errors/timeouts if this takes too
>long.

This is why I am going to apply it to a mirror kept in sync
with the main repo via an incremental hopcopy invoked from
the post-commit hook.  How can I skip synchronisation of the
mirror in the hook if it is currently frozen without making
other possible errors?  Does SVN provide a reliable lock
facility, or must I invent it myself in my backup
script/program using e.g. lock files?

-- 
Please, do not forward replies to the list to my e-mail.


Re: Questions about a script for regular backups

Posted by Mark Phippard <ma...@gmail.com>.
On Tue, Aug 27, 2019 at 5:06 AM Johan Corveleyn <jc...@gmail.com> wrote:

> On Mon, Aug 26, 2019 at 9:01 PM Mark Phippard <ma...@gmail.com> wrote:
> >
> > On Mon, Aug 26, 2019 at 1:29 PM Anton Shepelev <an...@gmail.com>
> wrote:
> >>
> >> I have now set up a post-commit hook that makes an
> >> --incremental hotcopy.  With the destination on the same
> >> machine's HDD, it takes about two seconds, but with a
> >> network share it lasts 30 seconds.  Is it expected behavior
> >> for committing a tiny change in a text file?  If not, then
> >> where shall I look for the possible performance problems?  I
> >> have svn 1.8.16.
> >
> >
> > It is probably due to slowness of the IO across network to read what is
> in the target repository and then copy over the files. Other than tuning
> NFS or whatever you are using there is not much you can do.  This is why my
> first recommendation was to use svnsync. You could have a second backup
> server running and then use svnsync via https or svn protocol to that
> server.  This basically replays the commit transaction so performs
> comparably to the original commit. It also makes it a lot easier to send
> the backup around the world or to another data center since it is using a
> protocol that is meant for that sort of latency.
> >
>
> Does svnsync also copy locks and hook scripts?
>

No, neither of those are synced.  You would not want the hooks to sync
since you need to run different hooks on the backup server but locks are a
problem for people using that feature.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

Re: Questions about a script for regular backups

Posted by Johan Corveleyn <jc...@gmail.com>.
On Mon, Aug 26, 2019 at 9:01 PM Mark Phippard <ma...@gmail.com> wrote:
>
> On Mon, Aug 26, 2019 at 1:29 PM Anton Shepelev <an...@gmail.com> wrote:
>>
>> I have now set up a post-commit hook that makes an
>> --incremental hotcopy.  With the destination on the same
>> machine's HDD, it takes about two seconds, but with a
>> network share it lasts 30 seconds.  Is it expected behavior
>> for committing a tiny change in a text file?  If not, then
>> where shall I look for the possible performance problems?  I
>> have svn 1.8.16.
>
>
> It is probably due to slowness of the IO across network to read what is in the target repository and then copy over the files. Other than tuning NFS or whatever you are using there is not much you can do.  This is why my first recommendation was to use svnsync. You could have a second backup server running and then use svnsync via https or svn protocol to that server.  This basically replays the commit transaction so performs comparably to the original commit. It also makes it a lot easier to send the backup around the world or to another data center since it is using a protocol that is meant for that sort of latency.
>

Does svnsync also copy locks and hook scripts?

Just to mention another option: Since 1.8 there is the command
'svnadmin freeze', which locks the repository for writing while you
run another command. That way, you can use regular backup / copy
commands (like rsync) to create a consistent copy. See the example
mentioned in the 1.8 release notes [1]:

    svnadmin freeze /svn/my-repos -- rsync -av /svn/my-repos /backup/my-repos

Of course, in contrast with hotcopy, the original repository is locked
for a (hopefully short) while, so users might experience errors /
timeouts if this takes too long.

[1] http://subversion.apache.org/docs/release-notes/1.8.html#svnadmin-freeze

-- 
Johan

Re: Questions about a script for regular backups

Posted by Mark Phippard <ma...@gmail.com>.
On Mon, Aug 26, 2019 at 1:29 PM Anton Shepelev <an...@gmail.com> wrote:

> I have now set up a post-commit hook that makes an
> --incremental hotcopy.  With the destination on the same
> machine's HDD, it takes about two seconds, but with a
> network share it lasts 30 seconds.  Is it expected behavior
> for committing a tiny change in a text file?  If not, then
> where shall I look for the possible performance problems?  I
> have svn 1.8.16.
>

It is probably due to slowness of the IO across network to read what is in
the target repository and then copy over the files. Other than tuning NFS
or whatever you are using there is not much you can do.  This is why my
first recommendation was to use svnsync. You could have a second backup
server running and then use svnsync via https or svn protocol to that
server.  This basically replays the commit transaction so performs
comparably to the original commit. It also makes it a lot easier to send
the backup around the world or to another data center since it is using a
protocol that is meant for that sort of latency.

That said, I have no idea what kind of performance you should be able to
get via NFS.  30 seconds seems slower than it ought to have been.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

Re: Questions about a script for regular backups

Posted by Anton Shepelev <an...@gmail.com>.
I have now set up a post-commit hook that makes an
--incremental hotcopy.  With the destination on the same
machine's HDD, it takes about two seconds, but with a
network share it lasts 30 seconds.  Is it expected behavior
for committing a tiny change in a text file?  If not, then
where shall I look for the possible performance problems?  I
have svn 1.8.16.

-- 
Please, do not forward replies to the list to my e-mail.


Re: Questions about a script for regular backups

Posted by Pierre Fourès <pi...@gmail.com>.
Le ven. 23 août 2019 à 17:10, Mark Phippard <ma...@gmail.com> a écrit :
>
> On Fri, Aug 23, 2019 at 11:06 AM Nathan Hartman <ha...@gmail.com> wrote:
>>
>> On Fri, Aug 23, 2019 at 9:53 AM Mark Phippard <ma...@gmail.com> wrote:
>>>
>>> Anyway ... the only danger of a repository format is if you upgrade to latest and then for some reason need to downgrade your server binaries to an older version.  You can always use an older format with a newer version.
>>
>>
>> If you did wish to downgrade to an older version, wouldn't a dump and load make that possible?
>>
>
> Absolutely.  Just pointing you that is the only time you would run into something that would not just work and would require you to do something.
>

Thanks a lot Mark for your clarifications.

Pierre.

Re: Questions about a script for regular backups

Posted by Mark Phippard <ma...@gmail.com>.
On Fri, Aug 23, 2019 at 11:06 AM Nathan Hartman <ha...@gmail.com>
wrote:

> On Fri, Aug 23, 2019 at 9:53 AM Mark Phippard <ma...@gmail.com> wrote:
>
>> Anyway ... the only danger of a repository format is if you upgrade to
>> latest and then for some reason need to downgrade your server binaries to
>> an older version.  You can always use an older format with a newer version.
>>
>
> If you did wish to downgrade to an older version, wouldn't a dump and load
> make that possible?
>
>
Absolutely.  Just pointing you that is the only time you would run into
something that would not just work and would require you to do something.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

Re: Questions about a script for regular backups

Posted by Nathan Hartman <ha...@gmail.com>.
On Fri, Aug 23, 2019 at 9:53 AM Mark Phippard <ma...@gmail.com> wrote:

> Anyway ... the only danger of a repository format is if you upgrade to
> latest and then for some reason need to downgrade your server binaries to
> an older version.  You can always use an older format with a newer version.
>

If you did wish to downgrade to an older version, wouldn't a dump and load
make that possible?

Re: Questions about a script for regular backups

Posted by Mark Phippard <ma...@gmail.com>.
On Fri, Aug 23, 2019 at 4:16 AM Pierre Fourès <pi...@gmail.com>
wrote:

> Hello,
>
> Le jeu. 22 août 2019 à 16:47, Mark Phippard <ma...@gmail.com> a écrit :
> >
> >
> >> Cannot they become obsolete when a new version of SVN comes
> >> out?
> >
> >
> > No.  It is a valid copy of the repository.
> >
> >>   Are they portable across operating systems and
> >> filesystems? (I fear not)
> >
> >
> > Yes, they are absolutely portable across OS and FS. As is the repos
> itself.
>
> This prove to work in practice, but is it guaranteed that the fsfs
> repos format remain compatible between 1.X subsequent subversion
> releases ?
>

Yes it is.

When you upgrade your server to new version you do not have to touch
existing repositories. Think what a nightmare that would be for hosting
services or anyone with a lot of repositories.  It is not uncommon for a
new release to introduce a new repository format with some new features ...
though usually it is just some new efficiency in how the data is stored.
You need to dump/load if you are interested in getting these changes but
the server is capable of reading and writing every repository format.



>
> It appears the fsfs repos format sometime change between 1.X
> subversion releases. For example, Subversion 1.9 introduced fsfs
> format version 7. The release notes [1] mention and recommend to do a
> full dump / load cycle to be able to take benefits of this new format
> improvements.


Correct, you need to dump/load if you want to use the new format.

There is nothing wrong with having full dumps of your repository and you
need it to upgrade the format, but hot-copies are totally viable as a
backup and have a lot of advantages when it comes to the recovery process
in the event you need the backup.  I would not rush to using new formats
just because they are available. I have avoided the new format in 1.9 as
its benefits seemed tuned to scenarios that do not match my needs at all
and it has slower performance for what I think is the most common use case
which is using the Apache server hosting lots of repositories.

Anyway ... the only danger of a repository format is if you upgrade to
latest and then for some reason need to downgrade your server binaries to
an older version.  You can always use an older format with a newer version.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

Re: Questions about a script for regular backups

Posted by Pierre Fourès <pi...@gmail.com>.
Hello,

Le jeu. 22 août 2019 à 16:47, Mark Phippard <ma...@gmail.com> a écrit :
>
>
>> Cannot they become obsolete when a new version of SVN comes
>> out?
>
>
> No.  It is a valid copy of the repository.
>
>>   Are they portable across operating systems and
>> filesystems? (I fear not)
>
>
> Yes, they are absolutely portable across OS and FS. As is the repos itself.

This prove to work in practice, but is it guaranteed that the fsfs
repos format remain compatible between 1.X subsequent subversion
releases ?

It appears the fsfs repos format sometime change between 1.X
subversion releases. For example, Subversion 1.9 introduced fsfs
format version 7. The release notes [1] mention and recommend to do a
full dump / load cycle to be able to take benefits of this new format
improvements. Nonetheless, the notes also say that "older formats
remain supported". But this seems to be a beneficial side effect, not
a guarantee. It not seem enforced that backward compatibility will be
ensured for all 1.X subsequent subversion releases. To my
understanding, what's guaranteed to remain stable and compatible
between 1.X releases is the protocol between client and server, not
the underlying storing system. This is the reason I went to use
hot-copies for backups *and* dumps for migrations / reinstall. First
off all, it ensures that I will use the latest repos format available
for the particular instance of subversion I would run, and not miss to
upgrade it in order to take all benefits introduced by the targeted
subversion instance. Then, it ensures that in the case of un expected
situation where I would need to downgrade the subversion server
version, I wouldn't face the case of an upgraded fsfs repos format
unable to be read / handled by the said instance.

To my understanding, albeit very slow to load, dumps are absolutely
portable, meaning backward and forward compatible between subversion
server version. You mention the repos are absolutely portable across
OS and FS. Do you also mean between different subversion server
versions ? For instance, how would have it been handled if, by the
time Debian Jessie was out as the Stable Debian, and providing
subversion 1.8, I would have run subversion 1.9 on Ubuntu Xenial (and
used the repos format version 7), and then, for some external reasons
had to made the move to Debian Jessie. I doubt subversion 1.8 could be
able to read the hot-copies I would have done on the Ubuntu server. Or
would it ? If not, this means repos wouldn't be portable across OS
(while in their most current version at a specified date, for example,
early 2017 for the sake of this example). However, to my
understanding, would I have used dumps to backup my Ubuntu server, I
would have been able to restore the repos. Admittedly, I would have
lost the new functionalities introduced in subversion 1.9, but I still
would have been able to run subversion and access my repos, which
seems not to be the case in the event I would just rely on repos
hot-copies. Or would it ?

I would be really interested to get your view on all this in order to
see if I misunderstand what to expect from the hot-copies and the
dumps, and if my setup is overkill, or if it doesn't meet the
requirements I thought it would.

[1] https://subversion.apache.org/docs/release-notes/1.9

Best Regards,
Pierre.

Re: Questions about a script for regular backups

Posted by Anton Shepelev <an...@gmail.com>.
Mark Phippard:

>Almost no one uses the BDB repository format.  The fsfs
>format became the default in SVN 1.1 or 1.2 and it is the
>only format used anymore.

Phew.  We do have FSFS.  Thank you.

-- 
()  ascii ribbon campaign - against html e-mail
/\  http://preview.tinyurl.com/qcy6mjc [archived]

Re: Questions about a script for regular backups

Posted by Mark Phippard <ma...@gmail.com>.
On Thu, Aug 22, 2019 at 10:55 AM Anton Shepelev <an...@gmail.com> wrote:

> Mark Phippard to Anton Shepelev about hot copies:
>
> >>Are they portable across operating systems and
> >>filesystems? (I fear not)
> >
> >Yes, they are absolutely portable across OS and FS. As is
> >the repos itself.  The only issue when going across these
> >is managing the OS level permissions of the copy.  IOW, if
> >you run something as root the copy will tend to be owned by
> >root which might make it not ready for consumption without
> >a chown/chmod.
> >
> >I used to regular move fsfs repositories between an AS/400
> >EBCDIC server and Windows without issue.
>
> But SVN book has this:
>
>    As described in the section called "Berkeley DB", hot-
>    copied Berkeley DB repositories are not portable across
>    operating systems, nor will they work on machines with a
>    different "endianness" than the machine where they were
>    created.
>

Almost no one uses the BDB repository format.  The fsfs format became the
default in SVN 1.1 or 1.2 and it is the only format used anymore.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

Re: Questions about a script for regular backups

Posted by Anton Shepelev <an...@gmail.com>.
Mark Phippard to Anton Shepelev about hot copies:

>>Are they portable across operating systems and
>>filesystems? (I fear not)
>
>Yes, they are absolutely portable across OS and FS. As is
>the repos itself.  The only issue when going across these
>is managing the OS level permissions of the copy.  IOW, if
>you run something as root the copy will tend to be owned by
>root which might make it not ready for consumption without
>a chown/chmod.
>
>I used to regular move fsfs repositories between an AS/400
>EBCDIC server and Windows without issue.

But SVN book has this:

   As described in the section called "Berkeley DB", hot-
   copied Berkeley DB repositories are not portable across
   operating systems, nor will they work on machines with a
   different "endianness" than the machine where they were
   created.

-- 
()  ascii ribbon campaign - against html e-mail
/\  http://preview.tinyurl.com/qcy6mjc [archived]

Re: Questions about a script for regular backups

Posted by Mark Phippard <ma...@gmail.com>.
On Thu, Aug 22, 2019 at 10:38 AM Anton Shepelev <an...@gmail.com> wrote:

> Mark Phippard:
>
> >My first choice option would be to setup a repository on a
> >second server and use svnsync from a post-commit hook
> >script to sync the change.  After that, I would use
> >svnadmin hotcopy with the new --incremental option (as of
> >1.8?).  Dump is not a great choice for backups.
>
> Thank you, but I should prefer a traditional backup
> approach.  You and other posters say that dumps are poor
> choice, so I shall backup incremental hot copies.  But the
> question remains that I have asked already in another reply:
> are hot-copies a reliable means of long-term storage.
>

Yes.  A hotcopy is basically just an intelligent backup/copy of the
repository. It is similar to what a backup/file copy tool might do except
that it is aware of in progress transactions and make sure you have a
consistent repository copy.


Cannot they become obsolete when a new version of SVN comes
> out?


No.  It is a valid copy of the repository.

  Are they portable across operating systems and
> filesystems? (I fear not)
>

Yes, they are absolutely portable across OS and FS. As is the repos
itself.  The only issue when going across these is managing the OS level
permissions of the copy.  IOW, if you run something as root the copy will
tend to be owned by root which might make it not ready for consumption
without a chown/chmod.

I used to regular move fsfs repositories between an AS/400 EBCDIC server
and Windows without issue.

The problem with dumps is that they have to be loaded to become usable and
it also only copies the repository content not other things like locks and
hook scripts.  The hotcopy is copying the repository files directly so you
have everything and you could even be serving the hotcopy from a
hotswappable server.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

Re: Questions about a script for regular backups

Posted by Anton Shepelev <an...@gmail.com>.
Mark Phippard:

>My first choice option would be to setup a repository on a
>second server and use svnsync from a post-commit hook
>script to sync the change.  After that, I would use
>svnadmin hotcopy with the new --incremental option (as of
>1.8?).  Dump is not a great choice for backups.

Thank you, but I should prefer a traditional backup
approach.  You and other posters say that dumps are poor
choice, so I shall backup incremental hot copies.  But the
question remains that I have asked already in another reply:
are hot-copies a reliable means of long-term storage.
Cannot they become obsolete when a new version of SVN comes
out?  Are they portable across operating systems and
filesystems? (I fear not)

-- 
()  ascii ribbon campaign - against html e-mail
/\  http://preview.tinyurl.com/qcy6mjc [archived]

RE: Questions about a script for regular backups

Posted by Bo Berglund <bo...@gmail.com>.
On Thu, 22 Aug 2019 09:38:02 -0400, Mark Phippard <ma...@gmail.com> wrote:

>My first choice option would be to setup a repository on a second server
>and use svnsync from a post-commit hook script to sync the change.  After
>that, I would use svnadmin hotcopy with the new --incremental option (as of
>1.8?).  Dump is not a great choice for backups.
>
>The main advantage of svnsync is you can push the change via HTTP or SVN to
>a different system where as hotcopy needs FS access so the only way to get
>the repos on to a second server is if you can mount the FS via NFS or
>something.

That is also what I did!
Our main server runs on a Windows Server on the corporate LAN.
The backup server is a Linux box in a different location altogether.
Both locations have fiber access to the Internet.

The backup server is set up with https access (thanks to LetsEncrypt and Certbot) 
through the router.

I have synced the servers after first loading the backup server from dump files so 
as not to have to use Internet bandwidth for the original data transfer.

On the Windows man server I have set up a nightly task that uses svnsync to 
synchronize the two servers. It has been running just fine for 18 months without fail.
Recommended solution.

Bo Berglund


Re: Questions about a script for regular backups

Posted by Mark Phippard <ma...@gmail.com>.
On Thu, Aug 22, 2019 at 9:16 AM Anton Shepelev <an...@gmail.com> wrote:

> [Having failed to post this message via Gmane, I am sending it by e-mail]
>
> Hello, all
>
> In order to write a backup script in the Windows batch
> language, I was reading the section "Migrating Repository
> Data Elsewhere" from "Repository Maintenance":
>
>    http://svnbook.red-bean.com/en/1.7/svn.reposadmin.maint.html
>
> where I found the following interesting paragraph:
>
>    Another neat trick you can perform with this
>    --incremental option involves appending to an existing
>    dump file a new range of dumped revisions. For example,
>    you might have a post-commit hook that simply appends the
>    repository dump of the single revision that triggered the
>    hook. Or you might have a script that runs nightly to
>    append dump file data for all the revisions that were
>    added to the repository since the last time the script
>    ran. Used like this, svnadmin dump can be one way to back
>    up changes to your repository over time in case of a
>    system crash or some other catastrophic event.
>
> The book unfortunately does not seem to give any examples of
> this usage, leaving the following questions:
>
>   1.  Is "appending" to be understood literally, that is
>       using the >> operator on a previously existing dump
>       file, or is it a figure of speach describing a
>       supplementary dump file that shall be applied "on top"
>       of a previous one?
>
>   2.  How does one determine the revision range for a
>       routine incremental dump -- by calling
>       `svnlook youngest' before dumping?
>
>   3.  Must the backup script somehow store the last revision
>       in the dump between calls?  If so, I shall have to
>       keep in a file and not let anybody touch it.
>
>
My first choice option would be to setup a repository on a second server
and use svnsync from a post-commit hook script to sync the change.  After
that, I would use svnadmin hotcopy with the new --incremental option (as of
1.8?).  Dump is not a great choice for backups.

The main advantage of svnsync is you can push the change via HTTP or SVN to
a different system where as hotcopy needs FS access so the only way to get
the repos on to a second server is if you can mount the FS via NFS or
something.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

Re: Questions about a script for regular backups

Posted by Pierre Fourès <pi...@gmail.com>.
Hello,

Le jeu. 22 août 2019 à 15:52, Anton Shepelev <an...@gmail.com> a écrit :
>
> Andreas Stieger:
>
> >The dump format is not the best option for backups. The
> >restore time is much too slow as you need to recover from a
> >serialized format. In many hand-baked scripts the dump
> >method misses point-in-time recovery capabilities, ->
>
> >Just make sure you take a consistent snapshot, which can be
> >achieved by briefly locking it (svnadmin lock) or operating
> >on a consistent copy (svnadmin hotcopy).
>
> Is a hot-copy portable between SVN versions?  How safe is it
> to rely on a hot copy instead of a dump?
>

Indeed, I've encountered the problem that of restoring dumps was way
too slow and I ended up with a "belt and suspenders" solution
consisting of doing hot-copies to guarantee timely restoration time
(on systems with same software configurations), but also dumps to
guarantee restoration (on systems where the software configurations
would differs). For one reason or an other, but mainly if I need to
upgrade subversion and a breaking change would occurs, if the
hot-copies restorations wouldn't work, I would admittedly screw up to
restore it timely, but I would be able to restore it eventually.
However, while I do hot-copies every night, I intended to do dumps
only on week-ends. Up to now, the systems (svn-master and the storage
solution) handle the extra load of doing both solution every night, so
I let it that way (but might reconsider it in the future).

Admittedly, this situation should be very unlikely, but I feel more at
ease to have took it into account. More over, I also set it up it in
the event to handle two use case situation : the first is to complete
timely restoration in case of emergencies, this is handled with
hot-copies ; the second is to handle server upgrades (with software
upgrades), this is handled with dumps. For this second use case,
clearly, this shouldn't be done in emergency situations, so the dump
solution fit fine in this use case while also ensuring (and being
designed for) smooths upgrades between distinct software revisions. Of
course, if not implemented in the backup solution, I would never have
a dump ready when required for a server upgrade. Having integrated it
in the nightly (or weekly) backups, I know I always have a fresh dump
ready for when I intend to upgrade my server. BTW, I talk here of
logical server (the https://svn.company.com/), not physical (or
virtual) instances. I usually never upgrades production running
server. I prefer to install the "upgraded one" from a fresh install
and take the opportunity to do a full restoration to double-check
everything is fine and recoverable. In this particular pursuit, I find
the dumps to be very valuable.

Best Regards,
Pierre

Re: Questions about a script for regular backups

Posted by Nico Kadel-Garcia <nk...@gmail.com>.
On Thu, Aug 22, 2019 at 9:52 AM Anton Shepelev <an...@gmail.com> wrote:
>
> [replying via Gmane]
>
> Andreas Stieger:
>
> >The dump format is not the best option for backups. The
> >restore time is much too slow as you need to recover from a
> >serialized format. In many hand-baked scripts the dump
> >method misses point-in-time recovery capabilities, ->
>
> Why should I need those if SVN repositories store the
> complete history?

Because, on a bulky repository with bulky binaries, it is *butt slow*,
you can't easily prune the bulky binaries, and you will inevitably
have split-brain during time between the dump and the next dump/load.
Split-brain Is Not Your Friend(tm).

Re: Questions about a script for regular backups

Posted by Anton Shepelev <an...@gmail.com>.
[replying via Gmane]

Andreas Stieger:

>The dump format is not the best option for backups. The
>restore time is much too slow as you need to recover from a
>serialized format. In many hand-baked scripts the dump
>method misses point-in-time recovery capabilities, ->

Why should I need those if SVN repositories store the
complete history?

>-> and few people implement backup usability checks by
>loading the dump.

Is not a dump guarranteed to be usable if `svn dump'
succeeded?  If not, how do I load that dump without
intefering with the current work?

>If you have content-aware file based backup software
>available use that in the on-disk repository format.

The Unison file synchroniser, to work efficeintly on
Windows, has an option to use file size and modification
date to detect changes.  Would that work with SVN?

Do you suggest that I backup the contents of

  csvn\data\repositories

>Just make sure you take a consistent snapshot, which can be
>achieved by briefly locking it (svnadmin lock) or operating
>on a consistent copy (svnadmin hotcopy).

Is a hot-copy portable between SVN versions?  How safe is it
to rely on a hot copy instead of a dump?

-- 
Please, do not forward replies to the list to my e-mail.


Re: Questions about a script for regular backups

Posted by Andreas Stieger <An...@gmx.de>.
Hello,

> In order to write a backup script in the Windows batch
> language, I was reading the section "Migrating Repository
> Data Elsewhere" from "Repository Maintenance":
>
>    http://svnbook.red-bean.com/en/1.7/svn.reposadmin.maint.html

The dump format is not the best option for backups. The restore time is much too slow as you need to recover from a serialized format. In many hand-baked scripts the dump method misses point-in-time recovery capabilities, and few people implement backup usability checks by loading the dump. If you have content-aware file based backup software available use that in the on-disk repository format. Just make sure you take a consistent snapshot, which can be achieved by briefly locking it (svnadmin lock) or operating on a consistent copy (svnadmin hotcopy).

Andreas