You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Malcolm Fernandes <mf...@octigabay.com> on 2003/10/17 22:06:15 UTC

hot-backup.py hangs

Was testing the hot-backup.py script on a test repository and it hung on one 
occassion.

We are using:
svn, version 0.30.0 on Suse Linux

The hot-backup script ran successfully on several earlier attempts.  

Scenario:
Was committing several hundred files and was simultaneously checking out a 
project from another xterm (http access).
While the two subversion operations were running, I fired up the hot-backup 
script. It ran for a while and then hung after re-copying the logfiles.

   Re-copying logfile 'log.0000000228'...
   Re-copying logfile 'log.0000000229'...
Backup completed.


An strace of the process revealed:
/usr/bin/strace -p27646
write(2, "Log sequence error: page LSN 227"..., 64) = ? ERESTARTSYS (To be 
restarted)

Interrupted the script after about half-hour and got this traceback message:
Traceback (most recent call last):
  File "./hot-backup.py", line 160, in ?
    stdout_lines = outfile.readlines()
KeyboardInterrupt

Attempted to run db_recover manually and got this error:
# /usr/bin/db_recover -h /home2/mfernandes/subversion/BACKUPS/test_2-73-2/db
db_recover: Log sequence error: page LSN 217:649059; previous LSN 227 947360
db_recover: Log sequence error: page LSN 227:741713; previous LSN 227 986004
db_recover: Log sequence error: page LSN 227:741713; previous LSN 227 992643
...
... --snip--
db_recover: Log sequence error: page LSN 227:759209; previous LSN 229 354875
db_recover: Log sequence error: page LSN 227:741713; previous LSN 229 480106
db_recover: strings: close: 427 blocks left pinned
db_recover: nodes: close: 21 blocks left pinned
db_recover: DB_ENV->open: DB_INCOMPLETE: Cache flush was unable to complete

Tried this several times and got the same error.

A subsequent run of hot-backup worked fine.

Could the bulk commit be the cause for this failure, whereby transactions were 
still getting logged, even while the logfile was getting re-copied to the 
backup area?

Thanks,

Mal



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: hot-backup.py hangs

Posted by Malcolm Fernandes <mf...@octigabay.com>.
On Wednesday 22 October 2003 10:31, C. Michael Pilato wrote:
> Malcolm Fernandes <mf...@octigabay.com> writes:
> > Can the hot-backup.py script be executed while other svn operations
> > are in progress?
>
> Well, that *is* why it's called a hot backup.  If something isn't
> working, we need to examine why this is happening.
>

Do you need any more information from me to examine why this is failing?

Thanks,

Mal


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: hot-backup.py hangs

Posted by Malcolm Fernandes <mf...@octigabay.com>.
On Wednesday 22 October 2003 10:31, C. Michael Pilato wrote:
> Malcolm Fernandes <mf...@octigabay.com> writes:
> > Can the hot-backup.py script be executed while other svn operations
> > are in progress?
>
> Well, that *is* why it's called a hot backup.  If something isn't
> working, we need to examine why this is happening.
>

Do you need any more information from me to examine why this is failing?

Thanks,

Mal


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: hot-backup.py hangs

Posted by "C. Michael Pilato" <cm...@collab.net>.
Malcolm Fernandes <mf...@octigabay.com> writes:

> Can the hot-backup.py script be executed while other svn operations
> are in progress?

Well, that *is* why it's called a hot backup.  If something isn't
working, we need to examine why this is happening.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: hot-backup.py hangs

Posted by "C. Michael Pilato" <cm...@collab.net>.
Malcolm Fernandes <mf...@octigabay.com> writes:

> Can the hot-backup.py script be executed while other svn operations
> are in progress?

Well, that *is* why it's called a hot backup.  If something isn't
working, we need to examine why this is happening.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Known hot-backup.py problems [was: hot-backup.py hangs]

Posted by "C. Michael Pilato" <cm...@collab.net>.
Philip Martin <ph...@codematters.co.uk> writes:

> Philip Martin <ph...@codematters.co.uk> writes:
> 
> > Overall hot-backup.py is less than perfect
> 
> I don't use hot-backup.py (I backup offline) so I have never really
> taken much interest in it before.  Having now looked at it, I believe
> there are at least 5 problems.

Philip, could you just file a big issue on this, or add it to the
issue (if such exists) with the 'svnadmin hotcopy' patch?  And thanks
for the insightful read.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Known hot-backup.py problems [was: hot-backup.py hangs]

Posted by Philip Martin <ph...@codematters.co.uk>.
Philip Martin <ph...@codematters.co.uk> writes:

> Overall hot-backup.py is less than perfect

I don't use hot-backup.py (I backup offline) so I have never really
taken much interest in it before.  Having now looked at it, I believe
there are at least 5 problems.

1. db_archive and db_recover (not serious)
The script calls db_archive and db_recover and hardcodes the path to
these utilities.  It should be using svnadmin from SVN_BINDIR.  This
should be straightforward to fix.

2. 'svnlook youngest' race (not serious)
The name of the backup incorporates a revision number obtained by
running 'svnlook youngest' on the live repository before starting the
backup.  However the youngest revision may change during the time
taken to backup and the name could be misleading.  One solution would
be to make the backup using a temporary name and rename it when
complete using the revision obtained by running 'svnlook youngest' on
the backup rather than the live repository.  Another solution, for use
when hot-backup.py is run from a post commit script, would be to pass
the commit revision and name the backup using that.

3. backup name race (very serious)
As described in my previous mail.  The name of the backup is chosen by
getting a directory listing, choosing a name that is different from
all the names in the listing and then copying into that name.  Nothing
prevents parallel instances of hot-backup.py choosing the same name so
that the copies overwrite each other, which will lead to missing
and/or corrupt backups.  One solution is that the hot-backup.py should
mkdir the chosen name before copying, then only one instance can
succeed and any others will have to try another name (at least that's
what I would do in C, I'm assuming Python works the same).

4. log file removal race (serious)
After making a backup hot-backup.py deletes those log files that are
no longer in use from the live repository.  However these log files
may have been in use when the backup was made.  That means that some
log files might never be complete in any of the backups, in an extreme
case some log files may not appear in any backup at all.  If the
complete version of a log file is missing it means that catastrophic
BDB recovery will not be possible.  One solution would be to get the
list of complete log files before making the backup.  Another would be
to get a list of complete log files from both the live and backup
repositories and only delete those that are present in both lists.

5. backup removal race (not serious)
When deleting old backups multiple instances of hot-backup.py may
attempt to delete the same backup.  I'm not a Python expert but I
think this is likely to result in some of the scripts returning an
error, thus erroneously indicating that the backup failed.  The simple
solution of simply ignoring errors while deleting old backups is
probably not a good idea, it might lead to the underlying backup
storage filling up.  I'm not sure what the best solution is, perhaps
some sort of lock file?  If so then perhaps that lock file should also
be used when choosing the backup name?

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: hot-backup.py hangs

Posted by Philip Martin <ph...@codematters.co.uk>.
Malcolm Fernandes <mf...@octigabay.com> writes:

>> We are using:
>> svn, version 0.30.0 on Suse Linux
>>
>> The hot-backup script ran successfully on several earlier attempts.
>>
>> Scenario:
>> Was committing several hundred files and was simultaneously checking out a
>> project from another xterm (http access).
>> While the two subversion operations were running, I fired up the hot-backup
>> script. It ran for a while and then hung after re-copying the logfiles.
>>
>>    Re-copying logfile 'log.0000000228'...
>>    Re-copying logfile 'log.0000000229'...
>> Backup completed.
>>
>>
>> An strace of the process revealed:
>> /usr/bin/strace -p27646
>> write(2, "Log sequence error: page LSN 227"..., 64) = ? ERESTARTSYS (To be
>> restarted)
>>
>> Interrupted the script after about half-hour and got this traceback
>> message: Traceback (most recent call last):
>>   File "./hot-backup.py", line 160, in ?
>>     stdout_lines = outfile.readlines()
>> KeyboardInterrupt
>>
>> Attempted to run db_recover manually and got this error:
>> # /usr/bin/db_recover -h

I see hot-backup.py is currently using db_archive and db_recover
directly, I think using svnadmin would be better.  The script also has
/usr/local/BerkeleyDB.4.0/bin hardcoded, did you modify it to use
/usr/bin?

Looking at hot-backup.py itself I think there is a race choosing the
name of the backup dir.  The script gets a listing of the backup
directory, then chooses a subdir name based on that listing and then
copies into that subdir.  If two scripts run simultaneously (say the
one you start manually and the one started by the commit) they could
get the same directory listing, choose the same name, and then backup
into the same subdir.  That would probably cause the hang you see.

There is also the known race that could potentially delete log files
that have not been backed up fully, but that should not cause your
hang.

Overall hot-backup.py is less than perfect :-(  It may be time for me
to get round to reviewing the 'svnadmin hotcopy' patch.  I was
initially enthusiatic, but my enthusiasm waned when I realised that it
could not be made robust in a few corner cases (obscure things like
set_data_dir).  However an admin would have to modify hot-backup.py to
handle set_data_dir, and given the history of hot-backup.py any such
modification might well be imperfect.  That might be enough to justify
a less flexible but more reliable 'svnadmin hotcopy' based system.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: hot-backup.py hangs

Posted by Malcolm Fernandes <mf...@octigabay.com>.
This problem is reproducible.   Tried this on two occasions since I reported 
the original problem and hit the same error.

Reversed the steps in these tests. i.e. started the backup first and then the 
commits.

This seems like a serious problem. 

Can the hot-backup.py script be executed while other svn operations are in 
progress?

Thanks,

Mal

On Friday 17 October 2003 15:06, Malcolm Fernandes wrote:
> Was testing the hot-backup.py script on a test repository and it hung on
> one occassion.
>
> We are using:
> svn, version 0.30.0 on Suse Linux
>
> The hot-backup script ran successfully on several earlier attempts.
>
> Scenario:
> Was committing several hundred files and was simultaneously checking out a
> project from another xterm (http access).
> While the two subversion operations were running, I fired up the hot-backup
> script. It ran for a while and then hung after re-copying the logfiles.
>
>    Re-copying logfile 'log.0000000228'...
>    Re-copying logfile 'log.0000000229'...
> Backup completed.
>
>
> An strace of the process revealed:
> /usr/bin/strace -p27646
> write(2, "Log sequence error: page LSN 227"..., 64) = ? ERESTARTSYS (To be
> restarted)
>
> Interrupted the script after about half-hour and got this traceback
> message: Traceback (most recent call last):
>   File "./hot-backup.py", line 160, in ?
>     stdout_lines = outfile.readlines()
> KeyboardInterrupt
>
> Attempted to run db_recover manually and got this error:
> # /usr/bin/db_recover -h
> /home2/mfernandes/subversion/BACKUPS/test_2-73-2/db db_recover: Log
> sequence error: page LSN 217:649059; previous LSN 227 947360 db_recover:
> Log sequence error: page LSN 227:741713; previous LSN 227 986004
> db_recover: Log sequence error: page LSN 227:741713; previous LSN 227
> 992643 ...
> ... --snip--
> db_recover: Log sequence error: page LSN 227:759209; previous LSN 229
> 354875 db_recover: Log sequence error: page LSN 227:741713; previous LSN
> 229 480106 db_recover: strings: close: 427 blocks left pinned
> db_recover: nodes: close: 21 blocks left pinned
> db_recover: DB_ENV->open: DB_INCOMPLETE: Cache flush was unable to complete
>
> Tried this several times and got the same error.
>
> A subsequent run of hot-backup worked fine.
>
> Could the bulk commit be the cause for this failure, whereby transactions
> were still getting logged, even while the logfile was getting re-copied to
> the backup area?
>
> Thanks,
>
> Mal


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org