You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Christopher L Merrill <ch...@webperformance.com> on 2006/03/21 19:24:55 UTC

corrupt repository

Subversion 1.1.3 on linux (RHES3) with Berkely db

First, how it happened:

We were getting low on disk space so we remapped the /opt hierarch (including
/opt/subversion where all our respositories are) and rsynced all the files
to the new area.  This was done by our sysadmin who is pretty good.  He assured
us that all timestamps and permissions were preserved and I have no reason to
doubt it.

He also shut down the svnserve service before the move.


Second, what happened:

After restarting subserve, most operations resulted in errors like:
$ svn log project.xml
svn: Berkeley DB error while opening environment for filesystem /opt/subversion/
Development/db:
DB_RUNRECOVERY: Fatal error, run database recovery
svn: bdb: Invalid log file: log.0000000001: No such file or directory
svn: bdb: PANIC: No such file or directory
svn: bdb: PANIC: DB_RUNRECOVERY: Fatal error, run database recovery


Third, what I've tried:

I shutdown svnserve and tried using "svnadmin recover".  It generated errors, too.
So based on a FAQ entry, I deleted the log files and then "svnadmin recover" seemed
to work. Restarted SVN and a simple update worked.  But then a few minutes
later I could not do a diff.  So then I found this:
   http://subversion.tigris.org/faq.html#bdb-recovery
So I installed db42-utils (via yum, to ensure the dependencies were satisfied)
and then ran db_recover -c -v -h <path>.  It said:
   [root@dev1 subversion]# db_recover -c -v -h /opt/subversion/Development/db/
   db_recover: Finding last valid log LSN: file: 1 offset 28
Can't tell if that is good or bad, but we still get errors like the one above.

Can anybody suggest what I should try next?  I am pretty darn sure these
files did not get corrupted by the move process, but I'm starting to get
worried that we've lost our entire repository.

TIA,
C


-- 
------------------------------------------------------------------------ -
Chris Merrill                           |  http://www.webperformance.com
Web Performance Inc.

Website Load Testing and Stress Testing Software & Services
------------------------------------------------------------------------ -

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: corrupt repository - more details

Posted by Christopher L Merrill <ch...@webperformance.com>.
Here is some more info on my previous post.  I've tried a few more things
and it certainly seems like the content of the repository is still in tact.
I just can't get to it for any amount of time.

For instance, immediately after deleting log files from the db folder,
doing a db_recover and restart svnserve, I can diff, log, update and commit
a file in the repository.  5 minutes later, doing a log a file results in:

 > $ svn log project.xml
 > svn: Berkeley DB error while checkpointing after Berkeley DB transaction for fil
 > esystem /opt/subversion/Development/db:
 > Invalid argument
 > svn: bdb: DB_ENV->log_flush: LSN of 1/115206 past current end-of-log of 1/9486
 > svn: bdb: Database environment corrupt; the wrong log files may have been remove
 > d or incompatible database files imported from another environment
 > svn: bdb: strings: unable to flush page: 0
 > svn: bdb: txn_checkpoint: failed to flush the buffer cache Invalid argument

If I repeat the procedure, I can do a log and see the result of my commit.


Also, I cannot do a dump of the repository.  If I do the above procedure and
start the dump, it gets through a few versions and then dies with hundreds of
lines of:

> svn: bdb: Database environment corrupt; the wrong log files may have been remove
> d or incompatible database files imported from another environment
> svn: bdb: DB_ENV->log_flush: LSN of 727/594026 past current end-of-log of 1/6118
> 
> svn: bdb: Database environment corrupt; the wrong log files may have been remove
> d or incompatible database files imported from another environment
> svn: bdb: DB_ENV->log_flush: LSN of 683/223525 past current end-of-log of 1/6118
> 
> svn: bdb: Database environment corrupt; the wrong log files may have been remove
> d or incompatible database files imported from another environment
> svn: bdb: DB_ENV->log_flush: LSN of 705/514953 past current end-of-log of 1/6118

If I then restart the dump, it fails immediately with the 1st error shown above.

So at this point, we can't even dump/restore the repository.


Thanks for any help you can offer!
Chris

-- 
------------------------------------------------------------------------ -
Chris Merrill                           |  http://www.webperformance.com
Web Performance Inc.

Website Load Testing and Stress Testing Software & Services
------------------------------------------------------------------------ -

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: corrupt repository

Posted by Nico Kadel-Garcia <nk...@comcast.net>.
Ryan Schmidt wrote:
> On Mar 21, 2006, at 20:24, Christopher L Merrill wrote:
>
>> We were getting low on disk space so we remapped the /opt hierarch
>> (including
>> /opt/subversion where all our respositories are) and rsynced all
>> the files
>> to the new area.  This was done by our sysadmin who is pretty
>> good.  He assured
>> us that all timestamps and permissions were preserved and I have no
>> reason to
>> doubt it.
>
> Is this new area on an NFS mount? If so, BerkeleyDB won't work like
> that.

A "good" sysadmin should have not "remapped": He should have shut down the 
Apache and svnservice services, done a complete system backup (perhaps using 
the hot-backup.py script designed for creating snapshots), then *DUPLICATED* 
the repository to the target location.

Only after the new location had also been successfully backed up should he 
have deleted the old one.

>> Can anybody suggest what I should try next?  I am pretty darn sure
>> these
>> files did not get corrupted by the move process, but I'm starting
>> to get
>> worried that we've lost our entire repository.
>
> Do you have up-to-date backups? Dumpfiles?
>
>
> Once you get this sorted out, consider moving the repository to FSFS,
> since it cannot suffer from these problems. How to do so is in the
> FAQ.

It's not clear to me yet that FSFS is better than Berkeley DB overall. But 
recovering from a corrupted database is an old problem in many venues, 
especially when people mistake doing a file-system snapshot for getting an 
actual database image. 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: corrupt repository

Posted by Ryan Schmidt <su...@ryandesign.com>.
On Mar 22, 2006, at 03:08, Christopher L Merrill wrote:

>> Do you have up-to-date backups? Dumpfiles?
>
> No.  I thought about doing a dump...but since it was _supposed_ to be
> a simple file move, I didn't.  My bad :(   I guess I should have  
> researched
> the quirks of BDB more.

Now would also be a good time then to start doing daily dumps. You  
can do them incrementally to save space, maybe doing a full dump only  
every week or every month. You can even write a post-commit hook to  
make a dump of that revision immediately. There are many  
possibilities, but it's certainly important to just pick one, even if  
it's not the best solution. An inefficient backup is still better  
than no backup.


>> Once you get this sorted out, consider moving the repository to  
>> FSFS,  since it cannot suffer from these problems. How to do so is  
>> in the FAQ.
>
> That is definitely on my list for this weekend.
>
> FWIW, I'm still not sure exactly what instigated this problem, but  
> somewhere
> I found advice to delete some log files and then try a recover.  As  
> it turns
> out, following that advice exaserbated the problem.  Once I  
> realized that
> for BDB the log files are actually _important_, I simply restored  
> the files
> from a backup, did a recover and viola!  Everything back to normal.

Yes, they're not log files in the traditional sense of text files  
that serve only to inform the administrator later. They're more like  
journal files for a journaled filesystem, in which data is written to  
the logs first, and only later into the actual DB files, and if you  
remove the logs before this happens, you've lost data. There is some  
command you can use to get BerkeleyDB to tell you which log files  
it's done with and which are safe to remove, but I forget the command  
at the moment.



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: corrupt repository

Posted by Christopher L Merrill <ch...@webperformance.com>.
Ryan Schmidt wrote:
> Is this new area on an NFS mount? If so, BerkeleyDB won't work like  that.

nope

> Do you have up-to-date backups? Dumpfiles?

No.  I thought about doing a dump...but since it was _supposed_ to be
a simple file move, I didn't.  My bad :(   I guess I should have researched
the quirks of BDB more.

> Once you get this sorted out, consider moving the repository to FSFS,  
> since it cannot suffer from these problems. How to do so is in the FAQ.

That is definitely on my list for this weekend.

FWIW, I'm still not sure exactly what instigated this problem, but somewhere
I found advice to delete some log files and then try a recover.  As it turns
out, following that advice exaserbated the problem.  Once I realized that
for BDB the log files are actually _important_, I simply restored the files
from a backup, did a recover and viola!  Everything back to normal.


Thanks,
Chris


-- 
------------------------------------------------------------------------ -
Chris Merrill                           |  http://www.webperformance.com
Web Performance Inc.

Website Load Testing and Stress Testing Software & Services
------------------------------------------------------------------------ -

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: corrupt repository

Posted by Ryan Schmidt <su...@ryandesign.com>.
On Mar 21, 2006, at 20:24, Christopher L Merrill wrote:

> We were getting low on disk space so we remapped the /opt hierarch  
> (including
> /opt/subversion where all our respositories are) and rsynced all  
> the files
> to the new area.  This was done by our sysadmin who is pretty  
> good.  He assured
> us that all timestamps and permissions were preserved and I have no  
> reason to
> doubt it.

Is this new area on an NFS mount? If so, BerkeleyDB won't work like  
that.


> Can anybody suggest what I should try next?  I am pretty darn sure  
> these
> files did not get corrupted by the move process, but I'm starting  
> to get
> worried that we've lost our entire repository.

Do you have up-to-date backups? Dumpfiles?


Once you get this sorted out, consider moving the repository to FSFS,  
since it cannot suffer from these problems. How to do so is in the FAQ.



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org