You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Mike Scott <mi...@mikescommunity.com> on 2004/09/09 16:48:41 UTC

Subversion DB corruption

My repository machine hung just after committing an update to the 
repository. Upon restarting, I received messages about corruption in the 
transaction table. Having run svnadmin recover, I now get the message:

  * Verified revision 0.
  svn: no such representation'2f9'

I get this error when I try to do anything to the repository.

Is there any way to recover it?

Cheers,

Mike Scott.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Subversion DB corruption

Posted by "C. Michael Pilato" <cm...@collab.net>.
"Max Bowsher" <ma...@ukf.net> writes:

> Mike Scott wrote:
> > My repository machine hung just after committing an update to the
> > repository. Upon restarting, I received messages about corruption in the
> > transaction table. Having run svnadmin recover, I now get the message:
> >  * Verified revision 0.
> >  svn: no such representation'2f9'
> > I get this error when I try to do anything to the repository.
> > Is there any way to recover it?
> 
> Put it on a web/ftp site, email me the URL. I'll see what I can do.

But at least try "db_recover -vehc /path/to/repos/db" before doing all
that.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Subversion DB corruption

Posted by kf...@collab.net.
Juan Jose Comellas <ju...@comellas.com.ar> writes:
> How does Subversion handle the journalling of its own files? Does it use 
> fsync() to make sure that the data is flushed to the hard drive? 

I believe this is what Berkeley DB does (and therefore what Subversion
does, when used with BDB), unless one passed '--bdb-txn-nosync' to
'svnadmin create'.

> BTW, even doing this might not be enough when using IDE drives. There was a 
> thread about this issue in the PostgreSQL users list some time ago.

Sigh :-).

-Karl

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Subversion DB corruption

Posted by Juan Jose Comellas <ju...@comellas.com.ar>.
How does Subversion handle the journalling of its own files? Does it use 
fsync() to make sure that the data is flushed to the hard drive? 

BTW, even doing this might not be enough when using IDE drives. There was a 
thread about this issue in the PostgreSQL users list some time ago.


On Saturday 11 September 2004 12:58, kfogel@collab.net wrote:
> Mike Scott <mi...@mikescommunity.com> writes:
> > The machine hung just after committing, so my guess is that there was
> > still unflushed data in the disk cache. I would have expected a
> > journalling system to ensure that it's journal update were flushed to
> > disk before making any changes to the data.
>
> I think what Max means is that something may have corrupted existing
> journal files.  For example, no journaling file system (no matter how
> good) can recover from having its journal files themselves corrupted
> by a skipping disk head.
>
> (Obviously, we don't know exactly what happened with your machine, so
> it's hard to determine the likelihood that old data got blown away.)
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: users-help@subversion.tigris.org

-- 
Juan Jose Comellas
(juanjo@comellas.com.ar)

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Subversion DB corruption

Posted by Max Bowsher <ma...@ukf.net>.
Mike Scott wrote:
> The machine hung, it wasn't a disk problem. I don't believe there was
> any hardware level corruption. If the journal files were corrupted, I'm
> curious how that happened. I'd like to know if there's a scenario
> whereby aborting in the middle of a write (e.g. by the processor
> overheating and just stopping running code, which I also suspect is what
> happened here) could cause corruption. Is it possible to rule that out?
> If not, then perhaps there's a weakness in the system that needs to be
> addressed.

I really don't know enough about BDB to answer that.

It does seem that we've had too many broken repositories to be entirely 
written off as hardware faults - especially on OS X.

Unfortunately, it's remarkably hard to debug - no reproduction recipe has 
surfaced, and it's not clear whether the underlying problem is in BDB, in 
Subversion's use of BDB, or whether it really is some problem with the 
particular machines & installations involved.

Max.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Subversion DB corruption

Posted by kf...@collab.net.
Mike Scott <mi...@mikescommunity.com> writes:
> The machine hung, it wasn't a disk problem. I don't believe there was
> any hardware level corruption. If the journal files were corrupted,
> I'm curious how that happened. I'd like to know if there's a scenario
> whereby aborting in the middle of a write (e.g. by the processor
> overheating and just stopping running code, which I also suspect is
> what happened here) could cause corruption. Is it possible to rule
> that out? If not, then perhaps there's a weakness in the system that
> needs to be addressed.

I agree with you that the likeliest explanation is a weakness in the
software -- after all, an aborted write ought not harm data that was
already written.  I'm not sure how to start looking for the weakness,
though :-(.  I don't believe we've seen this behavior before, and a
reproduction recipe that involves overheating the processor is going
to be a bit impractical... :-)

(Not trying to be defeatist, just not sure what the next step is.)

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Subversion DB corruption

Posted by Mike Scott <mi...@mikescommunity.com>.
The machine hung, it wasn't a disk problem. I don't believe there was 
any hardware level corruption. If the journal files were corrupted, I'm 
curious how that happened. I'd like to know if there's a scenario 
whereby aborting in the middle of a write (e.g. by the processor 
overheating and just stopping running code, which I also suspect is what 
happened here) could cause corruption. Is it possible to rule that out? 
If not, then perhaps there's a weakness in the system that needs to be 
addressed.

Cheers,

Mike.

kfogel@collab.net wrote:

>Mike Scott <mi...@mikescommunity.com> writes:
>  
>
>>The machine hung just after committing, so my guess is that there was
>>still unflushed data in the disk cache. I would have expected a
>>journalling system to ensure that it's journal update were flushed to
>>disk before making any changes to the data.
>>    
>>
>
>I think what Max means is that something may have corrupted existing
>journal files.  For example, no journaling file system (no matter how
>good) can recover from having its journal files themselves corrupted
>by a skipping disk head.
>
>(Obviously, we don't know exactly what happened with your machine, so
>it's hard to determine the likelihood that old data got blown away.)
>
>
>  
>

Re: Subversion DB corruption

Posted by kf...@collab.net.
Mike Scott <mi...@mikescommunity.com> writes:
> The machine hung just after committing, so my guess is that there was
> still unflushed data in the disk cache. I would have expected a
> journalling system to ensure that it's journal update were flushed to
> disk before making any changes to the data.

I think what Max means is that something may have corrupted existing
journal files.  For example, no journaling file system (no matter how
good) can recover from having its journal files themselves corrupted
by a skipping disk head.

(Obviously, we don't know exactly what happened with your machine, so
it's hard to determine the likelihood that old data got blown away.)

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Subversion DB corruption

Posted by Mike Scott <mi...@mikescommunity.com>.
Hi Max

The machine hung just after committing, so my guess is that there was 
still unflushed data in the disk cache. I would have expected a 
journalling system to ensure that it's journal update were flushed to 
disk before making any changes to the data.

Cheers,

Mike.

Max Bowsher wrote:

> Mike Scott wrote:
>
>> Hi Max
>>
>> Thanks for the offer. I've tried db_recover without success. All our
>> source is intact in the working directories so we'll just rebuild the
>> repository and accept we've lost all our revisions prior to now.
>>
>> What I'm a little p****d about is that the system is supposed to be
>> journaled so that it can rollback in situations like this. So how come
>> it can't?
>
>
> Whatever mysterious corruption hit you seems to have managed to break 
> both the database files and the journalling transaction log files.
>
> Max.
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Subversion DB corruption

Posted by Max Bowsher <ma...@ukf.net>.
Mike Scott wrote:
> Hi Max
>
> Thanks for the offer. I've tried db_recover without success. All our
> source is intact in the working directories so we'll just rebuild the
> repository and accept we've lost all our revisions prior to now.
>
> What I'm a little p****d about is that the system is supposed to be
> journaled so that it can rollback in situations like this. So how come
> it can't?

Whatever mysterious corruption hit you seems to have managed to break both 
the database files and the journalling transaction log files.

Max.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Subversion DB corruption

Posted by Mike Scott <mi...@mikescommunity.com>.
Hi Max

Thanks for the offer. I've tried db_recover without success. All our 
source is intact in the working directories so we'll just rebuild the 
repository and accept we've lost all our revisions prior to now.

What I'm a little p****d about is that the system is supposed to be 
journaled so that it can rollback in situations like this. So how come 
it can't?

Cheers,

Mike.

Max Bowsher wrote:

> Mike Scott wrote:
>
>> My repository machine hung just after committing an update to the
>> repository. Upon restarting, I received messages about corruption in the
>> transaction table. Having run svnadmin recover, I now get the message:
>>
>>  * Verified revision 0.
>>  svn: no such representation'2f9'
>>
>> I get this error when I try to do anything to the repository.
>>
>> Is there any way to recover it?
>
>
> Put it on a web/ftp site, email me the URL. I'll see what I can do.
>
> Max.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: users-help@subversion.tigris.org
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Subversion DB corruption

Posted by Max Bowsher <ma...@ukf.net>.
Mike Scott wrote:
> My repository machine hung just after committing an update to the
> repository. Upon restarting, I received messages about corruption in the
> transaction table. Having run svnadmin recover, I now get the message:
> 
>  * Verified revision 0.
>  svn: no such representation'2f9'
> 
> I get this error when I try to do anything to the repository.
> 
> Is there any way to recover it?

Put it on a web/ftp site, email me the URL. I'll see what I can do.

Max.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org