You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Federico Di Gregorio <fo...@initd.org> on 2004/10/04 17:23:12 UTC

database corruption

Hi everybody,

first of all please keep me in cc: because I am not subscribed to this
list. I hope to be given a neat and fast answer; if not I'll subscribe
to continue the discussion.

Apparently I have found a way to corrupt a repository in an
unrecoverable way. Our website (http://initd.org/) runs apache 2,
subversion 1.0.5 and provide two different accesses to the repository.
The first is the usual subversion mod_dav_svn.

The second is through svnlook: we have a small widget on out homepage
that uses svnlook to provide some information on the last checkin.
Apparently after some hours the "db/nodes" file is corrupted without any
apparent reason. The contents of the file are quite strange too, here
are the first few lines:

svn: File not found: revision '531', path 'psycopg'
<EE><91>^^^@I^C^@^@d^UaA^>^@^@^@^@^@^@^

Note that "psycopg" is not a valid repository but *was* a CVS repository
ported using "cvs2svn" and then "svnadmin load" (the new path is
psycopg1, note the '1').

Every time it happens I have to throw away the repository and recover
from backup (happened 3 times in 2  days).

Is this a know problem?

federico

-- 
Federico Di Gregorio                         http://people.initd.org/fog
Debian GNU/Linux Developer                                fog@debian.org
INIT.D Developer                                           fog@initd.org
  All'inizio ho scritto un programma proprietario, in esclusiva per il
   cliente; è stato tristissimo, perché mi ha succhiato un pezzo di
   anima.                                           -- Alessandro Rubini

Re: database corruption

Posted by Vincent Lefevre <vi...@vinc17.org>.
On 2004-10-05 09:28:40 -0600, Jani Averbach wrote:
> There was (and still is in your version (1.0.6)) an error which will
> cause this kind of repository corruption if filedescriptors
> (stdout,stderr) are closed when e.g. svnadmin dump is runned[1].

Does this happen if one types Ctrl-C to interrupt svnadmin, for
instance?

> The fix are in r10819, r10855@trunk, and they were backported to the
> 1.0.x and released in 1.0.7. The fix was for lib_subr/cmdline.c, so
> this same case should be true also for svnlook.

Since 1.0.6-2 is the latest Debian package version, all Debian users
of the subversion package are affected by this. :(

> Could you try to upgrade your system at least to the 1.0.7 version?

Does anyone know when the Debian package will be updated?

-- 
Vincent Lefèvre <vi...@vinc17.org> - Web: <http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / SPACES project at LORIA

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: database corruption

Posted by Jani Averbach <ja...@jaa.iki.fi>.
On 2004-10-19 15:12+0200, Federico Di Gregorio wrote:

> after some time. note that we have the db directory with g+sw
> permissions, apache with umask 002 and the twisted server running the
> svnlook widget code member of the www-data group (primary group for
> apache). after some web page accesses and some commits the system starts
> to wedge. 

This sounds like permission problems.

> "shut down apache, svnadmin recover, start apache" can be
> repeated from 5 to 10 times then there is the unrecoverable corruption.

This should not happen, How did you run recover?

> > So if you do 'head -n 2 db/nodes' you will get something like that:
> > 
> > > > svn: File not found: revision '531', path 'psycopg'
> > > > <EE><91>^^^@I^C^@^@d^UaA^>^@^@^@^@^@^@^@^@^@^@^B^@^@^@^@^@^@^@ ^@^
> 
> yes. i can send the corrupted archive if anybody bothers.

Your repo doesn't matter, there should not be that 'svn: file not
found...'  string in the 'db/node' at the first place. The file is
Berkeley database file, and you just don't add some random data to it,
and hope that you can go with it.

> 
> I attached the code to this mail. 

Could you try that patch:

--- svn.py~     2004-10-19 07:29:07.130995421 -0600
+++ svn.py      2004-10-19 07:48:01.060027475 -0600
@@ -52,10 +52,12 @@
             cmd += ' '+self._svn_path
         try:
            os.umask(0002)
-           pipe = os.popen(cmd, 'r')
-           return [x.strip() for x in pipe.readlines()]
+           (infile, outfile, errfile) = os.popen3(cmd, 'r')
+           return [x.strip() for x in outfile.readlines()]
         finally:
-           pipe.close()
+           infile.close()
+           outfile.close()
+           errfile.close()

     def _get_last_revision(self):
         """Get last revision number for given path."""


If this helps, then we have to find why it (svn@r10819) is still
happenning:

Couple of reasons:

1) You have old libraries around, and you are not using 1.0.9's
   version of libsvn_subr. In fact, what 'locate libsvn_subr' tells?

2) There is a similar bug still lurking somewhere

3) Something else in your system (or in svn)

I will try to reproduce the problem with your py-script.

BR, Jani

-- 
Jani Averbach

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: database corruption

Posted by Federico Di Gregorio <fo...@initd.org>.
On Tue, 2004-10-19 at 06:38 -0600, Jani Averbach wrote:
> On 2004-10-19 13:55+0200, Federico Di Gregorio wrote:
> >
> > i finally installed 1.0.9 and after 20 minutes of web access (the
> > svnlook widget) and commits the db was corrupted in an unrecoverable
> > way again.
> 
> How did you upgrade your system to 1.0.9?  Are you absolutely positive
> that you don't have any old libraries laying around?

yes. note that we have another 2 (private) repositories that always
worked. the only difference is the svnlook on the public repository. I
just commited, checked out and generally done a lot of work one one of
the other repos and everything is right.

> Do you know if this happens from first time when you use widget or
> only after some time?  Do they have to happen at same time or could
> your widget alone corrupt the repository?

after some time. note that we have the db directory with g+sw
permissions, apache with umask 002 and the twisted server running the
svnlook widget code member of the www-data group (primary group for
apache). after some web page accesses and some commits the system starts
to wedge. "shut down apache, svnadmin recover, start apache" can be
repeated from 5 to 10 times then there is the unrecoverable corruption.

> So if you do 'head -n 2 db/nodes' you will get something like that:
> 
> > > svn: File not found: revision '531', path 'psycopg'
> > > <EE><91>^^^@I^C^@^@d^UaA^>^@^@^@^@^@^@^@^@^@^@^B^@^@^@^@^@^@^@ ^@^

yes. i can send the corrupted archive if anybody bothers.

> > probably i am using svnlook in a non standard way but i don't
> > understand what the problem is. 
> 
> There should be absolutely any standard non-standard way to get this
> kind of mess.  I like to take look of your widget.

I attached the code to this mail. I know that svn has python bindings
but using svnlook was just easier to start with. I was planning to move
to the bindings when I had the time to study them.

> > i am really scared of continuing to use svn now. :(
> 
> I understand, however this is quite exceptional and out of the
> ordinary, which of cource doesn't help you at all...

I am pretty happy with svn on the two other repositories. I am just a
little bit scared: what if an badly placed svnlook during a commit just
make me loose all my work? :(

-- 
Federico Di Gregorio                         http://people.initd.org/fog
Debian GNU/Linux Developer                                fog@debian.org
INIT.D Developer                                           fog@initd.org
  99.99999999999999999999% still isn't 100% but sometimes suffice. -- Me

Re: database corruption

Posted by Jani Averbach <ja...@jaa.iki.fi>.
On 2004-10-19 13:55+0200, Federico Di Gregorio wrote:
>
> i finally installed 1.0.9 and after 20 minutes of web access (the
> svnlook widget) and commits the db was corrupted in an unrecoverable
> way again.

How did you upgrade your system to 1.0.9?  Are you absolutely positive
that you don't have any old libraries laying around?

Do you know if this happens from first time when you use widget or
only after some time?  Do they have to happen at same time or could
your widget alone corrupt the repository?

So if you do 'head -n 2 db/nodes' you will get something like that:

> > svn: File not found: revision '531', path 'psycopg'
> > <EE><91>^^^@I^C^@^@d^UaA^>^@^@^@^@^@^@^@^@^@^@^B^@^@^@^@^@^@^@ ^@^

> probably i am using svnlook in a non standard way but i don't
> understand what the problem is. 

There should be absolutely any standard non-standard way to get this
kind of mess.  I like to take look of your widget.

> i am really scared of continuing to use svn now. :(

I understand, however this is quite exceptional and out of the
ordinary, which of cource doesn't help you at all...

BR, Jani

-- 
Jani Averbach

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: database corruption

Posted by Jani Averbach <ja...@jaa.iki.fi>.
On 2004-10-05 12:58+0200, Federico Di Gregorio wrote:
> Thank you very much for your answer. Here is some more information. Note
> that the svn repository run perfectly well for about 4 weeks before the
> first corruption. the corruption happened about 2h after we activated
> the "svnlook widget" (I can provide the code if necessary).

ok.

> libdb4.2       4.2.52-17      Berkeley v4.2 Database Libraries [runtime]
> apache2-mpm-pr 2.0.50-11      Traditional model for Apache2
> subversion     1.0.6-2        Advanced version control system (aka. svn)
> libapache2-svn 1.0.6-1.2.1    Apache modules for Subversion (aka. svn)

ok.

> I know db/nodes is corrupted because when I try "svnadmin recover" I
> get:
> 
>         Please wait; recovering the repository may take some time...
>         svn: DB_RUNRECOVERY: Fatal error, run database recovery

yeah, see below.

> > - Have you tried to run recovery when your Apache has been up and running?
> 
> no, obviously not. I shut down every other access method when trying
> recovery.

I just had to ask, no offence. =)


> > - Also which command printed the quoted error message?
> 
> no command. the quote referred to the first few lines of the db/nodes
> file. i.e., "less db/nodes" shows:
> 
> svn: File not found: revision '531', path 'psycopg'
> <EE><91>^^^@I^C^@^@d^UaA^>^@^@^@^@^@^@^@^@^@^@^B^@^@^@^@^@^@^@ ^@^
> ...

gasp, This is very, very wrong. Your node file, which is a
BerkeleyDB's database file contains an error output of svn command
line tool.

So how this cmdline error message got there in first place? 

There was (and still is in your version (1.0.6)) an error which will
cause this kind of repository corruption if filedescriptors
(stdout,stderr) are closed when e.g. svnadmin dump is runned[1]. The
fix are in r10819, r10855@trunk, and they were backported to the 1.0.x
and released in 1.0.7. The fix was for lib_subr/cmdline.c, so this
same case should be true also for svnlook.

Could you try to upgrade your system at least to the 1.0.7 version?
or take extra care that your widged isn't messing with stdin/out/err?

Before you could sort out the exact reason for this, don't use your
widged, because its use is destroying your repository data 
by unrecoverable way.

BR, Jani

1) http://www.contactor.se/~dast/svn/archive-2004-09/0137.shtml

-- 
Jani Averbach

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: database corruption

Posted by Federico Di Gregorio <fo...@initd.org>.
Thank you very much for your answer. Here is some more information. Note
that the svn repository run perfectly well for about 4 weeks before the
first corruption. the corruption happened about 2h after we activated
the "svnlook widget" (I can provide the code if necessary).

Here is the information:

Linux lamu 2.4.20 #1 Mon Jun 2 19:35:02 CEST 2003 i686 GNU/Linux

libdb4.2       4.2.52-17      Berkeley v4.2 Database Libraries [runtime]
apache2-mpm-pr 2.0.50-11      Traditional model for Apache2
subversion     1.0.6-2        Advanced version control system (aka. svn)
libapache2-svn 1.0.6-1.2.1    Apache modules for Subversion (aka. svn)

File system type is ext3 and I am sure subverion and the apache module
use the same berkeley db library (I used ldd to check for the linked
libraries.)

I know db/nodes is corrupted because when I try "svnadmin recover" I
get:

        Please wait; recovering the repository may take some time...
        svn: DB_RUNRECOVERY: Fatal error, run database recovery

and the berkeley recovery utility says:

        db_recover: Finding last valid log LSN: file: 68 offset 363108
        db_recover: initd/db/nodes: unexpected file type or format
        db_recover: Recovery function for LSN 68 362220 failed
        db_recover: PANIC: Invalid argument
        db_recover: PANIC: fatal region error detected; run recovery
        db_recover: PANIC: fatal region error detected; run recovery
        db_recover: PANIC: fatal region error detected; run recovery
        db_recover: PANIC: fatal region error detected; run recovery
        db_recover: PANIC: fatal region error detected; run recovery
        db_recover: PANIC: fatal region error detected; run recovery
        db_recover: PANIC: fatal region error detected; run recovery
        db_recover: PANIC: fatal region error detected; run recovery
        db_recover: PANIC: fatal region error detected; run recovery
        db_recover: PANIC: fatal region error detected; run recovery
        db_recover: PANIC: fatal region error detected; run recovery
        db_recover: PANIC: fatal region error detected; run recovery
        db_recover: PANIC: fatal region error detected; run recovery
        db_recover: PANIC: fatal region error detected; run recovery
        db_recover: DB_ENV->open: DB_RUNRECOVERY: Fatal error, run database recovery

> - Have you tried to run recovery when your Apache has been up and running?

no, obviously not. I shut down every other access method when trying
recovery.

> - Are you using some other hooks on your repository setup (perhaps
>   dumping it with closed file descriptors (stdout & stderr))?

the repository does not use any kind of hook.

> - Where did the Subversion came to your system (Prebuild, self-build)?

prebuild debian packages (see above for the version).

> - Also which command printed the quoted error message?

no command. the quote referred to the first few lines of the db/nodes
file. i.e., "less db/nodes" shows:

svn: File not found: revision '531', path 'psycopg'
<EE><91>^^^@I^C^@^@d^UaA^>^@^@^@^@^@^@^@^@^@^@^B^@^@^@^@^@^@^@ ^@^
...

> - Anything else non-standard which deviates from common setup?

nothing.

> If this happens again, could you set aside your repository, just in
> case that somebody likes to take look of it?

already done. I can send the tarball to anyone interested. note that
this is the tarball *after* the svnadmin recover that failed. I can
re-setup the repository but would be difficult to determine if/when it
breaks because sometimes it just wedges (permission problems between
apache and the svnlook widget I think, but nothing that should break it
completely.)

> I strongly suspect that you have something fishy on your setup, this
> really isn't a known problem. 
> 
> Sorry that I could not help more.

thank you very much anyway,
federico

-- 
Federico Di Gregorio                         http://people.initd.org/fog
Debian GNU/Linux Developer                                fog@debian.org
INIT.D Developer                                           fog@initd.org
                  Beh un bacio, se ben dato, non si rifiuta. --  <laura>

Re: database corruption

Posted by Jani Averbach <ja...@jaa.iki.fi>.
On 2004-10-04 19:23+0200, Federico Di Gregorio wrote:
> Hi everybody,
> 
> first of all please keep me in cc: because I am not subscribed to this
> list. I hope to be given a neat and fast answer; if not I'll subscribe
> to continue the discussion.
> 
> Apparently I have found a way to corrupt a repository in an
> unrecoverable way. Our website (http://initd.org/) runs apache 2,
> subversion 1.0.5 and provide two different accesses to the repository.
> The first is the usual subversion mod_dav_svn.
> 
> The second is through svnlook: we have a small widget on out homepage
> that uses svnlook to provide some information on the last checkin.
> Apparently after some hours the "db/nodes" file is corrupted without any
> apparent reason. The contents of the file are quite strange too, here
> are the first few lines:
> 
> svn: File not found: revision '531', path 'psycopg'
> <EE><91>^^^@I^C^@^@d^UaA^>^@^@^@^@^@^@^
> 
> Note that "psycopg" is not a valid repository but *was* a CVS repository
> ported using "cvs2svn" and then "svnadmin load" (the new path is
> psycopg1, note the '1').
> 
> Every time it happens I have to throw away the repository and recover
> from backup (happened 3 times in 2  days).
> 
> Is this a know problem?
> 

This definitely isn't a known problem, and therefore I can't give you
any precise advices, sorry about that. -- Frankly, I don't have any
idea what could cause that, so could you provide more information:

- Server OS, 

- Spesific Apache version, 

- Berkeley DB version, 

- File system type where repository is installed (it isn't NFS, is it?)

- Will the repository be ok without your svnlook widged?

- Are you sure that mod_dav_svn and svnlook are using exactly same
  version of linked libraries?

- How do you know that db/nodes is corrupted? What was the exact error
  message which told that? Which program told that?

- What will svnadmin recover or 'db_recover -vech ' say about your
  repository? Read the book section 
  http://svnbook.red-bean.com/svnbook-1.0/ch05s03.html#svn-ch-5-sect-3.4
  before trying out (shutdown every other repository access methods!)
  or even better, try those actions on a copy of your repository.

- Have you tried to run recovery when your Apache has been up and running?

- Are you using some other hooks on your repository setup (perhaps
  dumping it with closed file descriptors (stdout & stderr))?

- Where did the Subversion came to your system (Prebuild, self-build)?

- Also which command printed the quoted error message?

- Anything else non-standard which deviates from common setup?

If this happens again, could you set aside your repository, just in
case that somebody likes to take look of it?

I strongly suspect that you have something fishy on your setup, this
really isn't a known problem. 

Sorry that I could not help more.


BR, Jani

-- 
Jani Averbach

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org