You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Alan Spark <al...@gmail.com> on 2018/10/19 15:04:23 UTC

Repository became corrupt on commit

Hi,

Yesterday one of our repositories became corrupt when someone
committed a simple text file.

In the end we deleted the file and re-added it and it has been fine
since then. We are using SVN 1.9.3 on the server. This is one of many
repositories and the first time we have encountered such a situation.

A verify showed this error:

* Error verifying revision 728.
svnadmin: E160013: Filesystem path 'trunk/scripts/script.py' is
neither a file nor a directory

The content of that file is what had replaced what used to be the
trunk folder. When in that state it was not possible to checkout the
repository or browse its contents (i.e. it was corrupt). We could see
the log messages for the level above trunk but still could not check
it out.

Now that we have deleted then re-committed exactly the same file, the
repository is back to normal and it seems like an unreproducable bug
at this stage.

Is this a known issue?

Regards,
Alan

Re: Repository became corrupt on commit

Posted by Alan Spark <al...@gmail.com>.
Hi Daniel,

> Odd.  My first assumption is that the subshell on line 39 isn't behaving
> as expected.  If you run `svnlook tree --full-paths --show-ids -r 719
> /path/to/repository /trunk/scripts/script.py`, does print a path and a
> node-rev id string (which looks like "a.b.c/d-e")?

Here is the output that I got:

svnlook tree --full-paths --show-ids -r 719 /path/to/repository
/trunk/scripts/script.py
svnlook: E160016: Failure opening '/trunk/scripts/script.py'
svnlook: E160016: '/trunk' is not a directory in filesystem
'6b6eeb5b-1909-4e6c-8b33-d77c8e70ace5'

If I run this on 718 then I get this output:

svnlook tree --full-paths --show-ids -r 718 /path/to/repository
/trunk/scripts/script.py
/trunk/scripts/script.py <3.0.r716/391789>

> These all look correct.  I assume that when you do a normal 'verify'
> run, revisions 1:718 (inclusive) are verified successfully and 719
> errors out, correct?  I.e., it prints "Verifying revision 718" and then
> "Error verifying revision 719".

That is correct:

* Verified revision 718.
* Error verifying revision 719.
svnadmin: E160013: Filesystem path 'trunk/doc/EDITS.txt' is neither a
file nor a directory

> There's a slim chance that those "type:" lines had a NUL byte tacked on,
> or something else that got lost in the translation to email.  You can
> rule out this remote possibility by piping the grep to xxd(1) (just
> '| xxd', no option flags needed).

Here is 718:

grep -a '^type:' /path/to/repository/db/revs/718 | xxd
00000000: 7479 7065 3a20 6669 6c65 0a74 7970 653a  type: file.type:
00000010: 2064 6972 0a74 7970 653a 2064 6972 0a74   dir.type: dir.t
00000020: 7970 653a 2064 6972 0a74 7970 653a 2064  ype: dir.type: d
00000030: 6972 0a74 7970 653a 2064 6972 0a         ir.type: dir.

And 719:

grep -a '^type:' /path/to/repository/db/revs/719 | xxd
00000000: 7479 7065 3a20 6669 6c65 0a74 7970 653a  type: file.type:
00000010: 2064 6972 0a

> Otherwise, could you please confirm that the error you're getting on
> r719 is the same one you posted for r728 (except for the revision
> number, of course)?

Yes, although I just noticed that it is a different file in this case
but same error code:

* Verified revision 718.
* Error verifying revision 719.
svnadmin: E160013: Filesystem path 'trunk/doc/EDITS.txt' is neither a
file nor a directory

I re-ran the svnlook commands on this file and get the same results:

svnlook tree --full-paths --show-ids -r 719 /path/to/repository
/trunk/doc/EDITS.txt
svnlook: E160016: Failure opening '/trunk/doc/EDITS.txt'
svnlook: E160016: '/trunk' is not a directory in filesystem
'6b6eeb5b-1909-4e6c-8b33-d77c8e70ace5'

And on 718:

svnlook tree --full-paths --show-ids -r 718 /path/to/repository
/trunk/doc/EDITS.txt
/trunk/doc/EDITS.txt <166.0.r717/510746>

Regards,
Alan

On Wed, Oct 24, 2018 at 3:08 PM Daniel Shahaf <d....@daniel.shahaf.name> wrote:
>
> Alan Spark wrote on Wed, 24 Oct 2018 08:57 +0100:
> > Hi Daniel,
> >
> > I didn't get anywhere with the perl script. /usr/bin is definitely in
> > the path and that is where svnlook is, and the SVNLOOK variable was
> > already set... anyway I gave up and went for your alternative.
> >
>
> Odd.  My first assumption is that the subshell on line 39 isn't behaving
> as expected.  If you run `svnlook tree --full-paths --show-ids -r 719
> /path/to/repository /trunk/scripts/script.py`, does print a path and a
> node-rev id string (which looks like "a.b.c/d-e")?
>
> > Firstly I must apologise as I stated revision 728 in my original email
> > but it looks like this was on one of our experimental copies of the
> > repository that we no longer have. I do still have a copy of the
> > broken repository at revision 719. With that in mind, I ran this:
> >
> > grep -a '^type:' /path/to/repository/db/revs/719
> > type: file
> > type: dir
> >
> > I also ran on the last working revision:
> >
> > grep -a '^type:' /path/to/repository/db/revs/718
> > type: file
> > type: dir
> > type: dir
> > type: dir
> > type: dir
> > type: dir
> >
>
> These all look correct.  I assume that when you do a normal 'verify'
> run, revisions 1:718 (inclusive) are verified successfully and 719
> errors out, correct?  I.e., it prints "Verifying revision 718" and then
> "Error verifying revision 719".
>
> There's a slim chance that those "type:" lines had a NUL byte tacked on,
> or something else that got lost in the translation to email.  You can
> rule out this remote possibility by piping the grep to xxd(1) (just
> '| xxd', no option flags needed).
>
> Otherwise, could you please confirm that the error you're getting on
> r719 is the same one you posted for r728 (except for the revision
> number, of course)?
>
> > I note that your path had a /0/ after /revs/ but this appears to be empty.
> >
>
> It's normal for that path not to exist, particularly in older repositories.  It
> shouldn't exist empty, but that's harmless.
>
> > I think I have confirmed that our MPM is prefork:
> >
> > a2query -M
> > prefork
> >
> > I hope this helps. Let me know if you need me to check anything else.
>
> Yes, it does.
>
> Thanks,
>
> Daniel

Re: Repository became corrupt on commit

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Alan Spark wrote on Wed, 24 Oct 2018 08:57 +0100:
> Hi Daniel,
> 
> I didn't get anywhere with the perl script. /usr/bin is definitely in
> the path and that is where svnlook is, and the SVNLOOK variable was
> already set... anyway I gave up and went for your alternative.
> 

Odd.  My first assumption is that the subshell on line 39 isn't behaving
as expected.  If you run `svnlook tree --full-paths --show-ids -r 719
/path/to/repository /trunk/scripts/script.py`, does print a path and a
node-rev id string (which looks like "a.b.c/d-e")?

> Firstly I must apologise as I stated revision 728 in my original email
> but it looks like this was on one of our experimental copies of the
> repository that we no longer have. I do still have a copy of the
> broken repository at revision 719. With that in mind, I ran this:
> 
> grep -a '^type:' /path/to/repository/db/revs/719
> type: file
> type: dir
> 
> I also ran on the last working revision:
> 
> grep -a '^type:' /path/to/repository/db/revs/718
> type: file
> type: dir
> type: dir
> type: dir
> type: dir
> type: dir
> 

These all look correct.  I assume that when you do a normal 'verify'
run, revisions 1:718 (inclusive) are verified successfully and 719
errors out, correct?  I.e., it prints "Verifying revision 718" and then
"Error verifying revision 719".

There's a slim chance that those "type:" lines had a NUL byte tacked on,
or something else that got lost in the translation to email.  You can
rule out this remote possibility by piping the grep to xxd(1) (just
'| xxd', no option flags needed).

Otherwise, could you please confirm that the error you're getting on
r719 is the same one you posted for r728 (except for the revision
number, of course)?

> I note that your path had a /0/ after /revs/ but this appears to be empty.
> 

It's normal for that path not to exist, particularly in older repositories.  It
shouldn't exist empty, but that's harmless.

> I think I have confirmed that our MPM is prefork:
> 
> a2query -M
> prefork
> 
> I hope this helps. Let me know if you need me to check anything else.

Yes, it does.

Thanks,

Daniel

Re: Repository became corrupt on commit

Posted by Alan Spark <al...@gmail.com>.
Hi Daniel,

I didn't get anywhere with the perl script. /usr/bin is definitely in
the path and that is where svnlook is, and the SVNLOOK variable was
already set... anyway I gave up and went for your alternative.

Firstly I must apologise as I stated revision 728 in my original email
but it looks like this was on one of our experimental copies of the
repository that we no longer have. I do still have a copy of the
broken repository at revision 719. With that in mind, I ran this:

grep -a '^type:' /path/to/repository/db/revs/719
type: file
type: dir

I also ran on the last working revision:

grep -a '^type:' /path/to/repository/db/revs/718
type: file
type: dir
type: dir
type: dir
type: dir
type: dir

I note that your path had a /0/ after /revs/ but this appears to be empty.

I think I have confirmed that our MPM is prefork:

a2query -M
prefork

I hope this helps. Let me know if you need me to check anything else.

Regards,
Alan
On Tue, Oct 23, 2018 at 3:54 PM Daniel Shahaf <d....@daniel.shahaf.name> wrote:
>
> Daniel Shahaf wrote on Tue, 23 Oct 2018 14:50 +0000:
> > Alan Spark wrote on Tue, 23 Oct 2018 14:41 +0100:
> > > perl dump-noderev.pl /path/to/repository /trunk/scripts/script.py 728
> > > Use of uninitialized value $noderev_id in split at dump-noderev.pl line 41.
> >
> > Sorry about that; the script is obviously missing an error check.
> >
> > Looks like 'svnlook' does not exist in your $PATH.
> >
> > It's not obvious to non Perl speakers, but line 6 does the
> > equivalent of:
> > .
> >     if [ -z "$SVNLOOK" ]; then SVNLOOK=svnlook ; fi
> > .
> > so if you run the script with svn/svnadmin/svnlook in PATH, _or_ with
> > $SVN, $SVNADMIN, $SVNLOOK set to those executables' full paths, then it
> > should work.
>
> The above is the preferred way forward, but if you hit roadblocks with
> that, there _is_ a crude alternative:
>
> % grep -a '^type:' /path/to/repository/db/revs/0/728
>
> This will extract only one field from the rfc822-formatted node-rev
> header.  It's a poor substitute to the Perl script, but better than nothing.
>
> Cheers,
>
> Daniel

Re: Repository became corrupt on commit

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Daniel Shahaf wrote on Tue, 23 Oct 2018 14:50 +0000:
> Alan Spark wrote on Tue, 23 Oct 2018 14:41 +0100:
> > perl dump-noderev.pl /path/to/repository /trunk/scripts/script.py 728
> > Use of uninitialized value $noderev_id in split at dump-noderev.pl line 41.
> 
> Sorry about that; the script is obviously missing an error check.
> 
> Looks like 'svnlook' does not exist in your $PATH.
> 
> It's not obvious to non Perl speakers, but line 6 does the
> equivalent of:
> .
>     if [ -z "$SVNLOOK" ]; then SVNLOOK=svnlook ; fi
> .
> so if you run the script with svn/svnadmin/svnlook in PATH, _or_ with
> $SVN, $SVNADMIN, $SVNLOOK set to those executables' full paths, then it
> should work.

The above is the preferred way forward, but if you hit roadblocks with
that, there _is_ a crude alternative:

% grep -a '^type:' /path/to/repository/db/revs/0/728

This will extract only one field from the rfc822-formatted node-rev
header.  It's a poor substitute to the Perl script, but better than nothing.

Cheers,

Daniel

Re: Repository became corrupt on commit

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Alan Spark wrote on Tue, 23 Oct 2018 14:41 +0100:
> Hi Daniel,
> 
> Sorry, I meant to reply to all. For the benefit of others, here is what I said:
> 
> ---
> Thanks for the response.
> 
> > where dump-noderev.pl is [1]?  The output should be just a few rfc822-like
> > headers specifying the offset and checksum of the file contents representation
> > ("rep").  One of the headers should be "type: file".
> 
> I'll take a look at this as soon as I can.
> 
> > Sorry, I don't follow.  Can you explain again what the contents of the
> > file is and what's its significance?  Has the server been restarted
> > between the two commit attempts?  Is it svnserve (in which mode,
> > -i/-t/-d/-T/service) or mod_dav_svn (which MPM)?
> 
> The file is a python script. It is just a text file. The first time it
> was committed it corrupted the repository. We rolled back to a backup
> and tried to commit the same file and it again broke the repository.
> Only when we deleted, then added exactly the same file did the problem
> go away. So it was reproducible until we rolled back and deleted the
> file.
> 

Thanks, this is important information.

> I'm afraid I'm new to perl so I may be doing something wrong, but I
> checked out the entire infrastructure-puppet repository and ran the
> command from within
> ~/infrastructure-puppet/modules/rootbin_asf/files/bin to get the
> following output:
> 
> perl dump-noderev.pl /path/to/repository /trunk/scripts/script.py 728
> Use of uninitialized value $noderev_id in split at dump-noderev.pl line 41.

Sorry about that; the script is obviously missing an error check.

Looks like 'svnlook' does not exist in your $PATH.

It's not obvious to non Perl speakers, but line 6 does the
equivalent of:
.
    if [ -z "$SVNLOOK" ]; then SVNLOOK=svnlook ; fi
.
so if you run the script with svn/svnadmin/svnlook in PATH, _or_ with
$SVN, $SVNADMIN, $SVNLOOK set to those executables' full paths, then it
should work.

> Is there a simple command that I can run rather than this script?
> 

Not really.  The script dumps an internal data structure (a node-rev is
basically an inode in Subversion's versioned filesystem) and is the easiest way
to do that.  We could probably instrument the C code instead but (a) I don't have
that patch ready, (b) you'd then have to rebuild svn.

> I tried to find out what our MPM is but I am not sure. I ran this command:
> 
> /usr/sbin/apache2 -l
> Compiled in modules:
> 
> I'm not sure if that tells you anything but I don't think it is
> prefork or worker.

It doesn't, I'm afraid.  It looks like you're using a *.so MPM but
that's not proof of _which_ MPM it is.

I'm asking about MPM because that ties to svn's internal caches.  (If
you used prefork, the bug's happening twice would rule out intra-process
caches as a cause.)

Cheers,

Daniel

Re: Repository became corrupt on commit

Posted by Alan Spark <al...@gmail.com>.
Hi Daniel,

Sorry, I meant to reply to all. For the benefit of others, here is what I said:

---
Thanks for the response.

> where dump-noderev.pl is [1]?  The output should be just a few rfc822-like
> headers specifying the offset and checksum of the file contents representation
> ("rep").  One of the headers should be "type: file".

I'll take a look at this as soon as I can.

> Sorry, I don't follow.  Can you explain again what the contents of the
> file is and what's its significance?  Has the server been restarted
> between the two commit attempts?  Is it svnserve (in which mode,
> -i/-t/-d/-T/service) or mod_dav_svn (which MPM)?

The file is a python script. It is just a text file. The first time it
was committed it corrupted the repository. We rolled back to a backup
and tried to commit the same file and it again broke the repository.
Only when we deleted, then added exactly the same file did the problem
go away. So it was reproducible until we rolled back and deleted the
file.

The server has not been restarted. It is mod_dav_dvn. I'm not sure
what you mean by MPM, sorry.
---

I'm afraid I'm new to perl so I may be doing something wrong, but I
checked out the entire infrastructure-puppet repository and ran the
command from within
~/infrastructure-puppet/modules/rootbin_asf/files/bin to get the
following output:

perl dump-noderev.pl /path/to/repository /trunk/scripts/script.py 728
Use of uninitialized value $noderev_id in split at dump-noderev.pl line 41.
Use of uninitialized value $txn_id in pattern match (m//) at
dump-noderev.pl line 42.
Use of uninitialized value $txn_id in pattern match (m//) at
dump-noderev.pl line 43.
Use of uninitialized value $REVISION in division (/) at dump-noderev.pl line 10.
Use of uninitialized value $REVISION in modulus (%) at dump-noderev.pl line 11.
Use of uninitialized value $REVISION in concatenation (.) or string at
dump-noderev.pl line 13.
Use of uninitialized value $REVISION in concatenation (.) or string at
dump-noderev.pl line 15.
Use of uninitialized value $REVISION in concatenation (.) or string at
dump-noderev.pl line 16.
Use of uninitialized value $file_offset in concatenation (.) or string
at dump-noderev.pl line 49.
xxd: 1024: No such file or directory

Is there a simple command that I can run rather than this script?

I tried to find out what our MPM is but I am not sure. I ran this command:

/usr/sbin/apache2 -l
Compiled in modules:
  core.c
  mod_so.c
  mod_watchdog.c
  http_core.c
  mod_log_config.c
  mod_logio.c
  mod_version.c
  mod_unixd.c

I'm not sure if that tells you anything but I don't think it is
prefork or worker.

Regards,
Alan


On Mon, Oct 22, 2018 at 4:54 PM Daniel Shahaf <d....@daniel.shahaf.name> wrote:
>
> Please send that to the list, not just to me. :)
>
> (To save a round trip: an MPM is an httpd thing, see https://httpd.apache.org/docs/current/mpm.html .  It's related to requests are distributed among httpd processes which factors in due to intra-process caches)
>
> Alan Spark wrote on Mon, 22 Oct 2018 16:50 +0100:
> > Hi Daniel,
> >
> > Thanks for the response.
> >
> > > where dump-noderev.pl is [1]?  The output should be just a few rfc822-like
> > > headers specifying the offset and checksum of the file contents representation
> > > ("rep").  One of the headers should be "type: file".
> >
> > I'll take a look at this as soon as I can.
> >
> > > Sorry, I don't follow.  Can you explain again what the contents of the
> > > file is and what's its significance?  Has the server been restarted
> > > between the two commit attempts?  Is it svnserve (in which mode,
> > > -i/-t/-d/-T/service) or mod_dav_svn (which MPM)?
> >
> > The file is a python script. It is just a text file. The first time it
> > was committed it corrupted the repository. We rolled back to a backup
> > and tried to commit the same file and it again broke the repository.
> > Only when we deleted, then added exactly the same file did the problem
> > go away. So it was reproducible until we rolled back and deleted the
> > file.
> >
> > The server has not been restarted. It is mod_dav_dvn. I'm not sure
> > what you mean by MPM, sorry.
> >
> > Regards,
> > Alan
> >
> > On Sat, Oct 20, 2018 at 6:04 PM Daniel Shahaf <d....@daniel.shahaf.name> wrote:
> > >
> > > Alan Spark wrote on Fri, 19 Oct 2018 16:04 +0100:
> > > > * Error verifying revision 728.
> > > > svnadmin: E160013: Filesystem path 'trunk/scripts/script.py' is
> > > > neither a file nor a directory
> > >
> > > E160013 is SVN_ERR_FS_CORRUPT, but this message is a new one.
> > >
> > > Can you share the output of
> > > .
> > >     % dump-noderev.pl ${REPO_DIR} /trunk/scripts/script.py 728
> > > .
> > > where dump-noderev.pl is [1]?  The output should be just a few rfc822-like
> > > headers specifying the offset and checksum of the file contents representation
> > > ("rep").  One of the headers should be "type: file".
> > >
> > > > The content of that file is what had replaced what used to be the trunk
> > > > folder. [...]  Now that we have deleted then re-committed exactly the same
> > > > file, the repository is back to normal and it seems like an unreproducable
> > > > bug at this stage.
> > >
> > > Sorry, I don't follow.  Can you explain again what the contents of the
> > > file is and what's its significance?  Has the server been restarted
> > > between the two commit attempts?  Is it svnserve (in which mode,
> > > -i/-t/-d/-T/service) or mod_dav_svn (which MPM)?
> > >
> > > Cheers,
> > >
> > > Daniel
> > >
> > > [1] https://github.com/apache/infrastructure-puppet/blob/0a97d8e60798656d856bb0521bee76b24fbe5574/modules/rootbin_asf/files/bin/dump-noderev.pl

Re: Repository became corrupt on commit

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Alan Spark wrote on Fri, 19 Oct 2018 16:04 +0100:
> * Error verifying revision 728.
> svnadmin: E160013: Filesystem path 'trunk/scripts/script.py' is
> neither a file nor a directory

E160013 is SVN_ERR_FS_CORRUPT, but this message is a new one.

Can you share the output of
.
    % dump-noderev.pl ${REPO_DIR} /trunk/scripts/script.py 728
.
where dump-noderev.pl is [1]?  The output should be just a few rfc822-like
headers specifying the offset and checksum of the file contents representation
("rep").  One of the headers should be "type: file".

> The content of that file is what had replaced what used to be the trunk
> folder. [...]  Now that we have deleted then re-committed exactly the same
> file, the repository is back to normal and it seems like an unreproducable
> bug at this stage.

Sorry, I don't follow.  Can you explain again what the contents of the
file is and what's its significance?  Has the server been restarted
between the two commit attempts?  Is it svnserve (in which mode,
-i/-t/-d/-T/service) or mod_dav_svn (which MPM)?

Cheers,

Daniel

[1] https://github.com/apache/infrastructure-puppet/blob/0a97d8e60798656d856bb0521bee76b24fbe5574/modules/rootbin_asf/files/bin/dump-noderev.pl