You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by David Glasser <gl...@davidglasser.net> on 2010/03/18 21:39:36 UTC

useful extra check in FS fold_change function

When dealing with some nasty cases in the almost-retired old Google
Code subversion backend, I found a kind of data corruption that the FS
code wasn't catching even though it caught relatively similar issues.
Specifically, find the fold_change function in both of the FS
implementations.  There's a check:

      /* Sanity check: an add, replacement, or reset must be the first
         thing to follow a deletion. */
      if ((old_change->change_kind == svn_fs_path_change_delete)
          && (! ((change->kind == svn_fs_path_change_replace)
                 || (change->kind == svn_fs_path_change_reset)
                 || (change->kind == svn_fs_path_change_add))))
        return svn_error_create
          (SVN_ERR_FS_CORRUPT, NULL,
           _("Invalid change ordering: non-add change on deleted path"));

It might also be nice to check the opposite condition: if change->kind
is add or replace, and old_change is not delete (or reset, I guess),
throw an error.  (We had accidentally recorded the sequence "add,
prop-mod, text-mod" out of order as "prop-mod, add, text-mod", which
was interpreted as an "R" below instead of an "A"; this suggested
check would have caught it earlier.)

(Sorry, but I haven't managed to get a proper Subversion build
environment working in a while or I'd send a real patch.)

--dave

-- 
glasser@davidglasser.net | langtonlabs.org | flickr.com/photos/glasser/

Re: useful extra check in FS fold_change function

Posted by Blair Zajac <bl...@orcaware.com>.
On 03/22/2010 12:56 PM, Hyrum K. Wright wrote:
>
> On Mar 22, 2010, at 1:41 PM, Blair Zajac wrote:
>
>> My name isn't David, but I now have 50+ repositories with 8.7 million revisions between them exposed via a Thrift-like API of svn_fs that anybody can modify the filesystem in any order, so it if a user could make a particular set of modifications that would trigger this, I would feel more comfortable with it being backported if you think there's any change this check would help.
>
> Completely off-topic, but do you have a public description of this API, or some kind of whitepaper or something?  I've been pondering FS backends lately, and am interested in whatever you can share.

Hopefully I'll be able to write something about this soon as we're 
launching a blog.

Blair

Re: useful extra check in FS fold_change function

Posted by "Hyrum K. Wright" <hy...@mail.utexas.edu>.
On Mar 22, 2010, at 1:41 PM, Blair Zajac wrote:

> My name isn't David, but I now have 50+ repositories with 8.7 million revisions between them exposed via a Thrift-like API of svn_fs that anybody can modify the filesystem in any order, so it if a user could make a particular set of modifications that would trigger this, I would feel more comfortable with it being backported if you think there's any change this check would help.

Completely off-topic, but do you have a public description of this API, or some kind of whitepaper or something?  I've been pondering FS backends lately, and am interested in whatever you can share.

-Hyrum

Re: useful extra check in FS fold_change function

Posted by "C. Michael Pilato" <cm...@collab.net>.
Blair Zajac wrote:
> On 03/22/2010 09:46 AM, C. Michael Pilato wrote:
>> C. Michael Pilato wrote:
>>> David Glasser wrote:
>>>>> Is the attached patch what you had in mind?  (Plus similar logic
>>>>> for FSFS,
>>>>> of course.)
>>>> Ah, yes, that's what I meant; that patch looks great, assuming it
>>>> works :)
>>>
>>> I'll try to polish this up, add the FSFS flavor, and add a regression
>>> test
>>> when I get some time.  Thanks for the report and suggested fix.
>>
>> Committed in r926151 and r926167.  I don't feel particularly compelled to
>> backport the changes as the higher-level FS stuffs *should* prevent this
>> scenario from ever occurring anyway.  Do you have an opinion one way
>> or the
>> other, David?
> 
> My name isn't David, but I now have 50+ repositories with 8.7 million
> revisions between them exposed via a Thrift-like API of svn_fs that
> anybody can modify the filesystem in any order, so it if a user could
> make a particular set of modifications that would trigger this, I would
> feel more comfortable with it being backported if you think there's any
> change this check would help.
> 
> If we get a corrupted repository, things would be really bad, as the
> repositories are being used for an asset management system and getting
> 1+ commits/sec, it's not like there's developers who could just wait for
> the software svn repository to be repaired.

Okay.  Proposed for backport.  (I'm still pretty sure that you can't see
this problem if you use the FS API instead of manipulating the filesystem at
lower levels, but...)

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand


Re: useful extra check in FS fold_change function

Posted by Blair Zajac <bl...@orcaware.com>.
On 03/22/2010 09:46 AM, C. Michael Pilato wrote:
> C. Michael Pilato wrote:
>> David Glasser wrote:
>>>> Is the attached patch what you had in mind?  (Plus similar logic for FSFS,
>>>> of course.)
>>> Ah, yes, that's what I meant; that patch looks great, assuming it
>>> works :)
>>
>> I'll try to polish this up, add the FSFS flavor, and add a regression test
>> when I get some time.  Thanks for the report and suggested fix.
>
> Committed in r926151 and r926167.  I don't feel particularly compelled to
> backport the changes as the higher-level FS stuffs *should* prevent this
> scenario from ever occurring anyway.  Do you have an opinion one way or the
> other, David?

My name isn't David, but I now have 50+ repositories with 8.7 million 
revisions between them exposed via a Thrift-like API of svn_fs that 
anybody can modify the filesystem in any order, so it if a user could 
make a particular set of modifications that would trigger this, I would 
feel more comfortable with it being backported if you think there's any 
change this check would help.

If we get a corrupted repository, things would be really bad, as the 
repositories are being used for an asset management system and getting 
1+ commits/sec, it's not like there's developers who could just wait for 
the software svn repository to be repaired.

Regards,
Blair

Re: useful extra check in FS fold_change function

Posted by David Glasser <gl...@davidglasser.net>.
On Mon, Mar 22, 2010 at 9:46 AM, C. Michael Pilato <cm...@collab.net> wrote:
> C. Michael Pilato wrote:
>> David Glasser wrote:
>>>> Is the attached patch what you had in mind?  (Plus similar logic for FSFS,
>>>> of course.)
>>> Ah, yes, that's what I meant; that patch looks great, assuming it
>>> works :)
>>
>> I'll try to polish this up, add the FSFS flavor, and add a regression test
>> when I get some time.  Thanks for the report and suggested fix.
>
> Committed in r926151 and r926167.  I don't feel particularly compelled to
> backport the changes as the higher-level FS stuffs *should* prevent this
> scenario from ever occurring anyway.  Do you have an opinion one way or the
> other, David?

Thanks Mike!

I don't have any reason to believe that this is particularly likely to
occur in general use (as mentioned, we saw this issue with the older
(replaced this week!) Bigtable backend at Google for Bigtable-specific
reasons) but given the existence of the *other* checks already, it
seemed reasonable to throw this one in too.


-- 
glasser@davidglasser.net | langtonlabs.org | flickr.com/photos/glasser/

Re: useful extra check in FS fold_change function

Posted by "C. Michael Pilato" <cm...@collab.net>.
C. Michael Pilato wrote:
> David Glasser wrote:
>>> Is the attached patch what you had in mind?  (Plus similar logic for FSFS,
>>> of course.)
>> Ah, yes, that's what I meant; that patch looks great, assuming it
>> works :)
> 
> I'll try to polish this up, add the FSFS flavor, and add a regression test
> when I get some time.  Thanks for the report and suggested fix.

Committed in r926151 and r926167.  I don't feel particularly compelled to
backport the changes as the higher-level FS stuffs *should* prevent this
scenario from ever occurring anyway.  Do you have an opinion one way or the
other, David?

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand


Re: useful extra check in FS fold_change function

Posted by "C. Michael Pilato" <cm...@collab.net>.
David Glasser wrote:
>> Is the attached patch what you had in mind?  (Plus similar logic for FSFS,
>> of course.)
> 
> Ah, yes, that's what I meant; that patch looks great, assuming it
> works :)

I'll try to polish this up, add the FSFS flavor, and add a regression test
when I get some time.  Thanks for the report and suggested fix.

> I would like to get a working Subversion development
> environment back one of these days, once I re-derive how to work
> around the Debian libtool issue again...

Heh.  Install everything in a subdirectory of /usr/local:

   /usr/local/sqlite
   /usr/local/serf
   /usr/local/apache2
   /usr/local/subversion
   /usr/local/superficial-software-segregation-sucks

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand


Re: useful extra check in FS fold_change function

Posted by Stefan Sperling <st...@elego.de>.
On Fri, Mar 19, 2010 at 06:04:45PM -0700, David Glasser wrote:
> I would like to get a working Subversion development
> environment back one of these days, once I re-derive how to work
> around the Debian libtool issue again...

Compile your own APR libs from source, and svn will use APR's libtool.
That worked well for me on Debian.
See tools/dev/unix-build/Makefile.svn for examples.

Stefan

Re: useful extra check in FS fold_change function

Posted by David Glasser <gl...@davidglasser.net>.
On Fri, Mar 19, 2010 at 10:26 AM, C. Michael Pilato <cm...@collab.net> wrote:
> David Glasser wrote:
>> When dealing with some nasty cases in the almost-retired old Google
>> Code subversion backend, I found a kind of data corruption that the FS
>> code wasn't catching even though it caught relatively similar issues.
>> Specifically, find the fold_change function in both of the FS
>> implementations.  There's a check:
>>
>>       /* Sanity check: an add, replacement, or reset must be the first
>>          thing to follow a deletion. */
>>       if ((old_change->change_kind == svn_fs_path_change_delete)
>>           && (! ((change->kind == svn_fs_path_change_replace)
>>                  || (change->kind == svn_fs_path_change_reset)
>>                  || (change->kind == svn_fs_path_change_add))))
>>         return svn_error_create
>>           (SVN_ERR_FS_CORRUPT, NULL,
>>            _("Invalid change ordering: non-add change on deleted path"));
>>
>> It might also be nice to check the opposite condition: if change->kind
>> is add or replace, and old_change is not delete (or reset, I guess),
>> throw an error.  (We had accidentally recorded the sequence "add,
>> prop-mod, text-mod" out of order as "prop-mod, add, text-mod", which
>> was interpreted as an "R" below instead of an "A"; this suggested
>> check would have caught it earlier.)
>
> Shouldn't this be just: "if change->kind is add, and old_change is not
> delete, throw an error"?  It would be legit for someone to record the
> sequence "add, prop-mod, replace, text-mod", for example.
>
> Is the attached patch what you had in mind?  (Plus similar logic for FSFS,
> of course.)

Ah, yes, that's what I meant; that patch looks great, assuming it
works :)  I would like to get a working Subversion development
environment back one of these days, once I re-derive how to work
around the Debian libtool issue again...

thanks,
dave

> --
> C. Michael Pilato <cm...@collab.net>
> CollabNet   <>   www.collab.net   <>   Distributed Development On Demand
>
> Index: subversion/libsvn_fs_base/bdb/changes-table.c
> ===================================================================
> --- subversion/libsvn_fs_base/bdb/changes-table.c       (revision 924936)
> +++ subversion/libsvn_fs_base/bdb/changes-table.c       (working copy)
> @@ -168,6 +168,15 @@
>           (SVN_ERR_FS_CORRUPT, NULL,
>            _("Invalid change ordering: non-add change on deleted path"));
>
> +      /* Sanity check: an add can't follow anything except
> +         a delete or reset.  */
> +      if ((change->kind == svn_fs_path_change_add)
> +          && (old_change->change_kind != svn_fs_path_change_delete)
> +          && (old_change->change_kind != svn_fs_path_change_reset))
> +        return svn_error_create
> +          (SVN_ERR_FS_CORRUPT, NULL,
> +           _("Invalid change ordering: add change on preexisting path"));
> +
>       /* Now, merge that change in. */
>       switch (change->kind)
>         {
>
>



-- 
glasser@davidglasser.net | langtonlabs.org | flickr.com/photos/glasser/

Re: useful extra check in FS fold_change function

Posted by "C. Michael Pilato" <cm...@collab.net>.
David Glasser wrote:
> When dealing with some nasty cases in the almost-retired old Google
> Code subversion backend, I found a kind of data corruption that the FS
> code wasn't catching even though it caught relatively similar issues.
> Specifically, find the fold_change function in both of the FS
> implementations.  There's a check:
> 
>       /* Sanity check: an add, replacement, or reset must be the first
>          thing to follow a deletion. */
>       if ((old_change->change_kind == svn_fs_path_change_delete)
>           && (! ((change->kind == svn_fs_path_change_replace)
>                  || (change->kind == svn_fs_path_change_reset)
>                  || (change->kind == svn_fs_path_change_add))))
>         return svn_error_create
>           (SVN_ERR_FS_CORRUPT, NULL,
>            _("Invalid change ordering: non-add change on deleted path"));
> 
> It might also be nice to check the opposite condition: if change->kind
> is add or replace, and old_change is not delete (or reset, I guess),
> throw an error.  (We had accidentally recorded the sequence "add,
> prop-mod, text-mod" out of order as "prop-mod, add, text-mod", which
> was interpreted as an "R" below instead of an "A"; this suggested
> check would have caught it earlier.)

Shouldn't this be just: "if change->kind is add, and old_change is not
delete, throw an error"?  It would be legit for someone to record the
sequence "add, prop-mod, replace, text-mod", for example.

Is the attached patch what you had in mind?  (Plus similar logic for FSFS,
of course.)

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand