You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Stefan Sperling <st...@elego.de> on 2017/05/09 09:38:51 UTC

Re: Progress on SHA-1 fixes in patch releases?

On Tue, Apr 18, 2017 at 12:54:20AM +0000, Daniel Shahaf wrote:
> % svnadmin load r2 < dump 
> <<< Started new transaction, based on original revision 1
>      * editing path : shattered-1.pdf ... done.
>      * editing path : shattered-2.pdf ...svnadmin: E200014: Checksum mismatch for '/shattered-2.pdf':
>    expected:  5bd9d8cabc46041579a311230539b8d1
>      actual:  ee4aa52b139d925f8d8884402b0a750c
> 
> zsh: exit 1     svnadmin load r2 < dump
> % 
> ]]]
> 
> That's with 1.9.5.  Is it fixed on trunk now?  I'm not sure whether
> r1785754 addresses that.

Yes, this is fixed on trunk :-)

<<< Started new transaction, based on original revision 3
     * editing path : trunk/shattered-1.pdf ... done.
svnadmin: warning: apr_err=SVN_ERR_FS_AMBIUGOUS_CHECKSUM_REP
svnadmin: warning: W160067: SHA1 of reps '-1 3 381130 422435 5bd9d8cabc46041579a311230539b8d1 38762cf7f55934b34d179ae6a4c80cadccbb7f0a 2-2/_5' and '-1 0 381130 
422435 5bd9d8cabc46041579a311230539b8d1 38762cf7f55934b34d179ae6a4c80cadccbb7f0a 2-2/_5' matches (38762cf7f55934b34d179ae6a4c80cadccbb7f0a) but contents differ
     * editing path : trunk/shattered-2.pdf ... done.

------- Committed revision 3 >>>

Of course, the working copy and RA protocols are still broken.
Editing both files and attempting a commit results in an error:

Sending        shattered-1.pdf
Sending        shattered-2.pdf
Transmitting file data ..subversion/svn/commit-cmd.c:185,
subversion/libsvn_client/commit.c:992,
subversion/libsvn_client/commit.c:156: (apr_err=SVN_ERR_CHECKSUM_MISMATCH)
svn: E200014: Commit failed (details follow):
subversion/libsvn_client/commit.c:904,
subversion/libsvn_client/commit_util.c:1933,
subversion/libsvn_wc/adm_crawler.c:1105,
subversion/libsvn_repos/commit.c:585,
subversion/libsvn_fs/fs-loader.c:1595,
subversion/libsvn_fs_fs/tree.c:3060,
subversion/libsvn_subr/checksum.c:658: (apr_err=SVN_ERR_CHECKSUM_MISMATCH)
svn: E200014: Base checksum mismatch on '/trunk/shattered-2.pdf':
   expected:  ee4aa52b139d925f8d8884402b0a750c
     actual:  5bd9d8cabc46041579a311230539b8d1

I suppose that is acceptable for now. The repository's health has priority.

Re: [PATCH] reject SHA1 collisions (was: Re: Progress on SHA-1 fixes in patch releases?)

Posted by Stefan Sperling <st...@elego.de>.
On Tue, May 09, 2017 at 06:48:03PM +0200, Johan Corveleyn wrote:
> If needed, admins
> can (re-)enable rep-sharing for an existing repository (as long as a
> collision hasn't been committed yet), right?

Sure. However, any content committed while rep-sharing was
disabled will not be considered during collision detection.

Re: [PATCH] reject SHA1 collisions (was: Re: Progress on SHA-1 fixes in patch releases?)

Posted by Stefan Sperling <st...@elego.de>.
On Tue, May 09, 2017 at 12:27:40PM -0400, Mark Phippard wrote:
> From the best I can tell we have no plan on how or when we could support
> this in the working copy.  I have also seen a lot of people express
> interest in the hook scripts to block the sha1 collisions and not any real
> conversation about wanting to fully support storing collisions.  With that
> in mind, why not just simplify this and say we have no plans to provide
> support for supporting two files with the same sha1 and put the code in
> place in 1.9.x to block this from happening.  This removes the need for
> people to add hooks and it solves the problem.

Indeed. I have committed this change (with an updated test) in r1794611.

Re: [PATCH] reject SHA1 collisions (was: Re: Progress on SHA-1 fixes in patch releases?)

Posted by Johan Corveleyn <jc...@gmail.com>.
On Tue, May 9, 2017 at 6:27 PM, Mark Phippard <ma...@gmail.com> wrote:
> On Tue, May 9, 2017 at 12:08 PM, Stefan Sperling <st...@elego.de> wrote:
>>
>> On Tue, May 09, 2017 at 03:44:22PM +0000, Daniel Shahaf wrote:
>> > Stefan Sperling wrote on Tue, May 09, 2017 at 15:25:23 +0200:
>> > > This could be further extended by the config knob to give users a
>> > > choice.
>> > > I don't see a good way of adding such a knob in a patch release,
>> > > though.
>> >
>> > Just give the knob a name that indicates it's not forward compatible?
>> >
>> > For illustration, if the knob in 1.10 will be called "foo", then the
>> > knob in 1.9.6 could be named "SVN_NFC_foo", where the prefix stands for
>> > "svn not forward compatible".
>>
>> For a patch release I would generally prefer a simple solution that
>> does not add knobs. A fix that people can install and forget about
>> is going to be appreciated the most after all the hype and worrying
>> this problem has caused.
>>
>> And I wonder who would really want to tweak such a knob in the first
>> place.
>> People who really wish to store SHA1 collisions in their FSFS repository
>> could just disable rep-sharing, couldn't they?
>
>
>
> From the best I can tell we have no plan on how or when we could support
> this in the working copy.  I have also seen a lot of people express interest
> in the hook scripts to block the sha1 collisions and not any real
> conversation about wanting to fully support storing collisions.  With that
> in mind, why not just simplify this and say we have no plans to provide
> support for supporting two files with the same sha1 and put the code in
> place in 1.9.x to block this from happening.  This removes the need for
> people to add hooks and it solves the problem.
>
> If and when someone can state enough of a need for storing files with the
> same sha1 let them step forward with a proposal and code for how to do this.
> Until then it seems like just flat out blocking this from happening is
> better then a half hearted attempt to support it.
>
> This approach does not need any knobs or hooks.

Yeah, I was pushing on IRC for making it configurable (with some hope
that a future client release might support it, and then a 1.10
repository could be "opened up" if the svn admin wanted to allow
collisions). But I'm reconsidering. I guess you guys are right ...
nobody is really asking today for full support of sha1 collisions. So
better to keep it simple now, and flat out block them (and leave "full
sha1 collision support" for some distant future when someone really
wants / needs it.

And like Stefan said: one can always disable rep-sharing. Speaking of
which: we must make it really clear in docs that this collision
rejection will only work with if rep-sharing is on. If needed, admins
can (re-)enable rep-sharing for an existing repository (as long as a
collision hasn't been committed yet), right?

-- 
Johan

Re: [PATCH] reject SHA1 collisions (was: Re: Progress on SHA-1 fixes in patch releases?)

Posted by Mark Phippard <ma...@gmail.com>.
On Tue, May 9, 2017 at 12:08 PM, Stefan Sperling <st...@elego.de> wrote:

> On Tue, May 09, 2017 at 03:44:22PM +0000, Daniel Shahaf wrote:
> > Stefan Sperling wrote on Tue, May 09, 2017 at 15:25:23 +0200:
> > > This could be further extended by the config knob to give users a
> choice.
> > > I don't see a good way of adding such a knob in a patch release,
> though.
> >
> > Just give the knob a name that indicates it's not forward compatible?
> >
> > For illustration, if the knob in 1.10 will be called "foo", then the
> > knob in 1.9.6 could be named "SVN_NFC_foo", where the prefix stands for
> > "svn not forward compatible".
>
> For a patch release I would generally prefer a simple solution that
> does not add knobs. A fix that people can install and forget about
> is going to be appreciated the most after all the hype and worrying
> this problem has caused.
>
> And I wonder who would really want to tweak such a knob in the first place.
> People who really wish to store SHA1 collisions in their FSFS repository
> could just disable rep-sharing, couldn't they?
>


From the best I can tell we have no plan on how or when we could support
this in the working copy.  I have also seen a lot of people express
interest in the hook scripts to block the sha1 collisions and not any real
conversation about wanting to fully support storing collisions.  With that
in mind, why not just simplify this and say we have no plans to provide
support for supporting two files with the same sha1 and put the code in
place in 1.9.x to block this from happening.  This removes the need for
people to add hooks and it solves the problem.

If and when someone can state enough of a need for storing files with the
same sha1 let them step forward with a proposal and code for how to do
this.  Until then it seems like just flat out blocking this from happening
is better then a half hearted attempt to support it.

This approach does not need any knobs or hooks.


-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

Re: [PATCH] reject SHA1 collisions (was: Re: Progress on SHA-1 fixes in patch releases?)

Posted by Stefan Sperling <st...@elego.de>.
On Tue, May 09, 2017 at 03:44:22PM +0000, Daniel Shahaf wrote:
> Stefan Sperling wrote on Tue, May 09, 2017 at 15:25:23 +0200:
> > This could be further extended by the config knob to give users a choice.
> > I don't see a good way of adding such a knob in a patch release, though.
> 
> Just give the knob a name that indicates it's not forward compatible?
> 
> For illustration, if the knob in 1.10 will be called "foo", then the
> knob in 1.9.6 could be named "SVN_NFC_foo", where the prefix stands for
> "svn not forward compatible".

For a patch release I would generally prefer a simple solution that
does not add knobs. A fix that people can install and forget about
is going to be appreciated the most after all the hype and worrying
this problem has caused.

And I wonder who would really want to tweak such a knob in the first place.
People who really wish to store SHA1 collisions in their FSFS repository
could just disable rep-sharing, couldn't they?

Re: [PATCH] reject SHA1 collisions

Posted by Stefan Fuhrmann <st...@apache.org>.
On 09.05.2017 15:25, Stefan Sperling wrote:
> On Tue, May 09, 2017 at 01:39:49PM +0200, Stefan Sperling wrote:
>> On Tue, May 09, 2017 at 11:38:51AM +0200, Stefan Sperling wrote:
>>> On Tue, Apr 18, 2017 at 12:54:20AM +0000, Daniel Shahaf wrote:
>>>> % svnadmin load r2 < dump
>>>> <<< Started new transaction, based on original revision 1
>>>>       * editing path : shattered-1.pdf ... done.
>>>>       * editing path : shattered-2.pdf ...svnadmin: E200014: Checksum mismatch for '/shattered-2.pdf':
>>>>     expected:  5bd9d8cabc46041579a311230539b8d1
>>>>       actual:  ee4aa52b139d925f8d8884402b0a750c
>>>>
>>>> zsh: exit 1     svnadmin load r2 < dump
>>>> %
>>>> ]]]
>>>>
>>>> That's with 1.9.5.  Is it fixed on trunk now?  I'm not sure whether
>>>> r1785754 addresses that.
>>> Yes, this is fixed on trunk :-)
>> I have now voted for and backported whatever I could to 1.9.x.
>>
>> Please add your votes to the STATUS file as well so we can finally
>> release these fixes (or improve them, if there are still concerns).
> On IRC, Branko and Johan raised concerns about the proposed backport.
>
> The proposed backport allows files with SHA1 collisions into the repository
> and avoids de-duplication of such content by the rep-cache. It fixes the
> integrity problem with the rep-cache but other problems remain.
>
> Clients will still suffer from the consequences of having such files in
> the repository. The RA layer cannot de-duplicate content, the working
> copy's pristine store is broken by such content, etc.
>
> So the question is whether we want to even store such content before all
> parts of the system are ready to handle it. Some users will likely keep
> asking for a way to reject this content because of the ripple effects
> it can cause. Many users may perceive a risk even if "only" working copies
> can be rendered unusable by committing such content, because dealing with
> fallout can be highly disruptive to the actual work people have to get done.
>
> We can block such content from entering the repository with a patch such
> as the one below. This check will only be done if rep-sharing is active.
> But since rep-sharing is active by default this patch will work for most
> of our users.
>
> This could be further extended by the config knob to give users a choice.
> I don't see a good way of adding such a knob in a patch release, though.
> Perhaps that could be done for 1.10.

[Hmpf. After answering on two other threads, I now found
this one and I can toss away those other posts ...]

Yeah, I think the patch gives us the most consistent behavior
with the least added effort. And it keeps the door open for
allowing checksum collisions in the future.

So, +1 from my side. And then let's continue with the out-
standing releases.

> Index: subversion/libsvn_fs_fs/transaction.c
> ===================================================================
> --- subversion/libsvn_fs_fs/transaction.c	(revision 1794541)
> +++ subversion/libsvn_fs_fs/transaction.c	(working copy)
> @@ -2430,12 +2430,10 @@ get_shared_rep(representation_t **old_rep,
>                                         scratch_pool);
>   
>         /* A mismatch should be extremely rare.
> -       * If it does happen, log the event and don't share the rep. */
> +       * If it does happen, reject the commit. */
>         if (!same || err)
>           {
> -          /* SHA1 collision or worse.
> -             We can continue without rep-sharing, but warn.
> -            */
> +          /* SHA1 collision or worse. */
>             svn_stringbuf_t *old_rep_str
>               = svn_fs_fs__unparse_representation(*old_rep,
>                                                   ffd->format, FALSE,
> @@ -2449,15 +2447,11 @@ get_shared_rep(representation_t **old_rep,
>             const char *checksum__str
>               = svn_checksum_to_cstring_display(&checksum, scratch_pool);
>   
> -          err = svn_error_createf(SVN_ERR_FS_AMBIUGOUS_CHECKSUM_REP,
> -                                  err, "SHA1 of reps '%s' and '%s' "
> -                                  "matches (%s) but contents differ",
> -                                  old_rep_str->data, rep_str->data,
> -                                  checksum__str);
> -
> -          (fs->warning)(fs->warning_baton, err);
> -          svn_error_clear(err);
> -          *old_rep = NULL;
> +          return svn_error_createf(SVN_ERR_FS_AMBIUGOUS_CHECKSUM_REP,
> +                                   err, "SHA1 of reps '%s' and '%s' "
> +                                   "matches (%s) but contents differ",
> +                                   old_rep_str->data, rep_str->data,
> +                                   checksum__str);
>           }
>   
>         /* Restore FILE's read / write position. */
>
I have not tested it, but this looks fine,

-- Stefan^2.

[on my way to ApacheCON]


Re: [PATCH] reject SHA1 collisions (was: Re: Progress on SHA-1 fixes in patch releases?)

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Stefan Sperling wrote on Tue, May 09, 2017 at 15:25:23 +0200:
> This could be further extended by the config knob to give users a choice.
> I don't see a good way of adding such a knob in a patch release, though.

Just give the knob a name that indicates it's not forward compatible?

For illustration, if the knob in 1.10 will be called "foo", then the
knob in 1.9.6 could be named "SVN_NFC_foo", where the prefix stands for
"svn not forward compatible".

> Perhaps that could be done for 1.10.

Re: [PATCH] reject SHA1 collisions (was: Re: Progress on SHA-1 fixes in patch releases?)

Posted by Jacek Materna <ja...@assembla.com>.
+1 on rejection.

On Tue, May 9, 2017 at 3:37 PM, Mark Phippard <ma...@gmail.com> wrote:
> On Tue, May 9, 2017 at 9:25 AM, Stefan Sperling <st...@elego.de> wrote:
>>
>> On IRC, Branko and Johan raised concerns about the proposed backport.
>>
>> The proposed backport allows files with SHA1 collisions into the
>> repository
>> and avoids de-duplication of such content by the rep-cache. It fixes the
>> integrity problem with the rep-cache but other problems remain.
>
>
> This approach makes a lot of sense to me.
>
>
>
> --
> Thanks
>
> Mark Phippard
> http://markphip.blogspot.com/



-- 

Jacek Materna
CTO

Assembla
210-410-7661

Re: [PATCH] reject SHA1 collisions (was: Re: Progress on SHA-1 fixes in patch releases?)

Posted by Mark Phippard <ma...@gmail.com>.
On Tue, May 9, 2017 at 9:25 AM, Stefan Sperling <st...@elego.de> wrote:

> On IRC, Branko and Johan raised concerns about the proposed backport.
>
> The proposed backport allows files with SHA1 collisions into the repository
> and avoids de-duplication of such content by the rep-cache. It fixes the
> integrity problem with the rep-cache but other problems remain.
>

This approach makes a lot of sense to me.



-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

[PATCH] reject SHA1 collisions (was: Re: Progress on SHA-1 fixes in patch releases?)

Posted by Stefan Sperling <st...@elego.de>.
On Tue, May 09, 2017 at 01:39:49PM +0200, Stefan Sperling wrote:
> On Tue, May 09, 2017 at 11:38:51AM +0200, Stefan Sperling wrote:
> > On Tue, Apr 18, 2017 at 12:54:20AM +0000, Daniel Shahaf wrote:
> > > % svnadmin load r2 < dump 
> > > <<< Started new transaction, based on original revision 1
> > >      * editing path : shattered-1.pdf ... done.
> > >      * editing path : shattered-2.pdf ...svnadmin: E200014: Checksum mismatch for '/shattered-2.pdf':
> > >    expected:  5bd9d8cabc46041579a311230539b8d1
> > >      actual:  ee4aa52b139d925f8d8884402b0a750c
> > > 
> > > zsh: exit 1     svnadmin load r2 < dump
> > > % 
> > > ]]]
> > > 
> > > That's with 1.9.5.  Is it fixed on trunk now?  I'm not sure whether
> > > r1785754 addresses that.
> > 
> > Yes, this is fixed on trunk :-)
> 
> I have now voted for and backported whatever I could to 1.9.x.
> 
> Please add your votes to the STATUS file as well so we can finally
> release these fixes (or improve them, if there are still concerns).

On IRC, Branko and Johan raised concerns about the proposed backport.

The proposed backport allows files with SHA1 collisions into the repository
and avoids de-duplication of such content by the rep-cache. It fixes the
integrity problem with the rep-cache but other problems remain.

Clients will still suffer from the consequences of having such files in
the repository. The RA layer cannot de-duplicate content, the working
copy's pristine store is broken by such content, etc.

So the question is whether we want to even store such content before all
parts of the system are ready to handle it. Some users will likely keep
asking for a way to reject this content because of the ripple effects
it can cause. Many users may perceive a risk even if "only" working copies
can be rendered unusable by committing such content, because dealing with
fallout can be highly disruptive to the actual work people have to get done.

We can block such content from entering the repository with a patch such
as the one below. This check will only be done if rep-sharing is active.
But since rep-sharing is active by default this patch will work for most
of our users.

This could be further extended by the config knob to give users a choice.
I don't see a good way of adding such a knob in a patch release, though.
Perhaps that could be done for 1.10.

Index: subversion/libsvn_fs_fs/transaction.c
===================================================================
--- subversion/libsvn_fs_fs/transaction.c	(revision 1794541)
+++ subversion/libsvn_fs_fs/transaction.c	(working copy)
@@ -2430,12 +2430,10 @@ get_shared_rep(representation_t **old_rep,
                                       scratch_pool);
 
       /* A mismatch should be extremely rare.
-       * If it does happen, log the event and don't share the rep. */
+       * If it does happen, reject the commit. */
       if (!same || err)
         {
-          /* SHA1 collision or worse.
-             We can continue without rep-sharing, but warn.
-            */
+          /* SHA1 collision or worse. */
           svn_stringbuf_t *old_rep_str
             = svn_fs_fs__unparse_representation(*old_rep,
                                                 ffd->format, FALSE,
@@ -2449,15 +2447,11 @@ get_shared_rep(representation_t **old_rep,
           const char *checksum__str
             = svn_checksum_to_cstring_display(&checksum, scratch_pool);
 
-          err = svn_error_createf(SVN_ERR_FS_AMBIUGOUS_CHECKSUM_REP,
-                                  err, "SHA1 of reps '%s' and '%s' "
-                                  "matches (%s) but contents differ",
-                                  old_rep_str->data, rep_str->data,
-                                  checksum__str);
-
-          (fs->warning)(fs->warning_baton, err);
-          svn_error_clear(err);
-          *old_rep = NULL;
+          return svn_error_createf(SVN_ERR_FS_AMBIUGOUS_CHECKSUM_REP,
+                                   err, "SHA1 of reps '%s' and '%s' "
+                                   "matches (%s) but contents differ",
+                                   old_rep_str->data, rep_str->data,
+                                   checksum__str);
         }
 
       /* Restore FILE's read / write position. */

Re: Progress on SHA-1 fixes in patch releases?

Posted by Stefan Sperling <st...@elego.de>.
On Tue, May 09, 2017 at 11:38:51AM +0200, Stefan Sperling wrote:
> On Tue, Apr 18, 2017 at 12:54:20AM +0000, Daniel Shahaf wrote:
> > % svnadmin load r2 < dump 
> > <<< Started new transaction, based on original revision 1
> >      * editing path : shattered-1.pdf ... done.
> >      * editing path : shattered-2.pdf ...svnadmin: E200014: Checksum mismatch for '/shattered-2.pdf':
> >    expected:  5bd9d8cabc46041579a311230539b8d1
> >      actual:  ee4aa52b139d925f8d8884402b0a750c
> > 
> > zsh: exit 1     svnadmin load r2 < dump
> > % 
> > ]]]
> > 
> > That's with 1.9.5.  Is it fixed on trunk now?  I'm not sure whether
> > r1785754 addresses that.
> 
> Yes, this is fixed on trunk :-)

I have now voted for and backported whatever I could to 1.9.x.

Please add your votes to the STATUS file as well so we can finally
release these fixes (or improve them, if there are still concerns).

Thanks!