You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@subversion.apache.org by hw...@apache.org on 2012/02/08 02:55:55 UTC

svn commit: r1241733 - /subversion/trunk/subversion/libsvn_delta/compat.c

Author: hwright
Date: Wed Feb  8 01:55:54 2012
New Revision: 1241733

URL: http://svn.apache.org/viewvc?rev=1241733&view=rev
Log:
Ev2 shims: Make sure that if we are providing content to alter_file(), we
are also providing a checksum.

* subversion/libsvn_delta/compat.c
  (path_checksum_args): Remove checksum member.
  (process_actions): Checksum the content file.
  (ev2_apply_textdelta): Don't checksum the output stream.

Modified:
    subversion/trunk/subversion/libsvn_delta/compat.c

Modified: subversion/trunk/subversion/libsvn_delta/compat.c
URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_delta/compat.c?rev=1241733&r1=1241732&r2=1241733&view=diff
==============================================================================
--- subversion/trunk/subversion/libsvn_delta/compat.c (original)
+++ subversion/trunk/subversion/libsvn_delta/compat.c Wed Feb  8 01:55:54 2012
@@ -203,7 +203,6 @@ struct path_checksum_args
 {
   const char *path;
   svn_revnum_t base_revision;
-  svn_checksum_t *checksum;
 };
 
 static svn_error_t *
@@ -340,9 +339,10 @@ process_actions(void *edit_baton,
               /* We can only set text on files. */
               kind = svn_node_file;
 
+              SVN_ERR(svn_io_file_checksum2(&checksum, pca->path,
+                                            svn_checksum_sha1, scratch_pool));
               SVN_ERR(svn_stream_open_readonly(&contents, pca->path,
                                                scratch_pool, scratch_pool));
-              checksum = pca->checksum;
 
               if (!SVN_IS_VALID_REVNUM(text_base_revision))
                 text_base_revision = pca->base_revision;
@@ -779,11 +779,6 @@ ev2_apply_textdelta(void *file_baton,
                                  svn_io_file_del_on_pool_cleanup,
                                  fb->eb->edit_pool, result_pool));
 
-  /* Wrap our target with a checksum'ing stream. */
-  target = svn_stream_checksummed2(target, NULL, &pca->checksum,
-                                   svn_checksum_sha1, TRUE,
-                                   fb->eb->edit_pool);
-
   svn_txdelta_apply(source, target,
                     NULL, NULL,
                     handler_pool,



Re: Ev2: providing checksums of files Re: svn commit: r1241733 - /subversion/trunk/subversion/libsvn_delta/compat.c

Posted by Daniel Shahaf <da...@elego.de>.
Hyrum K Wright wrote on Thu, Feb 09, 2012 at 13:16:13 -0600:
> On Thu, Feb 9, 2012 at 1:05 PM, Daniel Shahaf <da...@elego.de> wrote:
> > Hyrum K Wright wrote on Thu, Feb 09, 2012 at 12:54:50 -0600:
> >> (I'll also note that we actually *do* have a checksum by this point,
> >> only it is the md5 provided by close_file(), and Ev2 uses sha1s
> >> exclusively, so we have to recalculate.  I suspect this will be a
> >
> > Only sha1?  It allows neither md5 nor sha256?  Why?
> 
> "Convention" would be my initial response.  It makes much more sense
> to standardize checksum kinds across the system than to have a mixed
> environment.  Any checksum can help detect corruption, but being able
> to answer the question "is this content the same as that content?" is
> much more difficult in a mixed-checksum environment.  sha1 felt like a

It's hardly impossible.  It shouldn't take much effort to whip up an
SQLite table that maps sha1's to sha3's (both in the wc layer and the FS
layer).

> reasonable choice[1].
> 
> I could  be overestimating the negatives, though.  What reasons do you
> suppose we *should* allow arbitrary checksum types?
> 

I didn't say "arbitrary"; but hard-wiring a specific checksum algorithm
into our APIs seems wrong.  Suppose Ev2 is in 1.8 and then 1.9 clients
want to use sha4 with 1.9 servers.  Does the editor interface allow for
that?

I'm assuming we could mandate sha1 support but allow driver/receiver to
negotiate on using a different algorithm.

> -Hyrum
> 
> [1] Though, given the timeline of Ev2 and the proposed timeline for
> sha3, I'm really interested in the possibility of using sha3 when
> standardized, throughout *all* of Subversion.  (But that's a
> completely separate discussion.)
> 
> 
> -- 
> 
> uberSVN: Apache Subversion Made Easy
> http://www.uberSVN.com/

Re: Ev2: providing checksums of files Re: svn commit: r1241733 - /subversion/trunk/subversion/libsvn_delta/compat.c

Posted by Daniel Shahaf <da...@elego.de>.
Hyrum K Wright wrote on Thu, Feb 09, 2012 at 13:16:13 -0600:
> On Thu, Feb 9, 2012 at 1:05 PM, Daniel Shahaf <da...@elego.de> wrote:
> > Hyrum K Wright wrote on Thu, Feb 09, 2012 at 12:54:50 -0600:
> >> (I'll also note that we actually *do* have a checksum by this point,
> >> only it is the md5 provided by close_file(), and Ev2 uses sha1s
> >> exclusively, so we have to recalculate.  I suspect this will be a
> >
> > Only sha1?  It allows neither md5 nor sha256?  Why?
> 
> "Convention" would be my initial response.  It makes much more sense
> to standardize checksum kinds across the system than to have a mixed
> environment.  Any checksum can help detect corruption, but being able
> to answer the question "is this content the same as that content?" is
> much more difficult in a mixed-checksum environment.  sha1 felt like a

It's hardly impossible.  It shouldn't take much effort to whip up an
SQLite table that maps sha1's to sha3's (both in the wc layer and the FS
layer).

> reasonable choice[1].
> 
> I could  be overestimating the negatives, though.  What reasons do you
> suppose we *should* allow arbitrary checksum types?
> 

I didn't say "arbitrary"; but hard-wiring a specific checksum algorithm
into our APIs seems wrong.  Suppose Ev2 is in 1.8 and then 1.9 clients
want to use sha4 with 1.9 servers.  Does the editor interface allow for
that?

I'm assuming we could mandate sha1 support but allow driver/receiver to
negotiate on using a different algorithm.

> -Hyrum
> 
> [1] Though, given the timeline of Ev2 and the proposed timeline for
> sha3, I'm really interested in the possibility of using sha3 when
> standardized, throughout *all* of Subversion.  (But that's a
> completely separate discussion.)
> 
> 
> -- 
> 
> uberSVN: Apache Subversion Made Easy
> http://www.uberSVN.com/

Re: Ev2: providing checksums of files Re: svn commit: r1241733 - /subversion/trunk/subversion/libsvn_delta/compat.c

Posted by Hyrum K Wright <hy...@wandisco.com>.
On Thu, Feb 9, 2012 at 1:05 PM, Daniel Shahaf <da...@elego.de> wrote:
> Hyrum K Wright wrote on Thu, Feb 09, 2012 at 12:54:50 -0600:
>> (I'll also note that we actually *do* have a checksum by this point,
>> only it is the md5 provided by close_file(), and Ev2 uses sha1s
>> exclusively, so we have to recalculate.  I suspect this will be a
>
> Only sha1?  It allows neither md5 nor sha256?  Why?

"Convention" would be my initial response.  It makes much more sense
to standardize checksum kinds across the system than to have a mixed
environment.  Any checksum can help detect corruption, but being able
to answer the question "is this content the same as that content?" is
much more difficult in a mixed-checksum environment.  sha1 felt like a
reasonable choice[1].

I could  be overestimating the negatives, though.  What reasons do you
suppose we *should* allow arbitrary checksum types?

-Hyrum

[1] Though, given the timeline of Ev2 and the proposed timeline for
sha3, I'm really interested in the possibility of using sha3 when
standardized, throughout *all* of Subversion.  (But that's a
completely separate discussion.)


-- 

uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com/

Re: Ev2: providing checksums of files Re: svn commit: r1241733 - /subversion/trunk/subversion/libsvn_delta/compat.c

Posted by Hyrum K Wright <hy...@wandisco.com>.
On Thu, Feb 9, 2012 at 1:05 PM, Daniel Shahaf <da...@elego.de> wrote:
> Hyrum K Wright wrote on Thu, Feb 09, 2012 at 12:54:50 -0600:
>> (I'll also note that we actually *do* have a checksum by this point,
>> only it is the md5 provided by close_file(), and Ev2 uses sha1s
>> exclusively, so we have to recalculate.  I suspect this will be a
>
> Only sha1?  It allows neither md5 nor sha256?  Why?

"Convention" would be my initial response.  It makes much more sense
to standardize checksum kinds across the system than to have a mixed
environment.  Any checksum can help detect corruption, but being able
to answer the question "is this content the same as that content?" is
much more difficult in a mixed-checksum environment.  sha1 felt like a
reasonable choice[1].

I could  be overestimating the negatives, though.  What reasons do you
suppose we *should* allow arbitrary checksum types?

-Hyrum

[1] Though, given the timeline of Ev2 and the proposed timeline for
sha3, I'm really interested in the possibility of using sha3 when
standardized, throughout *all* of Subversion.  (But that's a
completely separate discussion.)


-- 

uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com/

Re: Ev2: providing checksums of files Re: svn commit: r1241733 - /subversion/trunk/subversion/libsvn_delta/compat.c

Posted by Daniel Shahaf <da...@elego.de>.
Hyrum K Wright wrote on Thu, Feb 09, 2012 at 12:54:50 -0600:
> (I'll also note that we actually *do* have a checksum by this point,
> only it is the md5 provided by close_file(), and Ev2 uses sha1s
> exclusively, so we have to recalculate.  I suspect this will be a

Only sha1?  It allows neither md5 nor sha256?  Why?

> somewhat common issue while we use a mixed shim environment.)

Re: Ev2: providing checksums of files Re: svn commit: r1241733 - /subversion/trunk/subversion/libsvn_delta/compat.c

Posted by Daniel Shahaf <da...@elego.de>.
Hyrum K Wright wrote on Thu, Feb 09, 2012 at 12:54:50 -0600:
> (I'll also note that we actually *do* have a checksum by this point,
> only it is the md5 provided by close_file(), and Ev2 uses sha1s
> exclusively, so we have to recalculate.  I suspect this will be a

Only sha1?  It allows neither md5 nor sha256?  Why?

> somewhat common issue while we use a mixed shim environment.)

Re: Ev2: providing checksums of files Re: svn commit: r1241733 - /subversion/trunk/subversion/libsvn_delta/compat.c

Posted by Hyrum K Wright <hy...@wandisco.com>.
On Thu, Feb 9, 2012 at 12:24 PM, Daniel Shahaf <da...@elego.de> wrote:
> hwright@apache.org wrote on Wed, Feb 08, 2012 at 01:55:55 -0000:
>> Author: hwright
>> Date: Wed Feb  8 01:55:54 2012
>> New Revision: 1241733
>>
>> URL: http://svn.apache.org/viewvc?rev=1241733&view=rev
>> Log:
>> Ev2 shims: Make sure that if we are providing content to alter_file(), we
>> are also providing a checksum.
>>
>> * subversion/libsvn_delta/compat.c
>>   (path_checksum_args): Remove checksum member.
>>   (process_actions): Checksum the content file.
>>   (ev2_apply_textdelta): Don't checksum the output stream.
>>
>> Modified:
>>     subversion/trunk/subversion/libsvn_delta/compat.c
>>
>> Modified: subversion/trunk/subversion/libsvn_delta/compat.c
>> URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_delta/compat.c?rev=1241733&r1=1241732&r2=1241733&view=diff
>> ==============================================================================
>> --- subversion/trunk/subversion/libsvn_delta/compat.c (original)
>> +++ subversion/trunk/subversion/libsvn_delta/compat.c Wed Feb  8 01:55:54 2012
>> @@ -203,7 +203,6 @@ struct path_checksum_args
>>  {
>>    const char *path;
>>    svn_revnum_t base_revision;
>> -  svn_checksum_t *checksum;
>>  };
>>
>>  static svn_error_t *
>> @@ -340,9 +339,10 @@ process_actions(void *edit_baton,
>>                /* We can only set text on files. */
>>                kind = svn_node_file;
>>
>> +              SVN_ERR(svn_io_file_checksum2(&checksum, pca->path,
>> +                                            svn_checksum_sha1, scratch_pool));
>>                SVN_ERR(svn_stream_open_readonly(&contents, pca->path,
>>                                                 scratch_pool, scratch_pool));
>
> So, you first open the file and read it through to checksum it, and then
> open it again to read it through as a stream.
>
> That's because right now add_file()/alter_file() require the checksum to
> be provided up front.
>
> Would it make sense to relax that requirement?  On the one hand,
> many receivers don't need to know the checksum until after they've
> received the whole file (eg, the fs txn process and the pristine store);
> on the other hand, having the checksum will allow in some cases not
> transmitting the file over the wire at all, and will allow for
> error-checking the transmission in other cases.

Agreed that the current approach taken by the shim is wasteful, though
other attempts were a bit dodgy, and this one Just Works.  If there
are better ways to get this checksum, let's use 'em.

I'm not certain about relaxing the requirement just yet.  I'll have to
think about the design implications.

(I'll also note that we actually *do* have a checksum by this point,
only it is the md5 provided by close_file(), and Ev2 uses sha1s
exclusively, so we have to recalculate.  I suspect this will be a
somewhat common issue while we use a mixed shim environment.)

-Hyrum


-- 

uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com/

Re: Ev2: providing checksums of files Re: svn commit: r1241733 - /subversion/trunk/subversion/libsvn_delta/compat.c

Posted by Hyrum K Wright <hy...@wandisco.com>.
On Thu, Feb 9, 2012 at 12:24 PM, Daniel Shahaf <da...@elego.de> wrote:
> hwright@apache.org wrote on Wed, Feb 08, 2012 at 01:55:55 -0000:
>> Author: hwright
>> Date: Wed Feb  8 01:55:54 2012
>> New Revision: 1241733
>>
>> URL: http://svn.apache.org/viewvc?rev=1241733&view=rev
>> Log:
>> Ev2 shims: Make sure that if we are providing content to alter_file(), we
>> are also providing a checksum.
>>
>> * subversion/libsvn_delta/compat.c
>>   (path_checksum_args): Remove checksum member.
>>   (process_actions): Checksum the content file.
>>   (ev2_apply_textdelta): Don't checksum the output stream.
>>
>> Modified:
>>     subversion/trunk/subversion/libsvn_delta/compat.c
>>
>> Modified: subversion/trunk/subversion/libsvn_delta/compat.c
>> URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_delta/compat.c?rev=1241733&r1=1241732&r2=1241733&view=diff
>> ==============================================================================
>> --- subversion/trunk/subversion/libsvn_delta/compat.c (original)
>> +++ subversion/trunk/subversion/libsvn_delta/compat.c Wed Feb  8 01:55:54 2012
>> @@ -203,7 +203,6 @@ struct path_checksum_args
>>  {
>>    const char *path;
>>    svn_revnum_t base_revision;
>> -  svn_checksum_t *checksum;
>>  };
>>
>>  static svn_error_t *
>> @@ -340,9 +339,10 @@ process_actions(void *edit_baton,
>>                /* We can only set text on files. */
>>                kind = svn_node_file;
>>
>> +              SVN_ERR(svn_io_file_checksum2(&checksum, pca->path,
>> +                                            svn_checksum_sha1, scratch_pool));
>>                SVN_ERR(svn_stream_open_readonly(&contents, pca->path,
>>                                                 scratch_pool, scratch_pool));
>
> So, you first open the file and read it through to checksum it, and then
> open it again to read it through as a stream.
>
> That's because right now add_file()/alter_file() require the checksum to
> be provided up front.
>
> Would it make sense to relax that requirement?  On the one hand,
> many receivers don't need to know the checksum until after they've
> received the whole file (eg, the fs txn process and the pristine store);
> on the other hand, having the checksum will allow in some cases not
> transmitting the file over the wire at all, and will allow for
> error-checking the transmission in other cases.

Agreed that the current approach taken by the shim is wasteful, though
other attempts were a bit dodgy, and this one Just Works.  If there
are better ways to get this checksum, let's use 'em.

I'm not certain about relaxing the requirement just yet.  I'll have to
think about the design implications.

(I'll also note that we actually *do* have a checksum by this point,
only it is the md5 provided by close_file(), and Ev2 uses sha1s
exclusively, so we have to recalculate.  I suspect this will be a
somewhat common issue while we use a mixed shim environment.)

-Hyrum


-- 

uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com/

Ev2: providing checksums of files Re: svn commit: r1241733 - /subversion/trunk/subversion/libsvn_delta/compat.c

Posted by Daniel Shahaf <da...@elego.de>.
hwright@apache.org wrote on Wed, Feb 08, 2012 at 01:55:55 -0000:
> Author: hwright
> Date: Wed Feb  8 01:55:54 2012
> New Revision: 1241733
> 
> URL: http://svn.apache.org/viewvc?rev=1241733&view=rev
> Log:
> Ev2 shims: Make sure that if we are providing content to alter_file(), we
> are also providing a checksum.
> 
> * subversion/libsvn_delta/compat.c
>   (path_checksum_args): Remove checksum member.
>   (process_actions): Checksum the content file.
>   (ev2_apply_textdelta): Don't checksum the output stream.
> 
> Modified:
>     subversion/trunk/subversion/libsvn_delta/compat.c
> 
> Modified: subversion/trunk/subversion/libsvn_delta/compat.c
> URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_delta/compat.c?rev=1241733&r1=1241732&r2=1241733&view=diff
> ==============================================================================
> --- subversion/trunk/subversion/libsvn_delta/compat.c (original)
> +++ subversion/trunk/subversion/libsvn_delta/compat.c Wed Feb  8 01:55:54 2012
> @@ -203,7 +203,6 @@ struct path_checksum_args
>  {
>    const char *path;
>    svn_revnum_t base_revision;
> -  svn_checksum_t *checksum;
>  };
>  
>  static svn_error_t *
> @@ -340,9 +339,10 @@ process_actions(void *edit_baton,
>                /* We can only set text on files. */
>                kind = svn_node_file;
>  
> +              SVN_ERR(svn_io_file_checksum2(&checksum, pca->path,
> +                                            svn_checksum_sha1, scratch_pool));
>                SVN_ERR(svn_stream_open_readonly(&contents, pca->path,
>                                                 scratch_pool, scratch_pool));

So, you first open the file and read it through to checksum it, and then
open it again to read it through as a stream.

That's because right now add_file()/alter_file() require the checksum to
be provided up front.

Would it make sense to relax that requirement?  On the one hand,
many receivers don't need to know the checksum until after they've
received the whole file (eg, the fs txn process and the pristine store);
on the other hand, having the checksum will allow in some cases not
transmitting the file over the wire at all, and will allow for
error-checking the transmission in other cases.

Daniel
(some of the points above by Greg and/or Hyrum on IRC)

> -              checksum = pca->checksum;
>  
>                if (!SVN_IS_VALID_REVNUM(text_base_revision))
>                  text_base_revision = pca->base_revision;
> @@ -779,11 +779,6 @@ ev2_apply_textdelta(void *file_baton,
>                                   svn_io_file_del_on_pool_cleanup,
>                                   fb->eb->edit_pool, result_pool));
>  
> -  /* Wrap our target with a checksum'ing stream. */
> -  target = svn_stream_checksummed2(target, NULL, &pca->checksum,
> -                                   svn_checksum_sha1, TRUE,
> -                                   fb->eb->edit_pool);
> -
>    svn_txdelta_apply(source, target,
>                      NULL, NULL,
>                      handler_pool,
> 
> 

Ev2: providing checksums of files Re: svn commit: r1241733 - /subversion/trunk/subversion/libsvn_delta/compat.c

Posted by Daniel Shahaf <da...@elego.de>.
hwright@apache.org wrote on Wed, Feb 08, 2012 at 01:55:55 -0000:
> Author: hwright
> Date: Wed Feb  8 01:55:54 2012
> New Revision: 1241733
> 
> URL: http://svn.apache.org/viewvc?rev=1241733&view=rev
> Log:
> Ev2 shims: Make sure that if we are providing content to alter_file(), we
> are also providing a checksum.
> 
> * subversion/libsvn_delta/compat.c
>   (path_checksum_args): Remove checksum member.
>   (process_actions): Checksum the content file.
>   (ev2_apply_textdelta): Don't checksum the output stream.
> 
> Modified:
>     subversion/trunk/subversion/libsvn_delta/compat.c
> 
> Modified: subversion/trunk/subversion/libsvn_delta/compat.c
> URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_delta/compat.c?rev=1241733&r1=1241732&r2=1241733&view=diff
> ==============================================================================
> --- subversion/trunk/subversion/libsvn_delta/compat.c (original)
> +++ subversion/trunk/subversion/libsvn_delta/compat.c Wed Feb  8 01:55:54 2012
> @@ -203,7 +203,6 @@ struct path_checksum_args
>  {
>    const char *path;
>    svn_revnum_t base_revision;
> -  svn_checksum_t *checksum;
>  };
>  
>  static svn_error_t *
> @@ -340,9 +339,10 @@ process_actions(void *edit_baton,
>                /* We can only set text on files. */
>                kind = svn_node_file;
>  
> +              SVN_ERR(svn_io_file_checksum2(&checksum, pca->path,
> +                                            svn_checksum_sha1, scratch_pool));
>                SVN_ERR(svn_stream_open_readonly(&contents, pca->path,
>                                                 scratch_pool, scratch_pool));

So, you first open the file and read it through to checksum it, and then
open it again to read it through as a stream.

That's because right now add_file()/alter_file() require the checksum to
be provided up front.

Would it make sense to relax that requirement?  On the one hand,
many receivers don't need to know the checksum until after they've
received the whole file (eg, the fs txn process and the pristine store);
on the other hand, having the checksum will allow in some cases not
transmitting the file over the wire at all, and will allow for
error-checking the transmission in other cases.

Daniel
(some of the points above by Greg and/or Hyrum on IRC)

> -              checksum = pca->checksum;
>  
>                if (!SVN_IS_VALID_REVNUM(text_base_revision))
>                  text_base_revision = pca->base_revision;
> @@ -779,11 +779,6 @@ ev2_apply_textdelta(void *file_baton,
>                                   svn_io_file_del_on_pool_cleanup,
>                                   fb->eb->edit_pool, result_pool));
>  
> -  /* Wrap our target with a checksum'ing stream. */
> -  target = svn_stream_checksummed2(target, NULL, &pca->checksum,
> -                                   svn_checksum_sha1, TRUE,
> -                                   fb->eb->edit_pool);
> -
>    svn_txdelta_apply(source, target,
>                      NULL, NULL,
>                      handler_pool,
> 
>