You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Greg Hudson <gh...@MIT.EDU> on 2004/04/23 04:17:30 UTC

RFC: MD5sums in svn_txdelta interface

When Karl and company added MD5 support to Subversion, one of the
things they did was stick MD5 support into svn_txdelta():

  /** Return the @a md5 digest for the complete fulltext deltified by
   * @a stream, or @c NULL if @a stream has not yet returned its final
   * @c NULL window.  The digest is allocated in the same memory as @a
   * STREAM.
   */
  const unsigned char *svn_txdelta_md5_digest (svn_txdelta_stream_t *stream);

It's my belief (a couple of years too late) that (1) this doc string
is pretty confusing (what is actually checksummed is the target data
read during the delta operation; I don't know how that translates into
"the complete fulltext deltified by STREAM"), and more importantly (2)
that MD5 checksums have nothing to do with delta computation and don't
belong in the same interface.

So I'm thinking that in the svn 2.0 API, we should rip out this
interface and instead introduce an "MD5 metastream" which calculates
MD5 digests as data is read from or written to an underlying stream.

I mention this now because I'm about to commit code to add a new
interface:

  /** Return a writable stream which, when fed target data, will send
   * delta windows to @a handler/@a handler_baton which transform the
   * data in @a source to the target data.  As usual, the window handler
   * will receive a NULL window to signify the end of the window stream.
   * The stream handler functions will read data from @a source as
   * necessary.
   */
  svn_stream_t *svn_txdelta_target_push (svn_txdelta_window_handler_t handler,
                                         void *handler_baton,
                                         svn_stream_t *source,
                                         apr_pool_t *pool);

If people agree with me about MD5 digests not belonging in the
text-delta interface, then this signature should be fine, and it will
be the caller's responsibility to compute checksums on the data before
writing it to the target-push stream, because that's the direction
we're eventually heading in anyway.  On the other hand, if people
disagree with me, then for symmetry, svn_txdelta_target_push() should
accept an optional argument for where an MD5 digest will be stored
upon closure of the target-push stream.  (Or a callback, or whatever
is de rigeur these days.)

Of course, it's also possible that people disagree with me on the
whole concept of svn_txdelta_target_push, in which case my commit can
be backed out and the issue will be rendered temporarily moot.  I want
this new function so that the FSFS implementation of
svn_fs_apply_text() can generate a delta on the fly as contents data
are pushed by the caller; I think that's a reasonable thing to want.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: RFC: MD5sums in svn_txdelta interface

Posted by kf...@collab.net.
Greg Hudson <gh...@MIT.EDU> writes:
> On Wed, 2004-07-14 at 15:36, kfogel@collab.net wrote:
> > Greg, is this proposal filed or recorded anywhere for 2.0?  If not,
> > would you like me to file an issue?
> 
> You filed it in #1882 back in May.

*mrmph*

Thanks.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: RFC: MD5sums in svn_txdelta interface

Posted by Greg Hudson <gh...@MIT.EDU>.
On Wed, 2004-07-14 at 15:36, kfogel@collab.net wrote:
> Greg, is this proposal filed or recorded anywhere for 2.0?  If not,
> would you like me to file an issue?

You filed it in #1882 back in May.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: RFC: MD5sums in svn_txdelta interface

Posted by kf...@collab.net.
Greg, is this proposal filed or recorded anywhere for 2.0?  If not,
would you like me to file an issue?

-Karl

Greg Hudson <gh...@MIT.EDU> writes:
> When Karl and company added MD5 support to Subversion, one of the
> things they did was stick MD5 support into svn_txdelta():
> 
>   /** Return the @a md5 digest for the complete fulltext deltified by
>    * @a stream, or @c NULL if @a stream has not yet returned its final
>    * @c NULL window.  The digest is allocated in the same memory as @a
>    * STREAM.
>    */
>   const unsigned char *svn_txdelta_md5_digest (svn_txdelta_stream_t *stream);
> 
> It's my belief (a couple of years too late) that (1) this doc string
> is pretty confusing (what is actually checksummed is the target data
> read during the delta operation; I don't know how that translates into
> "the complete fulltext deltified by STREAM"), and more importantly (2)
> that MD5 checksums have nothing to do with delta computation and don't
> belong in the same interface.
> 
> So I'm thinking that in the svn 2.0 API, we should rip out this
> interface and instead introduce an "MD5 metastream" which calculates
> MD5 digests as data is read from or written to an underlying stream.
> 
> I mention this now because I'm about to commit code to add a new
> interface:
> 
>   /** Return a writable stream which, when fed target data, will send
>    * delta windows to @a handler/@a handler_baton which transform the
>    * data in @a source to the target data.  As usual, the window handler
>    * will receive a NULL window to signify the end of the window stream.
>    * The stream handler functions will read data from @a source as
>    * necessary.
>    */
>   svn_stream_t *svn_txdelta_target_push (svn_txdelta_window_handler_t handler,
>                                          void *handler_baton,
>                                          svn_stream_t *source,
>                                          apr_pool_t *pool);
> 
> If people agree with me about MD5 digests not belonging in the
> text-delta interface, then this signature should be fine, and it will
> be the caller's responsibility to compute checksums on the data before
> writing it to the target-push stream, because that's the direction
> we're eventually heading in anyway.  On the other hand, if people
> disagree with me, then for symmetry, svn_txdelta_target_push() should
> accept an optional argument for where an MD5 digest will be stored
> upon closure of the target-push stream.  (Or a callback, or whatever
> is de rigeur these days.)
> 
> Of course, it's also possible that people disagree with me on the
> whole concept of svn_txdelta_target_push, in which case my commit can
> be backed out and the issue will be rendered temporarily moot.  I want
> this new function so that the FSFS implementation of
> svn_fs_apply_text() can generate a delta on the fly as contents data
> are pushed by the caller; I think that's a reasonable thing to want.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: dev-help@subversion.tigris.org

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org