You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@subversion.apache.org by ko...@apache.org on 2017/07/27 09:00:43 UTC

svn commit: r1803143 - in /subversion/trunk/subversion: include/ libsvn_delta/ libsvn_ra_serf/ mod_dav_svn/

Author: kotkov
Date: Thu Jul 27 09:00:43 2017
New Revision: 1803143

URL: http://svn.apache.org/viewvc?rev=1803143&view=rev
Log:
Lay the groundwork that would allow ra_serf to stream svndiff deltas
without creating temporary files when working against new servers.

This patch adds a new delta editor callback, apply_textdelta_stream()
that we'll use to stream the deltas, fully implements in the ra_serf's
commit editor, but doesn't yet change the behavior of existing editor
drivers, such as svn_client_import5() or svn_client_commit6().

This requires a minor tweak to the Subversion's HTTP protocol, and
it's the reason why streaming would only work against new servers.

Currently, all PUT requests include a special header that contains the
result checksum, which is used by the server to validate the integrity
of the result after it applies the delta received over the wire.  While
this approach works fine if the client first creates a temporary file with
the delta and only then starts sending it to the server (the result checksum
is calculated while preparing the temporary file), it can't be used in the
stream approach, as with it we'd need to know the result checksum _before_
we start sending data.

So we turn the existing scheme inside out, and teach mod_dav_svn to send the
result checksum in the responses to PUT requests.  Then, the client would
check if the received checksum matches what it calculated, and, possibly,
return a checksum mismatch error (thus aborting the edit and the transaction).

* subversion/include/svn_delta.h
  (svn_txdelta_stream_open_func_t): New callback type.
  (svn_delta_editor_t): Add a forward declaration for this type.
  (svn_delta_editor_t.apply_textdelta_stream): New vtable member.

* subversion/libsvn_delta/default_editor.c
  (apply_textdelta_stream): Provide default implementation for this callback
   that performs a fallback to apply_textdelta().
  (default_editor): Extend the default instance of an svn_delta_editor_t.

* subversion/libsvn_delta/cancel.c
  (apply_textdelta_stream): Implement this forwarding callback, and ...
  (svn_delta_get_cancellation_editor): ...install it here.

* subversion/include/svn_dav.h
  (SVN_DAV_NS_DAV_SVN_PUT_RESULT_CHECKSUM): New.

* subversion/mod_dav_svn/repos.c
  (close_stream): Send the "X-SVN-Result-Fulltext-MD5" header when responding
   to successful PUT requests.

* subversion/mod_dav_svn/version.c
  (get_vsn_options): Advertise new capability.

* subversion/libsvn_ra_serf/ra_serf.h
  (svn_ra_serf__session_t.supports_put_result_checksum): New field.

* subversion/libsvn_ra_serf/options.c
  (capabilities_headers_iterator_callback): Remember if the server sends
   the result checksum in the response to a successful PUT request.

* subversion/libsvn_ra_serf/commit.c
  (file_context_t.svndiff_sent): New field.
  (file_context_t.remote_result_checksum): New field.
  (apply_textdelta): Update the comment stating that it would be nice to
   get rid of temporary files for svndiff delta.  Factor out the svndiff
   format selection logic ...
  (negotiate_put_encoding): ...into this new helper function.
  (open_txdelta_baton_t): New.
  (txdelta_stream_errfunc): New error function for the stream bucket.
  (create_body_from_txdelta_stream): New svn_ra_serf__request_body_delegate_t
   that creates the request body by opening an svn_txdelta_stream_t, turning
   it into svn_stream_t and giving away a bucket wrapping around that stream.
   Use it in ...
  (apply_textdelta_stream): ...this new function that performs a PUT, and
   streams the request body.
  (put_response_ctx_t): New.
  (put_response_handler): New, remembers the result checksum we received
   from the server.
  (close_file): Don't do a PUT if we did it in apply_textdelta_stream().
   Check for a checksum mismatch using the checksum returned from the server.
  (svn_ra_serf__get_commit_editor): Install the new apply_textdelta_stream()
   callback when working against new servers.

Modified:
    subversion/trunk/subversion/include/svn_dav.h
    subversion/trunk/subversion/include/svn_delta.h
    subversion/trunk/subversion/libsvn_delta/cancel.c
    subversion/trunk/subversion/libsvn_delta/default_editor.c
    subversion/trunk/subversion/libsvn_ra_serf/commit.c
    subversion/trunk/subversion/libsvn_ra_serf/options.c
    subversion/trunk/subversion/libsvn_ra_serf/ra_serf.h
    subversion/trunk/subversion/mod_dav_svn/repos.c
    subversion/trunk/subversion/mod_dav_svn/version.c

Modified: subversion/trunk/subversion/include/svn_dav.h
URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/include/svn_dav.h?rev=1803143&r1=1803142&r2=1803143&view=diff
==============================================================================
--- subversion/trunk/subversion/include/svn_dav.h (original)
+++ subversion/trunk/subversion/include/svn_dav.h Thu Jul 27 09:00:43 2017
@@ -404,6 +404,15 @@ extern "C" {
 #define SVN_DAV_NS_DAV_SVN_SVNDIFF2\
             SVN_DAV_PROP_NS_DAV "svn/svndiff2"
 
+/** Presence of this in a DAV header in an OPTIONS response indicates
+ * that the transmitter (in this case, the server) sends the result
+ * checksum in the response to a successful PUT request.
+ *
+ * @since New in 1.10.
+ */
+#define SVN_DAV_NS_DAV_SVN_PUT_RESULT_CHECKSUM\
+            SVN_DAV_PROP_NS_DAV "svn/put-result-checksum"
+
 /** @} */
 
 /** @} */

Modified: subversion/trunk/subversion/include/svn_delta.h
URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/include/svn_delta.h?rev=1803143&r1=1803142&r2=1803143&view=diff
==============================================================================
--- subversion/trunk/subversion/include/svn_delta.h (original)
+++ subversion/trunk/subversion/include/svn_delta.h Thu Jul 27 09:00:43 2017
@@ -330,6 +330,18 @@ typedef svn_error_t *
 typedef const unsigned char *
 (*svn_txdelta_md5_digest_fn_t)(void *baton);
 
+/** A typedef for a function that opens an #svn_txdelta_stream_t object,
+ * allocated in @a result_pool.  @a baton is provided by the caller.
+ * Any temporary allocations may be performed in @a scratch_pool.
+ *
+ * @since New in 1.10.
+ */
+typedef svn_error_t *
+(*svn_txdelta_stream_open_func_t)(svn_txdelta_stream_t **txdelta_stream,
+                                  void *baton,
+                                  apr_pool_t *result_pool,
+                                  apr_pool_t *scratch_pool);
+
 /** Create and return a generic text delta stream with @a baton, @a
  * next_window and @a md5_digest.  Allocate the new stream in @a
  * pool.
@@ -678,6 +690,9 @@ svn_txdelta_skip_svndiff_window(apr_file
  * @{
  */
 
+/* Forward declarations. */
+typedef struct svn_delta_editor_t svn_delta_editor_t;
+
 /** A structure full of callback functions the delta source will invoke
  * as it produces the delta.
  *
@@ -859,7 +874,7 @@ svn_txdelta_skip_svndiff_window(apr_file
  * dead; the only further operation which may be called on the editor
  * is @c abort_edit.
  */
-typedef struct svn_delta_editor_t
+struct svn_delta_editor_t
 {
   /** Set the target revision for this edit to @a target_revision.  This
    * call, if used, should precede all other editor calls.
@@ -1131,9 +1146,38 @@ typedef struct svn_delta_editor_t
   svn_error_t *(*abort_edit)(void *edit_baton,
                              apr_pool_t *scratch_pool);
 
+  /** Apply a text delta stream, yielding the new revision of a file.
+   *
+   * @a file_baton indicates the file we're creating or updating, and the
+   * ancestor file on which it is based; it is the baton set by some
+   * prior @c add_file or @c open_file callback.
+   *
+   * @a open_func is a function that opens a #svn_txdelta_stream_t object.
+   * @a open_baton is provided by the caller.
+   *
+   * @a base_checksum is the hex MD5 digest for the base text against
+   * which the delta is being applied; it is ignored if NULL, and may
+   * be ignored even if not NULL.  If it is not ignored, it must match
+   * the checksum of the base text against which svndiff data is being
+   * applied; if it does not, @c apply_textdelta_stream call which detects
+   * the mismatch will return the error SVN_ERR_CHECKSUM_MISMATCH
+   * (if there is no base text, there may still be an error if
+   * @a base_checksum is neither NULL nor the hex MD5 checksum of the
+   * empty string).
+   *
+   * Any temporary allocations may be performed in @a scratch_pool.
+   */
+  svn_error_t *(*apply_textdelta_stream)(
+    const svn_delta_editor_t *editor,
+    void *file_baton,
+    const char *base_checksum,
+    svn_txdelta_stream_open_func_t open_func,
+    void *open_baton,
+    apr_pool_t *scratch_pool);
+
   /* Be sure to update svn_delta_get_cancellation_editor() and
    * svn_delta_default_editor() if you add a new callback here. */
-} svn_delta_editor_t;
+};
 
 
 /** Return a default delta editor template, allocated in @a pool.

Modified: subversion/trunk/subversion/libsvn_delta/cancel.c
URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_delta/cancel.c?rev=1803143&r1=1803142&r2=1803143&view=diff
==============================================================================
--- subversion/trunk/subversion/libsvn_delta/cancel.c (original)
+++ subversion/trunk/subversion/libsvn_delta/cancel.c Thu Jul 27 09:00:43 2017
@@ -222,6 +222,26 @@ apply_textdelta(void *file_baton,
 }
 
 static svn_error_t *
+apply_textdelta_stream(const svn_delta_editor_t *editor,
+                       void *file_baton,
+                       const char *base_checksum,
+                       svn_txdelta_stream_open_func_t open_func,
+                       void *open_baton,
+                       apr_pool_t *scratch_pool)
+{
+  struct file_baton *fb = file_baton;
+  struct edit_baton *eb = fb->edit_baton;
+
+  SVN_ERR(eb->cancel_func(eb->cancel_baton));
+
+  return eb->wrapped_editor->apply_textdelta_stream(eb->wrapped_editor,
+                                                    fb->wrapped_file_baton,
+                                                    base_checksum,
+                                                    open_func, open_baton,
+                                                    scratch_pool);
+}
+
+static svn_error_t *
 close_file(void *file_baton,
            const char *text_checksum,
            apr_pool_t *pool)
@@ -354,6 +374,7 @@ svn_delta_get_cancellation_editor(svn_ca
       tree_editor->add_file = add_file;
       tree_editor->open_file = open_file;
       tree_editor->apply_textdelta = apply_textdelta;
+      tree_editor->apply_textdelta_stream = apply_textdelta_stream;
       tree_editor->change_file_prop = change_file_prop;
       tree_editor->close_file = close_file;
       tree_editor->absent_file = absent_file;

Modified: subversion/trunk/subversion/libsvn_delta/default_editor.c
URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_delta/default_editor.c?rev=1803143&r1=1803142&r2=1803143&view=diff
==============================================================================
--- subversion/trunk/subversion/libsvn_delta/default_editor.c (original)
+++ subversion/trunk/subversion/libsvn_delta/default_editor.c Thu Jul 27 09:00:43 2017
@@ -133,6 +133,33 @@ close_file(void *file_baton,
 }
 
 
+static svn_error_t *
+apply_textdelta_stream(const svn_delta_editor_t *editor,
+                       void *file_baton,
+                       const char *base_checksum,
+                       svn_txdelta_stream_open_func_t open_func,
+                       void *open_baton,
+                       apr_pool_t *scratch_pool)
+{
+  svn_txdelta_window_handler_t handler;
+  void *handler_baton;
+
+  SVN_ERR(editor->apply_textdelta(file_baton, base_checksum,
+                                  scratch_pool, &handler,
+                                  &handler_baton));
+  if (handler != svn_delta_noop_window_handler)
+    {
+      svn_txdelta_stream_t *txdelta_stream;
+
+      SVN_ERR(open_func(&txdelta_stream, open_baton, scratch_pool,
+                        scratch_pool));
+      SVN_ERR(svn_txdelta_send_txstream(txdelta_stream, handler,
+                                        handler_baton, scratch_pool));
+    }
+
+  return SVN_NO_ERROR;
+}
+
 
 static const svn_delta_editor_t default_editor =
 {
@@ -151,7 +178,8 @@ static const svn_delta_editor_t default_
   close_file,
   absent_xxx_func,
   single_baton_func,
-  single_baton_func
+  single_baton_func,
+  apply_textdelta_stream
 };
 
 svn_delta_editor_t *

Modified: subversion/trunk/subversion/libsvn_ra_serf/commit.c
URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_ra_serf/commit.c?rev=1803143&r1=1803142&r2=1803143&view=diff
==============================================================================
--- subversion/trunk/subversion/libsvn_ra_serf/commit.c (original)
+++ subversion/trunk/subversion/libsvn_ra_serf/commit.c Thu Jul 27 09:00:43 2017
@@ -176,12 +176,18 @@ typedef struct file_context_t {
   /* Buffer holding the svndiff (can spill to disk). */
   svn_ra_serf__request_body_t *svndiff;
 
+  /* Did we send the svndiff in apply_textdelta_stream()? */
+  svn_boolean_t svndiff_sent;
+
   /* Our base checksum as reported by the WC. */
   const char *base_checksum;
 
   /* Our resulting checksum as reported by the WC. */
   const char *result_checksum;
 
+  /* Our resulting checksum as reported by the server. */
+  svn_checksum_t *remote_result_checksum;
+
   /* Changed properties (const char * -> svn_prop_t *) */
   apr_hash_t *prop_changes;
 
@@ -1859,42 +1865,12 @@ open_file(const char *path,
   return SVN_NO_ERROR;
 }
 
-static svn_error_t *
-apply_textdelta(void *file_baton,
-                const char *base_checksum,
-                apr_pool_t *pool,
-                svn_txdelta_window_handler_t *handler,
-                void **handler_baton)
+static void
+negotiate_put_encoding(int *svndiff_version,
+                       int *svndiff_compression_level,
+                       svn_ra_serf__session_t *session)
 {
-  file_context_t *ctx = file_baton;
-  int svndiff_version;
-  int compression_level;
-
-  /* Construct a holder for the request body; we'll give it to serf when we
-   * close this file.
-   *
-   * TODO: There should be a way we can stream the request body instead of
-   * possibly writing to a temporary file (ugh). A special svn stream serf
-   * bucket that returns EAGAIN until we receive the done call?  But, when
-   * would we run through the serf context?  Grr.
-   *
-   * BH: If you wait to a specific event... why not use that event to
-   *     trigger the operation?
-   *     Having a request (body) bucket return EAGAIN until done stalls
-   *     the entire HTTP pipeline after writing the first part of the
-   *     request. It is not like we can interrupt some part of a request
-   *     and continue later. Or somebody else must use tempfiles and
-   *     always assume that clients work this bad... as it only knows
-   *     for sure after the request is completely available.
-   */
-
-  ctx->svndiff =
-    svn_ra_serf__request_body_create(SVN_RA_SERF__REQUEST_BODY_IN_MEM_SIZE,
-                                     ctx->pool);
-  ctx->stream = svn_ra_serf__request_body_get_stream(ctx->svndiff);
-
-  if (ctx->commit_ctx->session->supports_svndiff1 &&
-      ctx->commit_ctx->session->using_compression)
+  if (session->supports_svndiff1 && session->using_compression)
     {
       /* Prefer svndiff1 when using http compression, as svndiff2 is not a
        * substitute for svndiff1 with default compression level.  (It gives
@@ -1906,14 +1882,13 @@ apply_textdelta(void *file_baton,
        * _including_ the preferred order.  This would allow us to dynamically
        * pick svndiff2 if that's what the server thinks is appropriate.
        */
-      svndiff_version = 1;
-      compression_level = SVN_DELTA_COMPRESSION_LEVEL_DEFAULT;
+      *svndiff_version = 1;
+      *svndiff_compression_level = SVN_DELTA_COMPRESSION_LEVEL_DEFAULT;
     }
-  else if (ctx->commit_ctx->session->supports_svndiff2 &&
-           ctx->commit_ctx->session->using_compression)
+  else if (session->supports_svndiff2 && session->using_compression)
     {
-      svndiff_version = 2;
-      compression_level = SVN_DELTA_COMPRESSION_LEVEL_NONE;
+      *svndiff_version = 2;
+      *svndiff_compression_level = SVN_DELTA_COMPRESSION_LEVEL_NONE;
     }
   else
     {
@@ -1927,10 +1902,40 @@ apply_textdelta(void *file_baton,
        * the usage of the uncompressed format by setting the corresponding
        * client configuration option, if they want to.
        */
-      svndiff_version = 0;
-      compression_level = SVN_DELTA_COMPRESSION_LEVEL_NONE;
+      *svndiff_version = 0;
+      *svndiff_compression_level = SVN_DELTA_COMPRESSION_LEVEL_NONE;
     }
+}
 
+static svn_error_t *
+apply_textdelta(void *file_baton,
+                const char *base_checksum,
+                apr_pool_t *pool,
+                svn_txdelta_window_handler_t *handler,
+                void **handler_baton)
+{
+  file_context_t *ctx = file_baton;
+  int svndiff_version;
+  int compression_level;
+
+  /* Construct a holder for the request body; we'll give it to serf when we
+   * close this file.
+   *
+   * Please note that if this callback is used, large request bodies will
+   * be spilled into temporary files (that requires disk space and prevents
+   * simultaneous processing by the server and the client).  A better approach
+   * that streams the request body is implemented in apply_textdelta_stream().
+   * It will be used with most recent servers having the "send result checksum
+   * in response to a PUT" capability, and only if the editor driver uses the
+   * new callback.
+   */
+  ctx->svndiff =
+    svn_ra_serf__request_body_create(SVN_RA_SERF__REQUEST_BODY_IN_MEM_SIZE,
+                                     ctx->pool);
+  ctx->stream = svn_ra_serf__request_body_get_stream(ctx->svndiff);
+
+  negotiate_put_encoding(&svndiff_version, &compression_level,
+                         ctx->commit_ctx->session);
   /* Disown the stream; we'll close it explicitly in close_file(). */
   svn_txdelta_to_svndiff3(handler, handler_baton,
                           svn_stream_disown(ctx->stream, pool),
@@ -1942,6 +1947,146 @@ apply_textdelta(void *file_baton,
   return SVN_NO_ERROR;
 }
 
+typedef struct open_txdelta_baton_t
+{
+  svn_ra_serf__session_t *session;
+  svn_txdelta_stream_open_func_t open_func;
+  void *open_baton;
+  svn_error_t *err;
+} open_txdelta_baton_t;
+
+static void
+txdelta_stream_errfunc(void *baton, svn_error_t *err)
+{
+  open_txdelta_baton_t *b = baton;
+
+  /* Remember extended error info from the stream bucket.  Note that
+   * theoretically this errfunc could be called multiple times -- say,
+   * if the request gets restarted after an error.  Compose the errors
+   * so we don't leak one of them if this happens. */
+  b->err = svn_error_compose_create(b->err, svn_error_dup(err));
+}
+
+/* Implements svn_ra_serf__request_body_delegate_t */
+static svn_error_t *
+create_body_from_txdelta_stream(serf_bucket_t **body_bkt,
+                                void *baton,
+                                serf_bucket_alloc_t *alloc,
+                                apr_pool_t *pool /* request pool */,
+                                apr_pool_t *scratch_pool)
+{
+  open_txdelta_baton_t *b = baton;
+  svn_txdelta_stream_t *txdelta_stream;
+  svn_stream_t *stream;
+  int svndiff_version;
+  int compression_level;
+
+  SVN_ERR(b->open_func(&txdelta_stream, b->open_baton, pool, scratch_pool));
+
+  negotiate_put_encoding(&svndiff_version, &compression_level, b->session);
+  stream = svn_txdelta_to_svndiff_stream(txdelta_stream, svndiff_version,
+                                         compression_level, pool);
+  *body_bkt = svn_ra_serf__create_stream_bucket(stream, alloc,
+                                                txdelta_stream_errfunc, b);
+
+  return SVN_NO_ERROR;
+}
+
+/* Handler baton for PUT request. */
+typedef struct put_response_ctx_t
+{
+  svn_ra_serf__handler_t *handler;
+  file_context_t *file_ctx;
+} put_response_ctx_t;
+
+/* Implements svn_ra_serf__response_handler_t */
+static svn_error_t *
+put_response_handler(serf_request_t *request,
+                     serf_bucket_t *response,
+                     void *baton,
+                     apr_pool_t *scratch_pool)
+{
+  put_response_ctx_t *prc = baton;
+  serf_bucket_t *hdrs;
+  const char *val;
+
+  hdrs = serf_bucket_response_get_headers(response);
+  val = serf_bucket_headers_get(hdrs, SVN_DAV_RESULT_FULLTEXT_MD5_HEADER);
+  SVN_ERR(svn_checksum_parse_hex(&prc->file_ctx->remote_result_checksum,
+                                 svn_checksum_md5, val, prc->file_ctx->pool));
+
+  return svn_error_trace(
+           svn_ra_serf__expect_empty_body(request, response,
+                                          prc->handler, scratch_pool));
+}
+
+static svn_error_t *
+apply_textdelta_stream(const svn_delta_editor_t *editor,
+                       void *file_baton,
+                       const char *base_checksum,
+                       svn_txdelta_stream_open_func_t open_func,
+                       void *open_baton,
+                       apr_pool_t *scratch_pool)
+{
+  file_context_t *ctx = file_baton;
+  open_txdelta_baton_t open_txdelta_baton = {0};
+  svn_ra_serf__handler_t *handler;
+  put_response_ctx_t *prc;
+  int expected_result;
+  svn_error_t *err;
+
+  /* Remember that we have sent the svndiff.  A case when we need to
+   * perform a zero-byte file PUT (during add_file, close_file editor
+   * sequences) is handled in close_file().
+   */
+  ctx->svndiff_sent = TRUE;
+  ctx->base_checksum = base_checksum;
+
+  handler = svn_ra_serf__create_handler(ctx->commit_ctx->session,
+                                        scratch_pool);
+  handler->method = "PUT";
+  handler->path = ctx->url;
+
+  prc = apr_pcalloc(scratch_pool, sizeof(*prc));
+  prc->handler = handler;
+  prc->file_ctx = ctx;
+
+  handler->response_handler = put_response_handler;
+  handler->response_baton = prc;
+
+  open_txdelta_baton.session = ctx->commit_ctx->session;
+  open_txdelta_baton.open_func = open_func;
+  open_txdelta_baton.open_baton = open_baton;
+  open_txdelta_baton.err = SVN_NO_ERROR;
+
+  handler->body_delegate = create_body_from_txdelta_stream;
+  handler->body_delegate_baton = &open_txdelta_baton;
+  handler->body_type = SVN_SVNDIFF_MIME_TYPE;
+
+  handler->header_delegate = setup_put_headers;
+  handler->header_delegate_baton = ctx;
+
+  err = svn_ra_serf__context_run_one(handler, scratch_pool);
+  /* Do we have an error from the stream bucket?  If yes, use it. */
+  if (open_txdelta_baton.err)
+    {
+      svn_error_clear(err);
+      return svn_error_trace(open_txdelta_baton.err);
+    }
+  else if (err)
+    return svn_error_trace(err);
+
+  if (ctx->added && !ctx->copy_path)
+    expected_result = 201; /* Created */
+  else
+    expected_result = 204; /* Updated */
+
+  if (handler->sline.code != expected_result)
+    return svn_error_trace(svn_ra_serf__unexpected_status(handler));
+
+  return SVN_NO_ERROR;
+}
+
 static svn_error_t *
 change_file_prop(void *file_baton,
                  const char *name,
@@ -1977,8 +2122,8 @@ close_file(void *file_baton,
   if ((!ctx->svndiff) && ctx->added && (!ctx->copy_path))
     put_empty_file = TRUE;
 
-  /* If we had a stream of changes, push them to the server... */
-  if (ctx->svndiff || put_empty_file)
+  /* If we have a stream of changes, push them to the server... */
+  if ((ctx->svndiff || put_empty_file) && !ctx->svndiff_sent)
     {
       svn_ra_serf__handler_t *handler;
       int expected_result;
@@ -2043,6 +2188,22 @@ close_file(void *file_baton,
                                  proppatch, scratch_pool));
     }
 
+  if (ctx->result_checksum && ctx->remote_result_checksum)
+    {
+      svn_checksum_t *result_checksum;
+
+      SVN_ERR(svn_checksum_parse_hex(&result_checksum, svn_checksum_md5,
+                                     ctx->result_checksum, scratch_pool));
+
+      if (!svn_checksum_match(result_checksum, ctx->remote_result_checksum))
+        return svn_checksum_mismatch_err(result_checksum,
+                                         ctx->remote_result_checksum,
+                                         scratch_pool,
+                                         _("Checksum mismatch for '%s'"),
+                                         svn_dirent_local_style(ctx->relpath,
+                                                                scratch_pool));
+    }
+
   ctx->commit_ctx->open_batons--;
 
   return SVN_NO_ERROR;
@@ -2218,6 +2379,12 @@ svn_ra_serf__get_commit_editor(svn_ra_se
   editor->close_file = close_file;
   editor->close_edit = close_edit;
   editor->abort_edit = abort_edit;
+  /* Only install the callback that allows streaming PUT request bodies
+   * if the server has the necessary capability.  Otherwise, this will
+   * fallback to the default implementation using the temporary files.
+   * See default_editor.c:apply_textdelta_stream(). */
+  if (session->supports_put_result_checksum)
+    editor->apply_textdelta_stream = apply_textdelta_stream;
 
   *ret_editor = editor;
   *edit_baton = ctx;

Modified: subversion/trunk/subversion/libsvn_ra_serf/options.c
URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_ra_serf/options.c?rev=1803143&r1=1803142&r2=1803143&view=diff
==============================================================================
--- subversion/trunk/subversion/libsvn_ra_serf/options.c (original)
+++ subversion/trunk/subversion/libsvn_ra_serf/options.c Thu Jul 27 09:00:43 2017
@@ -237,6 +237,10 @@ capabilities_headers_iterator_callback(v
           /* Same for svndiff2. */
           session->supports_svndiff2 = TRUE;
         }
+      if (svn_cstring_match_list(SVN_DAV_NS_DAV_SVN_PUT_RESULT_CHECKSUM, vals))
+        {
+          session->supports_put_result_checksum = TRUE;
+        }
     }
 
   /* SVN-specific headers -- if present, server supports HTTP protocol v2 */

Modified: subversion/trunk/subversion/libsvn_ra_serf/ra_serf.h
URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_ra_serf/ra_serf.h?rev=1803143&r1=1803142&r2=1803143&view=diff
==============================================================================
--- subversion/trunk/subversion/libsvn_ra_serf/ra_serf.h (original)
+++ subversion/trunk/subversion/libsvn_ra_serf/ra_serf.h Thu Jul 27 09:00:43 2017
@@ -264,6 +264,10 @@ struct svn_ra_serf__session_t {
 
   /* Indicates whether the server can understand svndiff version 2. */
   svn_boolean_t supports_svndiff2;
+
+  /* Indicates whether the server sends the result checksum in the response
+   * to a successful PUT request. */
+  svn_boolean_t supports_put_result_checksum;
 };
 
 #define SVN_RA_SERF__HAVE_HTTPV2_SUPPORT(sess) ((sess)->me_resource != NULL)

Modified: subversion/trunk/subversion/mod_dav_svn/repos.c
URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/mod_dav_svn/repos.c?rev=1803143&r1=1803142&r2=1803143&view=diff
==============================================================================
--- subversion/trunk/subversion/mod_dav_svn/repos.c (original)
+++ subversion/trunk/subversion/mod_dav_svn/repos.c Thu Jul 27 09:00:43 2017
@@ -2987,6 +2987,26 @@ close_stream(dav_stream *stream, int com
            pool);
     }
 
+  if (stream->wstream != NULL || stream->delta_handler != NULL)
+    {
+      request_rec *r = stream->res->info->r;
+      svn_checksum_t *checksum;
+
+      serr = svn_fs_file_checksum(&checksum, svn_checksum_md5,
+                                  stream->res->info->root.root,
+                                  stream->res->info->repos_path,
+                                  FALSE, pool);
+      if (serr)
+        return dav_svn__convert_err
+          (serr, HTTP_INTERNAL_SERVER_ERROR,
+            "mod_dav_svn close_stream: error getting file checksum",
+            pool);
+
+      if (checksum)
+        apr_table_set(r->headers_out, SVN_DAV_RESULT_FULLTEXT_MD5_HEADER,
+                      svn_checksum_to_cstring(checksum, pool));
+    }
+
   return NULL;
 }
 

Modified: subversion/trunk/subversion/mod_dav_svn/version.c
URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/mod_dav_svn/version.c?rev=1803143&r1=1803142&r2=1803143&view=diff
==============================================================================
--- subversion/trunk/subversion/mod_dav_svn/version.c (original)
+++ subversion/trunk/subversion/mod_dav_svn/version.c Thu Jul 27 09:00:43 2017
@@ -154,6 +154,7 @@ get_vsn_options(apr_pool_t *p, apr_text_
   apr_text_append(p, phdr, SVN_DAV_NS_DAV_SVN_REVERSE_FILE_REVS);
   apr_text_append(p, phdr, SVN_DAV_NS_DAV_SVN_SVNDIFF1);
   apr_text_append(p, phdr, SVN_DAV_NS_DAV_SVN_SVNDIFF2);
+  apr_text_append(p, phdr, SVN_DAV_NS_DAV_SVN_PUT_RESULT_CHECKSUM);
   /* Mergeinfo is a special case: here we merely say that the server
    * knows how to handle mergeinfo -- whether the repository does too
    * is a separate matter.



Re: svn commit: r1803143 - in /subversion/trunk/subversion: include/ libsvn_delta/ libsvn_ra_serf/ mod_dav_svn/

Posted by James McCoy <ja...@jamessan.com>.
On Sat, Sep 02, 2017 at 08:18:00AM +0000, Daniel Shahaf wrote:
> [replying to multiple]
> 
> James McCoy wrote on Fri, Sep 01, 2017 at 08:15:05 -0400:
> > On Fri, Sep 01, 2017 at 03:07:16AM +0000, Daniel Shahaf wrote:
> > > I'm not sure I'm getting through here.
> > > 
> > > The note does say "Don't allocate one of these yourself" but in the
> > > second sentence implies that the reason for this is that if you allocate
> > > it yourself and don't initialize all pointer-to-function members,
> > > Trouble will ensue.  Therefore, the note could be construed as meaning
> > > that it is okay to allocate one of these yourself if you take care to
> > > initialize all pointer members.
> > 
> > The fact that a user can choose not to use the recommended API doesn't
> > mean that there is an ABI break.  A user can already choose to allocate
> > the structure themselves and not initialize everything.  Adding an item
> > to the struct doesn't change that.
> > 
> 
> I'm afraid I don't follow.  If a user allocates the struct themselves
> and doesn't initialize all members, then (1) they are in violation of
> the API's precondition, (2) their code will (depending on the repository
> data) dereference an uninitialized pointer-to-function at runtime.
> That's undefined behaviour de jure and some form of crash de facto.

Right, but it's not an ABI break.  It's misuing the API.

> > I ran abi-compliance-checker[0] against the head of branches/1.9.x and
> > trunk.  It shows no incompatibility issues.  The report is attached for
> > anyone that wants to view it.
> > 
> > [0]: http://ispras.linuxbase.org/index.php/ABI_compliance_checker
> 
> Thanks for adding this datapoint, but, how does it support your
> argument?  The checker's output confirms what we already knew without
> it: that adding a member to a struct type can result in breakage when
> code compiled against 1.9 is run against 1.10 without rebuilding:

It confirms that there's not an ABI break.  Code that's misuing the API
can break, yes, but not code that's using the API correctly.

Cheers,
-- 
James
GPG Key: 4096R/91BF BF4D 6956 BD5D F7B7  2D23 DFE6 91AE 331B A3DB

Re: svn commit: r1803143 - in /subversion/trunk/subversion: include/ libsvn_delta/ libsvn_ra_serf/ mod_dav_svn/

Posted by Evgeny Kotkov <ev...@visualsvn.com>.
Evgeny Kotkov <ev...@visualsvn.com> writes:

> Daniel Shahaf <d....@daniel.shahaf.name> writes:
>
>>> +   *
>>> +   * Any temporary allocations may be performed in @a scratch_pool.
>>
>> Need to add an @since tag here.
>
> [...]
>
>>> +   */
>>> +  svn_error_t *(*apply_textdelta_stream)(
>>
>> Could you update the docstring of svn_delta_editor_t itself to mention this
>> new callback?  All other callbacks are discussed there.
>
> [...]
>
>>> +    const svn_delta_editor_t *editor,
>>
>> This parameter is undocumented.
>
> Thank you for the review.  I agree with these three points, and will try
> to make the suggested tweaks to the documentation.

Committed in r1816063.


Daniel Shahaf <d....@daniel.shahaf.name> writes:

> +1, but note that the "shouldn't" language in current HEAD, which this
> patch [once rebased to HEAD] will remove, was copied from some other
> docstring.  I made no note of which one specifically, because I think
> that "shouldn't" language is our standard formula for docstrings of
> non-opaque struct types.

Committed in r1816066.


Regards,
Evgeny Kotkov

Re: svn commit: r1803143 - in /subversion/trunk/subversion: include/ libsvn_delta/ libsvn_ra_serf/ mod_dav_svn/

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
> > - * @note Don't try to allocate one of these yourself.  Instead, always
> > - * use svn_delta_default_editor() or some other constructor, to ensure
> > - * that unused slots are filled in with no-op functions.
> > + * @note Fields may be added to the end of this structure in future
> > + * versions.  Therefore, users shouldn't allocate structures of this
> > + * type, to preserve binary compatibility.
> > + *
> > + * @note It is recommended to use svn_delta_default_editor() or some other
> > + * constructor, to ensure that unused slots are filled in with no-op functions.
>
> --- subversion/include/svn_delta.h      (revision 1806549)
> +++ subversion/include/svn_delta.h      (working copy)
> @@ -694,8 +694,10 @@ svn_txdelta_skip_svndiff_window(apr_file_t *file,
>   * as it produces the delta.
>   *
>   * @note Don't try to allocate one of these yourself.  Instead, always
> - * use svn_delta_default_editor() or some other constructor, to ensure
> - * that unused slots are filled in with no-op functions.
> + * use svn_delta_default_editor() or some other constructor, to avoid
> + * backwards compatibility problems if the structure is extended in
> + * future releases and to ensure that unused slots are filled in with
> + * no-op functions.

+1, but note that the "shouldn't" language in current HEAD, which this
patch [once rebased to HEAD] will remove, was copied from some other
docstring.  I made no note of which one specifically, because I think
that "shouldn't" language is our standard formula for docstrings of
non-opaque struct types.

Let's also backport this docs patch to 1.9.8.  (As a docs patch it needs
just one vote)

Cheers,

Daniel

P.S. I still think a release notes entry would be warranted, but not
strongly enough to reopen that angle.

Re: svn commit: r1803143 - in /subversion/trunk/subversion: include/ libsvn_delta/ libsvn_ra_serf/ mod_dav_svn/

Posted by Evgeny Kotkov <ev...@visualsvn.com>.
Daniel Shahaf <d....@daniel.shahaf.name> writes:

>> Also, I am not against the idea of tweaking the note by saying something
>> like "...because we may add new fields in the future", but I don't think
>> that it is required either.  (In other words, I am +0 to that.)
>
> Done in r1807028.

I think that although r1807028 provides the additional information to the
users, it simultaneously makes the API requirements less strict:

> - * @note Don't try to allocate one of these yourself.  Instead, always
> - * use svn_delta_default_editor() or some other constructor, to ensure
> - * that unused slots are filled in with no-op functions.
> + * @note Fields may be added to the end of this structure in future
> + * versions.  Therefore, users shouldn't allocate structures of this
> + * type, to preserve binary compatibility.
> + *
> + * @note It is recommended to use svn_delta_default_editor() or some other
> + * constructor, to ensure that unused slots are filled in with no-op functions.

While it does clarify that new fields can be added to the struct, this
changeset also replaces words like "don't try" and "always" with "shouldn't"
and "is recommended" thus making the requirements a recommendation:

 - "Don't try to allocate one of these yourself" became "users shouldn't
   allocate structures"

 - "Instead, always use svn_delta_default_editor()" is now "It is
   recommended to use svn_delta_default_editor()"

Perhaps, a better way would be to keep the original wording, but mention
that the structure may be extended, as in this alternative patch:

[[[
--- subversion/include/svn_delta.h      (revision 1806549)
+++ subversion/include/svn_delta.h      (working copy)
@@ -694,8 +694,10 @@ svn_txdelta_skip_svndiff_window(apr_file_t *file,
  * as it produces the delta.
  *
  * @note Don't try to allocate one of these yourself.  Instead, always
- * use svn_delta_default_editor() or some other constructor, to ensure
- * that unused slots are filled in with no-op functions.
+ * use svn_delta_default_editor() or some other constructor, to avoid
+ * backwards compatibility problems if the structure is extended in
+ * future releases and to ensure that unused slots are filled in with
+ * no-op functions.
  *
  * <h3>Function Usage</h3>
  *
]]]


Regards,
Evgeny Kotkov

Re: svn commit: r1803143 - in /subversion/trunk/subversion: include/ libsvn_delta/ libsvn_ra_serf/ mod_dav_svn/

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Daniel Shahaf wrote on Sat, Sep 02, 2017 at 08:18:00 +0000:
> However, I find the docstring ambiguous and you do not.

.oO ( Doesn't this _ipso facto_ mean the docstring is ambiguous? )

Re: svn commit: r1803143 - in /subversion/trunk/subversion: include/ libsvn_delta/ libsvn_ra_serf/ mod_dav_svn/

Posted by James McCoy <ja...@jamessan.com>.
On Fri, Sep 01, 2017 at 03:07:16AM +0000, Daniel Shahaf wrote:
> Evgeny Kotkov wrote on Thu, 31 Aug 2017 19:05 +0300:
> > Daniel Shahaf <d....@daniel.shahaf.name> writes:
> > > Is adding this function an ABI-compatible change?  The docstring of
> > > svn_delta_editor_t does say """
> > >
> > >  * @note Don't try to allocate one of these yourself.  Instead, always
> > >  * use svn_delta_default_editor() or some other constructor, to ensure
> > >  * that unused slots are filled in with no-op functions.
> > >
> > > """, but an API consumer might have interpreted this note as meaning "You may
> > > use malloc(..., sizeof(svn_delta_editor_t)) if you take care to initialize
> > > all struct members", in which case, his code will not be ABI compatible
> > > with 1.10.
> > 
> > I think that adding this callback does not affect the ABI compatibility.
> > The note says "Don't try to allocate one of these yourself", whereas the
> > malloc(..., sizeof(svn_delta_editor_t)) example does exactly the opposite.
> 
> I'm not sure I'm getting through here.
> 
> The note does say "Don't allocate one of these yourself" but in the
> second sentence implies that the reason for this is that if you allocate
> it yourself and don't initialize all pointer-to-function members,
> Trouble will ensue.  Therefore, the note could be construed as meaning
> that it is okay to allocate one of these yourself if you take care to
> initialize all pointer members.

The fact that a user can choose not to use the recommended API doesn't
mean that there is an ABI break.  A user can already choose to allocate
the structure themselves and not initialize everything.  Adding an item
to the struct doesn't change that.

I ran abi-compliance-checker[0] against the head of branches/1.9.x and
trunk.  It shows no incompatibility issues.  The report is attached for
anyone that wants to view it.

[0]: http://ispras.linuxbase.org/index.php/ABI_compliance_checker

Cheers,
-- 
James
GPG Key: 4096R/91BF BF4D 6956 BD5D F7B7  2D23 DFE6 91AE 331B A3DB

Re: svn commit: r1803143 - in /subversion/trunk/subversion: include/ libsvn_delta/ libsvn_ra_serf/ mod_dav_svn/

Posted by Evgeny Kotkov <ev...@visualsvn.com>.
Daniel Shahaf <d....@daniel.shahaf.name> writes:

> I'm not sure I'm getting through here.
>
> The note does say "Don't allocate one of these yourself" but in the
> second sentence implies that the reason for this is that if you allocate
> it yourself and don't initialize all pointer-to-function members,
> Trouble will ensue.  Therefore, the note could be construed as meaning
> that it is okay to allocate one of these yourself if you take care to
> initialize all pointer members.
>
> In other words: the comment could have been interpreted as a piece
> of advice --- "it is more robust to use _default_editor() than to allocate
> with malloc" --- as opposed to a blanket prohibition on allocating this
> struct type with malloc.
>
> (If one uses malloc and doesn't initialize all pointers, the result
> would be that an uninitialized pointer-to-function struct member is
> dereferenced.)
>
> In contrast, most of our other structs explicitly say that "Don't
> allocate yourself because we may add new fields in the future".  This
> struct does not give that reason.
>
> Makes sense?
>
> I suppose can just add an API erratum and/or release notes entry about this.

I think that the important thing about this documentation is that it
states that:

 (1) The API user shouldn't try to allocate the struct himself.

 (2) A constructor such as svn_delta_default_editor() should *always*
     be used.

To my mind, the current statement is clear and it is not possible to
interpret it as if it's allowed to malloc() the struct and initialize it
manually.

My opinion here is that neither the API erratum nor including this in the
release notes are required, and doing so would just unnecessarily restate
the information that's already available.

Also, I am not against the idea of tweaking the note by saying something
like "...because we may add new fields in the future", but I don't think
that it is required either.  (In other words, I am +0 to that.)


Regards,
Evgeny Kotkov

Re: svn commit: r1803143 - in /subversion/trunk/subversion: include/ libsvn_delta/ libsvn_ra_serf/ mod_dav_svn/

Posted by Stefan Hett <st...@egosoft.com>.
On 9/1/2017 5:07 AM, Daniel Shahaf wrote:
> Evgeny Kotkov wrote on Thu, 31 Aug 2017 19:05 +0300:
>> Daniel Shahaf <d....@daniel.shahaf.name> writes:
>>> Is adding this function an ABI-compatible change?  The docstring of
>>> svn_delta_editor_t does say """
>>>
>>>   * @note Don't try to allocate one of these yourself.  Instead, always
>>>   * use svn_delta_default_editor() or some other constructor, to ensure
>>>   * that unused slots are filled in with no-op functions.
>>>
>>> """, but an API consumer might have interpreted this note as meaning "You may
>>> use malloc(..., sizeof(svn_delta_editor_t)) if you take care to initialize
>>> all struct members", in which case, his code will not be ABI compatible
>>> with 1.10.
>> I think that adding this callback does not affect the ABI compatibility.
>> The note says "Don't try to allocate one of these yourself", whereas the
>> malloc(..., sizeof(svn_delta_editor_t)) example does exactly the opposite.
> I'm not sure I'm getting through here.
>
> The note does say "Don't allocate one of these yourself" but in the
> second sentence implies that the reason for this is that if you allocate
> it yourself and don't initialize all pointer-to-function members,
> Trouble will ensue.  Therefore, the note could be construed as meaning
> that it is okay to allocate one of these yourself if you take care to
> initialize all pointer members.
>
> In other words: the comment could have been interpreted as a piece
> of advice --- "it is more robust to use _default_editor() than to allocate
> with malloc" --- as opposed to a blanket prohibition on allocating this
> struct type with malloc.
>
> (If one uses malloc and doesn't initialize all pointers, the result
> would be that an uninitialized pointer-to-function struct member is
> dereferenced.)
>
> In contrast, most of our other structs explicitly say that "Don't
> allocate yourself because we may add new fields in the future".  This
> struct does not give that reason.
>
> Makes sense?
>
> I suppose can just add an API erratum and/or release notes entry about this.
>
> Thanks,
>
> Daniel
> (I'll reply to your other points separately)
>
I think your proposal to add an erratum and "correct/update" the 
documentation about allocating the struct is the right way to go here.
IMO this is only a minor inconsistency in the documentation (i.e. it 
should be as clear as the rest of the API documentation with regards to 
what the implications of not using the appropriate allocation methods are).

FWIW: I always read that current note already like this and interpreted 
the sentence as just giving an example of why this is important and what 
might go wrong if you don't follow the advice. But I certainly also see 
that it might be interpreted in another way. So no harm in being precise 
here and adding an erratum, I guess...

-- 
Regards,
Stefan Hett


Re: svn commit: r1803143 - in /subversion/trunk/subversion: include/ libsvn_delta/ libsvn_ra_serf/ mod_dav_svn/

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Evgeny Kotkov wrote on Thu, 31 Aug 2017 19:05 +0300:
> Daniel Shahaf <d....@daniel.shahaf.name> writes:
> > Is adding this function an ABI-compatible change?  The docstring of
> > svn_delta_editor_t does say """
> >
> >  * @note Don't try to allocate one of these yourself.  Instead, always
> >  * use svn_delta_default_editor() or some other constructor, to ensure
> >  * that unused slots are filled in with no-op functions.
> >
> > """, but an API consumer might have interpreted this note as meaning "You may
> > use malloc(..., sizeof(svn_delta_editor_t)) if you take care to initialize
> > all struct members", in which case, his code will not be ABI compatible
> > with 1.10.
> 
> I think that adding this callback does not affect the ABI compatibility.
> The note says "Don't try to allocate one of these yourself", whereas the
> malloc(..., sizeof(svn_delta_editor_t)) example does exactly the opposite.

I'm not sure I'm getting through here.

The note does say "Don't allocate one of these yourself" but in the
second sentence implies that the reason for this is that if you allocate
it yourself and don't initialize all pointer-to-function members,
Trouble will ensue.  Therefore, the note could be construed as meaning
that it is okay to allocate one of these yourself if you take care to
initialize all pointer members.

In other words: the comment could have been interpreted as a piece
of advice --- "it is more robust to use _default_editor() than to allocate
with malloc" --- as opposed to a blanket prohibition on allocating this
struct type with malloc.

(If one uses malloc and doesn't initialize all pointers, the result
would be that an uninitialized pointer-to-function struct member is
dereferenced.)

In contrast, most of our other structs explicitly say that "Don't
allocate yourself because we may add new fields in the future".  This
struct does not give that reason.

Makes sense?

I suppose can just add an API erratum and/or release notes entry about this.

Thanks,

Daniel
(I'll reply to your other points separately)

Re: svn commit: r1803143 - in /subversion/trunk/subversion: include/ libsvn_delta/ libsvn_ra_serf/ mod_dav_svn/

Posted by Evgeny Kotkov <ev...@visualsvn.com>.
Daniel Shahaf <d....@daniel.shahaf.name> writes:

>> +   *
>> +   * Any temporary allocations may be performed in @a scratch_pool.
>
> Need to add an @since tag here.

[...]

>> +   */
>> +  svn_error_t *(*apply_textdelta_stream)(
>
> Could you update the docstring of svn_delta_editor_t itself to mention this
> new callback?  All other callbacks are discussed there.

[...]

>> +    const svn_delta_editor_t *editor,
>
> This parameter is undocumented.

Thank you for the review.  I agree with these three points, and will try
to make the suggested tweaks to the documentation.

>> +  /** Apply a text delta stream, yielding the new revision of a file.
>> +   *
>> +   * @a file_baton indicates the file we're creating or updating, and the
>> +   * ancestor file on which it is based; it is the baton set by some
>> +   * prior @c add_file or @c open_file callback.
>> +   *
>> +   * @a open_func is a function that opens a #svn_txdelta_stream_t object.
>> +   * @a open_baton is provided by the caller.
>> +   *
>> +   * @a base_checksum is the hex MD5 digest for the base text against
>> +   * which the delta is being applied; it is ignored if NULL, and may
>> +   * be ignored even if not NULL.  If it is not ignored, it must match
>
> What's the rationale for allowing drivees to ignore the checksum?
>
> This leeway enables failure modes that wouldn't be possible without it.
> (Drivers that are aware of this leeway will validate checksums even if the
> drivee doesn't, leading to duplicate work; drivers that are unaware of this
> requirement might not get checksum errors they should have.)
>
> I get that you just copied this part of the docstring from apply_textdelta(),
> but I'd like to understand what's the rationale here.  (And to see if this
> leeway should be deprecated)

My interpretation of the documentation is that "the result checksum must
be validated by the implementation, whereas validating the checksum of the
base text may be omitted".

Perhaps, there are cases where the base checksum won't be validated by
our existing implementations (what about BDB, for instance?), but in the
meanwhile, I'm not too sure about the gain from _always_ requiring such
checks.

I think that, from the editor driver's point of view, the important thing
is the result checksum.  If the implementation also validates the base
checksum, that may allow it to skip actually applying the delta in case
of the early mismatch, give away better error messages ("I have a base
checksum mismatch" instead of "I can't apply instruction 7 at offset
12345") and maybe even detect the coding mistakes which cause the
delta to be applied to an unexpected base, but still yielding the
expected resulting fulltext.

Having all these properties is nice, but probably not mandatory.  And I
think that lifting the optionality of this check could shoot us in the foot
in the future, if we find ourselves writing an implementation where it is
particularly tricky to always implement the base text checks — for instance,
due to the used protocol or any other technical reasons.

Hope that I am not missing something subtle here :)

>> +   * the checksum of the base text against which svndiff data is being
>> +   * applied; if it does not, @c apply_textdelta_stream call which detects
>> +   * the mismatch will return the error SVN_ERR_CHECKSUM_MISMATCH
>> +   * (if there is no base text, there may still be an error if
>> +   * @a base_checksum is neither NULL nor the hex MD5 checksum of the
>> +   * empty string).
>
> To the best of my knowledge, we don't special case the empty string's
> checksum, d41d8cd98f00b204e9800998ecf8427e, anywhere.  We do special-case
> 00000000000000000000000000000000, though.  I assume the parenthetical should
> be fixed accordingly (both here and in apply_textdelta())?

I agree that this part of the docstring (same in svn_fs_apply_textdelta())
looks odd, and I don't think that the digest of an empty string is specially
handled somewhere in the implementations.

Perhaps, it would make sense to check on whether this has been true at
some point in the past and also tweak the docstring.

> Is adding this function an ABI-compatible change?  The docstring of
> svn_delta_editor_t does say """
>
>  * @note Don't try to allocate one of these yourself.  Instead, always
>  * use svn_delta_default_editor() or some other constructor, to ensure
>  * that unused slots are filled in with no-op functions.
>
> """, but an API consumer might have interpreted this note as meaning "You may
> use malloc(..., sizeof(svn_delta_editor_t)) if you take care to initialize
> all struct members", in which case, his code will not be ABI compatible
> with 1.10.

I think that adding this callback does not affect the ABI compatibility.
The note says "Don't try to allocate one of these yourself", whereas the
malloc(..., sizeof(svn_delta_editor_t)) example does exactly the opposite.


Thanks,
Evgeny Kotkov

Re: svn commit: r1803143 - in /subversion/trunk/subversion: include/ libsvn_delta/ libsvn_ra_serf/ mod_dav_svn/

Posted by Evgeny Kotkov <ev...@visualsvn.com>.
Daniel Shahaf <d....@daniel.shahaf.name> writes:

>> +   *
>> +   * Any temporary allocations may be performed in @a scratch_pool.
>
> Need to add an @since tag here.

[...]

>> +   */
>> +  svn_error_t *(*apply_textdelta_stream)(
>
> Could you update the docstring of svn_delta_editor_t itself to mention this
> new callback?  All other callbacks are discussed there.

[...]

>> +    const svn_delta_editor_t *editor,
>
> This parameter is undocumented.

Thank you for the review.  I agree with these three points, and will try
to make the suggested tweaks to the documentation.

>> +  /** Apply a text delta stream, yielding the new revision of a file.
>> +   *
>> +   * @a file_baton indicates the file we're creating or updating, and the
>> +   * ancestor file on which it is based; it is the baton set by some
>> +   * prior @c add_file or @c open_file callback.
>> +   *
>> +   * @a open_func is a function that opens a #svn_txdelta_stream_t object.
>> +   * @a open_baton is provided by the caller.
>> +   *
>> +   * @a base_checksum is the hex MD5 digest for the base text against
>> +   * which the delta is being applied; it is ignored if NULL, and may
>> +   * be ignored even if not NULL.  If it is not ignored, it must match
>
> What's the rationale for allowing drivees to ignore the checksum?
>
> This leeway enables failure modes that wouldn't be possible without it.
> (Drivers that are aware of this leeway will validate checksums even if the
> drivee doesn't, leading to duplicate work; drivers that are unaware of this
> requirement might not get checksum errors they should have.)
>
> I get that you just copied this part of the docstring from apply_textdelta(),
> but I'd like to understand what's the rationale here.  (And to see if this
> leeway should be deprecated)

My interpretation of the documentation is that "the result checksum must
be validated by the implementation, whereas validating the checksum of the
base text may be omitted".

Perhaps, there are cases where the base checksum won't be validated by
our existing implementations (what about BDB, for instance?), but in the
meanwhile, I'm not too sure about the gain from _always_ requiring such
checks.

I think that, from the editor driver's point of view, the important thing
is the result checksum.  If the implementation also validates the base
checksum, that may allow it to skip actually applying the delta in case
of the early mismatch, give away better error messages ("I have a base
checksum mismatch" instead of "I can't apply instruction 7 at offset
12345") and maybe even detect the coding mistakes which cause the
delta to be applied to an unexpected base, but still yielding the
expected resulting fulltext.

Having all these properties is nice, but probably not mandatory.  And I
think that lifting the optionality of this check could shoot us in the foot
in the future, if we find ourselves writing an implementation where it is
particularly tricky to always implement the base text checks — for instance,
due to the used protocol or any other technical reasons.

Hope that I am not missing something subtle here :)

>> +   * the checksum of the base text against which svndiff data is being
>> +   * applied; if it does not, @c apply_textdelta_stream call which detects
>> +   * the mismatch will return the error SVN_ERR_CHECKSUM_MISMATCH
>> +   * (if there is no base text, there may still be an error if
>> +   * @a base_checksum is neither NULL nor the hex MD5 checksum of the
>> +   * empty string).
>
> To the best of my knowledge, we don't special case the empty string's
> checksum, d41d8cd98f00b204e9800998ecf8427e, anywhere.  We do special-case
> 00000000000000000000000000000000, though.  I assume the parenthetical should
> be fixed accordingly (both here and in apply_textdelta())?

I agree that this part of the docstring (same in svn_fs_apply_textdelta())
looks odd, and I don't think that the digest of an empty string is specially
handled somewhere in the implementations.

Perhaps, it would make sense to check on whether this has been true at
some point in the past and also tweak the docstring.

> Is adding this function an ABI-compatible change?  The docstring of
> svn_delta_editor_t does say """
>
>  * @note Don't try to allocate one of these yourself.  Instead, always
>  * use svn_delta_default_editor() or some other constructor, to ensure
>  * that unused slots are filled in with no-op functions.
>
> """, but an API consumer might have interpreted this note as meaning "You may
> use malloc(..., sizeof(svn_delta_editor_t)) if you take care to initialize
> all struct members", in which case, his code will not be ABI compatible
> with 1.10.

I think that adding this callback does not affect the ABI compatibility.
The note says "Don't try to allocate one of these yourself", whereas the
malloc(..., sizeof(svn_delta_editor_t)) example does exactly the opposite.


Thanks,
Evgeny Kotkov

Re: svn commit: r1803143 - in /subversion/trunk/subversion: include/ libsvn_delta/ libsvn_ra_serf/ mod_dav_svn/

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Bert Huijben wrote on Wed, 30 Aug 2017 12:24 +0200:
> I think this just documents current behavior. Yes a 1.9+ client
> against a 1.9+ server will always have a checksum, but this is not the
> case when mixing older clients and servers.
>
> Original serf versions (form before we declared this stable) typically
> never provided the checksum. And in some cases bulk requests didn't
> have all the checksums either. I remember fixing a few cases around
> WC-NG to make sure all ra layers reported the same errors in some
> exceptional cases.

Thanks for the clarification, Bert.  It sounds like the various commit
editors (libsvn_ra's and libsvn_repos's, at least) should document that
they do verify checksums --- I didn't check whether they already
document that --- and the svn_delta_editor_t docs should be updated to
state that the leeway for receivers not to verify checksums is deprecated.


RE: svn commit: r1803143 - in /subversion/trunk/subversion: include/ libsvn_delta/ libsvn_ra_serf/ mod_dav_svn/

Posted by Bert Huijben <be...@qqmail.nl>.

> -----Original Message-----
> From: Daniel Shahaf [mailto:d.s@daniel.shahaf.name]
> Sent: woensdag 30 augustus 2017 07:07
> To: kotkov@apache.org
> Cc: dev@subversion.apache.org; commits@subversion.apache.org
> Subject: Re: svn commit: r1803143 - in /subversion/trunk/subversion:
> include/ libsvn_delta/ libsvn_ra_serf/ mod_dav_svn/
> 
> Good morning Evgeny,
> 
> kotkov@apache.org wrote on Thu, 27 Jul 2017 09:00 +0000:
> > +++ subversion/trunk/subversion/include/svn_delta.h Thu Jul 27 09:00:43
> 2017
> > @@ -678,6 +690,9 @@ svn_txdelta_skip_svndiff_window(apr_file
> >   * @{
> >   */
> >
> > +/* Forward declarations. */
> > +typedef struct svn_delta_editor_t svn_delta_editor_t;
> > +
> >  /** A structure full of callback functions the delta source will invoke
> >   * as it produces the delta.
> >   *
> > @@ -859,7 +874,7 @@ svn_txdelta_skip_svndiff_window(apr_file
> >   * dead; the only further operation which may be called on the editor
> >   * is @c abort_edit.
> >   */
> > -typedef struct svn_delta_editor_t
> > +struct svn_delta_editor_t
> >  {
> >    /** Set the target revision for this edit to @a target_revision.  This
> >     * call, if used, should precede all other editor calls.
> > @@ -1131,9 +1146,38 @@ typedef struct svn_delta_editor_t
> >    svn_error_t *(*abort_edit)(void *edit_baton,
> >                               apr_pool_t *scratch_pool);
> >
> > +  /** Apply a text delta stream, yielding the new revision of a file.
> > +   *
> > +   * @a file_baton indicates the file we're creating or updating, and the
> > +   * ancestor file on which it is based; it is the baton set by some
> > +   * prior @c add_file or @c open_file callback.
> > +   *
> > +   * @a open_func is a function that opens a #svn_txdelta_stream_t
> object.
> > +   * @a open_baton is provided by the caller.
> > +   *
> > +   * @a base_checksum is the hex MD5 digest for the base text against
> > +   * which the delta is being applied; it is ignored if NULL, and may
> > +   * be ignored even if not NULL.  If it is not ignored, it must match
> 
> What's the rationale for allowing drivees to ignore the checksum?
> 
> This leeway enables failure modes that wouldn't be possible without it.
> (Drivers that are aware of this leeway will validate checksums even if the
> drivee doesn't, leading to duplicate work; drivers that are unaware of this
> requirement might not get checksum errors they should have.)
> 
> I get that you just copied this part of the docstring from apply_textdelta(),
> but I'd like to understand what's the rationale here.  (And to see if this
> leeway should be deprecated)

I think this just documents current behavior. Yes a 1.9+ client against a 1.9+ server will always have a checksum, but this is not the case when mixing older clients and servers.

Original serf versions (form before we declared this stable) typically never provided the checksum. And in some cases bulk requests didn't have all the checksums either. I remember fixing a few cases around WC-NG to make sure all ra layers reported the same errors in some exceptional cases.

	Bert


RE: svn commit: r1803143 - in /subversion/trunk/subversion: include/ libsvn_delta/ libsvn_ra_serf/ mod_dav_svn/

Posted by Bert Huijben <be...@qqmail.nl>.

> -----Original Message-----
> From: Daniel Shahaf [mailto:d.s@daniel.shahaf.name]
> Sent: woensdag 30 augustus 2017 07:07
> To: kotkov@apache.org
> Cc: dev@subversion.apache.org; commits@subversion.apache.org
> Subject: Re: svn commit: r1803143 - in /subversion/trunk/subversion:
> include/ libsvn_delta/ libsvn_ra_serf/ mod_dav_svn/
> 
> Good morning Evgeny,
> 
> kotkov@apache.org wrote on Thu, 27 Jul 2017 09:00 +0000:
> > +++ subversion/trunk/subversion/include/svn_delta.h Thu Jul 27 09:00:43
> 2017
> > @@ -678,6 +690,9 @@ svn_txdelta_skip_svndiff_window(apr_file
> >   * @{
> >   */
> >
> > +/* Forward declarations. */
> > +typedef struct svn_delta_editor_t svn_delta_editor_t;
> > +
> >  /** A structure full of callback functions the delta source will invoke
> >   * as it produces the delta.
> >   *
> > @@ -859,7 +874,7 @@ svn_txdelta_skip_svndiff_window(apr_file
> >   * dead; the only further operation which may be called on the editor
> >   * is @c abort_edit.
> >   */
> > -typedef struct svn_delta_editor_t
> > +struct svn_delta_editor_t
> >  {
> >    /** Set the target revision for this edit to @a target_revision.  This
> >     * call, if used, should precede all other editor calls.
> > @@ -1131,9 +1146,38 @@ typedef struct svn_delta_editor_t
> >    svn_error_t *(*abort_edit)(void *edit_baton,
> >                               apr_pool_t *scratch_pool);
> >
> > +  /** Apply a text delta stream, yielding the new revision of a file.
> > +   *
> > +   * @a file_baton indicates the file we're creating or updating, and the
> > +   * ancestor file on which it is based; it is the baton set by some
> > +   * prior @c add_file or @c open_file callback.
> > +   *
> > +   * @a open_func is a function that opens a #svn_txdelta_stream_t
> object.
> > +   * @a open_baton is provided by the caller.
> > +   *
> > +   * @a base_checksum is the hex MD5 digest for the base text against
> > +   * which the delta is being applied; it is ignored if NULL, and may
> > +   * be ignored even if not NULL.  If it is not ignored, it must match
> 
> What's the rationale for allowing drivees to ignore the checksum?
> 
> This leeway enables failure modes that wouldn't be possible without it.
> (Drivers that are aware of this leeway will validate checksums even if the
> drivee doesn't, leading to duplicate work; drivers that are unaware of this
> requirement might not get checksum errors they should have.)
> 
> I get that you just copied this part of the docstring from apply_textdelta(),
> but I'd like to understand what's the rationale here.  (And to see if this
> leeway should be deprecated)

I think this just documents current behavior. Yes a 1.9+ client against a 1.9+ server will always have a checksum, but this is not the case when mixing older clients and servers.

Original serf versions (form before we declared this stable) typically never provided the checksum. And in some cases bulk requests didn't have all the checksums either. I remember fixing a few cases around WC-NG to make sure all ra layers reported the same errors in some exceptional cases.

	Bert


Re: svn commit: r1803143 - in /subversion/trunk/subversion: include/ libsvn_delta/ libsvn_ra_serf/ mod_dav_svn/

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Good morning Evgeny,

kotkov@apache.org wrote on Thu, 27 Jul 2017 09:00 +0000:
> +++ subversion/trunk/subversion/include/svn_delta.h Thu Jul 27 09:00:43 2017
> @@ -678,6 +690,9 @@ svn_txdelta_skip_svndiff_window(apr_file
>   * @{
>   */
>  
> +/* Forward declarations. */
> +typedef struct svn_delta_editor_t svn_delta_editor_t;
> +
>  /** A structure full of callback functions the delta source will invoke
>   * as it produces the delta.
>   *
> @@ -859,7 +874,7 @@ svn_txdelta_skip_svndiff_window(apr_file
>   * dead; the only further operation which may be called on the editor
>   * is @c abort_edit.
>   */
> -typedef struct svn_delta_editor_t
> +struct svn_delta_editor_t
>  {
>    /** Set the target revision for this edit to @a target_revision.  This
>     * call, if used, should precede all other editor calls.
> @@ -1131,9 +1146,38 @@ typedef struct svn_delta_editor_t
>    svn_error_t *(*abort_edit)(void *edit_baton,
>                               apr_pool_t *scratch_pool);
>  
> +  /** Apply a text delta stream, yielding the new revision of a file.
> +   *
> +   * @a file_baton indicates the file we're creating or updating, and the
> +   * ancestor file on which it is based; it is the baton set by some
> +   * prior @c add_file or @c open_file callback.
> +   *
> +   * @a open_func is a function that opens a #svn_txdelta_stream_t object.
> +   * @a open_baton is provided by the caller.
> +   *
> +   * @a base_checksum is the hex MD5 digest for the base text against
> +   * which the delta is being applied; it is ignored if NULL, and may
> +   * be ignored even if not NULL.  If it is not ignored, it must match

What's the rationale for allowing drivees to ignore the checksum?

This leeway enables failure modes that wouldn't be possible without it.
(Drivers that are aware of this leeway will validate checksums even if the
drivee doesn't, leading to duplicate work; drivers that are unaware of this
requirement might not get checksum errors they should have.)

I get that you just copied this part of the docstring from apply_textdelta(),
but I'd like to understand what's the rationale here.  (And to see if this
leeway should be deprecated)

> +   * the checksum of the base text against which svndiff data is being
> +   * applied; if it does not, @c apply_textdelta_stream call which detects
> +   * the mismatch will return the error SVN_ERR_CHECKSUM_MISMATCH
> +   * (if there is no base text, there may still be an error if
> +   * @a base_checksum is neither NULL nor the hex MD5 checksum of the
> +   * empty string).

To the best of my knowledge, we don't special case the empty string's checksum,
d41d8cd98f00b204e9800998ecf8427e, anywhere.  We do special-case
00000000000000000000000000000000, though.  I assume the parenthetical should be
fixed accordingly (both here and in apply_textdelta())?

> +   *
> +   * Any temporary allocations may be performed in @a scratch_pool.

Need to add an @since tag here.

> +   */
> +  svn_error_t *(*apply_textdelta_stream)(

Could you update the docstring of svn_delta_editor_t itself to mention this new
callback?  All other callbacks are discussed there.

>

Is adding this function an ABI-compatible change?  The docstring of
svn_delta_editor_t does say """

 * @note Don't try to allocate one of these yourself.  Instead, always
 * use svn_delta_default_editor() or some other constructor, to ensure
 * that unused slots are filled in with no-op functions.

""", but an API consumer might have interpreted this note as meaning "You may
use malloc(..., sizeof(svn_delta_editor_t)) if you take care to initialize all
struct members", in which case, his code will not be ABI compatible with 1.10.

> +    const svn_delta_editor_t *editor,

This parameter is undocumented.

Cheers,

Daniel

> +    void *file_baton,
> +    const char *base_checksum,
> +    svn_txdelta_stream_open_func_t open_func,
> +    void *open_baton,
> +    apr_pool_t *scratch_pool);
> +
>    /* Be sure to update svn_delta_get_cancellation_editor() and
>     * svn_delta_default_editor() if you add a new callback here. */
> -} svn_delta_editor_t;
> +};
>  
>  
>  /** Return a default delta editor template, allocated in @a pool.
> 

Re: svn commit: r1803143 - in /subversion/trunk/subversion: include/ libsvn_delta/ libsvn_ra_serf/ mod_dav_svn/

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Good morning Evgeny,

kotkov@apache.org wrote on Thu, 27 Jul 2017 09:00 +0000:
> +++ subversion/trunk/subversion/include/svn_delta.h Thu Jul 27 09:00:43 2017
> @@ -678,6 +690,9 @@ svn_txdelta_skip_svndiff_window(apr_file
>   * @{
>   */
>  
> +/* Forward declarations. */
> +typedef struct svn_delta_editor_t svn_delta_editor_t;
> +
>  /** A structure full of callback functions the delta source will invoke
>   * as it produces the delta.
>   *
> @@ -859,7 +874,7 @@ svn_txdelta_skip_svndiff_window(apr_file
>   * dead; the only further operation which may be called on the editor
>   * is @c abort_edit.
>   */
> -typedef struct svn_delta_editor_t
> +struct svn_delta_editor_t
>  {
>    /** Set the target revision for this edit to @a target_revision.  This
>     * call, if used, should precede all other editor calls.
> @@ -1131,9 +1146,38 @@ typedef struct svn_delta_editor_t
>    svn_error_t *(*abort_edit)(void *edit_baton,
>                               apr_pool_t *scratch_pool);
>  
> +  /** Apply a text delta stream, yielding the new revision of a file.
> +   *
> +   * @a file_baton indicates the file we're creating or updating, and the
> +   * ancestor file on which it is based; it is the baton set by some
> +   * prior @c add_file or @c open_file callback.
> +   *
> +   * @a open_func is a function that opens a #svn_txdelta_stream_t object.
> +   * @a open_baton is provided by the caller.
> +   *
> +   * @a base_checksum is the hex MD5 digest for the base text against
> +   * which the delta is being applied; it is ignored if NULL, and may
> +   * be ignored even if not NULL.  If it is not ignored, it must match

What's the rationale for allowing drivees to ignore the checksum?

This leeway enables failure modes that wouldn't be possible without it.
(Drivers that are aware of this leeway will validate checksums even if the
drivee doesn't, leading to duplicate work; drivers that are unaware of this
requirement might not get checksum errors they should have.)

I get that you just copied this part of the docstring from apply_textdelta(),
but I'd like to understand what's the rationale here.  (And to see if this
leeway should be deprecated)

> +   * the checksum of the base text against which svndiff data is being
> +   * applied; if it does not, @c apply_textdelta_stream call which detects
> +   * the mismatch will return the error SVN_ERR_CHECKSUM_MISMATCH
> +   * (if there is no base text, there may still be an error if
> +   * @a base_checksum is neither NULL nor the hex MD5 checksum of the
> +   * empty string).

To the best of my knowledge, we don't special case the empty string's checksum,
d41d8cd98f00b204e9800998ecf8427e, anywhere.  We do special-case
00000000000000000000000000000000, though.  I assume the parenthetical should be
fixed accordingly (both here and in apply_textdelta())?

> +   *
> +   * Any temporary allocations may be performed in @a scratch_pool.

Need to add an @since tag here.

> +   */
> +  svn_error_t *(*apply_textdelta_stream)(

Could you update the docstring of svn_delta_editor_t itself to mention this new
callback?  All other callbacks are discussed there.

>

Is adding this function an ABI-compatible change?  The docstring of
svn_delta_editor_t does say """

 * @note Don't try to allocate one of these yourself.  Instead, always
 * use svn_delta_default_editor() or some other constructor, to ensure
 * that unused slots are filled in with no-op functions.

""", but an API consumer might have interpreted this note as meaning "You may
use malloc(..., sizeof(svn_delta_editor_t)) if you take care to initialize all
struct members", in which case, his code will not be ABI compatible with 1.10.

> +    const svn_delta_editor_t *editor,

This parameter is undocumented.

Cheers,

Daniel

> +    void *file_baton,
> +    const char *base_checksum,
> +    svn_txdelta_stream_open_func_t open_func,
> +    void *open_baton,
> +    apr_pool_t *scratch_pool);
> +
>    /* Be sure to update svn_delta_get_cancellation_editor() and
>     * svn_delta_default_editor() if you add a new callback here. */
> -} svn_delta_editor_t;
> +};
>  
>  
>  /** Return a default delta editor template, allocated in @a pool.
>