You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Stefan Sperling <st...@elego.de> on 2020/01/30 19:34:19 UTC

ra_serf redirect URL canonicalization fix

I managed to trigger a client-side assertion failure today with a
simple HTTP redirect. I have fixed this issue in r1873375 (see below).

When I tried to merge this fix to 1.10.x I learned that this behaviour
was intentionally changed already in r1866899. We used to canonicalize
these URLs, but then we stopped doing so because we could end up
misreporting an unrelated error condition (see below).

My patch doesn't cleanly merge to 1.10.x because svn_uri_canonicalize_safe()
does not yet exist there. Should we backport it to 1.10.x as a private
function, or revert r1866899 from 1.10.x, or do we leave 1.10.x as is?

To me it looks as if the consequences of r1866899 weren't fully considered.
The server should not be able to trigger assertions in the client's libraries.

Would svn_uri_canonicalize_safe() allow for detecting a redirect loop problem?
My patch is not making use of the non-canonicalized version of the URL but
this version remains available when svn_uri_canonicalize_safe() is used.
Perhaps we could somehow take advantage of that to fix both problems?

------------------------------------------------------------------------
r1873375 | stsp | 2020-01-30 20:14:49 +0100 (Thu, 30 Jan 2020) | 16 lines

Canonicalize redirect URLs in ra_serf, rather than using them as-is.
This prevents an assertion failure in the client if the server sends
a redirect to a non-canonical URL.

If Apache HTTPD uses a redirect statement such as this:
 Redirect permanent "/svn" https://svn.example.com/svn/
then the redirect URL won't be canonical. For example, access to the path
"/svn/trunk" will be redirected to https://svn.example.com/svn//trunk

Note the double-slash which eventually triggers an assertion failure when
the redirect URL gets checked at an API boundary outside of ra_serf.

* subversion/libsvn_ra_serf/options.c
  (svn_ra_serf__exchange_capabilities): Treat redirect URLs as untrusted
   input and attempt to canonicalize them.

------------------------------------------------------------------------
r1866899 | julianfoad | 2019-09-13 14:24:01 +0200 (Fri, 13 Sep 2019) | 26 lines

When following an HTTP redirect, use the Location header URL exactly.

Previously we canonicalized the redirect URL, which could lead to a redirect
loop. Then Subversion would report a redirect loop as the error, potentially
hiding a more interesting error such as when the target is not in fact a
Subversion repository.

A manual test case (on a non-repository):
  before:
    $ svn ls https://archive.apache.org/dist
    Redirecting to URL 'https://archive.apache.org/dist':
    Redirecting to URL 'https://archive.apache.org/dist':
    svn: E195019: Redirect cycle detected for URL 'https://archive.apache.org/dist'

  after:
    $ svn ls https://archive.apache.org/dist
    Redirecting to URL 'https://archive.apache.org/dist/':
    svn: E170013: Unable to connect to a repository at URL 'https://archive.apache.org/dist/'
    svn: E175003: The server at 'https://archive.apache.org/dist/' does not support the HTTP/DAV protocol

* subversion/libsvn_ra_serf/options.c
  (svn_ra_serf__exchange_capabilities): Don't canonicalize the redirect URL.

* subversion/libsvn_ra_serf/util.c
  (response_get_location): Don't canonicalize the redirect URL.

------------------------------------------------------------------------

Re: ra_serf redirect URL canonicalization fix

Posted by Stefan Sperling <st...@elego.de>.
On Sat, Feb 01, 2020 at 02:23:49PM +0100, Stefan Sperling wrote:
> Here is a patch I am running tests on now. With this, the RA layer returns
> the redirect URL in both non-canonicalized and canonicalized forms.

I'll note also that apparently we don't have tests for problematic URLs
in redirect_tests.py. I can look at adding some, but would also be glad
to get some help with this.

Re: ra_serf redirect URL canonicalization fix

Posted by Branko Čibej <br...@apache.org>.
On 01.02.2020 14:23, Stefan Sperling wrote:
> On Sat, Feb 01, 2020 at 12:31:16PM +0100, Branko Čibej wrote:
>> On 01.02.2020 12:17, Stefan Sperling wrote:
>>> On Fri, Jan 31, 2020 at 04:12:20AM +0000, Daniel Shahaf wrote:
>>>> So in balance, how about —
>>>>
>>>> - ra_serf should not canonicalize redirection URLs before accessing them.
>>>>
>>>> - After following all redirections (that is, once we get a 2xx response
>>>>   or a 5xx response), canonicalize the resultant URL.  If it is then
>>>>   equal to the input URL, error out with a "Not a repository" error
>>>>   [SVN_ERR_RA_SVN_REPOS_NOT_FOUND seems appropriate]; otherwise, return
>>>>   SVN_ERR_RA_SESSION_URL_MISMATCH (the error code that svn_ra_open4()
>>>>   promises for this situation).
>>>>
>>>> This would fix both of the original bugs, wouldn't it?
>>> This idea won't work without some restructuring because the redirect
>>> retry loop is currently in libsvn_client, not within ra_serf.
>> And for good reason. The RA layer is supposed do be stupid^Wsimple.
>>
>> -- Brane
> Here is a patch I am running tests on now. With this, the RA layer returns
> the redirect URL in both non-canonicalized and canonicalized forms.
> The idea is to allow libsvn_client to cache the non-canonicalized redirect
> URL and detect loops based on it, while using the canonicalized version for
> any other purposes.


This seems like a good fix, given that we can't just return a
non-canonical URL from the RA layer, due to API compatibility constraints.


> I would not propose to backport this. Rather, this patch would go to 1.14 only.

Ack. Note that there's redirect-loop detection in JavaHL, too, separate
from libsvn_client. It should receive the same kind of polishing.

-- Brane


Re: ra_serf redirect URL canonicalization fix

Posted by Stefan Sperling <st...@elego.de>.
On Sat, Feb 01, 2020 at 12:31:16PM +0100, Branko Čibej wrote:
> On 01.02.2020 12:17, Stefan Sperling wrote:
> > On Fri, Jan 31, 2020 at 04:12:20AM +0000, Daniel Shahaf wrote:
> >> So in balance, how about —
> >>
> >> - ra_serf should not canonicalize redirection URLs before accessing them.
> >>
> >> - After following all redirections (that is, once we get a 2xx response
> >>   or a 5xx response), canonicalize the resultant URL.  If it is then
> >>   equal to the input URL, error out with a "Not a repository" error
> >>   [SVN_ERR_RA_SVN_REPOS_NOT_FOUND seems appropriate]; otherwise, return
> >>   SVN_ERR_RA_SESSION_URL_MISMATCH (the error code that svn_ra_open4()
> >>   promises for this situation).
> >>
> >> This would fix both of the original bugs, wouldn't it?
> > This idea won't work without some restructuring because the redirect
> > retry loop is currently in libsvn_client, not within ra_serf.
> 
> And for good reason. The RA layer is supposed do be stupid^Wsimple.
> 
> -- Brane

Here is a patch I am running tests on now. With this, the RA layer returns
the redirect URL in both non-canonicalized and canonicalized forms.
The idea is to allow libsvn_client to cache the non-canonicalized redirect
URL and detect loops based on it, while using the canonicalized version for
any other purposes.
I would not propose to backport this. Rather, this patch would go to 1.14 only.

With my proposed patch in STATUS 1.13.x will no longer be able to detect loops
but will no longer run into assertion failures which takes priority IMO.

The 1.10.x branch still canonicalizes unconditionally. It won't detect loops
either but unlike 1.13/trunk it could abort if the URL fails to canonicalize.
I think backporting canonicalize_safe as a private function and using it to
canonicalize the redirect URL would be the best fix for 1.10.x but I have not
yet written a patch for this.

Index: subversion/include/svn_ra.h
===================================================================
--- subversion/include/svn_ra.h	(revision 1873372)
+++ subversion/include/svn_ra.h	(working copy)
@@ -65,7 +65,7 @@ svn_ra_version(void);
  * @a close_baton as appropriate.
  *
  * @a path is relative to the "root" of the session, defined by the
- * @a repos_URL passed to svn_ra_open4() vtable call.
+ * @a repos_URL passed to svn_ra_open5() vtable call.
  *
  * @a name is the name of the property to fetch. If the property is present,
  * then it is returned in @a value. Otherwise, @a *value is set to @c NULL.
@@ -369,7 +369,7 @@ typedef struct svn_ra_reporter3_t
    * implementor should assume the directory has no entries or props.
    *
    * This will *override* any previous set_path() calls made on parent
-   * paths.  @a path is relative to the URL specified in svn_ra_open4().
+   * paths.  @a path is relative to the URL specified in svn_ra_open5().
    *
    * If @a lock_token is non-NULL, it is the lock token for @a path in the WC.
    *
@@ -520,7 +520,7 @@ typedef struct svn_ra_reporter_t
 /** A collection of callbacks implemented by libsvn_client which allows
  * an RA layer to "pull" information from the client application, or
  * possibly store information.  libsvn_client passes this vtable to
- * svn_ra_open4().
+ * svn_ra_open5().
  *
  * Each routine takes a @a callback_baton originally provided with the
  * vtable.
@@ -710,6 +710,14 @@ typedef struct svn_ra_session_t svn_ra_session_t;
  * within the new repository root URL that @a repos_URL pointed to within
  * the old repository root URL.
  *
+ * If @a redirect_url is not NULL and a @corrected_url is returned, then
+ * @a redirect_url contains a non-canonicalized version of @a corrected_url,
+ * as communicated in the network protocol used by the RA provider.
+ * THe @a redirect_url should be used for to detect redirection loops.
+ * Canonicalization may change the protocol-level URL in a way that
+ * makes detection of redirect loops impossible in some cases since URLs which
+ * are different at the protocol layer could map to the same canonicalized URL.
+ *
  * Return @c SVN_ERR_RA_UUID_MISMATCH if @a uuid is non-NULL and not equal
  * to the UUID of the repository at @c repos_URL.
  *
@@ -728,7 +736,24 @@ typedef struct svn_ra_session_t svn_ra_session_t;
  *
  * @see svn_client_open_ra_session().
  *
+ * @since New in 1.14.
+ */
+svn_error_t *
+svn_ra_open5(svn_ra_session_t **session_p,
+             const char **corrected_url,
+             const char **redirect_url,
+             const char *repos_URL,
+             const char *uuid,
+             const svn_ra_callbacks2_t *callbacks,
+             void *callback_baton,
+             apr_hash_t *config,
+             apr_pool_t *pool);
+
+/** Similar to svn_ra_open5(), but with @a redirect_url always passed
+ * as @c NULL.
+ *
  * @since New in 1.7.
+ * @deprecated Provided for backward compatibility with the 1.13 API.
  */
 svn_error_t *
 svn_ra_open4(svn_ra_session_t **session_p,
Index: subversion/libsvn_client/ra.c
===================================================================
--- subversion/libsvn_client/ra.c	(revision 1873372)
+++ subversion/libsvn_client/ra.c	(working copy)
@@ -402,8 +402,7 @@ svn_client__open_ra_session_internal(svn_ra_sessio
         }
     }
 
-  /* If the caller allows for auto-following redirections, and the
-     RA->open() call above reveals a CORRECTED_URL, try the new URL.
+  /* If the caller allows for auto-following redirections, try the new URL.
      We'll do this in a loop up to some maximum number follow-and-retry
      attempts.  */
   if (corrected_url)
@@ -414,12 +413,14 @@ svn_client__open_ra_session_internal(svn_ra_sessio
       *corrected_url = NULL;
       while (attempts_left--)
         {
-          const char *corrected = NULL;
+          const char *corrected = NULL; /* canonicalized version */
+          const char *redirect_url = NULL; /* non-canonicalized version */
 
           /* Try to open the RA session.  If this is our last attempt,
              don't accept corrected URLs from the RA provider. */
-          SVN_ERR(svn_ra_open4(ra_session,
+          SVN_ERR(svn_ra_open5(ra_session,
                                attempts_left == 0 ? NULL : &corrected,
+                               attempts_left == 0 ? NULL : &redirect_url,
                                base_url, uuid, cbtable, cb, ctx->config,
                                result_pool));
 
@@ -441,13 +442,22 @@ svn_client__open_ra_session_internal(svn_ra_sessio
           *corrected_url = corrected;
 
           /* Make sure we've not attempted this URL before. */
-          if (svn_hash_gets(attempted, corrected))
+          if (svn_hash_gets(attempted, redirect_url))
             return svn_error_createf(SVN_ERR_CLIENT_CYCLE_DETECTED, NULL,
                                      _("Redirect cycle detected for URL '%s'"),
-                                     corrected);
+                                     redirect_url);
 
-          /* Remember this CORRECTED_URL so we don't wind up in a loop. */
-          svn_hash_sets(attempted, corrected, (void *)1);
+          /*
+           * Remember this redirect URL so we don't wind up in a loop.
+           *
+           * Store the non-canonicalized version of the URL. The canonicalized
+           * version is insufficient for loop detection because we might not get
+           * an exact match against URLs used by the RA protocol-layer (the URL
+           * used by the protocol may contain trailing slashes, for example,
+           * which are stripped during canonicalization).
+           */
+          svn_hash_sets(attempted, redirect_url, (void *)1);
+
           base_url = corrected;
         }
     }
Index: subversion/libsvn_ra/deprecated.c
===================================================================
--- subversion/libsvn_ra/deprecated.c	(revision 1873372)
+++ subversion/libsvn_ra/deprecated.c	(working copy)
@@ -151,6 +151,19 @@ static svn_ra_reporter2_t reporter_3in2_wrapper =
   abort_report
 };
 
+svn_error_t *svn_ra_open4(svn_ra_session_t **session_p,
+                          const char **corrected_url_p,
+                          const char *repos_URL,
+                          const char *uuid,
+                          const svn_ra_callbacks2_t *callbacks,
+                          void *callback_baton,
+                          apr_hash_t *config,
+                          apr_pool_t *pool)
+{
+  return svn_ra_open5(session_p, corrected_url_p, NULL, repos_URL, uuid,
+                      callbacks, callback_baton, config, pool);
+}
+
 svn_error_t *svn_ra_open3(svn_ra_session_t **session_p,
                           const char *repos_URL,
                           const char *uuid,
Index: subversion/libsvn_ra/ra_loader.c
===================================================================
--- subversion/libsvn_ra/ra_loader.c	(revision 1873372)
+++ subversion/libsvn_ra/ra_loader.c	(working copy)
@@ -256,8 +256,9 @@ svn_ra_create_callbacks(svn_ra_callbacks2_t **call
   return SVN_NO_ERROR;
 }
 
-svn_error_t *svn_ra_open4(svn_ra_session_t **session_p,
+svn_error_t *svn_ra_open5(svn_ra_session_t **session_p,
                           const char **corrected_url_p,
+                          const char **redirect_url_p,
                           const char *repos_URL,
                           const char *uuid,
                           const svn_ra_callbacks2_t *callbacks,
@@ -381,7 +382,7 @@ svn_ra_create_callbacks(svn_ra_callbacks2_t **call
   session->pool = sesspool;
 
   /* Ask the library to open the session. */
-  err = vtable->open_session(session, corrected_url_p,
+  err = vtable->open_session(session, corrected_url_p, redirect_url_p,
                              repos_URL,
                              callbacks, callback_baton, auth_baton,
                              config, sesspool, scratch_pool);
@@ -406,6 +407,8 @@ svn_ra_create_callbacks(svn_ra_callbacks2_t **call
     {
       /* *session_p = NULL; */
       *corrected_url_p = apr_pstrdup(pool, *corrected_url_p);
+      if (redirect_url_p && *redirect_url_p)
+        *redirect_url_p = apr_pstrdup(pool, *redirect_url_p);
       svn_pool_destroy(sesspool); /* Includes scratch_pool */
       return SVN_NO_ERROR;
     }
Index: subversion/libsvn_ra/ra_loader.h
===================================================================
--- subversion/libsvn_ra/ra_loader.h	(revision 1873372)
+++ subversion/libsvn_ra/ra_loader.h	(working copy)
@@ -64,11 +64,12 @@ typedef struct svn_ra__vtable_t {
 
   /* Implementations of the public API functions. */
 
-  /* See svn_ra_open4(). */
+  /* See svn_ra_open5(). */
   /* All fields in SESSION, except priv, have been initialized by the
      time this is called.  SESSION->priv may be set by this function. */
   svn_error_t *(*open_session)(svn_ra_session_t *session,
                                const char **corrected_url,
+                               const char **redirect_url,
                                const char *session_URL,
                                const svn_ra_callbacks2_t *callbacks,
                                void *callback_baton,
Index: subversion/libsvn_ra/wrapper_template.h
===================================================================
--- subversion/libsvn_ra/wrapper_template.h	(revision 1873372)
+++ subversion/libsvn_ra/wrapper_template.h	(working copy)
@@ -90,7 +90,7 @@ static svn_error_t *compat_open(void **session_bat
   callbacks2->progress_func = NULL;
   callbacks2->progress_baton = NULL;
 
-  SVN_ERR(VTBL.open_session(sess, &session_url, repos_URL,
+  SVN_ERR(VTBL.open_session(sess, &session_url, NULL, repos_URL,
                             callbacks2, callback_baton,
                             callbacks ? callbacks->auth_baton : NULL,
                             config, sesspool, sesspool));
Index: subversion/libsvn_ra_local/ra_plugin.c
===================================================================
--- subversion/libsvn_ra_local/ra_plugin.c	(revision 1873372)
+++ subversion/libsvn_ra_local/ra_plugin.c	(working copy)
@@ -554,6 +554,7 @@ ignore_warnings(void *baton,
 static svn_error_t *
 svn_ra_local__open(svn_ra_session_t *session,
                    const char **corrected_url,
+                   const char **redirect_url,
                    const char *repos_URL,
                    const svn_ra_callbacks2_t *callbacks,
                    void *callback_baton,
@@ -576,6 +577,8 @@ svn_ra_local__open(svn_ra_session_t *session,
   /* We don't support redirections in ra-local. */
   if (corrected_url)
     *corrected_url = NULL;
+  if (redirect_url)
+    *redirect_url = NULL;
 
   /* Allocate and stash the session_sess args we have already. */
   sess = apr_pcalloc(pool, sizeof(*sess));
Index: subversion/libsvn_ra_serf/options.c
===================================================================
--- subversion/libsvn_ra_serf/options.c	(revision 1873375)
+++ subversion/libsvn_ra_serf/options.c	(working copy)
@@ -546,6 +546,7 @@ svn_ra_serf__v1_get_activity_collection(const char
 svn_error_t *
 svn_ra_serf__exchange_capabilities(svn_ra_serf__session_t *serf_sess,
                                    const char **corrected_url,
+                                   const char **redirect_url,
                                    apr_pool_t *result_pool,
                                    apr_pool_t *scratch_pool)
 {
@@ -553,6 +554,8 @@ svn_ra_serf__exchange_capabilities(svn_ra_serf__se
 
   if (corrected_url)
     *corrected_url = NULL;
+  if (redirect_url)
+    *redirect_url = NULL;
 
   /* This routine automatically fills in serf_sess->capabilities */
   SVN_ERR(create_options_req(&opt_ctx, serf_sess, scratch_pool));
@@ -577,6 +580,9 @@ svn_ra_serf__exchange_capabilities(svn_ra_serf__se
         {
           SVN_ERR(svn_uri_canonicalize_safe(corrected_url, NULL,
               opt_ctx->handler->location, result_pool, scratch_pool));
+          if (redirect_url)
+            *redirect_url = apr_pstrdup(result_pool,
+                                        opt_ctx->handler->location);
         }
       else
         {
@@ -593,6 +599,8 @@ svn_ra_serf__exchange_capabilities(svn_ra_serf__se
           absolute_uri = apr_uri_unparse(scratch_pool, &corrected_URI, 0);
           SVN_ERR(svn_uri_canonicalize_safe(corrected_url, NULL,
               absolute_uri, result_pool, scratch_pool));
+          if (redirect_url)
+            *redirect_url = apr_pstrdup(result_pool, absolute_uri);
         }
 
       return SVN_NO_ERROR;
@@ -700,7 +708,8 @@ svn_ra_serf__has_capability(svn_ra_session_t *ra_s
 
   /* If any capability is unknown, they're all unknown, so ask. */
   if (cap_result == NULL)
-    SVN_ERR(svn_ra_serf__exchange_capabilities(serf_sess, NULL, pool, pool));
+    SVN_ERR(svn_ra_serf__exchange_capabilities(serf_sess, NULL, NULL,
+                                               pool, pool));
 
   /* Try again, now that we've fetched the capabilities. */
   cap_result = svn_hash_gets(serf_sess->capabilities, capability);
Index: subversion/libsvn_ra_serf/ra_serf.h
===================================================================
--- subversion/libsvn_ra_serf/ra_serf.h	(revision 1873372)
+++ subversion/libsvn_ra_serf/ra_serf.h	(working copy)
@@ -1472,6 +1472,7 @@ svn_ra_serf__get_mergeinfo(svn_ra_session_t *ra_se
 svn_error_t *
 svn_ra_serf__exchange_capabilities(svn_ra_serf__session_t *serf_sess,
                                    const char **corrected_url,
+                                   const char **redirect_url,
                                    apr_pool_t *result_pool,
                                    apr_pool_t *scratch_pool);
 
Index: subversion/libsvn_ra_serf/serf.c
===================================================================
--- subversion/libsvn_ra_serf/serf.c	(revision 1873372)
+++ subversion/libsvn_ra_serf/serf.c	(working copy)
@@ -476,6 +476,7 @@ get_user_agent_string(apr_pool_t *pool)
 static svn_error_t *
 svn_ra_serf__open(svn_ra_session_t *session,
                   const char **corrected_url,
+                  const char **redirect_url,
                   const char *session_URL,
                   const svn_ra_callbacks2_t *callbacks,
                   void *callback_baton,
@@ -492,6 +493,8 @@ svn_ra_serf__open(svn_ra_session_t *session,
 
   if (corrected_url)
     *corrected_url = NULL;
+  if (redirect_url)
+    *redirect_url = NULL;
 
   serf_sess = apr_pcalloc(result_pool, sizeof(*serf_sess));
   serf_sess->pool = result_pool;
@@ -599,6 +602,7 @@ svn_ra_serf__open(svn_ra_session_t *session,
   serf_sess->conn_latency = -1;
 
   err = svn_ra_serf__exchange_capabilities(serf_sess, corrected_url,
+                                           redirect_url,
                                            result_pool, scratch_pool);
 
   /* serf should produce a usable error code instead of APR_EGENERAL */
Index: subversion/libsvn_ra_svn/client.c
===================================================================
--- subversion/libsvn_ra_svn/client.c	(revision 1873372)
+++ subversion/libsvn_ra_svn/client.c	(working copy)
@@ -841,6 +841,7 @@ is_valid_hostinfo(const char *hostinfo)
 
 static svn_error_t *ra_svn_open(svn_ra_session_t *session,
                                 const char **corrected_url,
+                                const char **redirect_url,
                                 const char *url,
                                 const svn_ra_callbacks2_t *callbacks,
                                 void *callback_baton,
@@ -858,6 +859,8 @@ static svn_error_t *ra_svn_open(svn_ra_session_t *
   /* We don't support server-prescribed redirections in ra-svn. */
   if (corrected_url)
     *corrected_url = NULL;
+  if (redirect_url)
+    *redirect_url = NULL;
 
   SVN_ERR(parse_url(url, &uri, sess_pool));
 
@@ -913,7 +916,7 @@ static svn_error_t *ra_svn_dup_session(svn_ra_sess
 {
   svn_ra_svn__session_baton_t *old_sess = old_session->priv;
 
-  SVN_ERR(ra_svn_open(new_session, NULL, new_session_url,
+  SVN_ERR(ra_svn_open(new_session, NULL, NULL, new_session_url,
                       old_sess->callbacks, old_sess->callbacks_baton,
                       old_sess->auth_baton, old_sess->config,
                       result_pool, scratch_pool));



Re: ra_serf redirect URL canonicalization fix

Posted by Branko Čibej <br...@apache.org>.
On 01.02.2020 12:17, Stefan Sperling wrote:
> On Fri, Jan 31, 2020 at 04:12:20AM +0000, Daniel Shahaf wrote:
>> So in balance, how about —
>>
>> - ra_serf should not canonicalize redirection URLs before accessing them.
>>
>> - After following all redirections (that is, once we get a 2xx response
>>   or a 5xx response), canonicalize the resultant URL.  If it is then
>>   equal to the input URL, error out with a "Not a repository" error
>>   [SVN_ERR_RA_SVN_REPOS_NOT_FOUND seems appropriate]; otherwise, return
>>   SVN_ERR_RA_SESSION_URL_MISMATCH (the error code that svn_ra_open4()
>>   promises for this situation).
>>
>> This would fix both of the original bugs, wouldn't it?
> This idea won't work without some restructuring because the redirect
> retry loop is currently in libsvn_client, not within ra_serf.

And for good reason. The RA layer is supposed do be stupid^Wsimple.

-- Brane

Re: ra_serf redirect URL canonicalization fix

Posted by Stefan Sperling <st...@elego.de>.
On Fri, Jan 31, 2020 at 04:12:20AM +0000, Daniel Shahaf wrote:
> So in balance, how about —
> 
> - ra_serf should not canonicalize redirection URLs before accessing them.
> 
> - After following all redirections (that is, once we get a 2xx response
>   or a 5xx response), canonicalize the resultant URL.  If it is then
>   equal to the input URL, error out with a "Not a repository" error
>   [SVN_ERR_RA_SVN_REPOS_NOT_FOUND seems appropriate]; otherwise, return
>   SVN_ERR_RA_SESSION_URL_MISMATCH (the error code that svn_ra_open4()
>   promises for this situation).
> 
> This would fix both of the original bugs, wouldn't it?

This idea won't work without some restructuring because the redirect
retry loop is currently in libsvn_client, not within ra_serf.

Re: ra_serf redirect URL canonicalization fix

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Julian Foad wrote on Thu, 30 Jan 2020 20:05 +0000:
> Stefan Sperling wrote:
> > I don't think that will help much in this case. ra_serf is supposed to
> > return a "corrected URL" the client can use, instead of the orignal URL
> > which was already canonical. By definition, the output of this operation
> > must be canocalized if the "corrected URL" is going to be used anywhere.
> 
> Checking the code...  OK, I see: what I'm suggesting is exactly what I 
> failed to do in my previous commit.  I somehow thought it was normal in 
> this code area to return a non-canonical URL, but it isn't.
> 
> I'm not sure what is the best thing to do.  Maybe the original "redirect 
> loop" failure mode, which I previously regarded as a bug, really is the 
> most appropriate thing, or at least good enough.

That case is an impedance mismatch between our internal conventions and
the HTTP protocol.

In HTTP, I believe trailing slashes are significant: http://archive.apache.org/dist
and http://archive.apache.org/dist/ could be different resources.
Therefore, the redirect cannot be assumed to be a bug in the server's
configuration.

However, I think it's reasonable for a caller of svn_ra_open4() to
assume that if *CORRECTED_URL is set at all, then it is set to a
canonical URL that is different from REPOS_URL [the input URL].

So in balance, how about —

- ra_serf should not canonicalize redirection URLs before accessing them.

- After following all redirections (that is, once we get a 2xx response
  or a 5xx response), canonicalize the resultant URL.  If it is then
  equal to the input URL, error out with a "Not a repository" error
  [SVN_ERR_RA_SVN_REPOS_NOT_FOUND seems appropriate]; otherwise, return
  SVN_ERR_RA_SESSION_URL_MISMATCH (the error code that svn_ra_open4()
  promises for this situation).

This would fix both of the original bugs, wouldn't it?

Cheers,

Daniel

Re: ra_serf redirect URL canonicalization fix

Posted by Julian Foad <ju...@foad.me.uk>.
Stefan Sperling wrote:
> On Thu, Jan 30, 2020 at 07:43:00PM +0000, Julian Foad wrote:
>>
>> Meh... What a mess.
>>
>> One part of the clean up: we should make very clear where a URL variable
>> holds a 'canonical' URL (in this conversation, that means according to svn's
>> own cannonicalisation rules) and where it does not. As our general rule is
>> URLs in Subversion source code should be canonical, the way to make this
>> clear is to label any non-canonical URL as such, by it's variable name, if it
>> exists for more than about one line of code from where it's received to where
>> it's validated/canonicalized

(I pressed 'Send' too early here, but was not going to add anything 
significant.)

> I don't think that will help much in this case. ra_serf is supposed to
> return a "corrected URL" the client can use, instead of the orignal URL
> which was already canonical. By definition, the output of this operation
> must be canocalized if the "corrected URL" is going to be used anywhere.

Checking the code...  OK, I see: what I'm suggesting is exactly what I 
failed to do in my previous commit.  I somehow thought it was normal in 
this code area to return a non-canonical URL, but it isn't.

I'm not sure what is the best thing to do.  Maybe the original "redirect 
loop" failure mode, which I previously regarded as a bug, really is the 
most appropriate thing, or at least good enough.

- Julian

Re: ra_serf redirect URL canonicalization fix

Posted by Stefan Sperling <st...@elego.de>.
On Thu, Jan 30, 2020 at 07:43:00PM +0000, Julian Foad wrote:
> 
> Meh... What a mess.
> 
> One part of the clean up: we should make very clear where a URL variable
> holds a 'canonical' URL (in this conversation, that means according to svn's
> own cannonicalisation rules) and where it does not. As our general rule is
> URLs in Subversion source code should be canonical, the way to make this
> clear is to label any non-canonical URL as such, by it's variable name, if it
> exists for more than about one line of code from where it's received to where
> it's validated/canonicalized

I don't think that will help much in this case. ra_serf is supposed to
return a "corrected URL" the client can use, instead of the orignal URL
which was already canonical. By definition, the output of this operation
must be canocalized if the "corrected URL" is going to be used anywhere.

Re: ra_serf redirect URL canonicalization fix

Posted by Julian Foad <ju...@foad.me.uk>.
Meh... What a mess.

One part of the clean up: we should make very clear where a URL variable holds a 'canonical' URL (in this conversation, that means according to svn's own cannonicalisation rules) and where it does not. As our general rule is URLs in Subversion source code should be canonical, the way to make this clear is to label any non-canonical URL as such, by it's variable name, if it exists for more than about one line of code from where it's received to where it's validated/canonicalized

Sent with FairEmail [https://email.faircode.eu/] , an open source, privacy friendly email app for Android

30 Jan 2020 19:34:42 Stefan Sperling <st...@elego.de>:

> I managed to trigger a client-side assertion failure today with a
> simple HTTP redirect. I have fixed this issue in r1873375 (see below).
> 
> When I tried to merge this fix to 1.10.x I learned that this behaviour
> was intentionally changed already in r1866899. We used to canonicalize
> these URLs, but then we stopped doing so because we could end up
> misreporting an unrelated error condition (see below).
> 
> My patch doesn't cleanly merge to 1.10.x because svn_uri_canonicalize_safe()
> does not yet exist there. Should we backport it to 1.10.x as a private
> function, or revert r1866899 from 1.10.x, or do we leave 1.10.x as is?
> 
> To me it looks as if the consequences of r1866899 weren't fully considered.
> The server should not be able to trigger assertions in the client's libraries.
> 
> Would svn_uri_canonicalize_safe() allow for detecting a redirect loop problem?
> My patch is not making use of the non-canonicalized version of the URL but
> this version remains available when svn_uri_canonicalize_safe() is used.
> Perhaps we could somehow take advantage of that to fix both problems?
> 
> ------------------------------------------------------------------------
> r1873375 | stsp | 2020-01-30 20:14:49 +0100 (Thu, 30 Jan 2020) | 16 lines
> 
> Canonicalize redirect URLs in ra_serf, rather than using them as-is.
> This prevents an assertion failure in the client if the server sends
> a redirect to a non-canonical URL.
> 
> If Apache HTTPD uses a redirect statement such as this:
> Redirect permanent "/svn" https://svn.example.com/svn/
> then the redirect URL won't be canonical. For example, access to the path
> "/svn/trunk" will be redirected to https://svn.example.com/svn//trunk
> 
> Note the double-slash which eventually triggers an assertion failure when
> the redirect URL gets checked at an API boundary outside of ra_serf.
> 
> * subversion/libsvn_ra_serf/options.c
> (svn_ra_serf__exchange_capabilities): Treat redirect URLs as untrusted
> input and attempt to canonicalize them.
> 
> ------------------------------------------------------------------------
> r1866899 | julianfoad | 2019-09-13 14:24:01 +0200 (Fri, 13 Sep 2019) | 26 lines
> 
> When following an HTTP redirect, use the Location header URL exactly.
> 
> Previously we canonicalized the redirect URL, which could lead to a redirect
> loop. Then Subversion would report a redirect loop as the error, potentially
> hiding a more interesting error such as when the target is not in fact a
> Subversion repository.
> 
> A manual test case (on a non-repository):
> before:
> $ svn ls https://archive.apache.org/dist
> Redirecting to URL 'https://archive.apache.org/dist':
> Redirecting to URL 'https://archive.apache.org/dist':
> svn: E195019: Redirect cycle detected for URL 'https://archive.apache.org/dist'
> 
> after:
> $ svn ls https://archive.apache.org/dist
> Redirecting to URL 'https://archive.apache.org/dist/':
> svn: E170013: Unable to connect to a repository at URL 'https://archive.apache.org/dist/'
> svn: E175003: The server at 'https://archive.apache.org/dist/' does not support the HTTP/DAV protocol
> 
> * subversion/libsvn_ra_serf/options.c
> (svn_ra_serf__exchange_capabilities): Don't canonicalize the redirect URL.
> 
> * subversion/libsvn_ra_serf/util.c
> (response_get_location): Don't canonicalize the redirect URL.
> 
> ------------------------------------------------------------------------
>