You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by David Crooke <da...@convio.com> on 2001/04/11 05:51:46 UTC

[PATCH] mod_alias RedirectMatch glitch


I came across a problem with Apache the other day, a very tiny thing but an
issue which seems to have been lurking for a couple of years - there are a
number of open GNATS entries I found for it, with no feedback against them. I
have done a bit of digging and convinced myself it's a bug, or at least it would
work better the way I expected it to rather than the way it does now, and found
the (trivial) code patch to change it.

The details...........

Here's a sample of open GNATS issues - the latter I submitted myself and
included the patch below.

http://bugs.apache.org/index.cgi/full/3333

http://bugs.apache.org/index.cgi/full/4379

http://bugs.apache.org/index.cgi/full/5948

http://bugs.apache.org/index.cgi/full/7503


If the target URI of a RedirectMatch contains a querystring or an anchor it will
not work as intended by the user, because mod_alias url-encodes the URI before
passing it back to the browser, e.g.

RedirectMatch 301 /foo.html http://www.wibble.org/cgi-bin/bar.cgi?page=foo

will return the following URL to the browser as the redirect target:

http://www.wibble.org/cgi-bin/bar.cgi%3fpage=foo

which of course results in a 404 on the subsequent request.

I suspect this may be by design, and that the reason for doing it this way is
because the target of the RedirectMatch is intended to replace only the base
part of the URI without the querystring segment; if the original URI contains a
querystring this is appended (correctly, without being re-encoded) to the result
of encoding the 3rd argument to RedirectMatch.

However, with Redirect instead of RedirectMatch, the above statement behaves as
expected and intended by a typical user.

My observations (IMHO only):

1. The behaviour of Redirect and RedirectMatch is definitely inconsistent; both
replace only the base part of the URI, but the latter forcibly escapes the
replacement string and the former does not.

2. My feeling is that he use case of wanting to redirect from a "base" URI with
no querystring (especially the case of redirecting "/" but not the whole URI
namesapce, which requires RedirectMatch) to a URI including a querystring is
more
common than wanting to redirect from one base URI to another while preserving
the
incoming querystring, and much more common than the decidedly questionable idea
of
wanting to have a target base URI which contains escaped special characters.

3. If RedirectMatch was changed to work like Redirect, and anyone was to want
the latter option in 2 above, they could easily write the urlencoded format in
httpd.conf to start with, whereas the current behaviour totally blocks the more
common case

4. In short, I cannot see a reason to urlencode, but I may be missing something
in the architecture of the module - I have not looked into how it works with
declarations in .htaccess files, for instance.

5. For these reasons, I believe that on balance taking the urlencode call out
makes more sense than leaving it in.

6. The correct and complete fix would be to change RedirectMatch so that the
target is not automagically url-encoded (as per the code change below), and to
also add an optional fourth argument of flags (I would suggest syntax in the
style of the "[PT]" etc. options to mod_rewrite) with one which would mean "take
the third argument as the entire URL and discard the inbound querystring, if
any, which came in with the previous URL"

6. Arguably, for downward compatibility there should be a second flag for "don't
urlencode", while leaving the current behaviour as the default even if it is
considered erroneous.

I have not investigated how to implement the flag option, but if the Apache
project people agree in principle with the idea, I will sign up to produce a
scrupulously documented patch which implements it in a form ready to apply to
the main source trees (1.3 and 2.0).

The change to turn off the urlencoding only, is as follows against the 1.3.19
version of src/modules/standard/mod_alias.c (also works with versions back as
far as 1.3.12)


*** mod_alias.c Mon Jan 15 11:05:35 2001

-- mod_alias.c.new     Sat Mar 31 17:25:46 2001
***************
*** 308,320 ****
--- 308,324 ----
        if (p->regexp) {
            if (!ap_regexec(p->regexp, r->uri, p->regexp->re_nsub + 1, regm, 0))
{
                if (p->real) {
                    found = ap_pregsub(r->pool, p->real, r->uri,
                                    p->regexp->re_nsub + 1, regm);
+ /*
+       Removed to make RedirectMatch work like Redirect - DCC
+
                    if (found && doesc) {
                        found = ap_escape_uri(r->pool, found);
                    }
+ */
                }
                else {
                    /* need something non-null */
                    found = ap_pstrdup(r->pool, "");
                }

Look forward to hearing your thoughts

Cheers
Dave

--
David Crooke, Chief Technology Officer
Convio Inc. - the online partner for nonprofits
4801 Plaza on the Lake, Suite 1500, Austin TX 78746
Tel: (512) 652 2600 - Fax: (512) 652 2699