You are viewing a plain text version of this content. The canonical link for it is here.
Posted to bugs@httpd.apache.org by bu...@apache.org on 2007/07/04 22:08:45 UTC

DO NOT REPLY [Bug 42810] New: - Wrong interactions between mod_negotiation, mod_rewrite, and mod_proxy. Apache is doing an extra-canonicalization of the query string separator

DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG�
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=42810>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND�
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=42810

           Summary: Wrong interactions between mod_negotiation, mod_rewrite,
                    and mod_proxy.  Apache is doing an extra-
                    canonicalization of the query string separator
           Product: Apache httpd-2
           Version: 2.2.4
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: mod_rewrite
        AssignedTo: bugs@httpd.apache.org
        ReportedBy: jose@w3.org


This bug is consistent in Apache 2.2.4 (official tar-ball compiled in-place),
This wasn't an bug in Apache 1.3.6 (and older 1.3.x versions).

How to reproduce:

1. Set up an Apache server and enable mod_proxy, mod_proxy_http, and 
mod_rewrite. 

2. In the configuration file, enable mod_rewrite logging:
[[
 RewriteLog /var/log/apache2/rewrite_log
 # set to 0 to disable logging (no longer debugging)
 RewriteLogLevel 9
]]

3. Create a directory called red-test. Inside this directory, put an HTML
file named index.html (any content will do).

4. In the same red-test directory, create an .htaccess file with the following
content:
[[
RewriteEngine on

RewriteBase /red-test/
RewriteRule ^index$ http://www.inrialpes.fr/index.html [NE,QSA,P,L]
RewriteRule ^index.html$ http://www.inrialpes.fr/index.html [NE,QSA,P,L]
]]

N.B., you could redirect it to any other page. www.inrialpes.fr/index.html
will display "Hello, World", regardless of the query string. You will get
something else if the page was not found (but written in french).

5. If you access the following URL (using curl or some other tool):

    curl "http://localhost/red-test/index.html?var1=32&var2=33"

Apache doesn't do any content negotiation (that's normal). The rewrite
log shows that we sent the following proxy request:

[perdir /home/kahan/httpd/red-test/] go-ahead with proxy request 
         proxy:http://www.inrialpes.fr/index.html?var1=32&var2=33 [OK]

With www.inrialpes.fr/index.html, we get our "Hello, World" page.

This is the expected result.

6. If you access the following URL:
    curl "http://localhost/red-test/index?var1=32&var2=33"

Apache's content-negotiation kicks in and selects index.html. The
rewrite log shows that the query string separator has been escaped:

[perdir /home/kahan/httpd/red-test/] pass through 
   proxy:http://www.inrialpes.fr/index.html%3Fvar1=32?var1=32

Also note, if significant, that the log message now says "pass
through" instead of "go-ahead". In this case, and as expected,
the remote server won't recognize %3F as a query string separator
and will try to find a different file. In the inrialpes.fr server,
we get a page not found" message from the server. That's
normal as there's no file named index.html%3F...

This is the wrong result. We we're expecting to have "Hello, World" again,
but the query string separator was canonicalized.

7. If you remove the file index.html from /red-test and try again, you
get the same result as when not doing content-negotiation, regardless
of accessing index.html or index. The rewrite rules are the only ones
taken into account. This is normal.

[perdir /home/kahan/httpd/red-test/] go-ahead with proxy request 
         proxy:http://www.inrialpes.fr/index.html?var1=32&var2=33 [OK]

And we always get the Hello, World page in the inrialpes server. This is
the expected result.

--------------
Some hints for a solution:

the problem may be around (or near)
mod_proxy:proxy_fixup.  For the conneg case, this function is being
called twice. The first pass we have:

[[
(gdb) p r->uri
$24 = 0x81df5a0 "/red-test/index.html"
(gdb) p r->filename
$25 = 0x81df5c8 "/home/kahan/httpd/red-test/index.html"
(gdb) p r->canonical_filename 
$26 = 0x81df550 "/home/kahan/httpd/red-test/index.html"
]]

In the second pass we have (notice that the query string now appears
in r->filename):

[[
(gdb) p r->uri
$29 = 0x81df5a0 "/red-test/index.html"
(gdb) p r->filename
$30 = 0x81dff30 "proxy:http://www.inrialpes.fr/index.html?var1=32&var2=33"
(gdb) p r->canonical_filename 
$31 = 0x81df550 "/home/kahan/httpd/red-test/index.html"
(gdb) 
]]

And, at the end of the function call, the query string was wrongly
canonicalized:

[[
(gdb) p r->uri
$32 = 0x81df5a0 "/red-test/index.html"
(gdb) p r->filename
$33 = 0x81ebeb8 "proxy:http://www.inrialpes.fr/
index.html%3Fvar1=32&var2=33?var1=32&var2=33"
(gdb) p r->canonical_filename 
$34 = 0x81df550 "/home/kahan/httpd/red-test/index.html"
]]

I'm not sure if a good solution to the problem would be
in mod_proxy:proxy_http_canon() which has this apparently wrong
assumption:

[[
    /* now parse path/search args, according to rfc1738 */
    /* N.B. if this isn't a true proxy request, then the URL _path_
     * has already been decoded.  True proxy requests have r->uri
     * == r->unparsed_uri, and no others have that property.
     */
    if (r->uri == r->unparsed_uri) {
        search = strchr(url, '?');
        if (search != NULL)
            *(search++) = '\0';
    }
    else
        search = r->args;

]]

which is not true in the broken case:

[[
(gdb) p r->uri
$41 = 0x81e85c0 "/red-test/index.html"
(gdb) p r->unparsed_uri
$42 = 0x81df878 "/red-test/index?var1=32&var2=33"
]]

Thanks for your help. We have this bug at W3C. I'm only using
the inrialpes.fr server to show how to reproduce it.

-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org


DO NOT REPLY [Bug 42810] - Wrong interactions between mod_negotiation, mod_rewrite, and mod_proxy. Apache is doing an extra-canonicalization of the query string separator

Posted by bu...@apache.org.
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG�
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=42810>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND�
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=42810


jose@w3.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED




------- Additional Comments From jose@w3.org  2007-08-06 07:32 -------
This was a bug in request.c. This patch I made for bug 41960 fixed this one too.

http://issues.apache.org/bugzilla/show_bug.cgi?id=41960

-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org