You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Brian Candler <B....@pobox.com> on 2005/09/30 20:00:18 UTC

%{REQUEST_URI} inconsistency

I've spent an afternoon nailing down a problem, and it's turned out to be
some anomolous behaviour which I think ought to be documented. (Note that
I'm talking about Apache-1.3 here, but a brief glance at the 2.1.6 source
suggests it has the same issue)

Issue: the REQUEST_URI value passed to a CGI environment is _not_ the same
as the %{REQUEST_URI} substitution in mod_rewrite. See here:

[main/util_script.c]

    ap_table_setn(e, "REQUEST_URI", original_uri(r));

[modules/standard/mod_rewrite.c]

    else if (strcasecmp(var, "REQUEST_URI") == 0) { /* non-standard */
        result = r->uri;

Looks an innocuous difference, doesn't it? :-)

Where it bit me is when a request has been associated with an action
handler, and so the original URI is not the same as the rewritten one. You
can have

    AddType application/x-php php
    Action application/x-php /common-cgi/php_wrapper

and if you actually *execute* php_wrapper on this box, then the script gets
the correct REQUEST_URI as expected. However if you try to do something
clever in mod_rewrite, such as trigger the request to be proxied to another
machine where it will be executed:

    RewriteRule ^/common-cgi/php_wrapper http://foo.com%{REQUEST_URI} [P]

then the REQUEST_URI is not what you expect. (In fact it was made worse in
my case because a user had a .htaccess file doing per-directory rewrites,
and I really needed to proxy the original URI, not the rewritten one)

Now, what the CGI's original_uri() function does is to take the original
request line, strip off everything before the first space and after the
second space, and use that. You can simulate this in mod_rewrite:

    RewriteCond %{THE_REQUEST} "^[^ ]+ +(/[^ ]*)"
    RewriteRule ^ - [E=REQUEST_URI:%2]

and so I have a solution to my problem. There doesn't seem to be any other
mod_rewrite expansion apart from THE_REQUEST which contains the original
URI.

I very much doubt you can change the behaviour of either the %{REQUEST_URI}
mod_rewrite function or the CGI REQUEST_URI environment variable without
breaking people everywhere, but perhaps a nice juicy note could be put into
the documentation, or at least in the source code at the two points above?
It might save someone else having to spend a whole afternoon scratching
their head :-)

Also, I think it would be nice in mod_rewrite to have %{ORIGINAL_URI} which
gives the same value as REQUEST_URI in a CGI environment. However I can live
without that, given that it can be simulated using a regular expression as
shown above.

Regards,

Brian Candler.

Re: %{REQUEST_URI} inconsistency

Posted by Brian Candler <B....@pobox.com>.
On Sun, Oct 02, 2005 at 12:49:07PM +0200, Ruediger Pluem wrote:
> >     RewriteRule ^/common-cgi/php_wrapper http://foo.com%{REQUEST_URI} [P]
> 
> Have you tried
> 
> RewriteRule ^/common-cgi/php_wrapper http://foo.com%{REQUEST_URI}?%{QUERY_STRING} [P]
> 
> instead?

No, because it wouldn't do what I needed. Once a request has been through
mod_rewrite several times, I want to be able to get hold of the original URI
for the request, not the one which started the last round of mod_rewrite
operation. This is what REQUEST_URI gives you in a CGI environment, but not
what %{REQUEST_URI} gives you in mod_rewrite.

Query strings are a separate problem which I didn't mention. I've discovered
that mod_rewrite/mod_proxy also adds back the query string even when you
don't want it. For example, if you have

    RewriteRule /foo.html /bar.php?id=6

then if you later do

    RewriteRule ^ /foo.html [P]

then it actually proxies to /foo.html?id=6. So to fix this, any initial
query which does not contain a question mark needs one adding: in this case
I would do

    RewriteRule ^ /foo.html? [P]

to stop mod_proxy adding the query string.

The actual magic I'm using is:

RewriteCond %{THE_REQUEST} "^[^ ]+ +(https?://[^/]+)?(/[^ ]*)"
RewriteRule ^ - [E=REQUEST_URI:%2]

RewriteCond %{ENV:REQUEST_URI} !\?
RewriteRule ^ - [E=REQUEST_URI:%{ENV:REQUEST_URI}?]

But that wasn't the point I was trying to make: the point is that
mod_rewrite's %{REQUEST_URI} is significantly different from a CGI
environment REQUEST_URI, this is non-obvious, and I consider it a bug in the
documentation that this is not made clear.

Regards,

Brian.

Re: %{REQUEST_URI} inconsistency

Posted by Ruediger Pluem <rp...@apache.org>.

Brian Candler wrote:

[..cut..]

> 
> Where it bit me is when a request has been associated with an action
> handler, and so the original URI is not the same as the rewritten one. You
> can have
> 
>     AddType application/x-php php
>     Action application/x-php /common-cgi/php_wrapper
> 
> and if you actually *execute* php_wrapper on this box, then the script gets
> the correct REQUEST_URI as expected. However if you try to do something
> clever in mod_rewrite, such as trigger the request to be proxied to another
> machine where it will be executed:
> 
>     RewriteRule ^/common-cgi/php_wrapper http://foo.com%{REQUEST_URI} [P]

Have you tried

RewriteRule ^/common-cgi/php_wrapper http://foo.com%{REQUEST_URI}?%{QUERY_STRING} [P]

instead?

Regards

RĂ¼diger

[..cut..]