You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@httpd.apache.org by Matt Liggett <ma...@socialtext.com> on 2006/10/03 19:17:57 UTC
[users@httpd] combining AllowEncodedSlashes, reverse proxy, and apache 1.x
Introduction
At Socialtext[1], we use, in many installations, Apache 2 (hereafter
"front end") as a server for static content and a reverse proxy for
Apache 1 with mod_perl (hereafter "back end"). It recently came up,
in the course of developing a REST API[2], that we need to be able
to handle URIs with encoded '/' (%2F) characters in them[3].
In addition to needing this all to work with Apache 2 acting as a
front end, it also needs to work in an alternate configuration where
Apache 1 runs alone.
AllowEncodedSlashes bug
According to the docs[4],
Allowing encoded slashes does not imply decoding. Occurrences of
%2F or %5C (only on according systems) will be left as such in the
otherwise decoded URL string.
but it is our experience that if a URL like in [3] is passed to
Apache 2, it gets passed to the reverse proxy as
/data/workspaces/ambivalent/pages/either/or
which seems to be a bug.[5]
In addition to this, I believe it's important not to decode '%25' if
one has AllowEncodedSlashes turned on, otherwise the URLs
'/foo/%252F' and '/foo/%2F' become indistinguishable.[6]
The assorted backports of AllowEncodedSlashes to Apache 1 have these
bugs as well.
Changed URL decoding behaviour in 2.0.55.
Prior to 2.0.55, the rewrite rule for our reverse proxy looked like
RewriteMap escape int:escape
RewriteRule (.*) http://BACK_END${escape:$1}
where BACK_END is the back end hostname and port. This was because
the URL was getting decoded prior to this rule, and an encoded '%43'
would become a '?', which would parse incorrectly on the back end.
As of 2.0.55, this extra decoding seems cleaned up, _except_ for
'%2F' if AllowEncodedSlashes is on. That is, the bug described
above is still present.
As a result, it seems that if we want standard decode/escape
sementics on the front-end, we must insist on 2.0.55+.
Do we need all this?
It would seem that we need patched versions of Apache 2.0.55+ and Apache 1
as described above to solve the problem in both configurations (with
and without Apache 2 acting as reverse proxy). Have we
overcomplicated the problem? If so is there a simpler combination
of configuration, versions, or patches that accomplishes the same
result?
Have I misunderstood anything above? Requiring specially patched
versions of both Apaches is a bit of a hardship, so we want to make
sure we aren't being super dumb here.
Thanks.
----
[1] http://www.socialtext.com/
[2] https://www.socialtext.net/st-rest-docs/index.cgi
[3] An example would be the canonical URI of a page named 'either/or'
in the workspace 'ambivalent':
/data/workspaces/ambivalent/pages/either%2For
[4] http://httpd.apache.org/docs/2.0/mod/core.html#allowencodedslashes
[5] I have a patch that fixes ap_unescape_url_keep2f() and can submit
it.
[6] I have a patch for this behaviour too, but the docs would need to
be modified if it were to be accepted.
--
Matt Liggett
Senior Software Engineer
Socialtext, Inc.
---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
" from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org
Re: [users@httpd] combining AllowEncodedSlashes, reverse proxy, and apache 1.x
Posted by Joshua Slive <jo...@slive.ca>.
I'm not really an expert in this stuff, but a couple comments anyway...
On 10/3/06, Matt Liggett <ma...@socialtext.com> wrote:
> AllowEncodedSlashes bug
>
> According to the docs[4],
>
> Allowing encoded slashes does not imply decoding. Occurrences of
> %2F or %5C (only on according systems) will be left as such in the
> otherwise decoded URL string.
>
> but it is our experience that if a URL like in [3] is passed to
> Apache 2, it gets passed to the reverse proxy as
>
> /data/workspaces/ambivalent/pages/either/or
>
> which seems to be a bug.[5]
I don't believe that is really a bug. The docs mean that activating
AllowEncodedSlashes does not in itself do any decoding. But if you
have other stuff in the works that does decoding, all bets are off.
And in general, I don't think the unescaping algorithm has a bug
either. RFC2396 section 2.4.2 says " If the
given URI scheme defines a canonicalization algorithm, then
unreserved characters may be unescaped according to that algorithm."
The slash is not a reserved character and hence can be unescaped,
according to my reading. And there are good reasons for doing just
that.
If I were you, the first thing I would try is to make your back-end
application deal with this, either by accepting a raw slash, or by
generating URLs that use some other character in place of slash.
But I have to admit that the escaping unescaping in mod_proxy and
mod_rewrite has always mystified me, and I wish it was better
documented and more configurable.
Joshua.
---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
" from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org