You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Thomas Eckert <th...@gmail.com> on 2013/09/25 11:06:33 UTC

ProxyPassReverse and regex

I'm facing the problem that I have to use ProxyPassReverse inside a
<LocationMatch> container, which is not really supported as documented in
the last paragrpah at
http://httpd.apache.org/docs/current/mod/mod_proxy.html#proxypassreverse

I find the 'workaround' mentioned in the docs quite useless:
"The same occurs inside a <LocationMatch> section, but will probably not
work as intended, as ProxyPassReverse will interpret the regexp literally
as a path; if needed in this situation, specify the ProxyPassReverse
outside the section, or in a separate <Location> section."

How is this supposed to help me when facing such a situation ? If I need to
have ProxyPassReverse understand a regex then it will not do so just
because I placed it outside of the <LocationMatch> container since it
*always* understands the path argument as literal string - or did I miss
anyhing when looking at the ProxyPassReverse code ?

In my concrete situation I have a <LocationMatch> container with a negative
lookahead which I need to have ProxyPassReverse understand somehow. I'm
thinking of patching ProxyPassReverse using the ProxyPassMatch code so it
understands regexps correctly. However, this has surely been considered
before and I'm wondering why it was not put in - after all similar code
exists for ProxyPassMatch. Are there pitfalls which I haven't seen yet ?

Some time ago I dig into some issues I had with using directives inside a
<LocationMatch> container instead of a <Location> container. I remember
being told in IRC <LocationMatch> behaves less like a <Location> and more
like a <Directory> internally. Might this be connected to
'ProxyPassReverseMatch' not existing ?

Cheers,
  Thomas

Re: ProxyPassReverse and regex

Posted by Thomas Eckert <th...@gmail.com>.
Given something like this

<LocationMatch ^/(foo|bar)>
  ProxyPass balancer://abc123/
  ProxyPassReverse balancer://abc123/
  ...
<LocationMatch>

it is obvious the regexp ^/(foo|bar) is used to determine the correct
location container to use for a given request. But after this, what is it's
value for ProxyPassReverse ? The path usually given in <Location> and
passed on to ProxyPassReverse by putting it inside the location container
is no real path - it is only an evaluation statement. If a request was
"matched into" the location above we know that the request's path is now
equivalent to the path in a normal location container. For example, compare
the above LocationMatch with this

<Location /other>
  ProxyPass balancer://abc123/
  ProxyPassReverse balancer://abc123/
  ...
</Location>

both can be used to "catch" request with paths along the line of "/other".
The second example will pass on the path information to ProxyPassReverse
directly while the first will not. However, for the mod_proxy logic we
still have that information in the request structure. So as long as we can
translate an origin server's name to the one used by the client to query
the reverse proxy and have access to the original request's path we are
fine.

'proof of concept' below works for me:

diff --git a/modules/proxy/proxy_util.c b/modules/proxy/proxy_util.c
index 4fa53dc..febb581 100644
--- a/modules/proxy/proxy_util.c
+++ b/modules/proxy/proxy_util.c
@@ -895,7 +895,8 @@ PROXY_DECLARE(const char *)
ap_proxy_location_reverse_map(request_rec *r,
                 }
                 else if (l1 >= l2 && strncasecmp((*worker)->s->name, url,
l2) == 0) {
                     /* edge case where fake is just "/"... avoid double
slash */
-                    if ((ent[i].fake[0] == '/') && (ent[i].fake[1] == 0)
&& (url[l2] == '/')) {
+                    if (((ent[i].fake[0] == '/') && (ent[i].fake[1] == 0)
&& (url[l2] == '/')) ||
+                        apr_fnmatch_test(ent[i].fake) {
                         u = apr_pstrdup(r->pool, &url[l2]);
                     } else {
                         u = apr_pstrcat(r->pool, ent[i].fake, &url[l2],
NULL);

I'm using ProxyPassReverse in a rather limited fashion. Do you see
situations where the above fails ?



On Wed, Sep 25, 2013 at 12:31 PM, Nick Kew <ni...@webthing.com> wrote:

>
> On 25 Sep 2013, at 10:06, Thomas Eckert wrote:
>
> > I'm facing the problem that I have to use ProxyPassReverse inside a
> <LocationMatch>
>
> Just a thought: could you hack a workaround with Header Edit?
>
> > In my concrete situation I have a <LocationMatch> container with a
> negative lookahead which I need to have ProxyPassReverse understand
> somehow. I'm thinking of patching ProxyPassReverse using the ProxyPassMatch
> code so it understands regexps correctly. However, this has surely been
> considered before and I'm wondering why it was not put in - after all
> similar code exists for ProxyPassMatch. Are there pitfalls which I haven't
> seen yet ?
>
> ProxyPass(Match) applies to the Request, ProxyPassReverse to the Response.
>
> From memory and without looking in the code, the missing link is
> per-request
> memory of how a regexp was expanded in the ProxyPass so that
> ProxyPassReverse
> can apply an equivalent rule.  It just requires someone to do the work.
>
> If you hack it, you might give some consideration to making an API for the
> ProxyPassReverse regexp expansion, so output filters like mod_proxy_html
> can use it.
>
> --
> Nick Kew

Re: ProxyPassReverse and regex

Posted by Nick Kew <ni...@webthing.com>.
On 25 Sep 2013, at 10:06, Thomas Eckert wrote:

> I'm facing the problem that I have to use ProxyPassReverse inside a <LocationMatch>

Just a thought: could you hack a workaround with Header Edit?

> In my concrete situation I have a <LocationMatch> container with a negative lookahead which I need to have ProxyPassReverse understand somehow. I'm thinking of patching ProxyPassReverse using the ProxyPassMatch code so it understands regexps correctly. However, this has surely been considered before and I'm wondering why it was not put in - after all similar code exists for ProxyPassMatch. Are there pitfalls which I haven't seen yet ?

ProxyPass(Match) applies to the Request, ProxyPassReverse to the Response.

From memory and without looking in the code, the missing link is per-request
memory of how a regexp was expanded in the ProxyPass so that ProxyPassReverse
can apply an equivalent rule.  It just requires someone to do the work.

If you hack it, you might give some consideration to making an API for the
ProxyPassReverse regexp expansion, so output filters like mod_proxy_html
can use it.

-- 
Nick Kew