You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Jim Jagielski <ji...@jaguNET.com> on 2009/03/24 14:28:21 UTC

ProxyPassReverse and paths

There have been a few times when people get caught up when
using ProxyPassReverse with balancers that contain a path...
After all, the normal convention is everywhere you see a
ProxyPass there should be a corresponding ProxyPassReverse
that follows the same format. However in cases where
ProxyPass contains a path, PPR doesn't work correctly...

This fixes that but is also safe for the vast majority of
existing sites. I plan to commit unless there are objections:

Index: modules/proxy/proxy_util.c
===================================================================
--- modules/proxy/proxy_util.c	(revision 757753)
+++ modules/proxy/proxy_util.c	(working copy)
@@ -1080,11 +1080,9 @@
        * or may not be the right one... basically, we need
        * to find which member actually handled this request.
        *
-         * TODO: Recover the path from real and use that
-         *       for more exact matching
        */
       if ((strncasecmp(real, "balancer:", 9) == 0) &&
-            (balancer = ap_proxy_get_balancer(r->pool, sconf, real))) {
+            (balancer = ap_proxy_get_balancerwpath(r->pool, sconf,  
real))) {
           int n;
           proxy_worker *worker;
           worker = (proxy_worker *)balancer->workers->elts;
@@ -1245,9 +1243,10 @@
   return ret;
}

-PROXY_DECLARE(proxy_balancer *) ap_proxy_get_balancer(apr_pool_t *p,
-                                                       
proxy_server_conf *conf,
-                                                      const char *url)
+static proxy_balancer *proxy_find_balancer(apr_pool_t *p,
+                                           proxy_server_conf *conf,
+                                           const char *url,
+                                           int usepath)
{
   proxy_balancer *balancer;
   char *c, *uri = apr_pstrdup(p, url);
@@ -1257,8 +1256,13 @@
   if (c == NULL || c[1] != '/' || c[2] != '/' || c[3] == '\0') {
      return NULL;
   }
-    /* remove path from uri */
-    if ((c = strchr(c + 3, '/'))) {
+    /*
+     * remove path from uri if desired. For compatibility, if
+     * we want to use the path, but there is no real path, just
+     * a trailing '/', strip it and continue
+     */
+    c = strchr(c + 3, '/');
+    if ( (usepath && c && !*(c+1)) || (!usepath && c)) {
       *c = '\0';
   }
   balancer = (proxy_balancer *)conf->balancers->elts;
@@ -1271,6 +1275,21 @@
   return NULL;
}

+PROXY_DECLARE(proxy_balancer *) ap_proxy_get_balancer(apr_pool_t *p,
+                                                       
proxy_server_conf *conf,
+                                                      const char *url)
+{
+    return proxy_find_balancer(p, conf, url, 0);
+}
+
+PROXY_DECLARE(proxy_balancer *) ap_proxy_get_balancerwpath(apr_pool_t  
*p,
+                                                            
proxy_server_conf *conf,
+                                                           const char  
*url)
+{
+    return proxy_find_balancer(p, conf, url, 1);
+}
+
+
PROXY_DECLARE(const char *) ap_proxy_add_balancer(proxy_balancer  
**balancer,
                                                 apr_pool_t *p,
                                                 proxy_server_conf  
*conf,
Index: modules/proxy/mod_proxy.h
===================================================================
--- modules/proxy/mod_proxy.h	(revision 757753)
+++ modules/proxy/mod_proxy.h	(working copy)
@@ -583,6 +583,17 @@
                                                     proxy_server_conf  
*conf,
                                                     const char *url);
/**
+ * Get the balancer from proxy configuration without dropping the path
+ * associated with that balancer
+ * @param p     memory pool used for finding balancer
+ * @param conf  current proxy server configuration
+ * @param url   url to find the worker from. Has to have balancer://  
prefix
+ * @return      proxy_balancer or NULL if not found
+ */
+PROXY_DECLARE(proxy_balancer *) ap_proxy_get_balancerwpath(apr_pool_t  
*p,
+                                                       
proxy_server_conf *conf,
+                                                      const char *url);
+/**
* Add the balancer to proxy configuration
* @param balancer the new balancer
* @param p      memory pool to allocate balancer from


Re: ProxyPassReverse and paths

Posted by Ruediger Pluem <rp...@apache.org>.
On 24.03.2009 14:57, Jim Jagielski wrote:
> 
> On Mar 24, 2009, at 9:44 AM, Ruediger Pluem wrote:
> 
>> On 24.03.2009 14:28, Jim Jagielski wrote:
>>> There have been a few times when people get caught up when
>>> using ProxyPassReverse with balancers that contain a path...
>>> After all, the normal convention is everywhere you see a
>>> ProxyPass there should be a corresponding ProxyPassReverse
>>> that follows the same format. However in cases where
>>> ProxyPass contains a path, PPR doesn't work correctly...
>>>
>>> This fixes that but is also safe for the vast majority of
>>> existing sites. I plan to commit unless there are objections:
>>>
>>> Index: modules/proxy/proxy_util.c
>>> ===================================================================
>>> --- modules/proxy/proxy_util.c    (revision 757753)
>>> +++ modules/proxy/proxy_util.c    (working copy)
>>> @@ -1080,11 +1080,9 @@
>>>       * or may not be the right one... basically, we need
>>>       * to find which member actually handled this request.
>>>       *
>>> -         * TODO: Recover the path from real and use that
>>> -         *       for more exact matching
>>>       */
>>>      if ((strncasecmp(real, "balancer:", 9) == 0) &&
>>> -            (balancer = ap_proxy_get_balancer(r->pool, sconf, real))) {
>>> +            (balancer = ap_proxy_get_balancerwpath(r->pool, sconf,
>>> real))) {
>>
>> Doesn't it make more sense to cut off the path of real here to get the
>> balancer
>> instead of adding a new function to the API (which also requires a
>> minor bump,
>> but thats just nitpicking)?
>>
> 
> Well, cutting off path from real is what ap_proxy_get_balancer() actually
> *does* and is what I'm trying to adjust :)

Hm. I am getting confused now. We do *not* store the path in balancer->name.
So if we do not cut it off in ap_proxy_get_balancer our strcmp will never match
correct?

Regards

RĂ¼diger



Re: ProxyPassReverse and paths

Posted by Jim Jagielski <ji...@jaguNET.com>.
On Mar 24, 2009, at 9:44 AM, Ruediger Pluem wrote:

> On 24.03.2009 14:28, Jim Jagielski wrote:
>> There have been a few times when people get caught up when
>> using ProxyPassReverse with balancers that contain a path...
>> After all, the normal convention is everywhere you see a
>> ProxyPass there should be a corresponding ProxyPassReverse
>> that follows the same format. However in cases where
>> ProxyPass contains a path, PPR doesn't work correctly...
>>
>> This fixes that but is also safe for the vast majority of
>> existing sites. I plan to commit unless there are objections:
>>
>> Index: modules/proxy/proxy_util.c
>> ===================================================================
>> --- modules/proxy/proxy_util.c    (revision 757753)
>> +++ modules/proxy/proxy_util.c    (working copy)
>> @@ -1080,11 +1080,9 @@
>>       * or may not be the right one... basically, we need
>>       * to find which member actually handled this request.
>>       *
>> -         * TODO: Recover the path from real and use that
>> -         *       for more exact matching
>>       */
>>      if ((strncasecmp(real, "balancer:", 9) == 0) &&
>> -            (balancer = ap_proxy_get_balancer(r->pool, sconf,  
>> real))) {
>> +            (balancer = ap_proxy_get_balancerwpath(r->pool, sconf,
>> real))) {
>
> Doesn't it make more sense to cut off the path of real here to get  
> the balancer
> instead of adding a new function to the API (which also requires a  
> minor bump,
> but thats just nitpicking)?
>

Well, cutting off path from real is what ap_proxy_get_balancer()  
actually
*does* and is what I'm trying to adjust :)

Re: ProxyPassReverse and paths

Posted by Ruediger Pluem <rp...@apache.org>.
On 24.03.2009 14:28, Jim Jagielski wrote:
> There have been a few times when people get caught up when
> using ProxyPassReverse with balancers that contain a path...
> After all, the normal convention is everywhere you see a
> ProxyPass there should be a corresponding ProxyPassReverse
> that follows the same format. However in cases where
> ProxyPass contains a path, PPR doesn't work correctly...
> 
> This fixes that but is also safe for the vast majority of
> existing sites. I plan to commit unless there are objections:
> 
> Index: modules/proxy/proxy_util.c
> ===================================================================
> --- modules/proxy/proxy_util.c    (revision 757753)
> +++ modules/proxy/proxy_util.c    (working copy)
> @@ -1080,11 +1080,9 @@
>        * or may not be the right one... basically, we need
>        * to find which member actually handled this request.
>        *
> -         * TODO: Recover the path from real and use that
> -         *       for more exact matching
>        */
>       if ((strncasecmp(real, "balancer:", 9) == 0) &&
> -            (balancer = ap_proxy_get_balancer(r->pool, sconf, real))) {
> +            (balancer = ap_proxy_get_balancerwpath(r->pool, sconf,
> real))) {

Doesn't it make more sense to cut off the path of real here to get the balancer
instead of adding a new function to the API (which also requires a minor bump,
but thats just nitpicking)?

Regards

RĂ¼diger


Re: ProxyPassReverse and paths

Posted by Jim Jagielski <ji...@jaguNET.com>.
On Mar 24, 2009, at 11:46 AM, William A. Rowe, Jr. wrote:

> Jim Jagielski wrote:
>> On 2nd thought.... the whole idea of a balancer itself
>> incorporating a path in conjunction with each ind. members also
>> possibly having one is a big can of worms. Need to mull this over...
>
> Let me try mind-mapping this out - please sanity check a whole group
> of assumptions.  (And I'm not surprised, I had the same reaction after
> I thought I had the right patch, and realized this is much more  
> tangled.)
> ....
>
> <Proxy balancer://foo/bar/>
> BalancerMember http://hash/bang/
> BalancerMember http://backup/app2/
>
> ProxyPass /app/ balancer://foo/bar
>
> What does <Proxy URL /bar/  mean in this context?  Can you have a  
> Member
> which is on one <Proxy > balancer block of balancer://foo and not in  
> the
> another with another URL?  This is impossible today due to the fact  
> that
> the balancer set struct doesn't contain a URI member at all.
>
> Is this construct interesting for other directives?  I suspect so.
>
> I don't think that <Proxy > URL's are valid for declaring  
> BalancerMember's;
> my gut instinct is to WARN in 2.2 and to fail parsing this in 2.3.
>
> Now let's look at the ProxyPass[reverse] - what do the directives  
> above
> do with a request to /app/login ?  This pretty clearly maps to either
> http://hash/bang/bar/login (or http://backup/app2/bar/login) ...  
> right?
> The reverse must strip the balancer member target, including URL, PLUS
> the RHS URL expression of the ProxyPassReverse.
>
> Sanity check this please, because we can't fix it until we understand
> what it was meant to do :)
>

Again looking over all this, the confusion, I think comes about because
historically you were able to pass a full URL as the 2nd arg to
ProxyPass. But when the balancer was added, you are instead passing
a "name" that looks like a URL. And the more I think about it, whether
this is optimal or not, logically it makes sense. Because what,
exactly, does a path in a balancer://foo/ *name* really mean? In the
above example, is the URL passed to the members http://hash/bang/bar ?
Do 2 balancers named balancer://foo/bar and balancer://foo/boo share
the exact same connection pool and other resources?

I think that forcing balancer://.. to always ignore paths and require
users to set paths as member definitions is the easier course of
action. So adding error-log entries which say "You added path
info to balancer://foo/ this is ignored" is likely enuff.

I'm trying to think of scenarios where paths in balancers make sense
and don't make things harder to understand... if anyone can, then
let me know :)


Re: ProxyPassReverse and paths

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
Jim Jagielski wrote:
> On 2nd thought.... the whole idea of a balancer itself
> incorporating a path in conjunction with each ind. members also
> possibly having one is a big can of worms. Need to mull this over...

Let me try mind-mapping this out - please sanity check a whole group
of assumptions.  (And I'm not surprised, I had the same reaction after
I thought I had the right patch, and realized this is much more tangled.)
....

<Proxy balancer://foo/bar/>
   BalancerMember http://hash/bang/
   BalancerMember http://backup/app2/

ProxyPass /app/ balancer://foo/bar

What does <Proxy URL /bar/  mean in this context?  Can you have a Member
which is on one <Proxy > balancer block of balancer://foo and not in the
another with another URL?  This is impossible today due to the fact that
the balancer set struct doesn't contain a URI member at all.

Is this construct interesting for other directives?  I suspect so.

I don't think that <Proxy > URL's are valid for declaring BalancerMember's;
my gut instinct is to WARN in 2.2 and to fail parsing this in 2.3.

Now let's look at the ProxyPass[reverse] - what do the directives above
do with a request to /app/login ?  This pretty clearly maps to either
http://hash/bang/bar/login (or http://backup/app2/bar/login) ... right?
The reverse must strip the balancer member target, including URL, PLUS
the RHS URL expression of the ProxyPassReverse.

Sanity check this please, because we can't fix it until we understand
what it was meant to do :)

Re: ProxyPassReverse and paths

Posted by Jim Jagielski <ji...@jaguNET.com>.
On 2nd thought.... the whole idea of a balancer itself
incorporating a path in conjunction with each ind. members also
possibly having one is a big can of worms. Need to mull this over...