You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Jan Kaluza <jk...@redhat.com> on 2014/03/18 12:26:15 UTC

Re: [PATCH] Fix settings options with ProxyPassMatch

This is also needed to fix ProxyPassMatch with UDS. Without this patch 
following configuration does not work:

ProxyPassMatch ^/web/(.*\.php)$ 
unix:/run/php-fpm/fpm.sock|fcgi://127.0.0.1/var/www/html/$1

If nobody is against, I will commit it to trunk later this week.

Regards,
Jan Kaluza

On 09/16/2013 02:44 PM, Jan Kaluža wrote:
> Hi,
>
> I have found out that if you use ProxyPassMatch with regexp and "$1"
> (see PR 43513 and PR 54973), the right proxy_worker is not used and
> "reverse" proxy_worker is used instead. The result is situation where no
> options like "timeout" can be set in this case.
>
> The real problem here is that proxy_worker is chosen according to its
> name which contains "$1" sequence in our case.
>
> I have found out that there is already patch fixing ProxyPassMatch
> behaviour with "$1" in PR 43513. I have rebased the patch for trunk and
> changed its style a bit.
>
> The patch strips everything after "$", so it for example changes the
> name of proxy_worker from "http://127.0.0.1/cgi-bin/$1" to
> "http://127.0.0.1/cgi-bin/". Later when request arrives, proper
> proxy_worker is chosen. Without this change, proxy_worker's name with
> "$1" would be compared against real request URL, but real request URL
> does not contain this "$1" sequence and therefore this correct
> proxy_worker wouldn't be chosen.
>
> Do you see any problems with this patch?
>
> Regards,
> Jan Kaluza


Re: [PATCH] Fix settings options with ProxyPassMatch

Posted by Jan Kaluza <jk...@redhat.com>.
On 03/18/2014 02:46 PM, Yann Ylavic wrote:
> On Tue, Mar 18, 2014 at 2:38 PM, Yann Ylavic <yl...@gmail.com> wrote:
>> Wouldn't it be possible to define wildcard workers when the URL is
>> known to be a regexp substitution?
>> For these workers' URLs, the dollars (plus the following digit) could
>> be replaced by a wildcard (ie. *) and ap_proxy_get_worker() could then
>> use ap_strcasecmp_match() against the requested URL.
>
> I meant ap_strcmp_match(), this has to be case sensitive...

That looks like good way too. I will give it a try. Maybe it will look 
better than this original patch (with which I'm not so sure...)

Regards,
Jan Kaluza


Re: [PATCH] Fix settings options with ProxyPassMatch

Posted by Jan Kaluža <jk...@redhat.com>.
On 04/25/2014 02:57 AM, Yann Ylavic wrote:
> Hi Jan,
>
> sorry for the late.

No problem :).

>
> On Tue, Apr 22, 2014 at 3:39 PM, Jan Kaluža <jk...@redhat.com> wrote:
>> Hi again,
>>
>> the patch has been here for some time already. I hesitate to commit it to
>> trunk without any review, because it changes the core code in mod_proxy and
>> I'm afraid that there could exist more corner-cases I'm not aware of.
>>
>> On the other side (and to motivate someone with deeper knowledge of
>> mod_proxy :) ), it would be great to have UDS support also for
>> ProxyPassMatch and fix some old bugs related to setting options together
>> with ProxyPassMatch (like PR 43513 and PR 54973).
>>
>> So, anyone who would have time to review this patch, please? :)
>
> I slightly modified your patch with the following changes :
> 1. Use \-escaping instead of %-escaping for wildcard_match so to
> consume less space and unescape quickly.

Hm, I'm not sure right now if that's right. The corner-case here is 
following configuration:

ScriptAlias /cgi?bin/ "/usr/local/apache2/cgi-bin/"
ProxyPassMatch ^/x/(.*\.x)$ http://127.0.0.1/cgi?bin/$1 timeout=1

I'm not sure how valid this configuration is, but it leads to following 
request:

"http://127.0.0.1/cgi%3Fbin/printenv.x"

while the proxy_worker has following escaped string:

"http://127.0.0.1/cgi\\?bin/*"

So the right proxy_worker is not matched in this case (not sure if 
"right" is really "right" in this case :) ).

> 2. Fix the recursion in ap_proxy_wildcard_match(), which was calling
> the legacy ap_strcmp_match(), and unescape according to 1.
> 2'. Shouldn't this function be made iterative (it is not final
> recursive currently)?
> 3. Always try to get the worker with its de_socketfy()ed name, and
> only if not found use ap_proxy_define[_wildcard]_worker() with the
> configured URL (add_pass() and proxysection() were not consistent
> about ap_proxy_get_worker()'s given name/URL).

+1 to this.

> 4. Don't match both the "normal" name with strcnmp() and the wildcard
> name with strcmp_match() in ap_proxy_get_worker(), since the worker is
> either wildcard or not (this enforcement is also added to add_pass()
> and proxysection() so that it is not possible to declare the same
> worker name once with a Match, again without, or vice versa).
> 4'. Does it makes sense to use the longest match_name or should we go
> for the first matching wildcard worker (if any) wins?

Hm, I think it should be the longest match_name, if you mean that 
"http://127.0.0.1/cgi-bin/*" should be matched if there exist two 
workers like "http://127.0.0.1/*" and "http://127.0.0.1/cgi-bin/*".

But if I'm right, in this case, there will be just one *shared* worker. 
Can you write an example of two non-shared workers with matching 
wirldcard name?

> 5. Rename ap_proxy_wildcard_match() as ap_proxy_strcmp_ematch() to
> suggest it could go to server/util.ch (if/when the associated
> ap_matchexp_escape() and ap_proxy_strcasecmp_ematch() are coded).
> 6. Move proxy_worker_shared { char
> wildcard_name[PROXY_WORKER_MAX_NAME_SIZE]; } to proxy_worker { char
> *match_name; } to save shm space and avoid length issues (I don't see
> how this could be updated at runtime, for now, balancer-manager can't
> declare Match workers).
> 6. Keep ap_proxy_escape_wildcard_url() static (private) since it is
> not used outside proxy_util (now).

+1 to all above.

> This is not very tested (yet), I'll come back if I detect issues, but
> first I'd like your feedbacks on this modifications...

I've tested the patch and generally it works for me (except the things I 
wrote above).

> Patch attached.
>
> Regards,
> Yann.
>

Regards,
Jan Kaluza


Re: [PATCH] Fix settings options with ProxyPassMatch

Posted by Jan Kaluža <jk...@redhat.com>.
On 04/29/2014 03:29 PM, Jim Jagielski wrote:
>
> On Apr 29, 2014, at 8:41 AM, Jan Kaluža <jk...@redhat.com> wrote:
>>
>> Because later we have to match the URL of request with some proxy_worker.
>>
>> If you configure ProxyPassMatch like this:
>> ProxyPassMatch ^/test/(\d+)/foo.jpg http://x/$1/foo.jpg
>>
>> Then the proxy_worker name would be "http://x/$1/foo.jpg".
>>
>> If you receive request with URL "http://x/something/foo.jpg", ap_proxy_get_worker() will have to find out the worker with name "http://x/$1/foo.jpg". The question here is how it would do that?
>>
>> The answer used in the patch is "we change the worker name to http://x/*/foo.jpg" and check if the URL ("http://x/something/foo.jpg" in our case) matches that worker.
>>
>> If we store the original name with $N, we will have to find out different way how to match the worker (probably emulating wildcard pattern matching)
>>
>> It would be possible to store only the original name (with "$N" variables), store the flag that the proxy worker is using regex and change ap_proxy_strcmp_ematch() function to treat "$N" as "*", but I don't see any real advantage here.
>
> Huh???
>
> I thought we were talking about this simple, streamlined patch that
> does JUST matching/awareness of regex's, not the one which seems
> to fold in a BUNCH of other semi-related stuff.
>

Sorry, I was talking about the latest Yann's patch all the time :(.

>
>
>
>>
>>> The version I looked at had some 'use_regex' logic outside of
>>> ap_proxy_get_worker(), right before we call it, in fact. Maybe
>>> I'm seeing an older version of the patch?
>>>
>>
>> I think that's the correct patch. It has use_regex only for creating the proxy_worker to distinguish between "normal" worker and "regex" worker.
>>
>> Note that "use_regex" does not influence the call of "ap_proxy_get_worker()". As I said, it's used only to distinguish whether the newly created worker is "normal" worker or "regex" worker.
>>
>> Regards,
>> Jan Kaluza
>


Re: [PATCH] Fix settings options with ProxyPassMatch

Posted by Jim Jagielski <ji...@jaguNET.com>.
On Apr 29, 2014, at 8:41 AM, Jan Kaluža <jk...@redhat.com> wrote:
> 
> Because later we have to match the URL of request with some proxy_worker.
> 
> If you configure ProxyPassMatch like this:
> ProxyPassMatch ^/test/(\d+)/foo.jpg http://x/$1/foo.jpg
> 
> Then the proxy_worker name would be "http://x/$1/foo.jpg".
> 
> If you receive request with URL "http://x/something/foo.jpg", ap_proxy_get_worker() will have to find out the worker with name "http://x/$1/foo.jpg". The question here is how it would do that?
> 
> The answer used in the patch is "we change the worker name to http://x/*/foo.jpg" and check if the URL ("http://x/something/foo.jpg" in our case) matches that worker.
> 
> If we store the original name with $N, we will have to find out different way how to match the worker (probably emulating wildcard pattern matching)
> 
> It would be possible to store only the original name (with "$N" variables), store the flag that the proxy worker is using regex and change ap_proxy_strcmp_ematch() function to treat "$N" as "*", but I don't see any real advantage here.

Huh???

I thought we were talking about this simple, streamlined patch that
does JUST matching/awareness of regex's, not the one which seems
to fold in a BUNCH of other semi-related stuff.

Re: [PATCH] Fix settings options with ProxyPassMatch

Posted by Jan Kaluža <jk...@redhat.com>.
On 07/10/2014 03:57 PM, Yann Ylavic wrote:
> On Thu, Jul 10, 2014 at 9:12 AM, Jan Kaluža <jk...@redhat.com> wrote:
>> On 07/09/2014 04:26 PM, Yann Ylavic wrote:
>>> I forgot proxysection(), why not handle the
>>> ap_proxy_define_match_worker() case there too?
>>>
>>
>> I'm not sure I see what you mean. There's match_worker created in
>> proxysection(), or do you mean something else?
>
> Indeed I missed the code in proxysection()...
>
> That's +1 for me.
>

Committed in r1609680.

Jan Kaluza


Re: [PATCH] Fix settings options with ProxyPassMatch

Posted by Yann Ylavic <yl...@gmail.com>.
On Thu, Jul 10, 2014 at 9:12 AM, Jan Kaluža <jk...@redhat.com> wrote:
> On 07/09/2014 04:26 PM, Yann Ylavic wrote:
>> I forgot proxysection(), why not handle the
>> ap_proxy_define_match_worker() case there too?
>>
>
> I'm not sure I see what you mean. There's match_worker created in
> proxysection(), or do you mean something else?

Indeed I missed the code in proxysection()...

That's +1 for me.

Re: [PATCH] Fix settings options with ProxyPassMatch

Posted by Jan Kaluža <jk...@redhat.com>.
On 07/09/2014 04:26 PM, Yann Ylavic wrote:
> On Wed, Jul 9, 2014 at 4:14 PM, Yann Ylavic <yl...@gmail.com> wrote:
>> On Wed, Jul 9, 2014 at 3:03 PM, Jan Kaluža <jk...@redhat.com> wrote:
>>> Hi,
>>>
>>> could you please check the patch I've attached to this email?
>>
>> Looks good to me.
>>
>>>
>>> It changes following parts of Yann's patch:
>>>
>>> 1. keep only single name of the worker stored in shared memory.
>>>
>>> 2. when ProxyPassMatch is used, wshared->is_name_matchable = 1.
>>>
>>> 3. if is_name_matchable == 1, ap_proxy_get_worker() uses
>>> ap_proxy_strcmp_ematch() which treats "$N" as '*'.
>>
>> Much simpler, no need to store the pattern.
>>
>> Would it be possible to handle some escaping character (eg. \), so
>> that $ can be expressed without being interpolated?
>> (There is still the comment in ap_proxy_strcmp_ematch(), but not the
>> code anymore).
>> AFAICT, $ is a legitimate URL character that need not be %-escaped.

Done in attached patch.

> I forgot proxysection(), why not handle the
> ap_proxy_define_match_worker() case there too?
>

I'm not sure I see what you mean. There's match_worker created in 
proxysection(), or do you mean something else?

Jan Kaluza


Re: [PATCH] Fix settings options with ProxyPassMatch

Posted by Yann Ylavic <yl...@gmail.com>.
On Wed, Jul 9, 2014 at 4:14 PM, Yann Ylavic <yl...@gmail.com> wrote:
> On Wed, Jul 9, 2014 at 3:03 PM, Jan Kaluža <jk...@redhat.com> wrote:
>> Hi,
>>
>> could you please check the patch I've attached to this email?
>
> Looks good to me.
>
>>
>> It changes following parts of Yann's patch:
>>
>> 1. keep only single name of the worker stored in shared memory.
>>
>> 2. when ProxyPassMatch is used, wshared->is_name_matchable = 1.
>>
>> 3. if is_name_matchable == 1, ap_proxy_get_worker() uses
>> ap_proxy_strcmp_ematch() which treats "$N" as '*'.
>
> Much simpler, no need to store the pattern.
>
> Would it be possible to handle some escaping character (eg. \), so
> that $ can be expressed without being interpolated?
> (There is still the comment in ap_proxy_strcmp_ematch(), but not the
> code anymore).
> AFAICT, $ is a legitimate URL character that need not be %-escaped.

I forgot proxysection(), why not handle the
ap_proxy_define_match_worker() case there too?

Re: [PATCH] Fix settings options with ProxyPassMatch

Posted by Yann Ylavic <yl...@gmail.com>.
On Wed, Jul 9, 2014 at 3:03 PM, Jan Kaluža <jk...@redhat.com> wrote:
> Hi,
>
> could you please check the patch I've attached to this email?

Looks good to me.

>
> It changes following parts of Yann's patch:
>
> 1. keep only single name of the worker stored in shared memory.
>
> 2. when ProxyPassMatch is used, wshared->is_name_matchable = 1.
>
> 3. if is_name_matchable == 1, ap_proxy_get_worker() uses
> ap_proxy_strcmp_ematch() which treats "$N" as '*'.

Much simpler, no need to store the pattern.

Would it be possible to handle some escaping character (eg. \), so
that $ can be expressed without being interpolated?
(There is still the comment in ap_proxy_strcmp_ematch(), but not the
code anymore).
AFAICT, $ is a legitimate URL character that need not be %-escaped.

Regards,
Yann.

Re: [PATCH] Fix settings options with ProxyPassMatch

Posted by Jan Kaluža <jk...@redhat.com>.
On 04/29/2014 03:51 PM, Jim Jagielski wrote:
>
> On Apr 29, 2014, at 8:41 AM, Jan Kaluža <jk...@redhat.com> wrote:
>>
>> Because later we have to match the URL of request with some proxy_worker.
>>
>> If you configure ProxyPassMatch like this:
>> ProxyPassMatch ^/test/(\d+)/foo.jpg http://x/$1/foo.jpg
>>
>> Then the proxy_worker name would be "http://x/$1/foo.jpg".
>>
>> If you receive request with URL "http://x/something/foo.jpg", ap_proxy_get_worker() will have to find out the worker with name "http://x/$1/foo.jpg". The question here is how it would do that?
>>
>> The answer used in the patch is "we change the worker name to http://x/*/foo.jpg" and check if the URL ("http://x/something/foo.jpg" in our case) matches that worker.
>>
>> If we store the original name with $N, we will have to find out different way how to match the worker (probably emulating wildcard pattern matching)
>>
>> It would be possible to store only the original name (with "$N" variables), store the flag that the proxy worker is using regex and change ap_proxy_strcmp_ematch() function to treat "$N" as "*", but I don't see any real advantage here.
>>
>
> In Yann's suggested patch we don't store match_name where it
> belongs; so we'd need to put it in shm, which means more
> memory. Instead, we store as is and add a simple char flag
> which sez if the stored name is a regex. Much savings.

Hi,

could you please check the patch I've attached to this email?

It changes following parts of Yann's patch:

1. keep only single name of the worker stored in shared memory.

2. when ProxyPassMatch is used, wshared->is_name_matchable = 1.

3. if is_name_matchable == 1, ap_proxy_get_worker() uses 
ap_proxy_strcmp_ematch() which treats "$N" as '*'.

I've tested this patch a bit and it seems to work for me.

Regards,
Jan Kaluza

> And I have no idea why storing with $1 -> * somehow makes
> things easier or implies a "different way how to match the worker".
>
> Finally, let's think about this deeper...
>
> Assume we do have
>
> 	ProxyPassMatch ^/test/(\d+)/foo.jpg http://x/$1/foo.jpg
> 	ProxyPassMatch ^/zippy/(\d+)/bar.jpg http://x/$1/omar/propjoe.gif
>
> is the intent/desire to have 2 workers or 1? A worker is, in
> some ways, simply a nickname for the socket related to a host and port.
> Maybe, in the interests of efficiency and speed, since regexes
> are slow as it is, a condition could be specified (a limitation,
> as it were), that when using PPM, only everything up to
> the 1st potential substitution is considered a unique worker.
>


Re: [PATCH] Fix settings options with ProxyPassMatch

Posted by Yann Ylavic <yl...@gmail.com>.
On Wed, Apr 30, 2014 at 12:53 AM, Yann Ylavic <yl...@gmail.com> wrote:
> Still another solution for these workers would be to reuse the
> ap_regmatch_t vector from proxy_trans() to exact match the worker's
> name (with its zero or more $N replaced with strings offsets from
> vector[N], like ap_expr_str_exec_re() does).

This solution is complete nonsense, I should have though about it more
deeply :-/
There is no such thing as an exact match with a wildcard name.

The place where the regexp (worker's alias) is exectuted and the
resulting ap_regmatch_t vector is used to build the requested URL is
proxy_trans(). Why would mod_proxy have to find (again) the worker
later?
I guess things may have been rewritten in between, or the like, but it
looks like a duplicate search for some (marginal/too simple?) cases.

Re: [PATCH] Fix settings options with ProxyPassMatch

Posted by Yann Ylavic <yl...@gmail.com>.
On Tue, Apr 29, 2014 at 3:51 PM, Jim Jagielski <ji...@jagunet.com> wrote:
> On Apr 29, 2014, at 8:41 AM, Jan Kaluža <jk...@redhat.com> wrote:
>>
>> Because later we have to match the URL of request with some proxy_worker.
>>
>> If you configure ProxyPassMatch like this:
>> ProxyPassMatch ^/test/(\d+)/foo.jpg http://x/$1/foo.jpg
>>
>> Then the proxy_worker name would be "http://x/$1/foo.jpg".
>>
>> If you receive request with URL "http://x/something/foo.jpg", ap_proxy_get_worker() will have to find out the worker with name "http://x/$1/foo.jpg". The question here is how it would do that?
>>
>> The answer used in the patch is "we change the worker name to http://x/*/foo.jpg" and check if the URL ("http://x/something/foo.jpg" in our case) matches that worker.
>>
>> If we store the original name with $N, we will have to find out different way how to match the worker (probably emulating wildcard pattern matching)
>>
>> It would be possible to store only the original name (with "$N" variables), store the flag that the proxy worker is using regex and change ap_proxy_strcmp_ematch() function to treat "$N" as "*", but I don't see any real advantage here.
>>
>
> In Yann's suggested patch we don't store match_name where it
> belongs; so we'd need to put it in shm, which means more
> memory.

Agreed, plus this is not balancer-manager aware.

BTW, what's the difference between alias_match() used by proxy_trans()
and ap_proxy_get_worker()? Longest match?
Can an entry matched by proxy_trans() *not* belong to the worker
got(ten) later from ap_proxy_get_worker()?
If no, another solution would be to backref the worker in (all) its
struct proxy_alias(es) entries.
That way the worker would be already known at proxy_trans() time (when
the entry is matched), and a new ap_proxy_get_worker_for_request(r)
could do the association later.
AFAICT, we don't use ap_proxy_get_worker() at runtime without a
request_rec available.

At least that could work for the *Match workers, for which the only
relevent requested-URL's match is from proxy_trans(), imo.

Still another solution for these workers would be to reuse the
ap_regmatch_t vector from proxy_trans() to exact match the worker's
name (with its zero or more $N replaced with strings offsets from
vector[N], like ap_expr_str_exec_re() does).
That would also require a request_rec available at
ap_proxy_get_worker()'s (run)time though.

> Instead, we store as is and add a simple char flag
> which sez if the stored name is a regex. Much savings.
>
> And I have no idea why storing with $1 -> * somehow makes
> things easier or implies a "different way how to match the worker".

Do we need to provide a way to escape (application/legitimate) $N in
the worker name or simply document on the limitation?
In the latter case this is indeed much simpler.

>
> Finally, let's think about this deeper...
>
> Assume we do have
>
>         ProxyPassMatch ^/test/(\d+)/foo.jpg http://x/$1/foo.jpg
>         ProxyPassMatch ^/zippy/(\d+)/bar.jpg http://x/$1/omar/propjoe.gif
>
> is the intent/desire to have 2 workers or 1? A worker is, in
> some ways, simply a nickname for the socket related to a host and port.

For which connections can be reused, different parameters apply...

> Maybe, in the interests of efficiency and speed, since regexes
> are slow as it is, a condition could be specified (a limitation,
> as it were), that when using PPM, only everything up to
> the 1st potential substitution is considered a unique worker.

That could be (another) limitation.
But one may want to apply different parameters to these somehow
different URLs, since they may be different backends/applications too.

Re: [PATCH] Fix settings options with ProxyPassMatch

Posted by Jim Jagielski <ji...@jaguNET.com>.
On Apr 29, 2014, at 8:41 AM, Jan Kaluža <jk...@redhat.com> wrote:
> 
> Because later we have to match the URL of request with some proxy_worker.
> 
> If you configure ProxyPassMatch like this:
> ProxyPassMatch ^/test/(\d+)/foo.jpg http://x/$1/foo.jpg
> 
> Then the proxy_worker name would be "http://x/$1/foo.jpg".
> 
> If you receive request with URL "http://x/something/foo.jpg", ap_proxy_get_worker() will have to find out the worker with name "http://x/$1/foo.jpg". The question here is how it would do that?
> 
> The answer used in the patch is "we change the worker name to http://x/*/foo.jpg" and check if the URL ("http://x/something/foo.jpg" in our case) matches that worker.
> 
> If we store the original name with $N, we will have to find out different way how to match the worker (probably emulating wildcard pattern matching)
> 
> It would be possible to store only the original name (with "$N" variables), store the flag that the proxy worker is using regex and change ap_proxy_strcmp_ematch() function to treat "$N" as "*", but I don't see any real advantage here.
> 

In Yann's suggested patch we don't store match_name where it
belongs; so we'd need to put it in shm, which means more
memory. Instead, we store as is and add a simple char flag
which sez if the stored name is a regex. Much savings.

And I have no idea why storing with $1 -> * somehow makes
things easier or implies a "different way how to match the worker".

Finally, let's think about this deeper...

Assume we do have

	ProxyPassMatch ^/test/(\d+)/foo.jpg http://x/$1/foo.jpg
	ProxyPassMatch ^/zippy/(\d+)/bar.jpg http://x/$1/omar/propjoe.gif

is the intent/desire to have 2 workers or 1? A worker is, in
some ways, simply a nickname for the socket related to a host and port.
Maybe, in the interests of efficiency and speed, since regexes
are slow as it is, a condition could be specified (a limitation,
as it were), that when using PPM, only everything up to
the 1st potential substitution is considered a unique worker.

Re: [PATCH] Fix settings options with ProxyPassMatch

Posted by Jan Kaluža <jk...@redhat.com>.
On 04/29/2014 02:22 PM, Jim Jagielski wrote:
>
> On Apr 29, 2014, at 7:41 AM, Jan Kaluža <jk...@redhat.com> wrote:
>>
>> That's what we do with current patch I think, don't we? In the patch, we create "char *match_name" which is NULL when the worker_name is not regex and contains the escaped name if regex is used (with "$N" replaced by '*').
>>
>> ap_proxy_get_worker() later checks the match_name and if it's not NULL, it tries regex matching using ap_proxy_strcmp_ematch().
>>
>> The actual problem is that ap_proxy_get_worker() cannot find out proper proxy_worker, because this method is not regex aware. To make it regex aware, we have to distinguish regex workers during creation,
>
> right, hence the flag.
>
>> replace the regex variables ($N -> *) in some safe manner and allow ap_proxy_get_worker() to find the right worker using this new info. And that's what the patch does.
>
> Why replace? Why can't we store the name as is?

Because later we have to match the URL of request with some proxy_worker.

If you configure ProxyPassMatch like this:
ProxyPassMatch ^/test/(\d+)/foo.jpg http://x/$1/foo.jpg

Then the proxy_worker name would be "http://x/$1/foo.jpg".

If you receive request with URL "http://x/something/foo.jpg", 
ap_proxy_get_worker() will have to find out the worker with name 
"http://x/$1/foo.jpg". The question here is how it would do that?

The answer used in the patch is "we change the worker name to 
http://x/*/foo.jpg" and check if the URL ("http://x/something/foo.jpg" 
in our case) matches that worker.

If we store the original name with $N, we will have to find out 
different way how to match the worker (probably emulating wildcard 
pattern matching)

It would be possible to store only the original name (with "$N" 
variables), store the flag that the proxy worker is using regex and 
change ap_proxy_strcmp_ematch() function to treat "$N" as "*", but I 
don't see any real advantage here.

> The version I looked at had some 'use_regex' logic outside of
> ap_proxy_get_worker(), right before we call it, in fact. Maybe
> I'm seeing an older version of the patch?
>

I think that's the correct patch. It has use_regex only for creating the 
proxy_worker to distinguish between "normal" worker and "regex" worker.

Note that "use_regex" does not influence the call of 
"ap_proxy_get_worker()". As I said, it's used only to distinguish 
whether the newly created worker is "normal" worker or "regex" worker.

Regards,
Jan Kaluza


Re: [PATCH] Fix settings options with ProxyPassMatch

Posted by Jim Jagielski <ji...@jaguNET.com>.
On Apr 29, 2014, at 7:41 AM, Jan Kaluža <jk...@redhat.com> wrote:
> 
> That's what we do with current patch I think, don't we? In the patch, we create "char *match_name" which is NULL when the worker_name is not regex and contains the escaped name if regex is used (with "$N" replaced by '*').
> 
> ap_proxy_get_worker() later checks the match_name and if it's not NULL, it tries regex matching using ap_proxy_strcmp_ematch().
> 
> The actual problem is that ap_proxy_get_worker() cannot find out proper proxy_worker, because this method is not regex aware. To make it regex aware, we have to distinguish regex workers during creation,

right, hence the flag.

> replace the regex variables ($N -> *) in some safe manner and allow ap_proxy_get_worker() to find the right worker using this new info. And that's what the patch does.

Why replace? Why can't we store the name as is?

The version I looked at had some 'use_regex' logic outside of
ap_proxy_get_worker(), right before we call it, in fact. Maybe
I'm seeing an older version of the patch?

Re: [PATCH] Fix settings options with ProxyPassMatch

Posted by Jan Kaluža <jk...@redhat.com>.
On 04/29/2014 01:04 PM, Jim Jagielski wrote:
>
> On Apr 24, 2014, at 8:57 PM, Yann Ylavic <yl...@gmail.com> wrote:
>
>> Hi Jan,
>>
>> sorry for the late.
>>
>> On Tue, Apr 22, 2014 at 3:39 PM, Jan Kaluža <jk...@redhat.com> wrote:
>>> Hi again,
>>>
>>> the patch has been here for some time already. I hesitate to commit it to
>>> trunk without any review, because it changes the core code in mod_proxy and
>>> I'm afraid that there could exist more corner-cases I'm not aware of.
>>>
>>> On the other side (and to motivate someone with deeper knowledge of
>>> mod_proxy :) ), it would be great to have UDS support also for
>>> ProxyPassMatch and fix some old bugs related to setting options together
>>> with ProxyPassMatch (like PR 43513 and PR 54973).
>>>
>>> So, anyone who would have time to review this patch, please? :)
>>
>> I slightly modified your patch with the following changes :
>> 1. Use \-escaping instead of %-escaping for wildcard_match so to
>> consume less space and unescape quickly.
>
> Wouldn't this cause confusion/issue with normal escaping?
>
>> 2. Fix the recursion in ap_proxy_wildcard_match(), which was calling
>> the legacy ap_strcmp_match(), and unescape according to 1.
>
> +1
>
>> 2'. Shouldn't this function be made iterative (it is not final
>> recursive currently)?
>
> +1
>
>> 3. Always try to get the worker with its de_socketfy()ed name, and
>> only if not found use ap_proxy_define[_wildcard]_worker() with the
>> configured URL (add_pass() and proxysection() were not consistent
>> about ap_proxy_get_worker()'s given name/URL).
>
> Makes sense.
>
>> 4. Don't match both the "normal" name with strcnmp() and the wildcard
>> name with strcmp_match() in ap_proxy_get_worker(), since the worker is
>> either wildcard or not (this enforcement is also added to add_pass()
>> and proxysection() so that it is not possible to declare the same
>> worker name once with a Match, again without, or vice versa).
>> 4'. Does it makes sense to use the longest match_name or should we go
>> for the first matching wildcard worker (if any) wins?
>
> Longest match name, imo.
>
>> 5. Rename ap_proxy_wildcard_match() as ap_proxy_strcmp_ematch() to
>> suggest it could go to server/util.ch (if/when the associated
>> ap_matchexp_escape() and ap_proxy_strcasecmp_ematch() are coded).
>
> Why?
>
>> 6. Move proxy_worker_shared { char
>> wildcard_name[PROXY_WORKER_MAX_NAME_SIZE]; } to proxy_worker { char
>> *match_name; } to save shm space and avoid length issues (I don't see
>> how this could be updated at runtime, for now, balancer-manager can't
>> declare Match workers).
>
> Future effort, so -1 on the moving.
>
>> 6. Keep ap_proxy_escape_wildcard_url() static (private) since it is
>> not used outside proxy_util (now).
>
> Sounds good.
>>
>> This is not very tested (yet), I'll come back if I detect issues, but
>> first I'd like your feedbacks on this modifications...
>>
>> Patch attached.
>>
>
> Just a comment: I am really concerned that with such a major change,
> we will likely break things in ways we are unaware of... We should
> focus on fixing the actual problem w/o also combining it with
> other "fixes" that are tangentially involved. For example, maybe
> a simply flag when creating the worker that sez "worker name is a regex"
> is the way to approach it, and thus ap_proxy_get_worker() could
> add additional checks for finding the "right" worker when it
> knows it needs to allow for regex substitutions. That creates
> a additional logic path which is self-contained and isolated, instead
> of monkeying around with changing current paths...
>

That's what we do with current patch I think, don't we? In the patch, we 
create "char *match_name" which is NULL when the worker_name is not 
regex and contains the escaped name if regex is used (with "$N" replaced 
by '*').

ap_proxy_get_worker() later checks the match_name and if it's not NULL, 
it tries regex matching using ap_proxy_strcmp_ematch().

The actual problem is that ap_proxy_get_worker() cannot find out proper 
proxy_worker, because this method is not regex aware. To make it regex 
aware, we have to distinguish regex workers during creation, replace the 
regex variables ($N -> *) in some safe manner and allow 
ap_proxy_get_worker() to find the right worker using this new info. And 
that's what the patch does.

Regards,
Jan Kaluza


Re: [PATCH] Fix settings options with ProxyPassMatch

Posted by Jim Jagielski <ji...@jaguNET.com>.
On Apr 24, 2014, at 8:57 PM, Yann Ylavic <yl...@gmail.com> wrote:

> Hi Jan,
> 
> sorry for the late.
> 
> On Tue, Apr 22, 2014 at 3:39 PM, Jan Kaluža <jk...@redhat.com> wrote:
>> Hi again,
>> 
>> the patch has been here for some time already. I hesitate to commit it to
>> trunk without any review, because it changes the core code in mod_proxy and
>> I'm afraid that there could exist more corner-cases I'm not aware of.
>> 
>> On the other side (and to motivate someone with deeper knowledge of
>> mod_proxy :) ), it would be great to have UDS support also for
>> ProxyPassMatch and fix some old bugs related to setting options together
>> with ProxyPassMatch (like PR 43513 and PR 54973).
>> 
>> So, anyone who would have time to review this patch, please? :)
> 
> I slightly modified your patch with the following changes :
> 1. Use \-escaping instead of %-escaping for wildcard_match so to
> consume less space and unescape quickly.

Wouldn't this cause confusion/issue with normal escaping?

> 2. Fix the recursion in ap_proxy_wildcard_match(), which was calling
> the legacy ap_strcmp_match(), and unescape according to 1.

+1

> 2'. Shouldn't this function be made iterative (it is not final
> recursive currently)?

+1

> 3. Always try to get the worker with its de_socketfy()ed name, and
> only if not found use ap_proxy_define[_wildcard]_worker() with the
> configured URL (add_pass() and proxysection() were not consistent
> about ap_proxy_get_worker()'s given name/URL).

Makes sense.

> 4. Don't match both the "normal" name with strcnmp() and the wildcard
> name with strcmp_match() in ap_proxy_get_worker(), since the worker is
> either wildcard or not (this enforcement is also added to add_pass()
> and proxysection() so that it is not possible to declare the same
> worker name once with a Match, again without, or vice versa).
> 4'. Does it makes sense to use the longest match_name or should we go
> for the first matching wildcard worker (if any) wins?

Longest match name, imo.

> 5. Rename ap_proxy_wildcard_match() as ap_proxy_strcmp_ematch() to
> suggest it could go to server/util.ch (if/when the associated
> ap_matchexp_escape() and ap_proxy_strcasecmp_ematch() are coded).

Why?

> 6. Move proxy_worker_shared { char
> wildcard_name[PROXY_WORKER_MAX_NAME_SIZE]; } to proxy_worker { char
> *match_name; } to save shm space and avoid length issues (I don't see
> how this could be updated at runtime, for now, balancer-manager can't
> declare Match workers).

Future effort, so -1 on the moving.

> 6. Keep ap_proxy_escape_wildcard_url() static (private) since it is
> not used outside proxy_util (now).

Sounds good.
> 
> This is not very tested (yet), I'll come back if I detect issues, but
> first I'd like your feedbacks on this modifications...
> 
> Patch attached.
> 

Just a comment: I am really concerned that with such a major change,
we will likely break things in ways we are unaware of... We should
focus on fixing the actual problem w/o also combining it with
other "fixes" that are tangentially involved. For example, maybe
a simply flag when creating the worker that sez "worker name is a regex"
is the way to approach it, and thus ap_proxy_get_worker() could
add additional checks for finding the "right" worker when it
knows it needs to allow for regex substitutions. That creates
a additional logic path which is self-contained and isolated, instead
of monkeying around with changing current paths...


Re: [PATCH] Fix settings options with ProxyPassMatch

Posted by Yann Ylavic <yl...@gmail.com>.
Hi Jan,

sorry for the late.

On Tue, Apr 22, 2014 at 3:39 PM, Jan Kaluža <jk...@redhat.com> wrote:
> Hi again,
>
> the patch has been here for some time already. I hesitate to commit it to
> trunk without any review, because it changes the core code in mod_proxy and
> I'm afraid that there could exist more corner-cases I'm not aware of.
>
> On the other side (and to motivate someone with deeper knowledge of
> mod_proxy :) ), it would be great to have UDS support also for
> ProxyPassMatch and fix some old bugs related to setting options together
> with ProxyPassMatch (like PR 43513 and PR 54973).
>
> So, anyone who would have time to review this patch, please? :)

I slightly modified your patch with the following changes :
1. Use \-escaping instead of %-escaping for wildcard_match so to
consume less space and unescape quickly.
2. Fix the recursion in ap_proxy_wildcard_match(), which was calling
the legacy ap_strcmp_match(), and unescape according to 1.
2'. Shouldn't this function be made iterative (it is not final
recursive currently)?
3. Always try to get the worker with its de_socketfy()ed name, and
only if not found use ap_proxy_define[_wildcard]_worker() with the
configured URL (add_pass() and proxysection() were not consistent
about ap_proxy_get_worker()'s given name/URL).
4. Don't match both the "normal" name with strcnmp() and the wildcard
name with strcmp_match() in ap_proxy_get_worker(), since the worker is
either wildcard or not (this enforcement is also added to add_pass()
and proxysection() so that it is not possible to declare the same
worker name once with a Match, again without, or vice versa).
4'. Does it makes sense to use the longest match_name or should we go
for the first matching wildcard worker (if any) wins?
5. Rename ap_proxy_wildcard_match() as ap_proxy_strcmp_ematch() to
suggest it could go to server/util.ch (if/when the associated
ap_matchexp_escape() and ap_proxy_strcasecmp_ematch() are coded).
6. Move proxy_worker_shared { char
wildcard_name[PROXY_WORKER_MAX_NAME_SIZE]; } to proxy_worker { char
*match_name; } to save shm space and avoid length issues (I don't see
how this could be updated at runtime, for now, balancer-manager can't
declare Match workers).
6. Keep ap_proxy_escape_wildcard_url() static (private) since it is
not used outside proxy_util (now).

This is not very tested (yet), I'll come back if I detect issues, but
first I'd like your feedbacks on this modifications...

Patch attached.

Regards,
Yann.

Re: [PATCH] Fix settings options with ProxyPassMatch

Posted by Jan Kaluža <jk...@redhat.com>.
Hi again,

the patch has been here for some time already. I hesitate to commit it 
to trunk without any review, because it changes the core code in 
mod_proxy and I'm afraid that there could exist more corner-cases I'm 
not aware of.

On the other side (and to motivate someone with deeper knowledge of 
mod_proxy :) ), it would be great to have UDS support also for 
ProxyPassMatch and fix some old bugs related to setting options together 
with ProxyPassMatch (like PR 43513 and PR 54973).

So, anyone who would have time to review this patch, please? :)

Thanks,
Jan Kaluza

On 04/10/2014 10:25 AM, Jan Kaluža wrote:
> Hi,
>
> can you please check the attached patch? I think it should be closer to
> the proper way how to fix this than the previous one.
>
> It works properly for me even for URLs with '*' and '?'.
>
> Regards,
> Jan Kaluza
>
> On 03/21/2014 03:27 PM, Yann Ylavic wrote:
>> Yes, selecting the right worker is all in ap_proxy_get_worker(), but
>> probably also add_pass() and proxysection() would need something like
>> ap_proxy_define_wildcard_worker() to register this kind of worker
>> (save the original name, ...).
>>
>> On Fri, Mar 21, 2014 at 3:15 PM, Jim Jagielski <ji...@jagunet.com> wrote:
>>> Just a thought, but wouldn't the better place to "fix" this
>>> be in ap_proxy_get_worker()??
>>>
>>> On Mar 21, 2014, at 9:13 AM, Yann Ylavic <yl...@gmail.com> wrote:
>>>
>>>> oups, sorry for the numbering.
>>>>
>>>> On Fri, Mar 21, 2014 at 2:11 PM, Yann Ylavic <yl...@gmail.com>
>>>> wrote:
>>>>> On Wed, Mar 19, 2014 at 9:59 AM, Jan Kaluža <jk...@redhat.com>
>>>>> wrote:
>>>>>> On 03/18/2014 02:46 PM, Yann Ylavic wrote:
>>>>>>>
>>>>>>> On Tue, Mar 18, 2014 at 2:38 PM, Yann Ylavic
>>>>>>> <yl...@gmail.com> wrote:
>>>>>>>>
>>>>>>>> Wouldn't it be possible to define wildcard workers when the URL is
>>>>>>>> known to be a regexp substitution?
>>>>>>>> For these workers' URLs, the dollars (plus the following digit)
>>>>>>>> could
>>>>>>>> be replaced by a wildcard (ie. *) and ap_proxy_get_worker()
>>>>>>>> could then
>>>>>>>> use ap_strcasecmp_match() against the requested URL.
>>>>>>>
>>>>>>>
>>>>>>> I meant ap_strcmp_match(), this has to be case sensitive...
>>>>>>>
>>>>>>
>>>>>> I've implemented your idea. Can you check the attached patch
>>>>>> please? It
>>>>>> fixes the original PR and also ProxyPassMatch with UDS for me.
>>>>>>
>>>>>> If it's OK, I will commit it.
>>>>>>
>>>>>
>>>>> I took a deeper look into this but I'm afraid it is a more complex
>>>>> story actually...
>>>>>
>>>>> 1. Workers can be defined in <Proxy[Match] ...> sections too.
>>>>> The same should probably be done in proxysection().
>>>>>
>>>>> 2. Both '*' and '?' are legitimate URL chars.
>>>>> We need a way to escape the original ones in the configured worker's
>>>>> URL, but ap_strcmp_match doesn't handle (un)escaping.
>>>>> So maybe new ap_matchexp_[un]escape() and ap_str[case]cmp_ematch()
>>>>> functions should be implemented.
>>>>>
>>>>> 5. When no $ is used in the worker's URL, an implicit $1 is appended
>>>>> by proxy_trans().
>>>>> Hence
>>>>>    ProxyPassMatch /some/path/with(/capture) http://some.host
>>>>> is equivalent to
>>>>>    ProxyPassMatch /some/path/with(/capture) http://some.host$1
>>>>> So an implicit '*' should be appended to the worker's wildcard URL in
>>>>> the first case (maybe only when the match has a capture but this is
>>>>> just an optimization, * would match empty).
>>>>>
>>>>> 3. I think we should mark the worker as wildcard when it is defined by
>>>>> ProxyPassMatch or a <ProxyMatch> section.
>>>>> So that ap_proxy_get_worker() won't use the costly ap_strcmp_match()
>>>>> for "normal" workers, and won't either trigger false positives due to
>>>>> '*' or '?' in the configured URL (which is not escaped).
>>>>>
>>>>> 4. What about the same URL used both with ProxyPass and
>>>>> ProxyPassMatch, same worker or different ones?
>>>>> IMHO, they should be different workers, and by rewritting the URL to a
>>>>> wildcard one as done in the patch they will be, but we lose the
>>>>> configured name, and eg. the balancer-manager's param "w" would now
>>>>> work only with the rewritten name (which is, at least, non intuitive).
>>>>> A new balancer-manager's param could be added to access wildcard
>>>>> workers (eg. "ww"), since they are different they may be accessed
>>>>> separately, hence I think we need to save the original URL for such
>>>>> case(s).
>>>>> So maybe for point 2.'s mark we could use the configured (original)
>>>>> URL as a new member of the proxy_worker struct (I don't see any reason
>>>>> for it to be shared in proxy_worker_shared but one may object...).
>>>>> This new field would be NULL for "normal" workers.
>>>>>
>>>>> 5. What about wildcard balancers (ProxyPassMatch /foo(/.*)
>>>>> balancer://mycluster/bar$1)?
>>>>> Since the balancers are registered/selected solely by their names
>>>>> (path and everything after is ignored), and then their workers are
>>>>> selected according to the balancer's method (the path still does not
>>>>> matter), I think there is no matching issue here (the requested path,
>>>>> eventually rewritten by proxy_trans(), will finally be appended to the
>>>>> worker's URL as is).
>>>>> Hence the same balancer's URL used both in ProxyPass and
>>>>> ProxyPassMatch can refer to the same balancer (IMHO).
>>>>> Nothing to be done therefore, but I may be missing something...
>>>>>
>>>>> 6. ProxyPassReverseMatch would be welcome too, but that's probably
>>>>> another story...
>>>>>
>>>>> Regards,
>>>>> Yann.
>>>>
>>>
>


Re: [PATCH] Fix settings options with ProxyPassMatch

Posted by Jan Kaluža <jk...@redhat.com>.
Hi,

can you please check the attached patch? I think it should be closer to 
the proper way how to fix this than the previous one.

It works properly for me even for URLs with '*' and '?'.

Regards,
Jan Kaluza

On 03/21/2014 03:27 PM, Yann Ylavic wrote:
> Yes, selecting the right worker is all in ap_proxy_get_worker(), but
> probably also add_pass() and proxysection() would need something like
> ap_proxy_define_wildcard_worker() to register this kind of worker
> (save the original name, ...).
>
> On Fri, Mar 21, 2014 at 3:15 PM, Jim Jagielski <ji...@jagunet.com> wrote:
>> Just a thought, but wouldn't the better place to "fix" this
>> be in ap_proxy_get_worker()??
>>
>> On Mar 21, 2014, at 9:13 AM, Yann Ylavic <yl...@gmail.com> wrote:
>>
>>> oups, sorry for the numbering.
>>>
>>> On Fri, Mar 21, 2014 at 2:11 PM, Yann Ylavic <yl...@gmail.com> wrote:
>>>> On Wed, Mar 19, 2014 at 9:59 AM, Jan Kaluža <jk...@redhat.com> wrote:
>>>>> On 03/18/2014 02:46 PM, Yann Ylavic wrote:
>>>>>>
>>>>>> On Tue, Mar 18, 2014 at 2:38 PM, Yann Ylavic <yl...@gmail.com> wrote:
>>>>>>>
>>>>>>> Wouldn't it be possible to define wildcard workers when the URL is
>>>>>>> known to be a regexp substitution?
>>>>>>> For these workers' URLs, the dollars (plus the following digit) could
>>>>>>> be replaced by a wildcard (ie. *) and ap_proxy_get_worker() could then
>>>>>>> use ap_strcasecmp_match() against the requested URL.
>>>>>>
>>>>>>
>>>>>> I meant ap_strcmp_match(), this has to be case sensitive...
>>>>>>
>>>>>
>>>>> I've implemented your idea. Can you check the attached patch please? It
>>>>> fixes the original PR and also ProxyPassMatch with UDS for me.
>>>>>
>>>>> If it's OK, I will commit it.
>>>>>
>>>>
>>>> I took a deeper look into this but I'm afraid it is a more complex
>>>> story actually...
>>>>
>>>> 1. Workers can be defined in <Proxy[Match] ...> sections too.
>>>> The same should probably be done in proxysection().
>>>>
>>>> 2. Both '*' and '?' are legitimate URL chars.
>>>> We need a way to escape the original ones in the configured worker's
>>>> URL, but ap_strcmp_match doesn't handle (un)escaping.
>>>> So maybe new ap_matchexp_[un]escape() and ap_str[case]cmp_ematch()
>>>> functions should be implemented.
>>>>
>>>> 5. When no $ is used in the worker's URL, an implicit $1 is appended
>>>> by proxy_trans().
>>>> Hence
>>>>    ProxyPassMatch /some/path/with(/capture) http://some.host
>>>> is equivalent to
>>>>    ProxyPassMatch /some/path/with(/capture) http://some.host$1
>>>> So an implicit '*' should be appended to the worker's wildcard URL in
>>>> the first case (maybe only when the match has a capture but this is
>>>> just an optimization, * would match empty).
>>>>
>>>> 3. I think we should mark the worker as wildcard when it is defined by
>>>> ProxyPassMatch or a <ProxyMatch> section.
>>>> So that ap_proxy_get_worker() won't use the costly ap_strcmp_match()
>>>> for "normal" workers, and won't either trigger false positives due to
>>>> '*' or '?' in the configured URL (which is not escaped).
>>>>
>>>> 4. What about the same URL used both with ProxyPass and
>>>> ProxyPassMatch, same worker or different ones?
>>>> IMHO, they should be different workers, and by rewritting the URL to a
>>>> wildcard one as done in the patch they will be, but we lose the
>>>> configured name, and eg. the balancer-manager's param "w" would now
>>>> work only with the rewritten name (which is, at least, non intuitive).
>>>> A new balancer-manager's param could be added to access wildcard
>>>> workers (eg. "ww"), since they are different they may be accessed
>>>> separately, hence I think we need to save the original URL for such
>>>> case(s).
>>>> So maybe for point 2.'s mark we could use the configured (original)
>>>> URL as a new member of the proxy_worker struct (I don't see any reason
>>>> for it to be shared in proxy_worker_shared but one may object...).
>>>> This new field would be NULL for "normal" workers.
>>>>
>>>> 5. What about wildcard balancers (ProxyPassMatch /foo(/.*)
>>>> balancer://mycluster/bar$1)?
>>>> Since the balancers are registered/selected solely by their names
>>>> (path and everything after is ignored), and then their workers are
>>>> selected according to the balancer's method (the path still does not
>>>> matter), I think there is no matching issue here (the requested path,
>>>> eventually rewritten by proxy_trans(), will finally be appended to the
>>>> worker's URL as is).
>>>> Hence the same balancer's URL used both in ProxyPass and
>>>> ProxyPassMatch can refer to the same balancer (IMHO).
>>>> Nothing to be done therefore, but I may be missing something...
>>>>
>>>> 6. ProxyPassReverseMatch would be welcome too, but that's probably
>>>> another story...
>>>>
>>>> Regards,
>>>> Yann.
>>>
>>


Re: [PATCH] Fix settings options with ProxyPassMatch

Posted by Yann Ylavic <yl...@gmail.com>.
Yes, selecting the right worker is all in ap_proxy_get_worker(), but
probably also add_pass() and proxysection() would need something like
ap_proxy_define_wildcard_worker() to register this kind of worker
(save the original name, ...).

On Fri, Mar 21, 2014 at 3:15 PM, Jim Jagielski <ji...@jagunet.com> wrote:
> Just a thought, but wouldn't the better place to "fix" this
> be in ap_proxy_get_worker()??
>
> On Mar 21, 2014, at 9:13 AM, Yann Ylavic <yl...@gmail.com> wrote:
>
>> oups, sorry for the numbering.
>>
>> On Fri, Mar 21, 2014 at 2:11 PM, Yann Ylavic <yl...@gmail.com> wrote:
>>> On Wed, Mar 19, 2014 at 9:59 AM, Jan Kaluža <jk...@redhat.com> wrote:
>>>> On 03/18/2014 02:46 PM, Yann Ylavic wrote:
>>>>>
>>>>> On Tue, Mar 18, 2014 at 2:38 PM, Yann Ylavic <yl...@gmail.com> wrote:
>>>>>>
>>>>>> Wouldn't it be possible to define wildcard workers when the URL is
>>>>>> known to be a regexp substitution?
>>>>>> For these workers' URLs, the dollars (plus the following digit) could
>>>>>> be replaced by a wildcard (ie. *) and ap_proxy_get_worker() could then
>>>>>> use ap_strcasecmp_match() against the requested URL.
>>>>>
>>>>>
>>>>> I meant ap_strcmp_match(), this has to be case sensitive...
>>>>>
>>>>
>>>> I've implemented your idea. Can you check the attached patch please? It
>>>> fixes the original PR and also ProxyPassMatch with UDS for me.
>>>>
>>>> If it's OK, I will commit it.
>>>>
>>>
>>> I took a deeper look into this but I'm afraid it is a more complex
>>> story actually...
>>>
>>> 1. Workers can be defined in <Proxy[Match] ...> sections too.
>>> The same should probably be done in proxysection().
>>>
>>> 2. Both '*' and '?' are legitimate URL chars.
>>> We need a way to escape the original ones in the configured worker's
>>> URL, but ap_strcmp_match doesn't handle (un)escaping.
>>> So maybe new ap_matchexp_[un]escape() and ap_str[case]cmp_ematch()
>>> functions should be implemented.
>>>
>>> 5. When no $ is used in the worker's URL, an implicit $1 is appended
>>> by proxy_trans().
>>> Hence
>>>   ProxyPassMatch /some/path/with(/capture) http://some.host
>>> is equivalent to
>>>   ProxyPassMatch /some/path/with(/capture) http://some.host$1
>>> So an implicit '*' should be appended to the worker's wildcard URL in
>>> the first case (maybe only when the match has a capture but this is
>>> just an optimization, * would match empty).
>>>
>>> 3. I think we should mark the worker as wildcard when it is defined by
>>> ProxyPassMatch or a <ProxyMatch> section.
>>> So that ap_proxy_get_worker() won't use the costly ap_strcmp_match()
>>> for "normal" workers, and won't either trigger false positives due to
>>> '*' or '?' in the configured URL (which is not escaped).
>>>
>>> 4. What about the same URL used both with ProxyPass and
>>> ProxyPassMatch, same worker or different ones?
>>> IMHO, they should be different workers, and by rewritting the URL to a
>>> wildcard one as done in the patch they will be, but we lose the
>>> configured name, and eg. the balancer-manager's param "w" would now
>>> work only with the rewritten name (which is, at least, non intuitive).
>>> A new balancer-manager's param could be added to access wildcard
>>> workers (eg. "ww"), since they are different they may be accessed
>>> separately, hence I think we need to save the original URL for such
>>> case(s).
>>> So maybe for point 2.'s mark we could use the configured (original)
>>> URL as a new member of the proxy_worker struct (I don't see any reason
>>> for it to be shared in proxy_worker_shared but one may object...).
>>> This new field would be NULL for "normal" workers.
>>>
>>> 5. What about wildcard balancers (ProxyPassMatch /foo(/.*)
>>> balancer://mycluster/bar$1)?
>>> Since the balancers are registered/selected solely by their names
>>> (path and everything after is ignored), and then their workers are
>>> selected according to the balancer's method (the path still does not
>>> matter), I think there is no matching issue here (the requested path,
>>> eventually rewritten by proxy_trans(), will finally be appended to the
>>> worker's URL as is).
>>> Hence the same balancer's URL used both in ProxyPass and
>>> ProxyPassMatch can refer to the same balancer (IMHO).
>>> Nothing to be done therefore, but I may be missing something...
>>>
>>> 6. ProxyPassReverseMatch would be welcome too, but that's probably
>>> another story...
>>>
>>> Regards,
>>> Yann.
>>
>

Re: [PATCH] Fix settings options with ProxyPassMatch

Posted by Jim Jagielski <ji...@jaguNET.com>.
Just a thought, but wouldn't the better place to "fix" this
be in ap_proxy_get_worker()??

On Mar 21, 2014, at 9:13 AM, Yann Ylavic <yl...@gmail.com> wrote:

> oups, sorry for the numbering.
> 
> On Fri, Mar 21, 2014 at 2:11 PM, Yann Ylavic <yl...@gmail.com> wrote:
>> On Wed, Mar 19, 2014 at 9:59 AM, Jan Kaluža <jk...@redhat.com> wrote:
>>> On 03/18/2014 02:46 PM, Yann Ylavic wrote:
>>>> 
>>>> On Tue, Mar 18, 2014 at 2:38 PM, Yann Ylavic <yl...@gmail.com> wrote:
>>>>> 
>>>>> Wouldn't it be possible to define wildcard workers when the URL is
>>>>> known to be a regexp substitution?
>>>>> For these workers' URLs, the dollars (plus the following digit) could
>>>>> be replaced by a wildcard (ie. *) and ap_proxy_get_worker() could then
>>>>> use ap_strcasecmp_match() against the requested URL.
>>>> 
>>>> 
>>>> I meant ap_strcmp_match(), this has to be case sensitive...
>>>> 
>>> 
>>> I've implemented your idea. Can you check the attached patch please? It
>>> fixes the original PR and also ProxyPassMatch with UDS for me.
>>> 
>>> If it's OK, I will commit it.
>>> 
>> 
>> I took a deeper look into this but I'm afraid it is a more complex
>> story actually...
>> 
>> 1. Workers can be defined in <Proxy[Match] ...> sections too.
>> The same should probably be done in proxysection().
>> 
>> 2. Both '*' and '?' are legitimate URL chars.
>> We need a way to escape the original ones in the configured worker's
>> URL, but ap_strcmp_match doesn't handle (un)escaping.
>> So maybe new ap_matchexp_[un]escape() and ap_str[case]cmp_ematch()
>> functions should be implemented.
>> 
>> 5. When no $ is used in the worker's URL, an implicit $1 is appended
>> by proxy_trans().
>> Hence
>>   ProxyPassMatch /some/path/with(/capture) http://some.host
>> is equivalent to
>>   ProxyPassMatch /some/path/with(/capture) http://some.host$1
>> So an implicit '*' should be appended to the worker's wildcard URL in
>> the first case (maybe only when the match has a capture but this is
>> just an optimization, * would match empty).
>> 
>> 3. I think we should mark the worker as wildcard when it is defined by
>> ProxyPassMatch or a <ProxyMatch> section.
>> So that ap_proxy_get_worker() won't use the costly ap_strcmp_match()
>> for "normal" workers, and won't either trigger false positives due to
>> '*' or '?' in the configured URL (which is not escaped).
>> 
>> 4. What about the same URL used both with ProxyPass and
>> ProxyPassMatch, same worker or different ones?
>> IMHO, they should be different workers, and by rewritting the URL to a
>> wildcard one as done in the patch they will be, but we lose the
>> configured name, and eg. the balancer-manager's param "w" would now
>> work only with the rewritten name (which is, at least, non intuitive).
>> A new balancer-manager's param could be added to access wildcard
>> workers (eg. "ww"), since they are different they may be accessed
>> separately, hence I think we need to save the original URL for such
>> case(s).
>> So maybe for point 2.'s mark we could use the configured (original)
>> URL as a new member of the proxy_worker struct (I don't see any reason
>> for it to be shared in proxy_worker_shared but one may object...).
>> This new field would be NULL for "normal" workers.
>> 
>> 5. What about wildcard balancers (ProxyPassMatch /foo(/.*)
>> balancer://mycluster/bar$1)?
>> Since the balancers are registered/selected solely by their names
>> (path and everything after is ignored), and then their workers are
>> selected according to the balancer's method (the path still does not
>> matter), I think there is no matching issue here (the requested path,
>> eventually rewritten by proxy_trans(), will finally be appended to the
>> worker's URL as is).
>> Hence the same balancer's URL used both in ProxyPass and
>> ProxyPassMatch can refer to the same balancer (IMHO).
>> Nothing to be done therefore, but I may be missing something...
>> 
>> 6. ProxyPassReverseMatch would be welcome too, but that's probably
>> another story...
>> 
>> Regards,
>> Yann.
> 


Re: [PATCH] Fix settings options with ProxyPassMatch

Posted by Yann Ylavic <yl...@gmail.com>.
oups, sorry for the numbering.

On Fri, Mar 21, 2014 at 2:11 PM, Yann Ylavic <yl...@gmail.com> wrote:
> On Wed, Mar 19, 2014 at 9:59 AM, Jan Kaluža <jk...@redhat.com> wrote:
>> On 03/18/2014 02:46 PM, Yann Ylavic wrote:
>>>
>>> On Tue, Mar 18, 2014 at 2:38 PM, Yann Ylavic <yl...@gmail.com> wrote:
>>>>
>>>> Wouldn't it be possible to define wildcard workers when the URL is
>>>> known to be a regexp substitution?
>>>> For these workers' URLs, the dollars (plus the following digit) could
>>>> be replaced by a wildcard (ie. *) and ap_proxy_get_worker() could then
>>>> use ap_strcasecmp_match() against the requested URL.
>>>
>>>
>>> I meant ap_strcmp_match(), this has to be case sensitive...
>>>
>>
>> I've implemented your idea. Can you check the attached patch please? It
>> fixes the original PR and also ProxyPassMatch with UDS for me.
>>
>> If it's OK, I will commit it.
>>
>
> I took a deeper look into this but I'm afraid it is a more complex
> story actually...
>
> 1. Workers can be defined in <Proxy[Match] ...> sections too.
> The same should probably be done in proxysection().
>
> 2. Both '*' and '?' are legitimate URL chars.
> We need a way to escape the original ones in the configured worker's
> URL, but ap_strcmp_match doesn't handle (un)escaping.
> So maybe new ap_matchexp_[un]escape() and ap_str[case]cmp_ematch()
> functions should be implemented.
>
> 5. When no $ is used in the worker's URL, an implicit $1 is appended
> by proxy_trans().
> Hence
>    ProxyPassMatch /some/path/with(/capture) http://some.host
> is equivalent to
>    ProxyPassMatch /some/path/with(/capture) http://some.host$1
> So an implicit '*' should be appended to the worker's wildcard URL in
> the first case (maybe only when the match has a capture but this is
> just an optimization, * would match empty).
>
> 3. I think we should mark the worker as wildcard when it is defined by
> ProxyPassMatch or a <ProxyMatch> section.
> So that ap_proxy_get_worker() won't use the costly ap_strcmp_match()
> for "normal" workers, and won't either trigger false positives due to
> '*' or '?' in the configured URL (which is not escaped).
>
> 4. What about the same URL used both with ProxyPass and
> ProxyPassMatch, same worker or different ones?
> IMHO, they should be different workers, and by rewritting the URL to a
> wildcard one as done in the patch they will be, but we lose the
> configured name, and eg. the balancer-manager's param "w" would now
> work only with the rewritten name (which is, at least, non intuitive).
> A new balancer-manager's param could be added to access wildcard
> workers (eg. "ww"), since they are different they may be accessed
> separately, hence I think we need to save the original URL for such
> case(s).
> So maybe for point 2.'s mark we could use the configured (original)
> URL as a new member of the proxy_worker struct (I don't see any reason
> for it to be shared in proxy_worker_shared but one may object...).
> This new field would be NULL for "normal" workers.
>
> 5. What about wildcard balancers (ProxyPassMatch /foo(/.*)
> balancer://mycluster/bar$1)?
> Since the balancers are registered/selected solely by their names
> (path and everything after is ignored), and then their workers are
> selected according to the balancer's method (the path still does not
> matter), I think there is no matching issue here (the requested path,
> eventually rewritten by proxy_trans(), will finally be appended to the
> worker's URL as is).
> Hence the same balancer's URL used both in ProxyPass and
> ProxyPassMatch can refer to the same balancer (IMHO).
> Nothing to be done therefore, but I may be missing something...
>
> 6. ProxyPassReverseMatch would be welcome too, but that's probably
> another story...
>
> Regards,
> Yann.

Re: [PATCH] Fix settings options with ProxyPassMatch

Posted by Yann Ylavic <yl...@gmail.com>.
On Wed, Mar 19, 2014 at 9:59 AM, Jan Kaluža <jk...@redhat.com> wrote:
> On 03/18/2014 02:46 PM, Yann Ylavic wrote:
>>
>> On Tue, Mar 18, 2014 at 2:38 PM, Yann Ylavic <yl...@gmail.com> wrote:
>>>
>>> Wouldn't it be possible to define wildcard workers when the URL is
>>> known to be a regexp substitution?
>>> For these workers' URLs, the dollars (plus the following digit) could
>>> be replaced by a wildcard (ie. *) and ap_proxy_get_worker() could then
>>> use ap_strcasecmp_match() against the requested URL.
>>
>>
>> I meant ap_strcmp_match(), this has to be case sensitive...
>>
>
> I've implemented your idea. Can you check the attached patch please? It
> fixes the original PR and also ProxyPassMatch with UDS for me.
>
> If it's OK, I will commit it.
>

I took a deeper look into this but I'm afraid it is a more complex
story actually...

1. Workers can be defined in <Proxy[Match] ...> sections too.
The same should probably be done in proxysection().

2. Both '*' and '?' are legitimate URL chars.
We need a way to escape the original ones in the configured worker's
URL, but ap_strcmp_match doesn't handle (un)escaping.
So maybe new ap_matchexp_[un]escape() and ap_str[case]cmp_ematch()
functions should be implemented.

5. When no $ is used in the worker's URL, an implicit $1 is appended
by proxy_trans().
Hence
   ProxyPassMatch /some/path/with(/capture) http://some.host
is equivalent to
   ProxyPassMatch /some/path/with(/capture) http://some.host$1
So an implicit '*' should be appended to the worker's wildcard URL in
the first case (maybe only when the match has a capture but this is
just an optimization, * would match empty).

3. I think we should mark the worker as wildcard when it is defined by
ProxyPassMatch or a <ProxyMatch> section.
So that ap_proxy_get_worker() won't use the costly ap_strcmp_match()
for "normal" workers, and won't either trigger false positives due to
'*' or '?' in the configured URL (which is not escaped).

4. What about the same URL used both with ProxyPass and
ProxyPassMatch, same worker or different ones?
IMHO, they should be different workers, and by rewritting the URL to a
wildcard one as done in the patch they will be, but we lose the
configured name, and eg. the balancer-manager's param "w" would now
work only with the rewritten name (which is, at least, non intuitive).
A new balancer-manager's param could be added to access wildcard
workers (eg. "ww"), since they are different they may be accessed
separately, hence I think we need to save the original URL for such
case(s).
So maybe for point 2.'s mark we could use the configured (original)
URL as a new member of the proxy_worker struct (I don't see any reason
for it to be shared in proxy_worker_shared but one may object...).
This new field would be NULL for "normal" workers.

5. What about wildcard balancers (ProxyPassMatch /foo(/.*)
balancer://mycluster/bar$1)?
Since the balancers are registered/selected solely by their names
(path and everything after is ignored), and then their workers are
selected according to the balancer's method (the path still does not
matter), I think there is no matching issue here (the requested path,
eventually rewritten by proxy_trans(), will finally be appended to the
worker's URL as is).
Hence the same balancer's URL used both in ProxyPass and
ProxyPassMatch can refer to the same balancer (IMHO).
Nothing to be done therefore, but I may be missing something...

6. ProxyPassReverseMatch would be welcome too, but that's probably
another story...

Regards,
Yann.

Re: [PATCH] Fix settings options with ProxyPassMatch

Posted by Jan Kaluža <jk...@redhat.com>.
On 03/19/2014 10:13 AM, Jan Kaluža wrote:
> On 03/19/2014 09:59 AM, Jan Kaluža wrote:
>> On 03/18/2014 02:46 PM, Yann Ylavic wrote:
>>> On Tue, Mar 18, 2014 at 2:38 PM, Yann Ylavic <yl...@gmail.com>
>>> wrote:
>>>> Wouldn't it be possible to define wildcard workers when the URL is
>>>> known to be a regexp substitution?
>>>> For these workers' URLs, the dollars (plus the following digit) could
>>>> be replaced by a wildcard (ie. *) and ap_proxy_get_worker() could then
>>>> use ap_strcasecmp_match() against the requested URL.
>>>
>>> I meant ap_strcmp_match(), this has to be case sensitive...
>>>
>>
>> I've implemented your idea. Can you check the attached patch please? It
>> fixes the original PR and also ProxyPassMatch with UDS for me.
>
> Ignore this patch. It breaks ProxyPass. Attached is better version.

Not my day today. s/apr_isxdigit/apr_isdigit.

>
>> If it's OK, I will commit it.
>>
>> Regards,
>> Jan Kaluza
>>
>
> Regards,
> Jan Kaluza
>


Re: [PATCH] Fix settings options with ProxyPassMatch

Posted by Jan Kaluža <jk...@redhat.com>.
On 03/19/2014 09:59 AM, Jan Kaluža wrote:
> On 03/18/2014 02:46 PM, Yann Ylavic wrote:
>> On Tue, Mar 18, 2014 at 2:38 PM, Yann Ylavic <yl...@gmail.com>
>> wrote:
>>> Wouldn't it be possible to define wildcard workers when the URL is
>>> known to be a regexp substitution?
>>> For these workers' URLs, the dollars (plus the following digit) could
>>> be replaced by a wildcard (ie. *) and ap_proxy_get_worker() could then
>>> use ap_strcasecmp_match() against the requested URL.
>>
>> I meant ap_strcmp_match(), this has to be case sensitive...
>>
>
> I've implemented your idea. Can you check the attached patch please? It
> fixes the original PR and also ProxyPassMatch with UDS for me.

Ignore this patch. It breaks ProxyPass. Attached is better version.

> If it's OK, I will commit it.
>
> Regards,
> Jan Kaluza
>

Regards,
Jan Kaluza


Re: [PATCH] Fix settings options with ProxyPassMatch

Posted by Jan Kaluža <jk...@redhat.com>.
On 03/18/2014 02:46 PM, Yann Ylavic wrote:
> On Tue, Mar 18, 2014 at 2:38 PM, Yann Ylavic <yl...@gmail.com> wrote:
>> Wouldn't it be possible to define wildcard workers when the URL is
>> known to be a regexp substitution?
>> For these workers' URLs, the dollars (plus the following digit) could
>> be replaced by a wildcard (ie. *) and ap_proxy_get_worker() could then
>> use ap_strcasecmp_match() against the requested URL.
>
> I meant ap_strcmp_match(), this has to be case sensitive...
>

I've implemented your idea. Can you check the attached patch please? It 
fixes the original PR and also ProxyPassMatch with UDS for me.

If it's OK, I will commit it.

Regards,
Jan Kaluza


Re: [PATCH] Fix settings options with ProxyPassMatch

Posted by Yann Ylavic <yl...@gmail.com>.
On Tue, Mar 18, 2014 at 2:38 PM, Yann Ylavic <yl...@gmail.com> wrote:
> Wouldn't it be possible to define wildcard workers when the URL is
> known to be a regexp substitution?
> For these workers' URLs, the dollars (plus the following digit) could
> be replaced by a wildcard (ie. *) and ap_proxy_get_worker() could then
> use ap_strcasecmp_match() against the requested URL.

I meant ap_strcmp_match(), this has to be case sensitive...

Re: [PATCH] Fix settings options with ProxyPassMatch

Posted by Yann Ylavic <yl...@gmail.com>.
Wouldn't it be possible to define wildcard workers when the URL is
known to be a regexp substitution?
For these workers' URLs, the dollars (plus the following digit) could
be replaced by a wildcard (ie. *) and ap_proxy_get_worker() could then
use ap_strcasecmp_match() against the requested URL.

Regards,
Yann.


On Tue, Mar 18, 2014 at 12:26 PM, Jan Kaluza <jk...@redhat.com> wrote:
> This is also needed to fix ProxyPassMatch with UDS. Without this patch
> following configuration does not work:
>
> ProxyPassMatch ^/web/(.*\.php)$
> unix:/run/php-fpm/fpm.sock|fcgi://127.0.0.1/var/www/html/$1
>
> If nobody is against, I will commit it to trunk later this week.
>
> Regards,
> Jan Kaluza
>
>
> On 09/16/2013 02:44 PM, Jan Kaluža wrote:
>>
>> Hi,
>>
>> I have found out that if you use ProxyPassMatch with regexp and "$1"
>> (see PR 43513 and PR 54973), the right proxy_worker is not used and
>> "reverse" proxy_worker is used instead. The result is situation where no
>> options like "timeout" can be set in this case.
>>
>> The real problem here is that proxy_worker is chosen according to its
>> name which contains "$1" sequence in our case.
>>
>> I have found out that there is already patch fixing ProxyPassMatch
>> behaviour with "$1" in PR 43513. I have rebased the patch for trunk and
>> changed its style a bit.
>>
>> The patch strips everything after "$", so it for example changes the
>> name of proxy_worker from "http://127.0.0.1/cgi-bin/$1" to
>> "http://127.0.0.1/cgi-bin/". Later when request arrives, proper
>> proxy_worker is chosen. Without this change, proxy_worker's name with
>> "$1" would be compared against real request URL, but real request URL
>> does not contain this "$1" sequence and therefore this correct
>> proxy_worker wouldn't be chosen.
>>
>> Do you see any problems with this patch?
>>
>> Regards,
>> Jan Kaluza
>
>