You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Mike Cardwell <ap...@lists.grepular.com> on 2007/03/07 11:28:22 UTC

mod_proxy doesn't proxy with %2F

Hi,

Using the standard Redhat Enterprise 4, Apache 2.0.52 RPMs here. I have
a CommunigatePro server. It runs it's own http daemon for the
administration interface, and webmail. We needed to extend it in several
ways, so I stuck an Apache mod_proxy in front of it. Here's the config I
used which works fine:

ProxyPass        / https://127.0.0.1:9100/
ProxyPassReverse / https://127.0.0.1:9100/

However. When using webmail, if you go to view an attachment like for
example "filename.txt" from message "message_id" in folder "INBOX" the
url would look like:

https://the.domain/session/session_id/MessagePart/INBOX/message_id/filename.txt

That works absolutely fine. The problem is when the file is in a
subfolder, eg "INBOX/Archive". Then the url becomes:

https://the.domain/session/session_id/MessagePart/INBOX%2FArchive/message_id/filename.txt

With the url above, mod_proxy simply does *nothing*, I get an Apache 404
error message because the path doesn't exist locally because mod_proxy
hasn't attempted to do the proxying. I've used tcpdump to verify that
there is no connection to port 9100 when I make the request.

Is this a bug, or am I doing something dumb?

Thanks,
Mike

Re: [users@httpd] Re: mod_proxy doesn't proxy with %2F

Posted by Mike Cardwell <ap...@lists.grepular.com>.
* on the Wed, Mar 07, 2007 at 01:50:18PM -0500, Jack Saunders wrote:

>>> This was on the dev list. I've brought it onto the users list as I no
>>> longer think it's a bug as such. Please see my original email above, and
>>> my update below for the issue.
>>>
>>> Right. I've made a *little* progress. Reading the core docs I found:
>>>
>>> "The AllowEncodedSlashes directive allows URLs which contain encoded path
>>> separators (%2F for /  and additionally %5C for \ on according systems)
>>> to be used. Normally such URLs are refused with a 404 (Not found)
>>> error."
>>>
>>> So these requests are being 404'd simply for containing %2F's in the
>>> path. When I turn on encoded slashes with:
>>>
>>> AllowEncodedSlashes On
>>>
>>> It starts to proxy. But it starts to proxy the wrong thing
>>>
>>> Requests for:
>>> 
>>> https://the.domain/session/session_id/MessagePart/INBOX%2FArchive/message_id/filename.txt
>>>
>>> Get proxied to:
>>>
>>> 
>>> https://127.0.0.1:9100/session/session_id/MessagePart/INBOX/Archive/message_id/filename.txt
>>>
>>> I need it to be proxied to:
>>>
>>> 
>>> https://127.0.0.1:9100/session/session_id/MessagePart/INBOX%2FArchive/message_id/filename.txt
>>>
>>> Where do I go from here?
>> I've sorted this now. Using a combination of AllowEncodedSlashes,
>> mod_rewrite and an external script called as a RewriteMap to change /'s
>> to %2F's. A horrible, unholy hack, but it works.
> I was running into the exact same issue with proxying to a lotus
> Workplace (Quickplace) application.  Can you give me more indepth
> information on how you resolved this.

Hi Jack. It's a horrible hack but it works for me. You'll need to make
some changes obviously to get it to work for you, but hopefully this'll
point you in the right direction...

The original request:

https://the.domain/session/session_id/MessagePart/INBOX%2FArchive/message_id/filename.txt

What I wanted that to proxy to:

https://127.0.0.1:9100/session/session_id/MessagePart/INBOX%2FArchive/message_id/filename.txt

Now, you'd think that would be fairly easy wouldn't you? ;) Apache returned
404 before it even got to mod_proxy because the URL contained %2F. So I
turned on the AllowEncodedSlashes directive. See
http://httpd.apache.org/docs/2.0/mod/core.html#allowencodedslashes for a
description of the default %2F behaviour, and how AllowEncodedSlashes
changes that.

Enabling that option then meant that the request was getting to
mod_proxy. However, mod_proxy was seeing
"/session/session_id/MessagePart/INBOX/Archive/message_id/filename.txt"
rather than
"/session/session_id/MessagePart/INBOX%2FArchive/message_id/filename.txt"

I needed to maintain the encoded forward slashes. So instead of using
mod_proxy for these particular requests directly, I set up a mod_rewrite
rule. I needed to split the path into three parts, and encode the middle
part, then stick them back together, before proxying using mod_rewrite:

Part1: /session/session_id/MessagePart/
Part2: Bit to be encoded
Part3: message_id/filename.txt

I wrote an external script to be used with RewriteMap to encode the
middle part:

=======================================================================
#!/usr/bin/perl
$|=1;
while(<STDIN>){
   s/\//\%2F/g;
   print;
}
=======================================================================

I then defined the RewriteMap for the script:

     RewriteMap encode_slashes prg:/etc/httpd/conf.d/rewrite_slashes.pl

I then added the following rewrite rules:

     RewriteCond %{REQUEST_URI} ^/session/[^/]+/MessagePart/
     RewriteRule ^proxy:(https://127.0.0.1:9100/session/[^/]+/MessagePart/)(.+)(/[^/]+/[^/]+)$ $1${encode_slashes:$2}$3 [P]

Hope this helps,

Mike

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] Re: mod_proxy doesn't proxy with %2F

Posted by Jack Saunders <ja...@gmail.com>.
Mike,

I was running into the exact same issue with proxying to a lotus
Workplace (Quickplace) application.  Can you give me more indepth
information on how you resolved this.

Thanks a bunch!
Jack

On 3/7/07, Mike Cardwell <ap...@lists.grepular.com> wrote:
> * on the Wed, Mar 07, 2007 at 02:48:16PM +0000, Mike Cardwell wrote:
>
> > This was on the dev list. I've brought it onto the users list as I no
> > longer think it's a bug as such. Please see my original email above, and
> > my update below for the issue.
> >
> > Right. I've made a *little* progress. Reading the core docs I found:
> >
> > "The AllowEncodedSlashes directive allows URLs which contain encoded path
> > separators (%2F for /  and additionally %5C for \ on according systems)
> > to be used. Normally such URLs are refused with a 404 (Not found)
> > error."
> >
> > So these requests are being 404'd simply for containing %2F's in the
> > path. When I turn on encoded slashes with:
> >
> > AllowEncodedSlashes On
> >
> > It starts to proxy. But it starts to proxy the wrong thing
> >
> > Requests for:
> > https://the.domain/session/session_id/MessagePart/INBOX%2FArchive/message_id/filename.txt
> >
> > Get proxied to:
> >
> > https://127.0.0.1:9100/session/session_id/MessagePart/INBOX/Archive/message_id/filename.txt
> >
> > I need it to be proxied to:
> >
> > https://127.0.0.1:9100/session/session_id/MessagePart/INBOX%2FArchive/message_id/filename.txt
> >
> > Where do I go from here?
>
> I've sorted this now. Using a combination of AllowEncodedSlashes,
> mod_rewrite and an external script called as a RewriteMap to change /'s
> to %2F's. A horrible, unholy hack, but it works.
>
> Mike
>
> ---------------------------------------------------------------------
> The official User-To-User support forum of the Apache HTTP Server Project.
> See <URL:http://httpd.apache.org/userslist.html> for more info.
> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
>   "   from the digest: users-digest-unsubscribe@httpd.apache.org
> For additional commands, e-mail: users-help@httpd.apache.org
>
>

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] Re: mod_proxy doesn't proxy with %2F

Posted by Mike Cardwell <ap...@lists.grepular.com>.
* on the Wed, Mar 07, 2007 at 02:48:16PM +0000, Mike Cardwell wrote:

> This was on the dev list. I've brought it onto the users list as I no
> longer think it's a bug as such. Please see my original email above, and
> my update below for the issue.
> 
> Right. I've made a *little* progress. Reading the core docs I found:
> 
> "The AllowEncodedSlashes directive allows URLs which contain encoded path
> separators (%2F for /  and additionally %5C for \ on according systems)
> to be used. Normally such URLs are refused with a 404 (Not found)
> error."
> 
> So these requests are being 404'd simply for containing %2F's in the
> path. When I turn on encoded slashes with:
> 
> AllowEncodedSlashes On
> 
> It starts to proxy. But it starts to proxy the wrong thing
> 
> Requests for:
> https://the.domain/session/session_id/MessagePart/INBOX%2FArchive/message_id/filename.txt
> 
> Get proxied to:
> 
> https://127.0.0.1:9100/session/session_id/MessagePart/INBOX/Archive/message_id/filename.txt
> 
> I need it to be proxied to:
> 
> https://127.0.0.1:9100/session/session_id/MessagePart/INBOX%2FArchive/message_id/filename.txt
> 
> Where do I go from here?

I've sorted this now. Using a combination of AllowEncodedSlashes,
mod_rewrite and an external script called as a RewriteMap to change /'s
to %2F's. A horrible, unholy hack, but it works.

Mike

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


[users@httpd] Re: mod_proxy doesn't proxy with %2F

Posted by Mike Cardwell <ap...@lists.grepular.com>.
* on the Wed, Mar 07, 2007 at 10:28:22AM +0000, Mike Cardwell wrote:

> Using the standard Redhat Enterprise 4, Apache 2.0.52 RPMs here. I have
> a CommunigatePro server. It runs it's own http daemon for the
> administration interface, and webmail. We needed to extend it in several
> ways, so I stuck an Apache mod_proxy in front of it. Here's the config I
> used which works fine:
> 
> ProxyPass        / https://127.0.0.1:9100/
> ProxyPassReverse / https://127.0.0.1:9100/
> 
> However. When using webmail, if you go to view an attachment like for
> example "filename.txt" from message "message_id" in folder "INBOX" the
> url would look like:
> 
> https://the.domain/session/session_id/MessagePart/INBOX/message_id/filename.txt
> 
> That works absolutely fine. The problem is when the file is in a
> subfolder, eg "INBOX/Archive". Then the url becomes:
> 
> https://the.domain/session/session_id/MessagePart/INBOX%2FArchive/message_id/filename.txt
> 
> With the url above, mod_proxy simply does *nothing*, I get an Apache 404
> error message because the path doesn't exist locally because mod_proxy
> hasn't attempted to do the proxying. I've used tcpdump to verify that
> there is no connection to port 9100 when I make the request.
> 
> Is this a bug, or am I doing something dumb?

This was on the dev list. I've brought it onto the users list as I no
longer think it's a bug as such. Please see my original email above, and
my update below for the issue.

Right. I've made a *little* progress. Reading the core docs I found:

"The AllowEncodedSlashes directive allows URLs which contain encoded path
separators (%2F for /  and additionally %5C for \ on according systems)
to be used. Normally such URLs are refused with a 404 (Not found)
error."

So these requests are being 404'd simply for containing %2F's in the
path. When I turn on encoded slashes with:

AllowEncodedSlashes On

It starts to proxy. But it starts to proxy the wrong thing

Requests for:
https://the.domain/session/session_id/MessagePart/INBOX%2FArchive/message_id/filename.txt

Get proxied to:

https://127.0.0.1:9100/session/session_id/MessagePart/INBOX/Archive/message_id/filename.txt

I need it to be proxied to:

https://127.0.0.1:9100/session/session_id/MessagePart/INBOX%2FArchive/message_id/filename.txt

Where do I go from here?

Mike

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: mod_proxy doesn't proxy with %2F

Posted by Mike Cardwell <ap...@lists.grepular.com>.
* on the Wed, Mar 07, 2007 at 04:41:08PM +0000, Mike Cardwell wrote:

>> Using the standard Redhat Enterprise 4, Apache 2.0.52 RPMs here. I have
>> a CommunigatePro server. It runs it's own http daemon for the
>> administration interface, and webmail. We needed to extend it in several
>> ways, so I stuck an Apache mod_proxy in front of it. Here's the config I
>> used which works fine:
>> 
>> ProxyPass        / https://127.0.0.1:9100/
>> ProxyPassReverse / https://127.0.0.1:9100/
>> 
>> However. When using webmail, if you go to view an attachment like for
>> example "filename.txt" from message "message_id" in folder "INBOX" the
>> url would look like:
>> 
>> https://the.domain/session/session_id/MessagePart/INBOX/message_id/filename.txt
>> 
>> That works absolutely fine. The problem is when the file is in a
>> subfolder, eg "INBOX/Archive". Then the url becomes:
>> 
>> https://the.domain/session/session_id/MessagePart/INBOX%2FArchive/message_id/filename.txt
>> 
>> With the url above, mod_proxy simply does *nothing*, I get an Apache 404
>> error message because the path doesn't exist locally because mod_proxy
>> hasn't attempted to do the proxying. I've used tcpdump to verify that
>> there is no connection to port 9100 when I make the request.
>> 
>> Is this a bug, or am I doing something dumb? 
> I found a solution to this problem. It's not a bug, just unexpected
> default behaviour I guess. I solved it over on the users list anyway.
> 
> Sorry for wasting your time.

Ok. I'm back. I've come to the conclusion that mod_proxy is actually
using incorrect behaviour. After turning on AllowEncodedSlashes,
Apache lets us use percent encoded forward slashes in the path. Eg:

"http://foo/hello%2Fworld"

When using "ProxyPass / http://bar/"

mod_proxy makes a request for:

"http://bar/hello/world"

That is wrong as I understand it. The forward slash at the end should
be encoded still.

RFC-1738, section 3.3 regarding HTTP URLs:

"Within the <path> and <searchpart> components, "/", ";", "?" are
reserved. The "/" character may be used within HTTP to designate a
hierarchical structure."

So... The forward slash is a reserved character. Section 2.2 says the
following about reserved characters:

"If the character corresponding to an octet is reserved in a scheme, the
octet must be encoded."

So as far as I can see http://foo/hello%2Fworld and
http://foo/hello/world are distinctly different HTTP URLs that should be
allowed to behave differently. Just because Apache treats them the same
when serving local content, doesn't mean that other servers do when you're
proxying the request to them, so I don't think the path should be rewritten
in the way mod_proxy is doing it...

Does anyone agree/disagree? Is this something that can/should be fixed, or
are there backwards compatibility issues? If there are backwards
compatability issues can/should an option be added to mod_proxy for
preserving character encoding, eg: ProxyPreservePathEncoding ... ?

Thanks,

Mike

Re: mod_proxy doesn't proxy with %2F

Posted by Mike Cardwell <ap...@lists.grepular.com>.
* on the Wed, Mar 07, 2007 at 10:28:22AM +0000, Mike Cardwell wrote:

> Using the standard Redhat Enterprise 4, Apache 2.0.52 RPMs here. I have
> a CommunigatePro server. It runs it's own http daemon for the
> administration interface, and webmail. We needed to extend it in several
> ways, so I stuck an Apache mod_proxy in front of it. Here's the config I
> used which works fine:
> 
> ProxyPass        / https://127.0.0.1:9100/
> ProxyPassReverse / https://127.0.0.1:9100/
> 
> However. When using webmail, if you go to view an attachment like for
> example "filename.txt" from message "message_id" in folder "INBOX" the
> url would look like:
> 
> https://the.domain/session/session_id/MessagePart/INBOX/message_id/filename.txt
> 
> That works absolutely fine. The problem is when the file is in a
> subfolder, eg "INBOX/Archive". Then the url becomes:
> 
> https://the.domain/session/session_id/MessagePart/INBOX%2FArchive/message_id/filename.txt
> 
> With the url above, mod_proxy simply does *nothing*, I get an Apache 404
> error message because the path doesn't exist locally because mod_proxy
> hasn't attempted to do the proxying. I've used tcpdump to verify that
> there is no connection to port 9100 when I make the request.
> 
> Is this a bug, or am I doing something dumb?

I found a solution to this problem. It's not a bug, just unexpected
default behaviour I guess. I solved it over on the users list anyway.

Sorry for wasting your time.

Mike