You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@httpd.apache.org by Robert Schenck <ro...@gmail.com> on 2009/12/02 12:02:27 UTC
[users@httpd] Reverse proxying is problematic
*I know this is a long read...but I really need help, and felt the best way
for anyone to help me remotely is to explain the issues in their entirety. *
Hello,
I'm trying to set a reverse proxy, but first, some context:
My office is subscribed to few academic journals. These journals verify the
subscription via IP, such that anyone connected to the internet through our
connection can access the journals. However, some individuals would like to
access the journals away from the office as well. We have a VPN, but it only
connects them to our intranet. Therefore, we want to create a reverse proxy
such that the users with connect to the VPN, then to our intranet, and then
to the proxy server, and then, ultimately, to the journal at hand. This
works because the proxy server will be within our intranet, which they have
access to through the VPN. So it will look like so:
Client --> VPN --> Our Intranet --> Reverse Proxy --> Journal
Note that I'm an intern and have had *very *little experience with Apache
and networking in general (and Linux!)...so please explain things fully.
I have attempted to follow this guide:
http://www.apachetutor.org/admin/reverseproxies
I'm running SUSE Linux Enterprise 11, and have installed apache through
zypper. I installed the mod_proxy_html and mod_xml2enc modules via
compiling. They are fully functional. (mod_proxy_html to rewrite links).
In the examples below I'm attempting to reverse proxy both http://aip.organd
http://apl.aip.org. So basically want I want to do is have anything that is
http://aip.org/somepage.html to be http://proxysrv1/aip/somepage.html and
anything that is http://apl.aip.org to be http://proxysrv1/apl/somepage.html.
All of the content on the page must go through the proxy (note: I know that
many of the links lead to other sub-domains, I will include those as
well...but later, I figured I should get these two working first). *Please
do not suggest a different server application like Squid, I'm required to
use Apache. *
So far, I have the following modifications to the http.conf file:
----------------------------------------------------------------------------------------------------------------------------
Include /etc/apache2/vhosts.d/*.conf
ProxyHTMLEnable On
ProxyHTMLExtended On
ProxyHTMLLinks a href
ProxyHTMLLinks area href
ProxyHTMLLinks link href
ProxyHTMLLinks img src longdesc usemap
ProxyHTMLLinks object classid codebase data usemap
ProxyHTMLLinks q cite
ProxyHTMLLinks blockquote cite
ProxyHTMLLinks ins cite
ProxyHTMLLinks del cite
ProxyHTMLLinks form action
ProxyHTMLLinks input src usemap
ProxyHTMLLinks head profile
ProxyHTMLLinks base href
ProxyHTMLLinks script src for
ProxyHTMLLinks iframe src
ProxyHTMLEvents onclick ondblclick onmousedown onmouseup \
onmouseover onmousemove onmouseout onkeypress \
onkeydown onkeyup onfocus onblur onload \
onunload onsubmit onreset onselect onchange
ProxyRequests Off
ProxyPass /aip/ http://aip.org/
ProxyPassReverse /aip/ http://aip.org/
ProxyHTMLURLMap http://www.aip.org http://proxysrv1/aip
ProxyPass /apl/ http://apl.aip.org/
ProxyPassReverse /apl/ http://apl.aip.org/
ProxyHTMLURLMap http://apl.aip.org http://proxysrv1/apl
<Location /aip/>
ProxyHTMLEnable On
ProxyHTMLExtended On
ProxyPassReverse /
ProxyHTMLURLMap / /
RequestHeader unset Accept-Encoding
</Location>
<Location /apl/>
ProxyHTMLEnable On
ProxyHTMLExtended On
ProxyPassreverse /
ProxyHTMLURLMap / /
RequestHeader unset Accept-Encoding
</Location>
ProxyHTMLLogVerbose On
LogLevel Info
----------------------------------------------------------------------------------------------------------------------------
And the following modifications to the vhost.conf file:
----------------------------------------------------------------------------------------------------------------------------
NameVirtualHost *:80
<VirtualHost *:80>
ServerName proxysrv1
DocumentRoot /srv/www/htdocs
HostnameLookups Off
UseCanonicalName On
ServerSignature On
<Directory "/srv/www/htdocs">
Options Indexes All
AllowOverride None
Order allow,deny
Allow from all
</Directory>
</VirtualHost>
<VirtualHost *:80>
Documentroot /srv/www/htdocs/aip
Servername proxysrv1/aip
HostnameLookups Off
UseCanonicalName On
ServerSignature On
<Directory "/srv/www/htdocs/aip">
Options Indexes All
AllowOverride None
Order allow,deny
Allow from all
</Directory>
</VirtualHost>
<VirtualHost *:80>
Documentroot /srv/www/htdocs/apl
Servername proxysrv1/apl
HostnameLookups Off
UseCanonicalName On
ServerSignature On
<Directory "/srv/www/htdocs/apl">
Options Indexes All
AllowOverride None
Order allow,deny
Allow from all
</Directory>
</VirtualHost>
-------------------------------------------------------------------------------------------
*The mass of issues:*
1) http://proxysrv1/aip/ looks like this: http://imgur.com/n6m0L.png
The page source: http://paste.ubuntu.com/333007/
2) http://proxysrv1/apl/ looks like this: http://proxysrv1/apl/
The page source: http://paste.ubuntu.com/333009/
3) I created a virtual host & proxy at http://proxysrv1/apl/, yet
links like http://apl.aip.org/about/about_the_journal
redirect to http://proxysrv/about/about_the_journal rather than
http://proxysrv/apl/about/about_the_journal
4) All the pages look like crap. I had aip.org working previously, but
only if I set its directory to / (so by going to http://proxysrv1/ you
went to aip.org/),
and had no virtual hosts.
5) That's actually all I can think of. But the pages are pretty darn broken.
*Please explain any fixes in a step-by-step process. Again, I'm new to this.*
Re: [users@httpd] Reverse proxying is problematic
Posted by Devraj Mukherjee <de...@gmail.com>.
Also look at mod_substitute and mod_headers
On Wed, Dec 2, 2009 at 10:45 PM, Robert Schenck <ro...@gmail.com> wrote:
> Peter,
>
> I have to use Apache, I don't have a choice (says my employer).
>
> On Wed, Dec 2, 2009 at 12:13 PM, Peter Schober <pe...@univie.ac.at>
> wrote:
>>
>> * Robert Schenck <ro...@gmail.com> [2009-12-02 12:03]:
>> > My office is subscribed to few academic journals. These journals verify
>> > the
>> > subscription via IP, such that anyone connected to the internet through
>> > our
>> > connection can access the journals.
>>
>> You might also want to look at EZproxy
>> http://en.wikipedia.org/wiki/EZproxy
>> (besides getting the publisher to dump IP-addresses for authorization).
>> -peter
>>
>> ---------------------------------------------------------------------
>> The official User-To-User support forum of the Apache HTTP Server Project.
>> See <URL:http://httpd.apache.org/userslist.html> for more info.
>> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
>> " from the digest: users-digest-unsubscribe@httpd.apache.org
>> For additional commands, e-mail: users-help@httpd.apache.org
>>
>
>
--
"The secret impresses no-one, the trick you use it for is everything"
- Alfred Borden (The Prestiege)
---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
" from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org
Re: [users@httpd] Reverse proxying is problematic
Posted by Robert Schenck <ro...@gmail.com>.
Peter: Well, I'm an intern so I'm supposed to be "learning"..or something
like.
On Wed, Dec 2, 2009 at 1:00 PM, Peter Schober <pe...@univie.ac.at>wrote:
> * Robert Schenck <ro...@gmail.com> [2009-12-02 12:46]:
> > I have to use Apache, I don't have a choice (says my employer).
>
> This was just meant as a heads up: depending on the publisher you
> might have to rewrite most everything (URLs, HTML content, Cookies,
> JavaScript, etc.), and every publisher does things differently.
> If your employer really thinks reinventing this poorly is time and
> money well spent (vs. using something that is known to just work),
> then so be it.
> (Not that I actually promote the use of aforementioned product, since
> that will only prolong the misuse of IP-addresses for authorization
> purposes. SAML is the standard way of accessing publisher resources
> online, of course.)
> -peter
>
> ---------------------------------------------------------------------
> The official User-To-User support forum of the Apache HTTP Server Project.
> See <URL:http://httpd.apache.org/userslist.html> for more info.
> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
> " from the digest: users-digest-unsubscribe@httpd.apache.org
> For additional commands, e-mail: users-help@httpd.apache.org
>
>
Re: [users@httpd] Reverse proxying is problematic
Posted by Peter Schober <pe...@univie.ac.at>.
* Robert Schenck <ro...@gmail.com> [2009-12-02 12:46]:
> I have to use Apache, I don't have a choice (says my employer).
This was just meant as a heads up: depending on the publisher you
might have to rewrite most everything (URLs, HTML content, Cookies,
JavaScript, etc.), and every publisher does things differently.
If your employer really thinks reinventing this poorly is time and
money well spent (vs. using something that is known to just work),
then so be it.
(Not that I actually promote the use of aforementioned product, since
that will only prolong the misuse of IP-addresses for authorization
purposes. SAML is the standard way of accessing publisher resources
online, of course.)
-peter
---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
" from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org
Re: [users@httpd] Reverse proxying is problematic
Posted by Robert Schenck <ro...@gmail.com>.
Peter,
I have to use Apache, I don't have a choice (says my employer).
On Wed, Dec 2, 2009 at 12:13 PM, Peter Schober
<pe...@univie.ac.at>wrote:
> * Robert Schenck <ro...@gmail.com> [2009-12-02 12:03]:
> > My office is subscribed to few academic journals. These journals verify
> the
> > subscription via IP, such that anyone connected to the internet through
> our
> > connection can access the journals.
>
> You might also want to look at EZproxy
> http://en.wikipedia.org/wiki/EZproxy
> (besides getting the publisher to dump IP-addresses for authorization).
> -peter
>
> ---------------------------------------------------------------------
> The official User-To-User support forum of the Apache HTTP Server Project.
> See <URL:http://httpd.apache.org/userslist.html> for more info.
> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
> " from the digest: users-digest-unsubscribe@httpd.apache.org
> For additional commands, e-mail: users-help@httpd.apache.org
>
>
Re: [users@httpd] Reverse proxying is problematic
Posted by Peter Schober <pe...@univie.ac.at>.
* Robert Schenck <ro...@gmail.com> [2009-12-02 12:03]:
> My office is subscribed to few academic journals. These journals verify the
> subscription via IP, such that anyone connected to the internet through our
> connection can access the journals.
You might also want to look at EZproxy
http://en.wikipedia.org/wiki/EZproxy
(besides getting the publisher to dump IP-addresses for authorization).
-peter
---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
" from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org
Re: [users@httpd] Reverse proxying is problematic
Posted by André Warnier <aw...@ice-sa.com>.
Robert Schenck wrote:
> *I know this is a long read...but I really need help, and felt the best way
> for anyone to help me remotely is to explain the issues in their entirety. *
>
> Hello,
>
> I'm trying to set a reverse proxy, but first, some context:
>
> My office is subscribed to few academic journals. These journals verify the
> subscription via IP, such that anyone connected to the internet through our
> connection can access the journals. However, some individuals would like to
> access the journals away from the office as well.
Hi.
I know that there is already a long list of answers to this, at the
technical level. And you were right to provide some background like you
did above.
Before solving the problem at the technical level, I would /strongly/
recommend getting in touch with the publishers of these journals, and
talk to them about your idea (or your boss' idea) first.
This is just in case one of them would object, and consider that by
doing this you are violating the commercial agreement your office has
with them, and your office thus becomes a target for a copyright
infringement lawsuit.
Publishers, who live from these copyright fees, tend to not joke about
such matters.
Background :
A publisher made a contract with your office, whereby a certain number
of people have access to a certain number of published journal articles,
against a flat fee. That flat fee replaces, under certain
circumstances, a per-article, per-person fee which would normally have
to be paid. The number of people to which this arrangement applies, and
the corresponding fee, is estimated by the supplier on the base of some
reasonable number of users. This number of users is limited,
approximately, by the number of people which the supplier roughly
calculated would be accessing these articles from within your corporate
network, and would thus look like originating from the IP address of
your firewall/proxy.
Your scheme would basically break the assumptions of the supplier, by
potentially providing access to an uncontrolled number of people from
outside of the network for which these assumptions were calculated.
The supplier may get very unhappy about this.
On the other hand, a case such as you describe is not that uncommon, and
I am sure that the suppliers of these articles have other solutions
available, which do not contravene the commercial agreements.
---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
" from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org
Re: [users@httpd] Reverse proxying is problematic
Posted by Eric Covener <co...@gmail.com>.
On Wed, Dec 2, 2009 at 7:31 AM, Robert Schenck <ro...@gmail.com> wrote:
>> >
>> > http://paste.ubuntu.com/333080/
>> >
The operative message is:
[Wed Dec 02 13:21:43 2009] [error] [client 9.4.69.54] Directory index
forbidden by Options directive: /srv/www/htdocs/apl/
Which would have been nice to include in-line. If you're serving a
mod_autoindex directory index on purpose, allow it with Options
+Indexes in the <Directory> block that covers whatever this URL maps
to.
if you meant for this to be proxied, it isn't,
if you meant for this to show some default file, see DirectoryIndex.
--
Eric Covener
covener@gmail.com
---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
" from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org
Re: [users@httpd] Reverse proxying is problematic
Posted by Robert Schenck <ro...@gmail.com>.
Here's a snippet: http://paste.ubuntu.com/333084/
On Wed, Dec 2, 2009 at 1:29 PM, Tom Evans <te...@googlemail.com> wrote:
> On Wed, Dec 2, 2009 at 12:23 PM, Robert Schenck <ro...@gmail.com>
> wrote:
> > I'm get "Access Forbidden" when trying to access proxysrv1/aip and
> > proxysrv1/apl
> >
> > This is my updated vhost file:
> >
> > http://paste.ubuntu.com/333080/
> >
>
> Your ServerName directives are not valid.
>
> When you get an 'Access Forbidden' message, apache will _always_
> explain why in the error log. What did it say?
>
> Cheers
>
> Tom
>
> ---------------------------------------------------------------------
> The official User-To-User support forum of the Apache HTTP Server Project.
> See <URL:http://httpd.apache.org/userslist.html> for more info.
> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
> " from the digest: users-digest-unsubscribe@httpd.apache.org
> For additional commands, e-mail: users-help@httpd.apache.org
>
>
Re: [users@httpd] Reverse proxying is problematic
Posted by Tom Evans <te...@googlemail.com>.
On Wed, Dec 2, 2009 at 12:23 PM, Robert Schenck <ro...@gmail.com> wrote:
> I'm get "Access Forbidden" when trying to access proxysrv1/aip and
> proxysrv1/apl
>
> This is my updated vhost file:
>
> http://paste.ubuntu.com/333080/
>
Your ServerName directives are not valid.
When you get an 'Access Forbidden' message, apache will _always_
explain why in the error log. What did it say?
Cheers
Tom
---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
" from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org
Re: [users@httpd] Reverse proxying is problematic
Posted by Robert Schenck <ro...@gmail.com>.
I'm get "Access Forbidden" when trying to access proxysrv1/aip and
proxysrv1/apl
This is my updated vhost file:
http://paste.ubuntu.com/333080/
On Wed, Dec 2, 2009 at 1:09 PM, Tom Evans <te...@googlemail.com> wrote:
> On Wed, Dec 2, 2009 at 11:02 AM, Robert Schenck <ro...@gmail.com>
> wrote:
> > I know this is a long read...but I really need help, and felt the best
> way
> > for anyone to help me remotely is to explain the issues in their
> entirety.
>
> tl;dr
>
> >
> > Please explain any fixes in a step-by-step process. Again, I'm new to
> this.
> >
>
> Part of the problem is that you are rewriting HTML. Messy isn't it?
> Now do it again, but don't bother with rewriting the HTML.
>
> Remove all the Proxy directives from the main apache server config, it
> makes no sense when you then define vhosts later to use.
>
> Define a vhost for each site you wish to proxy. Set it up like so:
>
> <VirtualHost *:80>
> ServerName proxyaip
> ProxyRequests Off
> DocumentRoot /var/empty
>
> <Directory /var/empty>
> Order allow,deny
> Allow from all
> </Directory>
>
> <Location />
> ProxyPass http://aip.com/
> ProxyPassReverse http://aip.com/
> </Location>
>
> </VirtualHost>
>
> Accessing http://proxyaip/ should now be just like accessing
> http://aip.com/ . If you want to proxy more sites, define more vhosts.
>
> Cheers
>
> Tom
>
> ---------------------------------------------------------------------
> The official User-To-User support forum of the Apache HTTP Server Project.
> See <URL:http://httpd.apache.org/userslist.html> for more info.
> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
> " from the digest: users-digest-unsubscribe@httpd.apache.org
> For additional commands, e-mail: users-help@httpd.apache.org
>
>
Re: [users@httpd] Reverse proxying is problematic
Posted by Tom Evans <te...@googlemail.com>.
On Wed, Dec 2, 2009 at 11:02 AM, Robert Schenck <ro...@gmail.com> wrote:
> I know this is a long read...but I really need help, and felt the best way
> for anyone to help me remotely is to explain the issues in their entirety.
tl;dr
>
> Please explain any fixes in a step-by-step process. Again, I'm new to this.
>
Part of the problem is that you are rewriting HTML. Messy isn't it?
Now do it again, but don't bother with rewriting the HTML.
Remove all the Proxy directives from the main apache server config, it
makes no sense when you then define vhosts later to use.
Define a vhost for each site you wish to proxy. Set it up like so:
<VirtualHost *:80>
ServerName proxyaip
ProxyRequests Off
DocumentRoot /var/empty
<Directory /var/empty>
Order allow,deny
Allow from all
</Directory>
<Location />
ProxyPass http://aip.com/
ProxyPassReverse http://aip.com/
</Location>
</VirtualHost>
Accessing http://proxyaip/ should now be just like accessing
http://aip.com/ . If you want to proxy more sites, define more vhosts.
Cheers
Tom
---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
" from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org
Re: [users@httpd] Reverse proxying is problematic
Posted by Eric Covener <co...@gmail.com>.
On 12/2/09, Robert Schenck <ro...@gmail.com> wrote:
> I disable the mod_proxy_html module and the page still looked the same,
> albeit without the little boxes signifying non-existent images.
>
> However, I also looked at the error log for the virtual host, and I found
> the following:
>
> http://paste.ubuntu.com/333064/
I didn't expect removing it to help, since you don't account for the
/css/ at all. I just couldn't tell if that mod_proxy_html magic was
translating the /css/ into something you handled.
--
Eric Covener
covener@gmail.com
---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
" from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org
Re: [users@httpd] Reverse proxying is problematic
Posted by Robert Schenck <ro...@gmail.com>.
I disable the mod_proxy_html module and the page still looked the same,
albeit without the little boxes signifying non-existent images.
However, I also looked at the error log for the virtual host, and I found
the following:
http://paste.ubuntu.com/333064/
On Wed, Dec 2, 2009 at 12:55 PM, Eric Covener <co...@gmail.com> wrote:
> Is mod_proxy_html supposed to be changing those /css/ links into
> something else that would actually be handled by your ProxyPass? You
> can tell if it is by saving the source when you're actually going
> through the proxy.
>
> Also, 404's in your access log would be a big hint about what you're
> missing, but due to the rendering issue it's likely the css.
>
> --
> Eric Covener
> covener@gmail.com
>
> ---------------------------------------------------------------------
> The official User-To-User support forum of the Apache HTTP Server Project.
> See <URL:http://httpd.apache.org/userslist.html> for more info.
> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
> " from the digest: users-digest-unsubscribe@httpd.apache.org
> For additional commands, e-mail: users-help@httpd.apache.org
>
>
Re: [users@httpd] Reverse proxying is problematic
Posted by Eric Covener <co...@gmail.com>.
Is mod_proxy_html supposed to be changing those /css/ links into
something else that would actually be handled by your ProxyPass? You
can tell if it is by saving the source when you're actually going
through the proxy.
Also, 404's in your access log would be a big hint about what you're
missing, but due to the rendering issue it's likely the css.
--
Eric Covener
covener@gmail.com
---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
" from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org