You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@httpd.apache.org by MM KP <sb...@gmail.com> on 2014/01/27 04:08:59 UTC

[users@httpd] Need help with reverse proxying and image loading

Hello all

I am new to apache & the apache mailing list so PLEASE forgive me for my
long message :


I am trying to configure a nice reverse proxy using Apache. Basically this
is what I want : i want to be able to browse to something like
testproxy.myproxy.com and proxy to www.cnn.com. I want to be able to see
images and i want javascript and css and all that good  stuff loaded as
well. I already created a DNS record for testproxy.myproxy.com and this is
the configuration im using for the virtual host:


<VirtualHost [::]:80>
   ServerName testproxy.myproxy.com
   ProxyRequests off
   ProxyPass / http://www.cnn.com/
   ProxyPassReverse / http://www.cnn.com/
</VirtualHost>


now when i restart the httpd service (By the way I am using RHEL 6.5), I
can browse to testproxy.myproxy.com but allthat appears in the browser are
text and links. No images are loaded nor any CSS/javascript. What am I
missing in my virtualhost configuration thats preventing me from loading
images? Ive noticed that some of the images on cnn.com are hosted on a
different site such as :

http://i2.cdn.turner.com/cnn/dam/assets/

Im guessing that since the images are hosted in the /cnn/dam/assets/ folder
on i2.cdn.turner.com , and the virtualhost/reverse proxy is only set up to
proxy pass to www.cnn.com , it is not loading images and scripts that are
hosted on http://i2.cdn.turner.com/cnn/dam/assets/. I dont know if i am
even close to being accurate with my assumptions. Apache is a very new
thing to me.


my question is how do I go about configuring my virtualhosts properly so
that every image and script that is on www.cnn.com, will be URL rewritten
as testproxy.myproxy.com/ blah blah blah as opposed to
i2.cdn.turner.com/etcetcetcetc? for example one of the images on CNN's
homepage is:

http://i2.cdn.turner.com/cnn/dam/assets/140123154723-07-super-bowl-prep-bin-tease.jpg

I want to be able to go to a browser, type in testproxy.myproxy.com in the
address bar, proxy to www.cnn.com and when i right click on the image, i
want the FQDN of the image to be something like
http://testproxy.myproxy.com/images/super-bowl-prep-bin-tease.jpg.
Basically i want all URLs to be rewritten as
http://testproxy.myproxy.com/.......etc etc etc.

All help is GREATLY appreciated because well, i am totally lost here lol.
Ive done research on using mod_proxy_html and what not, but im still
confused as to how I go about doing this in my situation.

Please assist me!


Thanks!!

SBC

Re: [users@httpd] Need help with reverse proxying and image loading

Posted by MM KP <sb...@gmail.com>.
hello!

Thanks for your response, but is this the same case for a reverse proxy? I
am trying to build a reverse proxy not a forward one.

Thanks though!

any other responses are greatly welcome!

SBC


On Mon, Jan 27, 2014 at 12:57 PM, Mark Brodis <ma...@colorado4x4.net>wrote:

> I am a n00b with Apache also but I'll take a stab at this.
>
> What you are wanting is actually 2 things.  A fully functional (for at
> least one website) forward HTTP proxy and also a domain name change.  In my
> opinion you will never get a functional webpage (at least not something as
> complex and interconnected as a CNN site) with static mappings.  Static
> mappings such as mysite.com be translated to cnn.com could work...but as
> you pointed out what about the rest of the items on the CNN page.  There
> will be images from Facebook, Twitter, 4space, Yahoo, Google..and everyone
> of those could have 50 different hosts the images could come from, the
> hostnames that you will pull content from will vary throughout the day and
> your region.
>
> So, for that to work you are going to need to use a real outbound forward
> HTTP proxy which your workstation/browser will know how to use (read up on
> forward proxies versus reverse proxies, same software can be used in very
> different ways).  Now, using that method in theory you could still try to
> change domain names of the site, though I'm not exactly sure how you would
> do that and I don't think it would work right.  Here's why...when a browser
> requests an item from a server it sends the hostname in the HTTP header.
> This seems redundant usually, as the CNN servers know they are CNN so why
> send "cnn" in the header.  This is because the server can serve up
> different content based on the header value (look up virtual hosts, and
> this is not virtual machine stuff).  So while some web-servers will serve
> up the same content whether you request it from the IP or a hostname,
> others will serve up something different.  Also the issue of SSL
> certificates.  The SSL cert has to match the site that the browser is going
> to by name.  SSL certs cannot be tied to an IP address and if you try to
> forward a SSL cert through a domain-name-changed proxy service then the
> name the browser has for a site will not match the CN (common name) value
> in the SSL cert itself..and thus the browser will throw it's arms up,
> complain, warn, etc.
>
> For a normal forward HTTP proxy there is a way to set them up as a secure
> proxy which will handle the SSL certs correctly but that is because there
> is no domain-name changing happening in the process.
>
> So, I'm not sure if what you're trying to do will work for a site as
> complex as CNN.  Could you do a domain-name-change on a buddy's site with
> very little interconnecting..sure..but it would still be a very statically
> defined setup.
>
> Good Luck...
> -Mark
>
>
> On Sun, Jan 26, 2014 at 8:08 PM, MM KP <sb...@gmail.com> wrote:
>
>> Hello all
>>
>> I am new to apache & the apache mailing list so PLEASE forgive me for my
>> long message :
>>
>>
>> I am trying to configure a nice reverse proxy using Apache. Basically
>> this is what I want : i want to be able to browse to something like
>> testproxy.myproxy.com and proxy to www.cnn.com. I want to be able to see
>> images and i want javascript and css and all that good  stuff loaded as
>> well. I already created a DNS record for testproxy.myproxy.com and this
>> is the configuration im using for the virtual host:
>>
>>
>> <VirtualHost [::]:80>
>>    ServerName testproxy.myproxy.com
>>    ProxyRequests off
>>    ProxyPass / http://www.cnn.com/
>>    ProxyPassReverse / http://www.cnn.com/
>> </VirtualHost>
>>
>>
>> now when i restart the httpd service (By the way I am using RHEL 6.5), I
>> can browse to testproxy.myproxy.com but allthat appears in the browser
>> are text and links. No images are loaded nor any CSS/javascript. What am I
>> missing in my virtualhost configuration thats preventing me from loading
>> images? Ive noticed that some of the images on cnn.com are hosted on a
>> different site such as :
>>
>> http://i2.cdn.turner.com/cnn/dam/assets/
>>
>> Im guessing that since the images are hosted in the /cnn/dam/assets/
>> folder on i2.cdn.turner.com , and the virtualhost/reverse proxy is only
>> set up to proxy pass to www.cnn.com , it is not loading images and
>> scripts that are hosted on http://i2.cdn.turner.com/cnn/dam/assets/. I
>> dont know if i am even close to being accurate with my assumptions. Apache
>> is a very new thing to me.
>>
>>
>> my question is how do I go about configuring my virtualhosts properly so
>> that every image and script that is on www.cnn.com, will be URL
>> rewritten as testproxy.myproxy.com/ blah blah blah as opposed to
>> i2.cdn.turner.com/etcetcetcetc? for example one of the images on CNN's
>> homepage is:
>>
>>
>> http://i2.cdn.turner.com/cnn/dam/assets/140123154723-07-super-bowl-prep-bin-tease.jpg
>>
>> I want to be able to go to a browser, type in testproxy.myproxy.com in
>> the address bar, proxy to www.cnn.com and when i right click on the
>> image, i want the FQDN of the image to be something like
>> http://testproxy.myproxy.com/images/super-bowl-prep-bin-tease.jpg.
>> Basically i want all URLs to be rewritten as
>> http://testproxy.myproxy.com/.......etc etc etc.
>>
>> All help is GREATLY appreciated because well, i am totally lost here lol.
>> Ive done research on using mod_proxy_html and what not, but im still
>> confused as to how I go about doing this in my situation.
>>
>> Please assist me!
>>
>>
>> Thanks!!
>>
>> SBC
>>
>>
>

Re: [users@httpd] Need help with reverse proxying and image loading

Posted by Mark Brodis <ma...@colorado4x4.net>.
I am a n00b with Apache also but I'll take a stab at this.

What you are wanting is actually 2 things.  A fully functional (for at
least one website) forward HTTP proxy and also a domain name change.  In my
opinion you will never get a functional webpage (at least not something as
complex and interconnected as a CNN site) with static mappings.  Static
mappings such as mysite.com be translated to cnn.com could work...but as
you pointed out what about the rest of the items on the CNN page.  There
will be images from Facebook, Twitter, 4space, Yahoo, Google..and everyone
of those could have 50 different hosts the images could come from, the
hostnames that you will pull content from will vary throughout the day and
your region.

So, for that to work you are going to need to use a real outbound forward
HTTP proxy which your workstation/browser will know how to use (read up on
forward proxies versus reverse proxies, same software can be used in very
different ways).  Now, using that method in theory you could still try to
change domain names of the site, though I'm not exactly sure how you would
do that and I don't think it would work right.  Here's why...when a browser
requests an item from a server it sends the hostname in the HTTP header.
This seems redundant usually, as the CNN servers know they are CNN so why
send "cnn" in the header.  This is because the server can serve up
different content based on the header value (look up virtual hosts, and
this is not virtual machine stuff).  So while some web-servers will serve
up the same content whether you request it from the IP or a hostname,
others will serve up something different.  Also the issue of SSL
certificates.  The SSL cert has to match the site that the browser is going
to by name.  SSL certs cannot be tied to an IP address and if you try to
forward a SSL cert through a domain-name-changed proxy service then the
name the browser has for a site will not match the CN (common name) value
in the SSL cert itself..and thus the browser will throw it's arms up,
complain, warn, etc.

For a normal forward HTTP proxy there is a way to set them up as a secure
proxy which will handle the SSL certs correctly but that is because there
is no domain-name changing happening in the process.

So, I'm not sure if what you're trying to do will work for a site as
complex as CNN.  Could you do a domain-name-change on a buddy's site with
very little interconnecting..sure..but it would still be a very statically
defined setup.

Good Luck...
-Mark


On Sun, Jan 26, 2014 at 8:08 PM, MM KP <sb...@gmail.com> wrote:

> Hello all
>
> I am new to apache & the apache mailing list so PLEASE forgive me for my
> long message :
>
>
> I am trying to configure a nice reverse proxy using Apache. Basically this
> is what I want : i want to be able to browse to something like
> testproxy.myproxy.com and proxy to www.cnn.com. I want to be able to see
> images and i want javascript and css and all that good  stuff loaded as
> well. I already created a DNS record for testproxy.myproxy.com and this
> is the configuration im using for the virtual host:
>
>
> <VirtualHost [::]:80>
>    ServerName testproxy.myproxy.com
>    ProxyRequests off
>    ProxyPass / http://www.cnn.com/
>    ProxyPassReverse / http://www.cnn.com/
> </VirtualHost>
>
>
> now when i restart the httpd service (By the way I am using RHEL 6.5), I
> can browse to testproxy.myproxy.com but allthat appears in the browser
> are text and links. No images are loaded nor any CSS/javascript. What am I
> missing in my virtualhost configuration thats preventing me from loading
> images? Ive noticed that some of the images on cnn.com are hosted on a
> different site such as :
>
> http://i2.cdn.turner.com/cnn/dam/assets/
>
> Im guessing that since the images are hosted in the /cnn/dam/assets/
> folder on i2.cdn.turner.com , and the virtualhost/reverse proxy is only
> set up to proxy pass to www.cnn.com , it is not loading images and
> scripts that are hosted on http://i2.cdn.turner.com/cnn/dam/assets/. I
> dont know if i am even close to being accurate with my assumptions. Apache
> is a very new thing to me.
>
>
> my question is how do I go about configuring my virtualhosts properly so
> that every image and script that is on www.cnn.com, will be URL rewritten
> as testproxy.myproxy.com/ blah blah blah as opposed to
> i2.cdn.turner.com/etcetcetcetc? for example one of the images on CNN's
> homepage is:
>
>
> http://i2.cdn.turner.com/cnn/dam/assets/140123154723-07-super-bowl-prep-bin-tease.jpg
>
> I want to be able to go to a browser, type in testproxy.myproxy.com in
> the address bar, proxy to www.cnn.com and when i right click on the
> image, i want the FQDN of the image to be something like
> http://testproxy.myproxy.com/images/super-bowl-prep-bin-tease.jpg.
> Basically i want all URLs to be rewritten as
> http://testproxy.myproxy.com/.......etc etc etc.
>
> All help is GREATLY appreciated because well, i am totally lost here lol.
> Ive done research on using mod_proxy_html and what not, but im still
> confused as to how I go about doing this in my situation.
>
> Please assist me!
>
>
> Thanks!!
>
> SBC
>
>