You are viewing a plain text version of this content. The canonical link for it is here.
Posted to modproxy-dev@apache.org by g....@ieee.org on 2002/01/24 19:27:14 UTC

patch: proxy-preserve-host

In the current code you can't reverse proxy to virtual hosts or any other
host that relies on a host header because the original host header is
deleted and a new host header is created using the proxy's idea of the
hostname. This patch creates a config option called ProxyPreserveHost which
allows you to tell apache to send the original host header instead of a new
one.

Might anyone else be interested in this patch or is my case unique?

It's really only useful during reverse proxying and not during regular
proxying so do you think that I should only enable the option if
ProxyRequests is not on? Or, better yet, should I detect if a certain
proxied request is reverse and not regular and only then preserve the
original host header? If so, then what's a good way to distinguish the
reverse from the regular proxied request?

Additionally, I'll may be using this in a production environment so I'd
really appreciate knowing of any potential bugs that anyone might see.
Anyone have any other suggestions?

- Gabriel Russell

(See attached file: proxy-preserve-host.diff)

Re: patch: proxy-preserve-host

Posted by Kwindla Hultman Kramer <kw...@allafrica.com>.
[ ob-executive-summary (as this message somehow got far longer than I
  meant for it to be):

  ProxyPreserveHost is a great addition to the mod_proxy directive
  set, but a more general ability to set/manipulate arbitrary headers
  on proxy requests is also extremely useful. I have a patch that
  allows this, and which has been used in production (on several sites
  and with very heavy traffic) for several months, but it's for
  1.3.19. I will try to explain why this functionality is so important
  to our shop and attempt to convince other folks that the 1.3 tree
  would benefit from the inclusion of a "ProxyRequestHeader"
  directive.

  We're still deploying on my patched 1.3.19 because we rely so
  heavily on this ProxyRequestHeader functionality. I give
  configuration examples from our currently-in-roll-out project
  http://beta.democrats.org ]


Hi,

It's good to see the proxy-preserve-host patch being worked on. One of
our "typical" configurations involves reverse proxying through
multiple virtual hosts to a single backend server. It's often
important to know which virtual host the request came in for, so that
the backend server can adjust its content generation accordingly.

I'd like to give an example, because we use this kind of setup all the
time, and there's still a piece missing from the solution even with
the ProxyPreserveHost directive available...

At the moment, we're "live testing" http://beta.democrats.org/ (which
will, barring any unforeseen circumstances, become
http://www.democrats.org on Thursday night). The beta.democrats.org
domain is served by a pair of machines in a classic reverse-proxying
configuration -- the proxy server sits on the network fielding all
comers, doing lots of rewrite-rule stuff, and caching as aggressively
as possible. Another machine sits behind the proxy server doing all
the content generation.

Here's the first problem, which ProxyPreserveHost could solve quite
nicely:

In a few hours, we'll be bringing up spanish-language content on
es.beta.democrats.org. The new 'es' domain will be served in exactly
the same way, by the same two machines, and the backend server needs
to know whether its supposed to be thinking 'english' or 'spanish.'
Here's how we've been doing that, with our patched 1.3.19.

[ stripped down example for clarity ]

  ---------------------------------
    
    ProxyPassReverse      /            http://192.168.1.243/
    RewriteRule          ^/(.*)$       http://192.168.1.243/$1        [P] 

    NameVirtualHost 0.0.0.0
    # the en host should come first -- it's the default
    <VirtualHost 0.0.0.0>
      ServerName          beta.democrats.org
      CacheRoot           "/usr/local/apache/proxy-cache-en"
      RewriteEngine       On
      RewriteOptions      inherit
      ProxyRequestHeader  set   Req-Language   "en"
      ProxyRequestHeader  set   Req-Host       "beta.democrats.org"
    </VirtualHost>
    
    <VirtualHost 0.0.0.0>
      ServerName          es.beta.democrats.org
      CacheRoot           "/usr/local/apache/proxy-cache-es"
      RewriteEngine       On
      RewriteOptions      inherit
      ProxyRequestHeader  set   Req-Language   "es"
      ProxyRequestHeader  set   Req-Host       "es.beta.democrats.org"
    </VirtualHost>

  ---------------------------------

So we use a new configuration directive, ProxyRequestHeader, to pass
information to the back-end server.

Here's why we need to be able to set other, arbitrary headers (problem
two, which ProxyPreserveHost doesn't quite solve, at least not in a
generalizable way):

Both domains -- 'beta..' and 'beta.es..' -- need to serve SSL
connections as well as normal page requests. Again, the back-end
server does all the "real" work, but the proxy must accept the SSL
connection, handle the encryption-related issues, and reverse proxy
the request through to the backend as a normal request.

Here's how we do that:

[ again, stripped down and with SSL stuff elided ]
    
    NameVirtualHost 0.0.0.0:443
    # again, english comes first
    <VirtualHost 0.0.0.0:443>
     ServerName          beta.democrats.org
    
     [ mod_ssl stuff ]
    
     ProxyRequestHeader  set   Req-Language   "en"
     ProxyRequestHeader  set   Req-Host       "beta.democrats.org"
     ProxyRequestHeader  set   Req-HTTPS      "on"
     RewriteEngine       On
     RewriteOptions      inherit
    </VirtualHost>                                  
    
    <VirtualHost 0.0.0.0:443>
     ServerName          es.beta.democrats.org
    
     [ mod_ssl stuff ]
    
     ProxyRequestHeader  set   Req-Language   "es"
     ProxyRequestHeader  set   Req-Host       "es.beta.democrats.org"
     ProxyRequestHeader  set   Req-HTTPS      "on"
     RewriteEngine       On
     RewriteOptions      inherit
    </VirtualHost>        

  ---------------------------------

And this is a moderately simple example, mostly because we've just
built this site from scratch and it hasn't had time to grow various
appendages and mutate into a rewrite-rule-laden beast. On some of our
sites we have a mixture of virtual-host and directory-based proxy
configurations, which is why the ProxyRequestHeader directive takes an
optional extra final argument, a pattern to match against the URL.

So you can do something like this, for example:

  RewriteRule         ^/image/(.*)$   http://192.168.1.243/img/$1    [P] 
  ProxyRequestHeader  set             Req-Old-Image  "yes"     "^/image"

My patch that included the ProxyRequestHeader functionality was for
1.3.19. It also included some downstream cache control facilities (two
more configuration directives) and, for symmetry, a
ProxyResponseHeader directive that works the same way as
ProxyRequestHeader. But by far the most important of those features,
for us, is the ability to set arbritrary proxy request headers. The
other features are in the patch because we do use them occasionally
and they involved touching many of the same parts of the code. (It is
clear to me in retrospect that I should have submitted *separate*
patches for the various features -- I apologize for not doing that.)

(A full description of my 1.3.19 patch can be found at:
http://allafrica.com/tools/apache/mod_proxy/ )

I guess I have a couple of questions. Is there any chance that this
ProxyRequestHeader functionality could be deemed useful enough that it
would make it into the next 1.3 release? It's certainly important to
us -- we're still deploying all of our large sites on my patched
1.3.19 because we so often need to set these headers. I haven't ported
the patch to later versions partly because of a lack of time, and
partly because of an if-its-not-broken-don't-fix-it conservatism. If
there is a chance that ProxyRequestHeader (or something very like it)
could become part of the 1.3 tree, what can I do to help that happen?
I'm very happy to rewrite and resubmit the ProxyRequestHeader part of
the patch (and, of course, just as happy if someone who knows the
codebase better than I do would prefer to do so.)

If you made it this far, I feel obliged to thank you for taking
several hours out of your day to read my missive...

Kwindla


Re: patch: proxy-preserve-host

Posted by Kwindla Hultman Kramer <kw...@allafrica.com>.
Hi,

I know this is not quite what you're looking for, but you might find
it useful for what you're doing...

We have a somewhat similar setup, and needed to be able to set
arbritrary headers on my front-end servers so that the application
servers could figure out what they should be doing with any given
request. We patched (to 1.3.19) to add four new configuration
directives that together allow finer control over header
setting/unsetting/modifying:

    CacheFreshenDate
    ProxyResponseExpiresVector
    ProxyRequestHeader
    ProxyResponseHeader

We've been running this code in production (roughly 300,000 page views
per day) for several months, now. The documentation and patch file are
available from:

  http://allafrica.com/tools/apache/mod_proxy/

The "ProxyRequestHeader" directive is probably the most useful of the
group, here is its documentation (from
http://allafrica.com/tools/apache/mod_proxy/mod_proxy.html#proxyrequestheader):

----

ProxyRequestHeader directive
Syntax: 
  ProxyRequestHeader set | unset | add | append header [string] [match-pattern]
Default: Off
Context: server config, virtual host
Override: Not applicable
Status: Experimental
Module: mod_proxy
Compatibility: 
  Patch, available from http://allafrica.com/tools/apache/mod_proxy/

The ProxyRequestHeader sets headers on upstream requests. The first
argument, one of set | unset | add | append, indicates the action to
be taken for the second argument, the header. (The trailing colon
should not be included in the header argument.) The third argument is
the string that should be used for a set, add or append action (for an
unset, the string argument should be ommitted.)  The optional fourth
argument, match-pattern, controls the application of the directive. If
a match-pattern is present, the request uri must match the pattern for
the directive to be applied.

ProxyRequestHeader directives can appear in server and virtual server
configuration sections. For each request, the directives are processed
in declaration order -- base server directives processed first. For
each match found, the specified action is taken.

ProxyRequestHeader      set      Language         "unknown"
ProxyRequestHeader      append   Passed-Through   this.host
<VirtualHost english.stuff.org>
  ProxyRequestHeader    set      Language         "en"
  ProxyRequestHeader    set      Root-Request     "yes"    "^/$"
<VirtualHost english.stuff.org>

----

Kwin


=?iso-8859-1?Q?Morten_Bj=F8rhus?= writes:
 > Your case is definitely not unique --- we would also be very interested
 > in such a patch since we are running Apache 3.20 in a production
 > environment as a reverse proxy with several virtual hosts, all having
 > some ProxyPass directives to the same web application (which selects a
 > branding based on the virtual host name used, essentially). And we have
 > of course experienced the same problem of figuring out how to get the
 > proxied webapp to recognize which virtual host name is currently used.
 > Your patch is just what we would need to get a clean and simple solution
 > to this.
 > 


Re: patch: proxy-preserve-host

Posted by Morten Bjørhus <mo...@ttyl.com>.
Your case is definitely not unique --- we would also be very interested
in such a patch since we are running Apache 3.20 in a production
environment as a reverse proxy with several virtual hosts, all having
some ProxyPass directives to the same web application (which selects a
branding based on the virtual host name used, essentially). And we have
of course experienced the same problem of figuring out how to get the
proxied webapp to recognize which virtual host name is currently used.
Your patch is just what we would need to get a clean and simple solution
to this.

Morten

-
----------------------------------------------
 Morten Bjørhus, TTYL
 http://www.ttyl.com
 ----------------------------------------------

----- Original Message -----
From: <g....@ieee.org>
To: "Apache .Org Mod_Proxy" <mo...@apache.org>
Sent: Thursday, January 24, 2002 7:27 PM
Subject: patch: proxy-preserve-host


> In the current code you can't reverse proxy to virtual hosts or any
other
> host that relies on a host header because the original host header is
> deleted and a new host header is created using the proxy's idea of the
> hostname. This patch creates a config option called ProxyPreserveHost
which
> allows you to tell apache to send the original host header instead of
a new
> one.
>
> Might anyone else be interested in this patch or is my case unique?
>
> It's really only useful during reverse proxying and not during regular
> proxying so do you think that I should only enable the option if
> ProxyRequests is not on? Or, better yet, should I detect if a certain
> proxied request is reverse and not regular and only then preserve the
> original host header? If so, then what's a good way to distinguish the
> reverse from the regular proxied request?
>
> Additionally, I'll may be using this in a production environment so
I'd
> really appreciate knowing of any potential bugs that anyone might see.
> Anyone have any other suggestions?
>
> - Gabriel Russell
>
> (See attached file: proxy-preserve-host.diff)



Re: patch: proxy-preserve-host

Posted by Graham Leggett <mi...@sharp.fm>.
g.russell@ieee.org wrote:

> In the current code you can't reverse proxy to virtual hosts or any other
> host that relies on a host header because the original host header is
> deleted and a new host header is created using the proxy's idea of the
> hostname. This patch creates a config option called ProxyPreserveHost which
> allows you to tell apache to send the original host header instead of a new
> one.
> 
> Might anyone else be interested in this patch or is my case unique?
> 
> It's really only useful during reverse proxying and not during regular
> proxying so do you think that I should only enable the option if
> ProxyRequests is not on?

This would be useful during regular proxying - as this option would be
needed for transparent proxying to work.

This patch looks quite useful, as it answers Eli Marmor's transparent
proxy question as well.

Regards,
Graham
-- 
-----------------------------------------
minfrin@sharp.fm		"There's a moon
					over Bourbon Street
						tonight..."

Re: patch: proxy-preserve-host

Posted by Chuck Murcko <ch...@topsail.org>.
I hate to have to ask, but I assume this is for 2.0?

Chuck

On Thursday, January 24, 2002, at 01:27 PM, g.russell@ieee.org wrote:

> In the current code you can't reverse proxy to virtual hosts or any 
> other
> host that relies on a host header because the original host header is
> deleted and a new host header is created using the proxy's idea of the
> hostname. This patch creates a config option called ProxyPreserveHost 
> which
> allows you to tell apache to send the original host header instead of a 
> new
> one.
>
> Might anyone else be interested in this patch or is my case unique?
>
> It's really only useful during reverse proxying and not during regular
> proxying so do you think that I should only enable the option if
> ProxyRequests is not on? Or, better yet, should I detect if a certain
> proxied request is reverse and not regular and only then preserve the
> original host header? If so, then what's a good way to distinguish the
> reverse from the regular proxied request?
>
> Additionally, I'll may be using this in a production environment so I'd
> really appreciate knowing of any potential bugs that anyone might see.
> Anyone have any other suggestions?
>
> - Gabriel Russell
>
> (See attached file: proxy-preserve-host.diff)
>