You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@httpd.apache.org by Graham Leggett <mi...@sharp.fm> on 2011/11/18 14:24:38 UTC

Effective IP address / real IP address

Hi all,

Right now, we only keep track of the real IP address of the incoming  
connection within conn_rec, and with a simple webserver that's fine.

In a world containing load balancers, we now have the real IP address  
(the load balancer) and the effective IP address (the IP that  
connected to the load balancer) for the request. And in restful  
service architectures, you might have requests that have passed  
through a few load balancers on their way, making the "effective IP  
address" even more murky.

Right now, modules that handle this attempt to overwrite the contents  
of conn_rec, which is really ugly - requests shouldn't be fiddling  
with the parent connection.

Ideally, what I'd like is a way for httpd to keep track of both the  
real IP address (the one in conn_rec) and an optional effective IP  
address, and use each appropriately. It will then be up to module  
authors to write modules to set the effective IP address as their  
needs dictate.

Most specifically, what I have in mind is this:

- Add a hook ap_get_effective_ip() (or similar).
- With a default APR_HOOK_LAST implementation that just returns the IP  
from conn_rec.
- Update the authz modules to use this hook to get the IP instead of  
reading conn_rec directly.
- Add the ability to log the effective IP address in additional to the  
existing real IP address to the logging code.

This should in theory be really simple to implement, and opens the  
door for future people to choose an effective IP as they see fit.

Sensible?

Regards,
Graham
--

Re: Effective IP address / real IP address

Posted by "William A. Rowe Jr." <wr...@rowe-clan.net>.

On 12/13/2011 9:06 PM, Roy T. Fielding wrote:
> On Dec 13, 2011, at 5:33 PM, Graham Leggett wrote:
>> On 14 Dec 2011, at 12:50 AM, Graham Leggett wrote:
>>> On 12 Dec 2011, at 11:25 PM, William A. Rowe Jr. wrote:
>>>
>>>> I have a frustrating update, which we need to take into consideration for
>>>> the whole remote_ip-related resolution.  From the httpd-ng workgroup...
>>>
>>> This makes sense, we're an HTTP server, lets stick to RFC related terms.
>>
>> Done in r 1214022.
> 
> Huh, I was wondring what Bill was talking about ... I couldn't remember any
> discussion that had reversed the meaning, and now I know why.
> 
> The IP address received by the server interface when we are acting as a
> reverse proxy is not necessarily the IP address of the user agent.  It could
> just as easily be the IP address of an ISP proxy, a corporate firewall,
> or a dozen other client-side intermediaries that are not the user agent.
> Hence, it is just a client.

That is EXACTLY our understanding.  The immediate connection comes from
"a client".  Not the end node asking for the data, but our immediate client
which is a BigIP balancer or a telco cell browser proxy or anything else
which you mention.

"the user agent" is not "just a client", but the originator of the request.

Are we on the same page now?

Re: Effective IP address / real IP address

Posted by "Roy T. Fielding" <fi...@gbiv.com>.

On Dec 13, 2011, at 5:33 PM, Graham Leggett wrote:
> On 14 Dec 2011, at 12:50 AM, Graham Leggett wrote:
>> On 12 Dec 2011, at 11:25 PM, William A. Rowe Jr. wrote:
>> 
>>> I have a frustrating update, which we need to take into consideration for
>>> the whole remote_ip-related resolution.  From the httpd-ng workgroup...
>> 
>> This makes sense, we're an HTTP server, lets stick to RFC related terms.
> 
> Done in r 1214022.

Huh, I was wondring what Bill was talking about ... I couldn't remember any
discussion that had reversed the meaning, and now I know why.

The IP address received by the server interface when we are acting as a
reverse proxy is not necessarily the IP address of the user agent.  It could
just as easily be the IP address of an ISP proxy, a corporate firewall,
or a dozen other client-side intermediaries that are not the user agent.
Hence, it is just a client.

I can hear Graham screaming now.

No worries -- I don't care what the variable is called in request_rec
as long as the intent is clear enough, so feel free to leave it as is.

....Roy

Re: Effective IP address / real IP address

Posted by Graham Leggett <mi...@sharp.fm>.

On 14 Dec 2011, at 12:50 AM, Graham Leggett wrote:

> On 12 Dec 2011, at 11:25 PM, William A. Rowe Jr. wrote:
> 
>> I have a frustrating update, which we need to take into consideration for
>> the whole remote_ip-related resolution.  From the httpd-ng workgroup...
> 
> This makes sense, we're an HTTP server, lets stick to RFC related terms.

Done in r 1214022.

Regards,
Graham
--

Re: Effective IP address / real IP address

Posted by Graham Leggett <mi...@sharp.fm>.

On 12 Dec 2011, at 11:25 PM, William A. Rowe Jr. wrote:

> I have a frustrating update, which we need to take into consideration for
> the whole remote_ip-related resolution.  From the httpd-ng workgroup...

This makes sense, we're an HTTP server, lets stick to RFC related terms.

> Mark Nottingham <mn...@mnot.net> response to my observation below;
> 
> That's exactly backwards from how we have always used the terms in HTTP -
> 
> 1945:
> 
>>   client
>> 
>>       An application program that establishes connections for the
>>       purpose of sending requests.
>> 
>>   user agent
>> 
>>       The client which initiates a request. These are often browsers,
>>       editors, spiders (web-traversing robots), or other end user
>>       tools.
> 
> 2068:
> 
>>   client
>>      A program that establishes connections for the purpose of sending
>>      requests.
>> 
>>   user agent
>>      The client which initiates a request. These are often browsers,
>>      editors, spiders (web-traversing robots), or other end user tools.
> 
> 2616:
> 
>>   client
>>      A program that establishes connections for the purpose of sending
>>      requests.
>> 
>>   user agent
>>      The client which initiates a request. These are often browsers,
>>      editors, spiders (web-traversing robots), or other end user tools.

+1.

Regards,
Graham
--

Re: Effective IP address / real IP address

Posted by "William A. Rowe Jr." <wr...@rowe-clan.net>.

I have a frustrating update, which we need to take into consideration for
the whole remote_ip-related resolution.  From the httpd-ng workgroup...

On 09/12/2011, at 9:27 AM, William A. Rowe Jr. wrote to http-ng;

> On 12/8/2011 12:33 PM, Karl Dubost wrote:
>> Le 8 déc. 2011 à 14:55, Larry Masinter a écrit :
>>> I think Karl's rewording is worse. The point I really wanted to make was that
documents that follow HTTP terminology often make the mistake of assuming a "user agent"
has a "user".
>> Ahah! I didn't have the initial context. :)
>>
>>> But if "client" means the same thing as "user agent", then why have a separate term?
>>
>> I would rather prefer client everywhere too.
>>
>> What wikipedia says:
>>
>> 	In computing, a user agent is a client application
>> 	implementing a network protocol used in communications
>> 	within a client–server distributed computing system.
>> 	— http://en.wikipedia.org/wiki/User_agent
> We just had this discussion at the ASF httpd project.
>
> In a proxy chain, each proxy server is a user agent itself reaching
> out to the next server in the chain.  It is possible to describe
> these each as clients, but when you start looking at end-to-end
> definitions, "client" suggests the originating user agent (app, or
> browser, or service).
>
> So UA and client do have distinct connotations.

Mark Nottingham <mn...@mnot.net> response to my observation below;

That's exactly backwards from how we have always used the terms in HTTP -

1945:

>    client
>
>        An application program that establishes connections for the
>        purpose of sending requests.
>
>    user agent
>
>        The client which initiates a request. These are often browsers,
>        editors, spiders (web-traversing robots), or other end user
>        tools.

2068:

>    client
>       A program that establishes connections for the purpose of sending
>       requests.
>
>    user agent
>       The client which initiates a request. These are often browsers,
>       editors, spiders (web-traversing robots), or other end user tools.

2616:

>    client
>       A program that establishes connections for the purpose of sending
>       requests.
>
>    user agent
>       The client which initiates a request. These are often browsers,
>       editors, spiders (web-traversing robots), or other end user tools.

Re: Effective IP address / real IP address

Posted by Stefan Fritsch <sf...@sfritsch.de>.

On Friday 18 November 2011, Graham Leggett wrote:
> > besides the ugliness of updating conn_rec, are there known
> > functional drawbacks of the existing mechanism, assuming that
> > the module which sets the client also sets a note to allow
> > logging of the TCP peer if desired?

There is also the problem that with pipelined requests, it is not 
clear to which request c->remote_ip actually belongs. With the current 
code, the logging of the previous request (initiated by the 
destruction of the EOR bucket) will happen after the current request 
has already changed c->remote_ip.

> Looking deeper into mod_remoteip, most specifically  
> remoteip_modify_connection(), what I see is that we seem to be
> leaking   memory from c->pool on each request, which on a server
> serving millions of requests an hour will start to add up.
> 
> In addition, it looks like we assign memory allocated from
> r->pool   into the structure attached to c->pool without having
> registered any cleanups to reverse this when r->pool is destroyed.
> 
> I think we need to look closer at this.

+1

Re: Effective IP address / real IP address

Posted by Graham Leggett <mi...@sharp.fm>.

On 22 Nov 2011, at 11:47 PM, Stefan Fritsch wrote:

> I am not sure that we need that. When is it necessary versus the
> information in the Host header? But if others think that it would be a
> good idea, I am OK with it, too.

Looking at c->local_addr, it seems to be tied in with some other  
fields in conn_rec, which will need more time to do properly, let's  
not delay this thing any further. For v2.4 purposes, I think I am done.

I'll backport the remote_ip changes, which will resolve the  
mod_remoteip issue.

Regards,
Graham
--

Re: Effective IP address / real IP address

Posted by Stefan Fritsch <sf...@sfritsch.de>.

On Tuesday 22 November 2011, Graham Leggett wrote:
> On 21 Nov 2011, at 8:04 PM, Stefan Fritsch wrote:
> > Looks reasonable. Some comments:
> > 
> > The error log handler log_remote_address for %a needs to fall
> > back to c->remote_ip if r is not specified. Otherwise one would
> > need different logformats for per-conn and per-request log
> > messages. Also, I would prefer %{r}a and %{c}a to force logging
> > of r->remote_ip and c-
> > 
> >> remote_ip. Then we don't need a new format letter and it would
> >> be
> > 
> > more consistent with the %L and %{c}L errorlog format.
> 
> What would we do with the error_log customisation in log.c?
> Currently the letter 'd' is used there, while "%{c}a" is used in
> mod_log_config.
> 
> Ideally log.c should support the same %{c} option, but right now it
> doesn't. Thoughts?

I think you resolved that now, didn't you?


> > I think there may be some confusion of addresses if mod_remoteip
> > is used for CONNECT requests. But I am OK with ignoring that
> > complication. It should always be possible to use the connection
> > log ids to correlate the different messages.
> > 
> > IMHO, commit to trunk and we can fix the remaining issues there.
> 
> Done in r1204968.
> 
> For symmetry, should we do the same for local_ip / local_addr? One
> of the things this allows us to do is provide a mechanism for a
> load balancer to reveal the frontend port being used, which helps
> mod_proxy_ajp for example to report the right frontend port to the
> application behind it (something that's always been painful in
> javaland). The patch should be significantly simpler.

I am not sure that we need that. When is it necessary versus the 
information in the Host header? But if others think that it would be a 
good idea, I am OK with it, too.

Re: Effective IP address / real IP address

Posted by Graham Leggett <mi...@sharp.fm>.

On 22 Nov 2011, at 6:18 PM, Graham Leggett wrote:

> I've noticed the "%{c}a" syntax isn't documented for either  
> error_log or access_log, should I update that or have I missed  
> something?

Oops, I meant the "%{c}L" syntax in mod_log_config. Will fix.

Regards,
Graham
--

Re: Effective IP address / real IP address

Posted by Graham Leggett <mi...@sharp.fm>.

On 21 Nov 2011, at 8:04 PM, Stefan Fritsch wrote:

> The error log handler log_remote_address for %a needs to fall back to
> c->remote_ip if r is not specified. Otherwise one would need different
> logformats for per-conn and per-request log messages. Also, I would
> prefer %{r}a and %{c}a to force logging of r->remote_ip and c-
>> remote_ip. Then we don't need a new format letter and it would be
> more consistent with the %L and %{c}L errorlog format.

This is now the same with r1205061.

I've noticed the "%{c}a" syntax isn't documented for either error_log  
or access_log, should I update that or have I missed something?

Regards,
Graham
--

Re: Effective IP address / real IP address

Posted by Graham Leggett <mi...@sharp.fm>.

On 21 Nov 2011, at 8:04 PM, Stefan Fritsch wrote:

> Looks reasonable. Some comments:
>
> The error log handler log_remote_address for %a needs to fall back to
> c->remote_ip if r is not specified. Otherwise one would need different
> logformats for per-conn and per-request log messages. Also, I would
> prefer %{r}a and %{c}a to force logging of r->remote_ip and c-
>> remote_ip. Then we don't need a new format letter and it would be
> more consistent with the %L and %{c}L errorlog format.

What would we do with the error_log customisation in log.c? Currently  
the letter 'd' is used there, while "%{c}a" is used in mod_log_config.

Ideally log.c should support the same %{c} option, but right now it  
doesn't. Thoughts?

> We may also want a CONN_REMOTE_ADDR or PHYS_REMOTE_ADDR variable in
> ap_expr to still allow access to c->remote_addr.

I've added this in r1204990.

> Do we need special handling of the REMOTE_HOST script variable?
> Probably it does not make sense because we can't reliably do DNS
> lookups for addresses received via X-Forwarded-For.

Right now mod_remoteip does sanity checking on the IP address  
presented in the header, if you provide a bogus address it will be  
ignored. I think this should probably be the expectation of a module  
that fiddles with r->remote_ip rather than trying to sanity check it  
across the server.

> I think there may be some confusion of addresses if mod_remoteip is
> used for CONNECT requests. But I am OK with ignoring that
> complication. It should always be possible to use the connection log
> ids to correlate the different messages.
>
> IMHO, commit to trunk and we can fix the remaining issues there.

Done in r1204968.

For symmetry, should we do the same for local_ip / local_addr? One of  
the things this allows us to do is provide a mechanism for a load  
balancer to reveal the frontend port being used, which helps  
mod_proxy_ajp for example to report the right frontend port to the  
application behind it (something that's always been painful in  
javaland). The patch should be significantly simpler.

Regards,
Graham
--

Re: Effective IP address / real IP address

Posted by Stefan Fritsch <sf...@sfritsch.de>.

On Sunday 20 November 2011, Graham Leggett wrote:
> On 20 Nov 2011, at 1:37 AM, Jeff Trawick wrote:
> > On Sat, Nov 19, 2011 at 2:46 PM, Stefan Fritsch <sf...@sfritsch.de>
> > 
> > wrote:
> >> On Saturday 19 November 2011, Graham Leggett wrote:
> >>>> The correction is simple; promote the remote_ip up to the
> >>>> request rec and log/use for authentication that r->remote_ip
> >>>> throughout httpd.  Introduce a wire client logging tag for
> >>>> c->remote_ip.
> >>> 
> >>> This is a lot simpler and cleaner I think, let me come up with
> >>> an alternative patch.
> >> 
> >> I also think this is preferable. The hook approach adds unneeded
> >> complexity and users of mod_remoteip would also need to change
> >> their log formats.
> > 
> > Yeah, only needing to add a special .conf for LB configurations
> > would be nice (i.e., not touching/reconfiguring anything else)
> 
> This is the alternative I've come up with, again needing docs and
> in- principle. A logging option has been attached to log the raw
> IP address. Separately, I've attached a patch for mod_remoteip.
> Thoughts?

Looks reasonable. Some comments:

The error log handler log_remote_address for %a needs to fall back to 
c->remote_ip if r is not specified. Otherwise one would need different 
logformats for per-conn and per-request log messages. Also, I would 
prefer %{r}a and %{c}a to force logging of r->remote_ip and c-
>remote_ip. Then we don't need a new format letter and it would be 
more consistent with the %L and %{c}L errorlog format.

We may also want a CONN_REMOTE_ADDR or PHYS_REMOTE_ADDR variable in 
ap_expr to still allow access to c->remote_addr.

Do we need special handling of the REMOTE_HOST script variable? 
Probably it does not make sense because we can't reliably do DNS 
lookups for addresses received via X-Forwarded-For.

I think there may be some confusion of addresses if mod_remoteip is 
used for CONNECT requests. But I am OK with ignoring that 
complication. It should always be possible to use the connection log 
ids to correlate the different messages.

IMHO, commit to trunk and we can fix the remaining issues there.

Re: Effective IP address / real IP address

Posted by Graham Leggett <mi...@sharp.fm>.

On 20 Nov 2011, at 1:37 AM, Jeff Trawick wrote:

> On Sat, Nov 19, 2011 at 2:46 PM, Stefan Fritsch <sf...@sfritsch.de>  
> wrote:
>> On Saturday 19 November 2011, Graham Leggett wrote:
>>>> The correction is simple; promote the remote_ip up to the request
>>>> rec and log/use for authentication that r->remote_ip throughout
>>>> httpd.  Introduce a wire client logging tag for c->remote_ip.
>>>
>>> This is a lot simpler and cleaner I think, let me come up with an
>>> alternative patch.
>>
>> I also think this is preferable. The hook approach adds unneeded
>> complexity and users of mod_remoteip would also need to change their
>> log formats.
>
> Yeah, only needing to add a special .conf for LB configurations would
> be nice (i.e., not touching/reconfiguring anything else)

This is the alternative I've come up with, again needing docs and in- 
principle. A logging option has been attached to log the raw IP  
address. Separately, I've attached a patch for mod_remoteip. Thoughts?

Regards,
Graham
--

Re: Effective IP address / real IP address

Posted by Jeff Trawick <tr...@gmail.com>.

On Sat, Nov 19, 2011 at 2:46 PM, Stefan Fritsch <sf...@sfritsch.de> wrote:
> On Saturday 19 November 2011, Graham Leggett wrote:
>> > The correction is simple; promote the remote_ip up to the request
>> > rec and log/use for authentication that r->remote_ip throughout
>> > httpd.  Introduce a wire client logging tag for c->remote_ip.
>>
>> This is a lot simpler and cleaner I think, let me come up with an
>> alternative patch.
>
> I also think this is preferable. The hook approach adds unneeded
> complexity and users of mod_remoteip would also need to change their
> log formats.

Yeah, only needing to add a special .conf for LB configurations would
be nice (i.e., not touching/reconfiguring anything else)

Re: Effective IP address / real IP address

Posted by Stefan Fritsch <sf...@sfritsch.de>.

On Saturday 19 November 2011, Graham Leggett wrote:
> > The correction is simple; promote the remote_ip up to the request
> > rec and log/use for authentication that r->remote_ip throughout
> > httpd.  Introduce a wire client logging tag for c->remote_ip.
> 
> This is a lot simpler and cleaner I think, let me come up with an  
> alternative patch.

I also think this is preferable. The hook approach adds unneeded 
complexity and users of mod_remoteip would also need to change their 
log formats.

Re: Effective IP address / real IP address

Posted by Graham Leggett <mi...@sharp.fm>.

On 19 Nov 2011, at 4:49 AM, William A. Rowe Jr. wrote:

> Nevermind that you failed to be consistent in tag values between
> logging schemas...

Can you confirm in more detail what you're referring to? There is only  
one logging line in the patch, and this was based on existing logging  
lines in the same module. Patches should ideally change just one thing  
at a time, but if there are things to fix we should fix them.

> nothing in this proposal addresses the reason
> that I myself had implemented mod_remoteip, which was authn/authz
> control.  In the limited scenario you have considered, authn is
> pretty much a noop on the physical address (no public client would
> ever be routable to that server anyways) so access control is
> shifted to the consumer of the web resources.
>
> This patch would have a long way to go before being considered...
> and is certainly not a 2.4.x candidate.

The way I approached this was to hunt through the code for instances  
of c->remote_ip and c->remote_addr and address each part of the code  
they touched. In the aaa code I only found reference to them in  
mod_authz_host.c, there was no reference anywhere else. Can you be  
more specific about the authn functionality you believe was missed?

> The very design of mod_remoteip keeps the precious values that
> you are concerned about losing as request notes suitable for
> logging.  But Stefan also points out some flaws in the current
> approach.

Having built a module like this before, I fully appreciate just how  
hard it is to keep the overridden IP address alive long enough to be  
logged given that logging happens in a cleanup, and the module manages  
to do it, but the way it's done now is definitely wrong. We allow  
dangling pointers allocated from a destroyed r->pool to exist in c- 
 >pool structures in the hope that the next request will clean it up,  
and we hope that nobody tries to read c->remote_ip before we do. We  
also slowly leak from c->pool. I recognise this is done to work around  
restrictions to the existing v2.2 API, but as new code, we should fix  
the API and do this properly - none of this should be released in v2.4  
in it's current form.

> The correction is simple; promote the remote_ip up to the request
> rec and log/use for authentication that r->remote_ip throughout
> httpd.  Introduce a wire client logging tag for c->remote_ip.

This is a lot simpler and cleaner I think, let me come up with an  
alternative patch.

Regards,
Graham
--

Re: Effective IP address / real IP address

Posted by "William A. Rowe Jr." <wr...@rowe-clan.net>.

Nevermind that you failed to be consistent in tag values between
logging schemas... nothing in this proposal addresses the reason
that I myself had implemented mod_remoteip, which was authn/authz
control.  In the limited scenario you have considered, authn is
pretty much a noop on the physical address (no public client would
ever be routable to that server anyways) so access control is
shifted to the consumer of the web resources.

This patch would have a long way to go before being considered...
and is certainly not a 2.4.x candidate.

The very design of mod_remoteip keeps the precious values that
you are concerned about losing as request notes suitable for
logging.  But Stefan also points out some flaws in the current
approach.

The correction is simple; promote the remote_ip up to the request
rec and log/use for authentication that r->remote_ip throughout
httpd.  Introduce a wire client logging tag for c->remote_ip.

Re: Effective IP address / real IP address

Posted by Graham Leggett <mi...@sharp.fm>.

On 18 Nov 2011, at 4:23 PM, Graham Leggett wrote:

> The lines I was thinking along was that effective_ip was in addition  
> to the remote_ip, rather than instead of. The log format wouldn't  
> change, there would be a new value that would represent the  
> effective IP, in addition to the existing value that represented the  
> real IP.
>
> Existing modules can continue using conn_rec->remote_ip and it will  
> still work the same as before.

Far easier to express this as a patch:

- There is an explicit idea of a "real" ip address (belonging to the  
load balancer) and an "effective" IP address (upstream IP address) at  
the same time, no overloading of one for the other or mixing them up.
- By default, the effective IP is made equal to the real IP, until the  
admin adds a module to change this.
- Adds a hook called ap_effective_ip();
- Addition of an EFFECTIVE_ADDR environment variable in addition to  
REMOTE_ADDR, which remains unchanged;
- Addition of appropriate logging variables, leaving the current ones  
unchanged;
- Addition of "require effective-ip" in addition to the existing  
unchanged "require ip".
- Needs testing and documentation, but you get the idea.

Regards,
Graham
--

Re: Effective IP address / real IP address

Posted by Graham Leggett <mi...@sharp.fm>.

On 18 Nov 2011, at 4:05 PM, Jeff Trawick wrote:

> A. modules keep using conn_rec, core keeps track of TCP peer for
> logging, post-read-request or some other per-request hook used to set
> effective client in conn_rec
>
> ugly updates to conn_rec by some module; client is really per-request
> in some configurations
>
> B. modules switch to using new ap_get_client() API, something
> different is used to log the client
>
> inconsistencies among get-client mechanism used by different unbundled
> modules could be confusing, log format would change
>
> --/--
>
> either way, a module that thinks it knows the client across different
> requests is hosed

The lines I was thinking along was that effective_ip was in addition  
to the remote_ip, rather than instead of. The log format wouldn't  
change, there would be a new value that would represent the effective  
IP, in addition to the existing value that represented the real IP.

Existing modules can continue using conn_rec->remote_ip and it will  
still work the same as before.

The advantage of the hook approach is that people whose servers don't  
use authz won't ever see the hook being called, so the functionality  
is no-cost for those who don't need it.

> besides the ugliness of updating conn_rec, are there known functional
> drawbacks of the existing mechanism, assuming that the module which
> sets the client also sets a note to allow logging of the TCP peer if
> desired?

Looking deeper into mod_remoteip, most specifically  
remoteip_modify_connection(), what I see is that we seem to be leaking  
memory from c->pool on each request, which on a server serving  
millions of requests an hour will start to add up.

In addition, it looks like we assign memory allocated from r->pool  
into the structure attached to c->pool without having registered any  
cleanups to reverse this when r->pool is destroyed.

I think we need to look closer at this.

Regards,
Graham
--

Re: Effective IP address / real IP address

Posted by Jeff Trawick <tr...@gmail.com>.

On Fri, Nov 18, 2011 at 8:24 AM, Graham Leggett <mi...@sharp.fm> wrote:
> Hi all,
>
> Right now, we only keep track of the real IP address of the incoming
> connection within conn_rec, and with a simple webserver that's fine.
>
> In a world containing load balancers, we now have the real IP address (the
> load balancer) and the effective IP address (the IP that connected to the
> load balancer) for the request. And in restful service architectures, you
> might have requests that have passed through a few load balancers on their
> way, making the "effective IP address" even more murky.
>
> Right now, modules that handle this attempt to overwrite the contents of
> conn_rec, which is really ugly - requests shouldn't be fiddling with the
> parent connection.
>
> Ideally, what I'd like is a way for httpd to keep track of both the real IP
> address (the one in conn_rec) and an optional effective IP address, and use
> each appropriately. It will then be up to module authors to write modules to
> set the effective IP address as their needs dictate.
>
> Most specifically, what I have in mind is this:
>
> - Add a hook ap_get_effective_ip() (or similar).
> - With a default APR_HOOK_LAST implementation that just returns the IP from
> conn_rec.
> - Update the authz modules to use this hook to get the IP instead of reading
> conn_rec directly.
> - Add the ability to log the effective IP address in additional to the
> existing real IP address to the logging code.
>
> This should in theory be really simple to implement, and opens the door for
> future people to choose an effective IP as they see fit.
>
> Sensible?

A. modules keep using conn_rec, core keeps track of TCP peer for
logging, post-read-request or some other per-request hook used to set
effective client in conn_rec

ugly updates to conn_rec by some module; client is really per-request
in some configurations

B. modules switch to using new ap_get_client() API, something
different is used to log the client

inconsistencies among get-client mechanism used by different unbundled
modules could be confusing, log format would change

--/--

either way, a module that thinks it knows the client across different
requests is hosed

besides the ugliness of updating conn_rec, are there known functional
drawbacks of the existing mechanism, assuming that the module which
sets the client also sets a note to allow logging of the TCP peer if
desired?