You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Dean Gaudet <dg...@hotwired.com> on 1996/06/30 10:11:51 UTC

syntax questions

I'm in the process of merging our code to 1.1bwhatever.  I'm looking for
input for syntax for some of the changes.  I didn't pay enough attention
when all the 1.1 <VirtualHost> and Host:-header stuff was going around
on this list.

First of all, my additions to 1.0 syntax:

    (1) <VirtualHost 1.1.1.1 2.2.2.2 ...> ... </VirtualHost>
	defines a virtual host that responds to any of the listed addresses

    (2) <VirtualHost multihomed.domain.com> ... </VirtualHost>
	defines a virtual host that responds to any of the addresses resulting
	from a lookup of "multihomed.domain.com"

    (3) <VirtualHost 0.0.0.0> ... </VirtualHost>
	defines the "default" virtual host -- the host that answers when
	no others match.

(Mix the above as desired.)

Some motivation.  The need to deal with multihomed machines is an obvious
reason that (2), and to a similar extent (1) is nice.  I use (1) to make
all my webserver config files identical.  (All www.hotwired.com machines
run the same config file.)  This has obvious maintenance benefits... and
can really help when one of the servers fail.  All of the "web serving"
addresses on my servers are aliases on the loopback interface.  If one
of the servers goes down, some routing magic directs the hits to a
live server... it requires each server to be able to serve any of the
addresses listed in the DNS.

(3) comes about partially because I'm paranoid, and partially because
of how I do that hit "stealing" for down machines.  The paranoid part
is easy to explain:  I don't want someone to be able to find an address
on my machines which I don't list in my config file and through that
get at stuff I don't want them to see.  Essentially I wanted to issue
a redirect for all these "unlisted" addresses to www.hotwired.com.
Is there another way to do this already in 1.0 or 1.1 syntax?

Sooo... now that 1.1 syntax is around and Host:-header parsing is in,
it looks like I need to rethink at least (2) above.  But I'm open
to suggestions for redoing the entire syntax, as long as I get the
functionality I need.  I can hack around it with m4 macros, but I'd
really rather keep the number of server_recs down.

I'll make the patch available as time permits.  (I actually sent a patch
for this against 1.0, but it got lost in the shuffle as I was too busy.)

Dean

P.S.  It's obvious that to support hundreds to thousands of busy virtual
hosts we need to use hashing... but non-ip virtualhosts with ServerAliases
pose some interesting hashing challenges.  You'd pretty much have to
hash-and-cache on the fly or something.

Re: security holes and other fun stuff

Posted by Dean Gaudet <dg...@hotwired.com>.
In article <ho...@fully.organic.com>,
Brian Behlendorf  <ne...@hyperreal.com> wrote:
>Okay, I think I've added a pretty approachable and understandable example
>to the docs; check out http://www.apache.org/docs/host.html.  If that's
>alright then virtual-host.html should probably also be updated.

Yep that explains the name-vhost access to ip-vhost problem quite well.

>> D> Now, competitor changes their DNS so that www.fakename.com maps to
>> D> www.victim.com's address.  Take a boo at virtualhost_section() and
>> D> you'll see that it places www.fakename.com BEFORE www.victim.com in the
>> D> virtual host searchlist.  Bingo, any hit to www.victim.com's address will
>> D> be served out of the www.fakename.com configuration.  
>> 
>> B>*Well*, any hit which doesn't contain the Host: header.  If everyone used
>> B>the Host: header, how would www.fakename.com get a hit from
>> B>www.victim.com?
>> 
>> My example was with ip-vhosts, and is a hole in 1.0 and 1.1 (1.1 doesn't
>> care what Host: crap says on non-"main server address" vhosts -- it does
>> respect http:// though).  www.fakename.com is set to the same IP as
>> www.victim.com for the duration of the attack, and both are ip-vhosts.  The
>> order of the definitions in the config file lets www.fakename.com steal all
>> the traffic for www.victim.com.
>
>Ah, okay, I understand now.  So thereis one bug here we should write
>down and clarify and make sure we fix:
>
>  If we're going to allow Host: headers to be used for ip-vhosts against 
>  the main server address, then we should be consistant and allow such
>  Host:  header functionality on the other ip addresses too.
>
>The ordering problem is no longer a problem if the above bug is fixed.

Hrm... yeah this would avoid most of the problems because the ip
www.victim.com and www.fakename.com would fall under name-vhost rules.
It still is a (milder) denial of service because not all browsers
support name-vhosts yet.  The bug would still exist in 1.0, but we're
not committed to supporting that.

There's still the issue of explaining the problems associated with
putting DNS names in <virtualhost> directives.  I'll try to write the
DNS issues down clearly (and/or add patches) in a few weeks after I'm
done with a load of work.

>Actually, I was thinking about this the other day, when I was thinking
>about how crappy our config file syntax is sometimes. :)  We have just
>started using m4 here, thanks to Alexei.

m4 is nice.  I've been using it for several months to generate
<virtualhosts> for a half dozen similar but somewhat different servers
(at different stages of our production).

My largest beef with the config file format is that the directory
merge/override behaviour is not intuitive.  For example I'd like to
be able to restrict access to an entire server by ip... and in some
cases I need to further restrict some subdirectories.  The only way
to do this now is to specify the restrictions in each <Directory>.

I'd like to be able to specify:

<Directory />
allow from 10.0.0.
deny from all
</Directory>

and have it die immediately if it's not from 10.0.0. instead of going
further down in the tree to see if it's really allowed somewhere
lower on.

But I can't imagine a nice syntax that allows us to both restrict
lower parts of the tree from overriding configuration options, and
allow them to override others.  At one extreme you could control all
access through perl routines that get passed all the relevant
structures and decide yay or nay on override/merge issues.  But that's
a little overkill.

Maybe it's good enough to define "soft" and "hard" directory
configurations.  soft auth stuff can be overriden by later auth
directives.  Hard stuff causes immediate failure.

Dean

Re: security holes and other fun stuff

Posted by Dean Gaudet <dg...@hotwired.com>.
In article <ho...@tees>,
Paul Richards  <ne...@hyperreal.com> wrote:
>A ttl of 1/2 day forces queries to hit the server at least twice a day. This
>is redundant since the secondary will not pick up changes more than once
>a day so queries that hit the secondary will always get the same response
>within a 24 hour period so the client may as well be allowed to cache the
>data for that long. I think your DNS is somewhat mis-configured in this
>regard.

The 1/2 day timeout is from before I discovered that Sprint won't respect
any refresh less than 1 day... and I suppose I could change it now, but
it hardly seems worth it when my (external) secondaries are changing in
a month anyhow.  Half day timeouts are long compared to many other
zones I was looking at... so I'm actually being nicer to folks than
many :)

>I assume when you talk about zone timeout you're talking about the
>refresh value?

Yup, was saying it that way for the folks that don't have to deal with
DNS day in and out.

Dean

Re: security holes and other fun stuff

Posted by Paul Richards <p....@elsevier.co.uk>.
Dean Gaudet writes:
 > In article <ho...@tees>,
 > Paul Richards  <ne...@hyperreal.com> wrote:
 > >Incidentally, your DNS settings are a little odd. Having a ttl of 1/2
 > >day when your zone transfers only happen daily seems redundant to me
 > >since you can't update your tables more than once a day so why force
 > >clients to expire the cache twice a day.
 > 
 > Well the time an old record should live is the zone timeout plus the
 > record timeout, so I'm at a 1.5 day theoretical record change time now.

How often do you make changes to a record? A 1/2 day ttl seems very low for
a "running" server.

A ttl of 1/2 day forces queries to hit the server at least twice a day. This
is redundant since the secondary will not pick up changes more than once
a day so queries that hit the secondary will always get the same response
within a 24 hour period so the client may as well be allowed to cache the
data for that long. I think your DNS is somewhat mis-configured in this
regard.

If you did make a change on the primary then clients will pick up that
change if they hit the primary within 12 hours but if they access the
secondary they'll get the out of date record since the secondary only
refreshes every 24 hours. The ttl should never be less than the refresh
value otherwise clients timeout more often than the secondaries refresh
and may then get old data from the secondaries. This is one type of
misconfiguaration that can cause the problems you raised originally with
clients still trying to connect to old addresses. Not likely in your
case since your values are so low.

Of course, care when updating DNS (say by pushing the data by hand to
the secondaries) can overcome these problems but that makes the whole
thing redundant, you may as well run with no refresh and infinite
expire.

I assume when you talk about zone timeout you're talking about the
refresh value?



Re: security holes and other fun stuff

Posted by Dean Gaudet <dg...@hotwired.com>.
In article <ho...@tees>,
Paul Richards  <ne...@hyperreal.com> wrote:
>Incidentally, your DNS settings are a little odd. Having a ttl of 1/2
>day when your zone transfers only happen daily seems redundant to me
>since you can't update your tables more than once a day so why force
>clients to expire the cache twice a day.

Well the time an old record should live is the zone timeout plus the
record timeout, so I'm at a 1.5 day theoretical record change time now.
The difference between that and 2 days is minimal, so yeah I could
change it.  Incidentally, the 1 day zone timeout is for my external
secondaries -- sprintlink enforces a minimum of at least a 1 day zone
timeout.  My internal secondaries are force fed more regularly (in fact,
after each change, although I'm not using the bind update stuff yet).

> > This argument is pointless.  I really think there are security flaws with
> > Apache's advocation of DNS in configuration files.  If the Apache Group
> > doesn't care about it (I've not received one response, even privately,
> > which supports my view) then I'll shut up.  It's not just an Apache
> > issue -- I'm certain that if we were to investigate Netscape or NCSA or
> > other servers that allow DNS in their configuration files then we'd find
> > the same problems.
>
>Some of the points you made I understood and was concerned about but I
>had trouble following your argument so I'm not entirely clear on what
>exactly you're worries are regarding the DNS issues. I'm still
>rereading some of them.

You'll have to excuse me... I've been running on way too little ZZZzzz
lately, to the point I'm having a hard time focusing my eyes.  So, um,
I was probably a little stressed at something else when I replied.  My
apologies.

Dean

Re: security holes and other fun stuff

Posted by Paul Richards <p....@elsevier.co.uk>.
Dean Gaudet writes:
 > In article <ho...@originat.demon.co.uk>,
 > >Not true. 
 > 
 > What's not true?  That the API is working?  Gee the IRIX, BSDI, Solaris,

That netscape are closer to being able to do the right thing.

 > Ahh so now you're saying that HTTP is the culprit and it's bad and
 > everything because it opens a new TCP connection for each request.
 > So your solution to this problem is that a browser should do DNS on
 > every hit... or they have to write their own resolver routines (which

I don't understand the concerns of that last paragraph. Bind does all
this for you and it's a library so it's linked into the app anyway.  My
gripe with Netscape is that their DNS code circumvents functionality I have
in my underlying OS and I can't control it's behaviour. It has nothing to
do with passing back the timeout to the application. At some point you
have to call some function or another to get the current ip address for the
fqdn and that function has to check the ttl and decide whether to contact
the server or not. If you set up a caching server on your box then bind does
all this perfectly well, if you're not running a caching server then the
correct behaviour *IS* to contact a server. The Netscape solution isn't
anything more clever than this, they've just embedded all this DNS
functionality into the application itself.

 > No they won't drop.  Not unless you actually drop the old address.
 > Generally I do this (IRIX syntax):
 > 
 >     ifconfig ec0 NEW.A.B.C; ifconfig ec0 alias OLD.X.Y.Z

Yeah well, OK, you have to actually drop the old address, what you're doing
above is adding another not changing the old one :-)

 > I don't deny that it can't happen with "working" code.  I'm telling you
 > that working code is an ideal.  We're living in the real world, where
 > broken code is the norm.
 > 
 > Frankly I don't care about 4 accesses per day 1 month after a renumbering.
 > But I do care about the one to two week range where I saw hundreds to
 > thousands of accesses despite the fact that the DNS had fully propagated.

Well this is odd but I'd bet it's more likely configuration errors rather
then broken code. If others have seen this I'd be interested in looking
into the problem some more since it might change my mind about the way
I do some things.

Incidentally, your DNS settings are a little odd. Having a ttl of 1/2
day when your zone transfers only happen daily seems redundant to me
since you can't update your tables more than once a day so why force
clients to expire the cache twice a day.

 > This argument is pointless.  I really think there are security flaws with
 > Apache's advocation of DNS in configuration files.  If the Apache Group
 > doesn't care about it (I've not received one response, even privately,
 > which supports my view) then I'll shut up.  It's not just an Apache
 > issue -- I'm certain that if we were to investigate Netscape or NCSA or
 > other servers that allow DNS in their configuration files then we'd find
 > the same problems.

Some of the points you made I understood and was concerned about but I
had trouble following your argument so I'm not entirely clear on what
exactly you're worries are regarding the DNS issues. I'm still
rereading some of them.

Re: security holes and other fun stuff

Posted by Dean Gaudet <dg...@hotwired.com>.
In article <ho...@originat.demon.co.uk>,
>Not true. 

What's not true?  That the API is working?  Gee the IRIX, BSDI, Solaris,
and SunOS4 man pages must all be broken.  The API has no timeout.
It is impossible to implement applications that respect the timeout
using the API.  There's also no mention in the man pages (I'm not about
to go dig any further for RFCs or the posix spec) about the timeout.
Hence applications programmers are doing just what they're supposed to:
calling the routine and using the result.

>This isn't how DNS works at all. An application only has to do a DNS lookup
>once, to map the domain name to an ip address. Once it's opened a connection
>then there's no need to check that the DNS map has changed. 

Ahh so now you're saying that HTTP is the culprit and it's bad and
everything because it opens a new TCP connection for each request.
So your solution to this problem is that a browser should do DNS on
every hit... or they have to write their own resolver routines (which
you've already said netscape shouldn't have done -- but they had to do
it to get multithreaded DNS).

>If you change an ip address on a box then all existing connections will
>drop anyway and at the next DNS lookup the new address will appear, there's
>absolutely no need for any API to pass back timeout values, they're
>not relevant to applications.

No they won't drop.  Not unless you actually drop the old address.
Generally I do this (IRIX syntax):

    ifconfig ec0 NEW.A.B.C; ifconfig ec0 alias OLD.X.Y.Z

>> A month after renumbering bianca.com I saw 4 requests per day to the old
>
>It just can't happen with working code.

I don't deny that it can't happen with "working" code.  I'm telling you
that working code is an ideal.  We're living in the real world, where
broken code is the norm.

Frankly I don't care about 4 accesses per day 1 month after a renumbering.
But I do care about the one to two week range where I saw hundreds to
thousands of accesses despite the fact that the DNS had fully propagated.

>> The length of that time is what's at issue here.  There's like 0 chance
>> you'll convince J. Random Admin to play with the magic values in their
>> zone's SOA.  The number of people that understand that the 2nd through
>> 4th values refer to the zone as a whole and not the individual records
>> is pretty small.  I know how to play with the values to attempt to renumber
>
>If the admin doesn't have a full understanding of DNS then the whole issue
>is irrelevant.

I wasn't clear enough, sorry.  Here's what I meant to say:

    The length of that time is what's at issue here.  There's like 0
    chance an ISP will convince J. Random Customer to play with the magic
    values in their zone's SOA.  The number of people that understand
    that the 2nd through 4th values refer to the zone as a whole and
    not the individual records is pretty small.  I know how to play with
    the values to attempt to renumber

The ISP clearly has to understand DNS.

This argument is pointless.  I really think there are security flaws with
Apache's advocation of DNS in configuration files.  If the Apache Group
doesn't care about it (I've not received one response, even privately,
which supports my view) then I'll shut up.  It's not just an Apache
issue -- I'm certain that if we were to investigate Netscape or NCSA or
other servers that allow DNS in their configuration files then we'd find
the same problems.

Dean

Re: security holes and other fun stuff

Posted by Dean Gaudet <dg...@hotwired.com>.
In article <ho...@cadair.elsevier.co.uk>,
Paul Richards  <ne...@hyperreal.com> wrote:
>The time it takes to change a DNS entry is dependent *entirely* on the timeout
>value. Anything doing DNS lookups that caches an entry for longer than that
>time is *FUNDAMENTALLY* broken.

Any DNS server that caches an entry for longer than that time is
fundamentally broken.  Take a look at the API though: gethostbyname and
gethostbyaddr DO NOT RETURN THE TIMEOUT.  Hence it is impossible as an
application "doing the right thing" by using the API to actually do
the right thing.  Netscape is actually closer to being able to do the
right thing by writing their own API.

Netscape is broken.  Apache (configured with DNS) is broken.  INN is
broken (the FAQ mentions this and gives a "fix").  Hell telnet, rlogin,
any news reader, they're all broken because they don't continually 
look up the names they're using.

A month after renumbering bianca.com I saw 4 requests per day to the old
address.  A month after renumbering www.suck.com I saw 3 requests per day.
I have two week zone timeouts, with 1 day updates, and 1/2 day ttls.
In two days all my secondaries should have updated (they had).  In three
days everyone should have been going to the new address.

Theory and practice don't meet.  I studied mathematics, I'm totally
into theory.  I'm just giving you a practical view from someone who
has actually gone through the effort of renumbering a bunch of servers.
Maybe Cliff can add something here -- he's gone through this as well.

Regardless of this fact of life, my point is still valid.  You always have
to run with two config entries long enough to let the old record time out.
Brian was claiming that there was some magic method of running with one
that required little co-ordination.

The length of that time is what's at issue here.  There's like 0 chance
you'll convince J. Random Admin to play with the magic values in their
zone's SOA.  The number of people that understand that the 2nd through
4th values refer to the zone as a whole and not the individual records
is pretty small.  I know how to play with the values to attempt to renumber

Dean

Re: security holes and other fun stuff

Posted by Paul Richards <p....@elsevier.co.uk>.
Dean Gaudet writes:

 > You can't change DNS in less than 2 weeks.  It doesn't matter what any of
 > your timeouts are set to -- you'll still see traffic on the old addresses
 > after two weeks.  I've done this dozens of times and seen it in action.  I
 > don't think that it's all broken DNS clients either -- netscape never looks
 > up anything twice, and many people leave their netscape (and everything
 > else) running when they "leave" work.  My netscape has been going for 3
 > days now, so I won't see any DNS changes until I restart it (i.e.  'cause
 > it crashed ;).  So... no matter what you do, you have to use two config
 > file entries for a while or you lose traffic.

The time it takes to change a DNS entry is dependent *entirely* on the timeout
value. Anything doing DNS lookups that caches an entry for longer than that
time is *FUNDAMENTALLY* broken.

That's the whole point of the thing, say you're timeout is a week, when
you want to make a change you drop the timeout to say a day, wait a week,
make the change, wait a day, then bump the timeout back up to a week again.

After the first week everyone should be checking the primary every day at
which point it'll take no more than a day for the change to propagate.

Netscape may well be totally broken because I don't suppose they expected
people to leave it running permanently. Even so, it's totally
broken anyway if it doesn't honour the timeout and I've always felt it's
totally broken for trying to implement DNS internally in the first place.

Re: security holes and other fun stuff

Posted by Brian Behlendorf <br...@organic.com>.
Sorry for the delay in responding, I've been out of touch for the last
couple of days.  I'm also going to be out of touch until Thursday after
today....

On 15 Jul 1996, Dean Gaudet wrote:
> B = Brian, D = Dean
> 
> B>I can't believe that if you're clued enough to use private network
> B>addresses, you wouldn't also know that vhosts can be requested via the
> B>Host: header.
> 
> Because apache 1.0 didn't do this and it's not documented as a "feature"
> of upgrading?  The docs say [paraphrased] "if you configure it using
> CNAMEs to the main server address it'll use the Host: header"  they don't
> say "the server always looks at the Host: header on the main server
> address and dispatches to any virtualhost, regardless of whether that
> host is served by the main server address".

Okay, I think I've added a pretty approachable and understandable example
to the docs; check out http://www.apache.org/docs/host.html.  If that's
alright then virtual-host.html should probably also be updated.

> D> Now, competitor changes their DNS so that www.fakename.com maps to
> D> www.victim.com's address.  Take a boo at virtualhost_section() and
> D> you'll see that it places www.fakename.com BEFORE www.victim.com in the
> D> virtual host searchlist.  Bingo, any hit to www.victim.com's address will
> D> be served out of the www.fakename.com configuration.  
> 
> B>*Well*, any hit which doesn't contain the Host: header.  If everyone used
> B>the Host: header, how would www.fakename.com get a hit from
> B>www.victim.com?
> 
> My example was with ip-vhosts, and is a hole in 1.0 and 1.1 (1.1 doesn't
> care what Host: crap says on non-"main server address" vhosts -- it does
> respect http:// though).  www.fakename.com is set to the same IP as
> www.victim.com for the duration of the attack, and both are ip-vhosts.  The
> order of the definitions in the config file lets www.fakename.com steal all
> the traffic for www.victim.com.

Ah, okay, I understand now.  So thereis one bug here we should write
down and clarify and make sure we fix:

  If we're going to allow Host: headers to be used for ip-vhosts against 
  the main server address, then we should be consistant and allow such
  Host:  header functionality on the other ip addresses too.

The ordering problem is no longer a problem if the above bug is fixed.

> There's nothing a little perl won't handle either.  We're not 100% locked
> into a particular config file syntax... we're only, say, 50% locked in.
> How many people use m4 or other macro sets to generate their configs?  I'd
> wager it's few.  The rest have plain configs that can be munched into a new
> format.  Not that we'd ever go that way, I would expect -1 on this fast.

Actually, I was thinking about this the other day, when I was thinking
about how crappy our config file syntax is sometimes. :)  We have just
started using m4 here, thanks to Alexei.  If we had a need we could
justify, I think asking people to modify their config files for Apache 2.0
(aided with perl, perhaps) would not be an impossible political feat.  But
we better have a good reason :)

And after thinking about your issues for awhile, I think the best solution
to your concerns and problems would be a directive like "NoBleedVHosts",
which means that Host: header requests would only be answered on those IP
addresses to which they were assigned.  That seems to basically be the
security issue here.

	Brian

--=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--
brian@organic.com  www.apache.org  hyperreal.com  http://www.organic.com/JOBS


Re: security holes and other fun stuff

Posted by Dean Gaudet <dg...@hotwired.com>.
B = Brian, D = Dean

B>I can't believe that if you're clued enough to use private network
B>addresses, you wouldn't also know that vhosts can be requested via the
B>Host: header.

Because apache 1.0 didn't do this and it's not documented as a "feature"
of upgrading?  The docs say [paraphrased] "if you configure it using
CNAMEs to the main server address it'll use the Host: header"  they don't
say "the server always looks at the Host: header on the main server
address and dispatches to any virtualhost, regardless of whether that
host is served by the main server address".

D> Now, competitor changes their DNS so that www.fakename.com maps to
D> www.victim.com's address.  Take a boo at virtualhost_section() and
D> you'll see that it places www.fakename.com BEFORE www.victim.com in the
D> virtual host searchlist.  Bingo, any hit to www.victim.com's address will
D> be served out of the www.fakename.com configuration.  

B>*Well*, any hit which doesn't contain the Host: header.  If everyone used
B>the Host: header, how would www.fakename.com get a hit from
B>www.victim.com?

My example was with ip-vhosts, and is a hole in 1.0 and 1.1 (1.1 doesn't
care what Host: crap says on non-"main server address" vhosts -- it does
respect http:// though).  www.fakename.com is set to the same IP as
www.victim.com for the duration of the attack, and both are ip-vhosts.  The
order of the definitions in the config file lets www.fakename.com steal all
the traffic for www.victim.com.

D> The ISP probably
D> won't notice ('cause of the thing Cliff reported and I said was hard to
D> figure out in general with http/1.1 to consider).
B>This is another valid instance where a "No1.1VHostSupport" could be
B>used, so long as concern about supporting Host-less clients was important.
B>Secondly, I'd support making the chain created by virtualhost_section() go
B>in the opposite order - should be easy to do, no?  Hmm, perhaps we could
B>also have a "EnforceHTTP1.1" directive which rejects requests without a
B>full URI or Host: header.

Searching in the opposite order might be more intuitive... but doesn't
eliminate the exploit I gave.  EnforceHTTP1.1 would eliminate the
exploit.

D> - can only do HTTP/1.1 stuff with the "main" server address
B>
B>Eh?  I can do a telnet hyperreal.com 80 and say "GET
B>http://www.apache.org/" and it works fine...

hyperreal.com is the "main server address" of that machine... the address
returned by gethostname().  As I said, the docs aren't clear.

D> Best needs to know about the change anyhow -- since they'll want to know
D> that you've freed up the number.  It isn't hard for them to move your
D> vhost to another spot in the config file.
B>
B>But it requires exact timing (or at least, coordination of two events
B>within close proximity - adding a second entry to their config files for
B>you, and then after your switch removing your old entry).  Yes, eventually
B>I tell them the old IP address is gone, but only after I'm sure no one's
B>using it.

You can't change DNS in less than 2 weeks.  It doesn't matter what any of
your timeouts are set to -- you'll still see traffic on the old addresses
after two weeks.  I've done this dozens of times and seen it in action.  I
don't think that it's all broken DNS clients either -- netscape never looks
up anything twice, and many people leave their netscape (and everything
else) running when they "leave" work.  My netscape has been going for 3
days now, so I won't see any DNS changes until I restart it (i.e.  'cause
it crashed ;).  So... no matter what you do, you have to use two config
file entries for a while or you lose traffic.

D> [<Map> syntax]
B>
B>Hmm... personally, I'm not convinced.  The above is not necessarily more
B>readable or intuitive or less error-prone than the existing config files -
B>that's for me, and I admit I may be biased by being too close to the
B>problem.  I won't even mention the political cost of asking people to
B>totally change config files.  And while us computer scientists like adding
B>layers of indirection, it might be somewhat confusing for the average
B>webmaster. :)

Oh they don't have to convert, because the <Map> stuff can be generated
implicitly.  You're entirely right, the existing config files can be used
to generate hash tables and everything -- that is, AFTER we specify an
ordering for them explicitly (presently it's reverse order specified in the
config file).  The <Map> stuff is more how I think the mapping from request
-> virtualhost should be made... and given enough primitives the present
config can support it as well.

There's nothing a little perl won't handle either.  We're not 100% locked
into a particular config file syntax... we're only, say, 50% locked in.
How many people use m4 or other macro sets to generate their configs?  I'd
wager it's few.  The rest have plain configs that can be munched into a new
format.  Not that we'd ever go that way, I would expect -1 on this fast.

Dean

P.S. There is a DNS exploit involving multihoming that can't occur right
now because we don't allow multihomed addresses in <VirtualHost>
statements... but that's exactly what I want to add and why I started to
dig through this code.

Re: security holes and other fun stuff

Posted by Brian Behlendorf <br...@organic.com>.
On 15 Jul 1996, Dean Gaudet wrote:
> In article <ho...@fully.organic.com>,
> Brian Behlendorf  <ne...@hyperreal.com> wrote:
> >On 13 Jul 1996, Dean Gaudet wrote:
> >> While thinking on my VirtualHost syntax problem, and Cliff's request
> >> I think I've discovered several configuration-related security (and
> >> reliability) problems, one of which bypasses access controls.
> >> 
> >> First, a serious one.  Suppose the same webserver has an internal vhost,
> >> an external vhost, 
> >
> >(I'm presuming here you mean, two IP numbers, say X and Y?)
> 
> Actually I've tried to exploit it now that I've had sleep, and
> I can.  Suppose that you have an ip-vhost vhost.foobar.com, and
> that phys.foobar.com is the machine's physical address (i.e. the
> "main server address" in our documentation on vhosts), and doesn't
> appear in any <VirtualHost> statement.  Then sending a request with
> "Host: vhost.foobar.com" or "GET http://vhost.foobar.com/ HTTP/1.1"
> to the physical address will cause the webserver to serve data from
> vhost.foobar.com instead of phys.foobar.com.

As it should, in my opinion, and probably also in the opinion of other
server implementors on http-wg.  What it seems you object to is the
configurability of this presumption.  I would probably support the
creation of an obscure config directive like "No1.1VHostSupport" or
something to specifically disable this feature.

> Essentially what is happening right now is that a request can come into
> the "main" address(s) and can then be mapped to any vhost, regardless
> of whether that vhost is a name-vhost or an ip-vhost.  This violates
> the Principle of Least Astonishment I think.

I feel exactly the opposite, but I might be too "close" to the issue to
judge fairly.  

> You want a real world example of where this might be a problem?  Suppose
> you've created some forms-based mgmt software for your webserver, and
> suppose you don't trust Apache's access control (I don't -- I can get
> into this later -- but any good admin doesn't trust any access control and
> piles it all on).  Then a reasonable solution is to put that forms-based
> control stuff on addresses which the outside world "can't reach" (i.e. RFC
> 1597 private network addresses).  Of course if you're clued enough to
> make it that far you'd also slap password and group restrictions on it.
> But you'd still be astonished that suddenly your web server is happily
> "routing" stuff it shouldn't.

I can't believe that if you're clued enough to use private network
addresses, you wouldn't also know that vhosts can be requested via the
Host: header.

I support the idea of putting in the vhost documentation a warning
about this "gotcha".  I would also support the idea of describing a way to
do this using Listen and two separate pools of servers so that there's no
way that one could serve the tree from the other.  And as I said before,
as a last resort, I'd support an obscure directive turning off support
for the Host: header.  But I don't support changing the way the config
files work, for this purpose.

[Now on to a separate issue, which is denail of service attacks when the
DNS fails for a named vhost]

> >So, the policy here should be, if DNS names fail to resolve correctly to
> >an IP address on the machine, then that particular vhost setting fails
> >but the rest of the server comes up.
> 
> Ahh but there's more than that happening here.  I'm talking about
> configuration gotchas... and I'm hoping that we'll amend the
> documentation.  Let me give an explicit example.
> 
> Suppose www.victim.com is served from ISP.  Their competitor decides
> they want to steal web traffic.  So call up the ISP and ask for service
> for www.fakename.com (competitor controls fakename.com).  ISP gives
> them an IP address and tells them how to configure things, blah blah.
> ISP's httpd.conf looks like this:
> 
>     <VirtualHost www.victim.com>
>     blah blah
>     </VirtualHost>
>     ...
>     <VirtualHost www.fakename.com>
>     blah blah
>     </VirtualHost>
> 
> Now, competitor changes their DNS so that www.fakename.com maps to
> www.victim.com's address.  Take a boo at virtualhost_section() and
> you'll see that it places www.fakename.com BEFORE www.victim.com in the
> virtual host searchlist.  Bingo, any hit to www.victim.com's address will
> be served out of the www.fakename.com configuration.  

*Well*, any hit which doesn't contain the Host: header.  If everyone used
the Host: header, how would www.fakename.com get a hit from
www.victim.com?

> The ISP probably
> won't notice ('cause of the thing Cliff reported and I said was hard to
> figure out in general with http/1.1 to consider).

This is another valid instance where a "No1.1VHostSupport" could be
used, so long as concern about supporting Host-less clients was important.
Secondly, I'd support making the chain created by virtualhost_section() go
in the opposite order - should be easy to do, no?  Hmm, perhaps we could
also have a "EnforceHTTP1.1" directive which rejects requests without a
full URI or Host: header.

> Sure, it won't last long.  But maybe it will... maybe the competitor is
> also smart enough to clone most of victim.com's site and only change a
> few important details... like where the credit card numbers are mailed.
> Some sites are changed so infrequently this charade might be easy to
> keep up for ages... victim.com may just figure "there's really no money
> in the net" as all their traffic is stolen by competitor.

Heh.

> >I'd prefer the name the vhost responds to should be a union of the
> >ServerName and ServerAlias settings, not just ServerAlias.  Ugh, but then
> >you get things like
> >
> ><VirtualHost 204.76.138.65>
> >ServerName www.apache.org
> >....
> ></VirtualHost>
> >
> ><VirtualHost 204.76.138.65>
> >ServerName dev.apache.org>
> >...
> ></VirtualHost>
> >
> >hmm... slight semantic change, I suppose.
> 
> I'm moving in this direction... I'm thinking that we should have
> ip addresses which serve sets of vhosts (even in the name-vhost
> world there are cases when you'll want to spread them over several
> addresses).  Within each set HTTP/1.1 methods are used to determine
> which vhost to use.  The administrator should be able to statically
> control set membership.  We're almost there now if we use my suggestion
> of <VirtualHost A.B.C.D>+ServerName "www.foobar.com" for all virtualhosts
> in the config... two things short:
> 
> - can only do HTTP/1.1 stuff with the "main" server address

Eh?  I can do a telnet hyperreal.com 80 and say "GET
http://www.apache.org/" and it works fine...

> - it doesn't respect "set" boundaries

I.e., if I make a request in set A for a host in set B... hmm, maybe that
magic directive should be "NoShareVHostsBetweenIPs" instead of
"No1.1VHostSupport".  

> But I think it's confusing...

Ya.  If only TimBL hadn't presumed there'd only be one server on port 80
per machine back when he was releasing HTTP.  But given his other
successes, I forgive him.  :)

> >> So... do we want to rethink the syntax?  At the moment it's actually a bad
> >> thing to be forced into name-vhost mode because not all clients support
> >> it.  In the future this may not be an issue.
> >
> >On the flip side, the current setup would allow ISP's who are currently
> >burning IP addresses with vhosts to allow those sites whose DNS is
> >controlled elsewhere to migrate without having to coordinate it with the
> >server authority; in other words, if I have a web site at best.com using
> >www.bong.com, and best currently gives me an IP number but want it back,
> >then I can change the IP mapping of "www.bong.com" at any time to the
> >"centralized" best.com vhost IP number, and I won't need to coordinate
> >that with best's sysadmins.  So, what to some people is a bug, is a
> >feature to others. :)
> 
> Best needs to know about the change anyhow -- since they'll want to know
> that you've freed up the number.  It isn't hard for them to move your
> vhost to another spot in the config file.

But it requires exact timing (or at least, coordination of two events
within close proximity - adding a second entry to their config files for
you, and then after your switch removing your old entry).  Yes, eventually
I tell them the old IP address is gone, but only after I'm sure no one's
using it.

> >> - Apache doesn't do double reverse lookup for DNS based authentication.
> >
> >At least when I wrote about this for my book, I did mention that
> >host-based access control was unsafe unless used with MAXIMUM_DNS (time
> >to make this a runtime config?) or unless one used IP numbers.
> 
> We should probably put this on the mod_access todo list:
> 
>     - allow double reverse lookup to be configured only for those hits
> 	requiring it so that you don't have to suffer from MAXIMUM_DNS
> 	for every hit

Sure.  In fact, I would be comfortable with that as the default - any hit
which is protected by a non-numeric hostname is subject to reverse dns
lookup.

>     - provide a global "IPAuthByNumberOnly" that will barf on any allow/deny
> 	that isn't numeric
> 	-- this is because some configs are built out of lots of small files
> 	    that are edited by less knowledgeable people and/or you use
> 	    .htaccess files
> 	-- the barf should be a warning and the result should be to deny
> 	    everything (a nice safe way to continue after the error in the
> 	    config file)

Sure.

> >Wouldn't it be saner to just set it to the IP number?
> 
> Actually, I like this approach:
> 
> *** http_main.c.orig    Sun Jul 14 19:35:11 1996
> --- http_main.c Sun Jul 14 19:36:53 1996
> ***************
> *** 1058,1065 ****
> 
>       /* Main host first */
> 
> !     if (!s->server_hostname)
>         s->server_hostname = get_local_host(pconf);
> 
>       def_hostname = s->server_hostname;
>       main = gethostbyname(def_hostname);
> --- 1058,1069 ----
> 
>       /* Main host first */
> 
> !     if (!s->server_hostname) {
> !       fprintf(stderr,"httpd: leaving ServerName unset is considered harmful to your health\n");
> !       fprintf(stderr,"httpd: but I'll let it slip for the moment, and attempt to use DNS\n");
> !       fprintf(stderr,"httpd: to find it... who knows what I'll do if your name server is down.\n");
>         s->server_hostname = get_local_host(pconf);
> +     }
> 
>       def_hostname = s->server_hostname;
>       main = gethostbyname(def_hostname);

What I'm suggesting is that if get_local_host fails, just use the IP
number as the ServerName.  I guess in addition to this, sure.

> Maybe I should just write a page "DNS Considered Harmful" and get you
> guys to link to it from the docs? :)
>
> Seriously... I have to get in there and dig around in order to implement
> my multi-ip vhost stuff.  Is there a better way we should do it?

As you've probably seen, I would prefer to go with the status quo, and
offer configuration directives for what I feel will be seldom-needed
restrictions on functionality.

> I figure I got no comments on my previous post about multi-ip syntax (i.e.
> <VirtualHost 10.1.1.1 10.1.1.2> responding to both addresses) because:
> 
>     (a) everyone liked it (or didn't care)
>     (b) I didn't include a patch and everyone was waiting for the patch
> 	to show up before arguing about how it should be done.
> 
> My experience with this list says (b) is the answer :)

Actually, I thought I remember some favorable comments... certainly, a
lack of negative comments does mean something good, but you need to at
least convince someone to commit it  :)  

> I see two ways to approach vhost mapping.  One is to define the mapping
> when the vhost is defined, and the other is to define all the mappings
> together and have them symbolically refer to vhosts.  The first is great
> for newbie admins because it does a lot of stuff implicitly (i.e. creates
> name-vhosts without telling you).  The second is great for the person
> writing the mapping code because it abstracts the mapping out of the
> server_rec structure (which is a Good Thing).  Consider:
> 
>     # syntax <VirualHost SYMBOLIC-VHOST-NAME>
>     <VirtualHost vhost1>
>     ServerName www.vhost1.com
>     # other goo blah blah
>     </VirtualHost>
> 
>     <VirtualHost vhost2>
>     ServerName www.vhost2.com
>     # other goo blah blah
>     </VirtualHost>
>     ... and so on up to 4
> 
>     <VirtualHost go-away-server>
>     ServerName www.blah.com
>     # a server which sends back "go away" for any request
>     </VirtualHost>
>     
>     # We now map ip addresses to symbolic-vhost-names.
>     # Syntax: <Map ip-address*> ... </Map>
>     # Ordering within a Map is important.  Ordering of Map statements isn't,
>     # Map addresses can't overlap (we want to move towards hashing at least
>     # ips).  If you want to use DNS names in map statements then you have
>     # to define SHOOT_FOOT when compiling the server.
>     <Map 10.1.1.1>
>     # MatchHost SYMBOLIC-VHOST-NAME PATTERN+
>     MatchHost vhost1 www.vhost1.com
>     MatchHost vhost2 *.vhost2.com
>     # MatchPath SYMBOLIC-VHOST-NAME PATTERN+
>     MatchPath vhost1 /vhost1 /alpha
>     MatchPath vhost2 /vhost2
>     #
>     Default vhost1
>     </Map>
> 
>     # and now a mapping for another address, this is an ip-vhost only
>     <Map 10.1.1.2>
>     Default vhost3
>     </Map>
> 
>     # this one responds on multiple addresses
>     <Map 10.1.1.3 10.1.1.4>
>     Default vhost4
>     </Map>
> 
>     # and this one responds to any ip not present in the other rules
>     <Map>
>     Default go-away-server
>     </Map>
>     # an alternate option is to return 403 for anything that isn't matched
>     # by a map...
> 
> Notice that this way it's easy to see all the aliases and path mappings
> for a particular ip.  It'd also be possible to do some easy optimizations
> such as turning a sequence of "simple" MatchHost statements (those
> without wildcards) into a hashtable (a big win later on as name-vhosts
> become common and every ISP is happily giving them out).

Hmm... personally, I'm not convinced.  The above is not necessarily more
readable or intuitive or less error-prone than the existing config files -
that's for me, and I admit I may be biased by being too close to the
problem.  I won't even mention the political cost of asking people to
totally change config files.  And while us computer scientists like adding
layers of indirection, it might be somewhat confusing for the average
webmaster. :)

Question: why can't hash tables be used for existing config files for
non-wildcard vhosts?

	Brian

--=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--
brian@organic.com  www.apache.org  hyperreal.com  http://www.organic.com/JOBS


Re: security holes and other fun stuff

Posted by Dean Gaudet <dg...@hotwired.com>.
In article <ho...@fully.organic.com>,
Brian Behlendorf  <ne...@hyperreal.com> wrote:
>On 13 Jul 1996, Dean Gaudet wrote:
>> While thinking on my VirtualHost syntax problem, and Cliff's request
>> I think I've discovered several configuration-related security (and
>> reliability) problems, one of which bypasses access controls.
>> 
>> First, a serious one.  Suppose the same webserver has an internal vhost,
>> an external vhost, 
>
>(I'm presuming here you mean, two IP numbers, say X and Y?)

Actually I've tried to exploit it now that I've had sleep, and
I can.  Suppose that you have an ip-vhost vhost.foobar.com, and
that phys.foobar.com is the machine's physical address (i.e. the
"main server address" in our documentation on vhosts), and doesn't
appear in any <VirtualHost> statement.  Then sending a request with
"Host: vhost.foobar.com" or "GET http://vhost.foobar.com/ HTTP/1.1"
to the physical address will cause the webserver to serve data from
vhost.foobar.com instead of phys.foobar.com.

That is, the config file does not specify HTTP/1.1 virtual hosts but
apache is allowing them to happen... in a case where I don't think it
should.

Here is an example config:

    ServerType standalone
    Port 80
    HostnameLookups on
    User nobody
    Group #-1
    ServerAdmin you@your.address
    ServerRoot /tmp/apache_1.1.1
    ErrorLog logs/error_log
    TransferLog logs/access_log
    PidFile logs/httpd.pid
    ScoreBoardFile logs/apache_status
    ResourceConfig /dev/null
    AccessConfig /dev/null
    DocumentRoot /tmp/apache_1.1.1/real-doc
    <VirtualHost vhost.foobar.com>
    DocumentRoot /tmp/apache_1.1.1/vhost-doc
    </VirtualHost>

real-doc/index.html and vhost-doc/index.html distinguish the
two servers.  Replace vhost.foobar.com with an alias on your box.
telnet to your box's main address and attempt to exploit it by issuing
"GET http://vhost.foobar.com/ HTTP/1.1" or "GET / HTTP/1.0\r\nHost:
vhost.foobar.com".  (Note if you do this exploit on another port make
sure to adjust your requests appropriately.)

Essentially what is happening right now is that a request can come into
the "main" address(s) and can then be mapped to any vhost, regardless
of whether that vhost is a name-vhost or an ip-vhost.  This violates
the Principle of Least Astonishment I think.

You want a real world example of where this might be a problem?  Suppose
you've created some forms-based mgmt software for your webserver, and
suppose you don't trust Apache's access control (I don't -- I can get
into this later -- but any good admin doesn't trust any access control and
piles it all on).  Then a reasonable solution is to put that forms-based
control stuff on addresses which the outside world "can't reach" (i.e. RFC
1597 private network addresses).  Of course if you're clued enough to
make it that far you'd also slap password and group restrictions on it.
But you'd still be astonished that suddenly your web server is happily
"routing" stuff it shouldn't.

I dunno, the whole thing feels like having ip source-routing enabled,
or "ip forwarding" enabled on hosts that really shouldn't act as
gateways.

Can the proxy folks comment on whether this would allow someone to exploit
the proxy into snarfing stuff off internal machines?

>So, the policy here should be, if DNS names fail to resolve correctly to
>an IP address on the machine, then that particular vhost setting fails
>but the rest of the server comes up.

Ahh but there's more than that happening here.  I'm talking about
configuration gotchas... and I'm hoping that we'll amend the
documentation.  Let me give an explicit example.

Suppose www.victim.com is served from ISP.  Their competitor decides
they want to steal web traffic.  So call up the ISP and ask for service
for www.fakename.com (competitor controls fakename.com).  ISP gives
them an IP address and tells them how to configure things, blah blah.
ISP's httpd.conf looks like this:

    <VirtualHost www.victim.com>
    blah blah
    </VirtualHost>
    ...
    <VirtualHost www.fakename.com>
    blah blah
    </VirtualHost>

Now, competitor changes their DNS so that www.fakename.com maps to
www.victim.com's address.  Take a boo at virtualhost_section() and
you'll see that it places www.fakename.com BEFORE www.victim.com in the
virtual host searchlist.  Bingo, any hit to www.victim.com's address will
be served out of the www.fakename.com configuration.  The ISP probably
won't notice ('cause of the thing Cliff reported and I said was hard to
figure out in general with http/1.1 to consider).

Sure, it won't last long.  But maybe it will... maybe the competitor is
also smart enough to clone most of victim.com's site and only change a
few important details... like where the credit card numbers are mailed.
Some sites are changed so infrequently this charade might be easy to
keep up for ages... victim.com may just figure "there's really no money
in the net" as all their traffic is stolen by competitor.

>> Suppose abc.com is your "main server address" and maps to 10.1.1.1 and
>> you expect def.com to map to 10.1.1.2, but you don't control def.com.
>> Then the def.com controller (or Evil Person) can force you to use
>> name-vhosts just by mapping def.com to 10.1.1.1.
>
>How is this a problem?  Oh, I suppose for clients which don't support
>Host:, this might make def.com the default host for that IP address, but I
>don't know...

Yep, it does -- the "main" host is only chosen if none of the virtual
hosts match the ip (see find_virtual_server()).  (Hey didn't you say that
www.organic.com was your default host?  Hey, hope one of your clients
doesn't accidentally set their www.foobar.com DNS to www.organic.com's
address :)

>I'd prefer the name the vhost responds to should be a union of the
>ServerName and ServerAlias settings, not just ServerAlias.  Ugh, but then
>you get things like
>
><VirtualHost 204.76.138.65>
>ServerName www.apache.org
>....
></VirtualHost>
>
><VirtualHost 204.76.138.65>
>ServerName dev.apache.org>
>...
></VirtualHost>
>
>hmm... slight semantic change, I suppose.

I'm moving in this direction... I'm thinking that we should have
ip addresses which serve sets of vhosts (even in the name-vhost
world there are cases when you'll want to spread them over several
addresses).  Within each set HTTP/1.1 methods are used to determine
which vhost to use.  The administrator should be able to statically
control set membership.  We're almost there now if we use my suggestion
of <VirtualHost A.B.C.D>+ServerName "www.foobar.com" for all virtualhosts
in the config... two things short:

- can only do HTTP/1.1 stuff with the "main" server address
- it doesn't respect "set" boundaries

But I think it's confusing...

>> So... do we want to rethink the syntax?  At the moment it's actually a bad
>> thing to be forced into name-vhost mode because not all clients support
>> it.  In the future this may not be an issue.
>
>On the flip side, the current setup would allow ISP's who are currently
>burning IP addresses with vhosts to allow those sites whose DNS is
>controlled elsewhere to migrate without having to coordinate it with the
>server authority; in other words, if I have a web site at best.com using
>www.bong.com, and best currently gives me an IP number but want it back,
>then I can change the IP mapping of "www.bong.com" at any time to the
>"centralized" best.com vhost IP number, and I won't need to coordinate
>that with best's sysadmins.  So, what to some people is a bug, is a
>feature to others. :)

Best needs to know about the change anyhow -- since they'll want to know
that you've freed up the number.  It isn't hard for them to move your
vhost to another spot in the config file.

>> - Apache doesn't do double reverse lookup for DNS based authentication.
>
>At least when I wrote about this for my book, I did mention that
>host-based access control was unsafe unless used with MAXIMUM_DNS (time
>to make this a runtime config?) or unless one used IP numbers.

We should probably put this on the mod_access todo list:

    - allow double reverse lookup to be configured only for those hits
	requiring it so that you don't have to suffer from MAXIMUM_DNS
	for every hit
    - provide a global "IPAuthByNumberOnly" that will barf on any allow/deny
	that isn't numeric
	-- this is because some configs are built out of lots of small files
	    that are edited by less knowledgeable people and/or you use
	    .htaccess files
	-- the barf should be a warning and the result should be to deny
	    everything (a nice safe way to continue after the error in the
	    config file)

>> - If ServerName isn't set, apache does a reverse lookup in
>>     default_server_hostnames to find it.  If DNS is down when the
>>     server boots this will cause ServerName to be defaulted incorrectly.
>> 
>>     Actually, default_server_hostnames will segfault if somewhere
>>     between get_local_host and "main = gethostbyname..." there is a DNS
>>     failure... or if the servername is bogus.  patch:
>...
>> +     if( main == NULL ) {
>> +         fprintf(stderr,"httpd: cannot resolve main ServerName.\n");
>> +         exit(1);
>> +     }
>
>Wouldn't it be saner to just set it to the IP number?

Actually, I like this approach:

*** http_main.c.orig    Sun Jul 14 19:35:11 1996
--- http_main.c Sun Jul 14 19:36:53 1996
***************
*** 1058,1065 ****

      /* Main host first */

!     if (!s->server_hostname)
        s->server_hostname = get_local_host(pconf);

      def_hostname = s->server_hostname;
      main = gethostbyname(def_hostname);
--- 1058,1069 ----

      /* Main host first */

!     if (!s->server_hostname) {
!       fprintf(stderr,"httpd: leaving ServerName unset is considered harmful to your health\n");
!       fprintf(stderr,"httpd: but I'll let it slip for the moment, and attempt to use DNS\n");
!       fprintf(stderr,"httpd: to find it... who knows what I'll do if your name server is down.\n");
        s->server_hostname = get_local_host(pconf);
+     }

      def_hostname = s->server_hostname;
      main = gethostbyname(def_hostname);


Maybe I should just write a page "DNS Considered Harmful" and get you
guys to link to it from the docs? :)

Seriously... I have to get in there and dig around in order to implement
my multi-ip vhost stuff.  Is there a better way we should do it?

I figure I got no comments on my previous post about multi-ip syntax (i.e.
<VirtualHost 10.1.1.1 10.1.1.2> responding to both addresses) because:

    (a) everyone liked it (or didn't care)
    (b) I didn't include a patch and everyone was waiting for the patch
	to show up before arguing about how it should be done.

My experience with this list says (b) is the answer :)

I see two ways to approach vhost mapping.  One is to define the mapping
when the vhost is defined, and the other is to define all the mappings
together and have them symbolically refer to vhosts.  The first is great
for newbie admins because it does a lot of stuff implicitly (i.e. creates
name-vhosts without telling you).  The second is great for the person
writing the mapping code because it abstracts the mapping out of the
server_rec structure (which is a Good Thing).  Consider:

    # syntax <VirualHost SYMBOLIC-VHOST-NAME>
    <VirtualHost vhost1>
    ServerName www.vhost1.com
    # other goo blah blah
    </VirtualHost>

    <VirtualHost vhost2>
    ServerName www.vhost2.com
    # other goo blah blah
    </VirtualHost>
    ... and so on up to 4

    <VirtualHost go-away-server>
    ServerName www.blah.com
    # a server which sends back "go away" for any request
    </VirtualHost>
    
    # We now map ip addresses to symbolic-vhost-names.
    # Syntax: <Map ip-address*> ... </Map>
    # Ordering within a Map is important.  Ordering of Map statements isn't,
    # Map addresses can't overlap (we want to move towards hashing at least
    # ips).  If you want to use DNS names in map statements then you have
    # to define SHOOT_FOOT when compiling the server.
    <Map 10.1.1.1>
    # MatchHost SYMBOLIC-VHOST-NAME PATTERN+
    MatchHost vhost1 www.vhost1.com
    MatchHost vhost2 *.vhost2.com
    # MatchPath SYMBOLIC-VHOST-NAME PATTERN+
    MatchPath vhost1 /vhost1 /alpha
    MatchPath vhost2 /vhost2
    #
    Default vhost1
    </Map>

    # and now a mapping for another address, this is an ip-vhost only
    <Map 10.1.1.2>
    Default vhost3
    </Map>

    # this one responds on multiple addresses
    <Map 10.1.1.3 10.1.1.4>
    Default vhost4
    </Map>

    # and this one responds to any ip not present in the other rules
    <Map>
    Default go-away-server
    </Map>
    # an alternate option is to return 403 for anything that isn't matched
    # by a map...

Notice that this way it's easy to see all the aliases and path mappings
for a particular ip.  It'd also be possible to do some easy optimizations
such as turning a sequence of "simple" MatchHost statements (those
without wildcards) into a hashtable (a big win later on as name-vhosts
become common and every ISP is happily giving them out).

Dean

Re: security holes and other fun stuff

Posted by Dean Gaudet <dg...@hotwired.com>.
In article <ho...@ace.nueva.pvt.k12.ca.us>,
Alexei Kosut  <ne...@hyperreal.com> wrote:
>Correct me if I'm wrong, but I thought I'd read somewhere that
>starting domain names with a number was in the "technically allowed,
>but we don't reccomend it, and we think it's a really bad idea, and if
>you do it, be warned that we're likely to make it illegal sometime
>soon" category.

I brought this up in february (you can check the mail archive) and this
is what happened:

Dean> mod_access shouldn't use isalpha(first_char) to distinguish ip vs.
Dean> name rules

Rob McCool> My understanding of Preferred Name Syntax (RFC 1034 or 1035
Rob McCool> I think) is that it is a valid assumption.
Rob McCool> Am I misreading something?

Dirk> :-( No it is not; one of the few exception areas where it is
Dirk> allowed to have a number as first; (quoting an explicit warning
Dirk> from the O'Reilly BIND/DNS book so not tooo authorative :-()

David Robinson> Yes, you must be misreading something. This is not
David Robinson> a valid assumption.  The restriction is that the last
David Robinson> component of a domain name must start with an alphabetic
David Robinson> character; thus "4me.org" _is_ a valid domain name.

... I'm actually not going to worry about it.  I'd rather see two
different directives (for performance and robustness reasons) one that
takes DNS names and one that takes network/netmasks.  I'll submit this.
(backwards compat of course)

Dean

Re: security holes and other fun stuff

Posted by Alexei Kosut <ak...@nueva.pvt.k12.ca.us>.
On Sat, 13 Jul 1996, Brian Behlendorf wrote:

[...]

> Hmm, I'm not so sure I buy that.  You can't expect the application to know
> about security policies that sit at a layer below it - when your mapping
> between those two layers dissolves, then you need to find another mapping.
> 
> The only thing I can think of is to be a little more conservative with the
> vhost stuff and say, "If vhost Y maps to the primary IP address which I've
> bound myself to, then it's a Host:-based host - if it maps to a secondary
> IP address, then it's not Host:-based", but that would disallow certain
> other expected functionalities, like letting two web sites share the same
> secondary IP address. 

The current code doesn't do this *quite* right, but that's the theory,
at any rate... pretend you had ten servers, all Host-header based. And
your server sits on two networks, competely isolated, but they both
know about the same names. You want to give each of them differnet
documents. So you set up twenty virtual hosts, ten using one IP
address, ten using the other, and for each of the ten hostnames, you
give that ServerName one to each IP address. Should work...

At any rate, if you want, there's an undocumented "feature" that will
deactive host-header based resolving for a virtual host: make the
ServerName end with a dot, e.g. "ServerName some.server.name." - it
should still be a valid name, but the Apache code that checks Host:
headers will never find a match for it, because it strips off trailing
dots.

[...]

> I'd prefer the name the vhost responds to should be a union of the
> ServerName and ServerAlias settings, not just ServerAlias.  Ugh, but then
> you get things like

It is. For each virtual host, it checks first to see if the Host:
header matches ServerName, then the entries in ServerAlias. What it
*doesn't* look at (unless you don't have a ServerName) is what you put
in the <VirtualHost> entry. You can put anything, it doesn't care, or
check.

[...]

> > P.S. People doing SSL for apache 1.1 should watch out for the "http://"
> > comparisons in check_hostalias().
> 
> Eeek!

I don't see an eeek. Apache supports plain, simple, HTTP. If people
hack it to support SSL, they need to change all this stuff. I suppose
it could go and look for a "://", but then you run into real problems
if you're operating as a proxy server and people request things like
"ftp://your.server.name.here/" - you'd need to have a list of what
protocols your server supports anyway.

> > P.P.S.  Hostnames such as "123.hotwired.com" are valid, yet find_allowdeny
> > does not properly handle them.  This should be put on Known Bugs.  Be
> > careful when fixing this because just removing the isalpha() check creates
> > a security hole, consider the DNS map "1.1.1.1.in-addr.arpa IN PTR 2.2.2."
> > if the user has a config line "allow from 2.2.2" it will allow 1.1.1.1 in
> > (unless -DMAXIMUM_DNS).  -- which is bad because it breaks people who
> > understand double reverse lookup and are trying to avoid it by using
> > only ip addresses on allow/deny statements.
> 
> Good one.  I've added it to the known_bugs page.

Correct me if I'm wrong, but I thought I'd read somewhere that
starting domain names with a number was in the "technically allowed,
but we don't reccomend it, and we think it's a really bad idea, and if
you do it, be warned that we're likely to make it illegal sometime
soon" category.

-- 
________________________________________________________________________
Alexei Kosut <ak...@nueva.pvt.k12.ca.us>      The Apache HTTP Server
URL: http://www.nueva.pvt.k12.ca.us/~akosut/   http://www.apache.org/


Re: security holes and other fun stuff

Posted by Brian Behlendorf <br...@organic.com>.
On 13 Jul 1996, Dean Gaudet wrote:
> While thinking on my VirtualHost syntax problem, and Cliff's request
> I think I've discovered several configuration-related security (and
> reliability) problems, one of which bypasses access controls.
> 
> First, a serious one.  Suppose the same webserver has an internal vhost,
> an external vhost, 

(I'm presuming here you mean, two IP numbers, say X and Y?)

> and access to the internal vhost is restricted
> only by ip filtering (this is not unreasonable -- consider the use of
> private network numbers).  We now parse HTTP/1.1 requests using either
> "http://blah.com/asdf" URIs or Host: headers.  If you send a request to
> the public address with a URI or Host: header pointing at the private
> address the server will serve up the page.  I'm zoned at the moment,
> so someone else should follow the flow through read_request() and into
> read_request_line() and then into check_hostalias().  If I'm right,
> we've got a problem.

Hmm, I'm not so sure I buy that.  You can't expect the application to know
about security policies that sit at a layer below it - when your mapping
between those two layers dissolves, then you need to find another mapping.

The only thing I can think of is to be a little more conservative with the
vhost stuff and say, "If vhost Y maps to the primary IP address which I've
bound myself to, then it's a Host:-based host - if it maps to a secondary
IP address, then it's not Host:-based", but that would disallow certain
other expected functionalities, like letting two web sites share the same
secondary IP address. 

> Now on to some more mundane concerns.  Let "name-vhosts" refer to HTTP/1.1
> virtual hosts, and "ip-vhosts" refer to ip-based virtual hosts.
> 
> Remember while reading this that it's common for ISPs to *not* be primary
> for some customers' DNS, while at the same time hosting websites for
> such customers.  Also remember that our docs all suggest the use of DNS
> names instead of IPs when configuring either name-vhosts or ip-vhosts.
> 
> There are several reliability/denial-of-service possibilities involving
> the use of DNS names in <VirtualHost> statements.  If name-vhosts (or
> ip-vhosts) are configured in the manner described in the documentation
> (i.e. using DNS names) then the server won't boot if DNS isn't working.
> Furthermore, if any of the names used are in uncontrolled domains (i.e.
> secondary or no local authority) then a third party can cause denial
> of service.

So, the policy here should be, if DNS names fail to resolve correctly to
an IP address on the machine, then that particular vhost setting fails
but the rest of the server comes up.  I think we talked about it a few
weeks ago as something we should do, in the vein of "more tolerant config
file parsing".

> Another possibility involving uncontrolled DNS:
> 
>     <VirtualHost abc.com>
>     ...
>     </VirtualHost>
> 
>     <VirtualHost def.com>
>     ...
>     </VirtualHost>
> 
> Suppose abc.com is your "main server address" and maps to 10.1.1.1 and
> you expect def.com to map to 10.1.1.2, but you don't control def.com.
> Then the def.com controller (or Evil Person) can force you to use
> name-vhosts just by mapping def.com to 10.1.1.1.

How is this a problem?  Oh, I suppose for clients which don't support
Host:, this might make def.com the default host for that IP address, but I
don't know...

> Essentially, using DNS names in <VirtualHost> statements should be
> entirely avoided.  
>
> This is true in pre-apache-1.1 syntax as well, since
> def.com could force denial of service for abc.com if their ip addresses
> match and def.com appears first (last?) in the config file.
> 
> Fortunately, I think we already have workarounds for all of the above.
> Always use ip addresses and ServerName, even when defining a name-vhost.
> For name-vhosts use ServerAlias to define the name your server responds
> to.

I'd prefer the name the vhost responds to should be a union of the
ServerName and ServerAlias settings, not just ServerAlias.  Ugh, but then
you get things like

<VirtualHost 204.76.138.65>
ServerName www.apache.org
....
</VirtualHost>

<VirtualHost 204.76.138.65>
ServerName dev.apache.org>
...
</VirtualHost>

hmm... slight semantic change, I suppose.

> So... do we want to rethink the syntax?  At the moment it's actually a bad
> thing to be forced into name-vhost mode because not all clients support
> it.  In the future this may not be an issue.

On the flip side, the current setup would allow ISP's who are currently
burning IP addresses with vhosts to allow those sites whose DNS is
controlled elsewhere to migrate without having to coordinate it with the
server authority; in other words, if I have a web site at best.com using
www.bong.com, and best currently gives me an IP number but want it back,
then I can change the IP mapping of "www.bong.com" at any time to the
"centralized" best.com vhost IP number, and I won't need to coordinate
that with best's sysadmins.  So, what to some people is a bug, is a
feature to others. :)

> The documentation should be updated to at least mention this concern,
> and maybe some other DNS concerns:
> 
> - Apache doesn't do double reverse lookup for DNS based authentication.
>     There's even a helpful example of how to do it wrong in the
>     access.conf-dist file for the status handler... which an "attacker"
>     can easily spoof by playing with their reverse map.  Naive users
>     (heck even HotWired's config had this problem when I arrived :)
>     will fall into this trap.
> 
>     (I know there's the mention of MAXIMUM_DNS in the Configuration file.)

At least when I wrote about this for my book, I did mention that
host-based access control was unsafe unless used with MAXIMUM_DNS (time
to make this a runtime config?) or unless one used IP numbers.

> - If ServerName isn't set, apache does a reverse lookup in
>     default_server_hostnames to find it.  If DNS is down when the
>     server boots this will cause ServerName to be defaulted incorrectly.
> 
>     Actually, default_server_hostnames will segfault if somewhere
>     between get_local_host and "main = gethostbyname..." there is a DNS
>     failure... or if the servername is bogus.  patch:
...
> +     if( main == NULL ) {
> +         fprintf(stderr,"httpd: cannot resolve main ServerName.\n");
> +         exit(1);
> +     }

Wouldn't it be saner to just set it to the IP number?

> P.S. People doing SSL for apache 1.1 should watch out for the "http://"
> comparisons in check_hostalias().

Eeek!

> P.P.S.  Hostnames such as "123.hotwired.com" are valid, yet find_allowdeny
> does not properly handle them.  This should be put on Known Bugs.  Be
> careful when fixing this because just removing the isalpha() check creates
> a security hole, consider the DNS map "1.1.1.1.in-addr.arpa IN PTR 2.2.2."
> if the user has a config line "allow from 2.2.2" it will allow 1.1.1.1 in
> (unless -DMAXIMUM_DNS).  -- which is bad because it breaks people who
> understand double reverse lookup and are trying to avoid it by using
> only ip addresses on allow/deny statements.

Good one.  I've added it to the known_bugs page.

> P.P.P.S.  A response to the original article I was following up on:
> 
> In article <ho...@bauhaus.organic.com>,
> Cliff Skolnick  <ne...@hyperreal.com> wrote:
> >In your vitual host fixup, can you do one of the following:
> >
> >If a second virtual host directive is givin for an existing defined
> >virtual host, then:
> >
> >1) A warning that the first directive's information is lost is
> >output to the error log and/or stderr.
> >
> >2) The new values are merged into the existing virtual host.
> 
> I think I'd rather see a warning... I'm not sure why merging would
> be useful -- unless you want to be able to define a virtualhost piecemeal,
> a bit here, a bit there spread all over the config file(s).

I'd agree.

> It's also a pain in the butt to solve considering you have to compare
> all possible methods of accessing each virtual server to even find it.
> Sure, pure ip comparison would have been easy... but now with ServerAlias
> and name-vhosts it's not fun.

Woohoo!

	Brian

--=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--
brian@organic.com  www.apache.org  hyperreal.com  http://www.organic.com/JOBS


security holes and other fun stuff

Posted by Dean Gaudet <dg...@hotwired.com>.
While thinking on my VirtualHost syntax problem, and Cliff's request
I think I've discovered several configuration-related security (and
reliability) problems, one of which bypasses access controls.

First, a serious one.  Suppose the same webserver has an internal vhost,
an external vhost, and access to the internal vhost is restricted
only by ip filtering (this is not unreasonable -- consider the use of
private network numbers).  We now parse HTTP/1.1 requests using either
"http://blah.com/asdf" URIs or Host: headers.  If you send a request to
the public address with a URI or Host: header pointing at the private
address the server will serve up the page.  I'm zoned at the moment,
so someone else should follow the flow through read_request() and into
read_request_line() and then into check_hostalias().  If I'm right,
we've got a problem.

Now on to some more mundane concerns.  Let "name-vhosts" refer to HTTP/1.1
virtual hosts, and "ip-vhosts" refer to ip-based virtual hosts.

Remember while reading this that it's common for ISPs to *not* be primary
for some customers' DNS, while at the same time hosting websites for
such customers.  Also remember that our docs all suggest the use of DNS
names instead of IPs when configuring either name-vhosts or ip-vhosts.

There are several reliability/denial-of-service possibilities involving
the use of DNS names in <VirtualHost> statements.  If name-vhosts (or
ip-vhosts) are configured in the manner described in the documentation
(i.e. using DNS names) then the server won't boot if DNS isn't working.
Furthermore, if any of the names used are in uncontrolled domains (i.e.
secondary or no local authority) then a third party can cause denial
of service.

Another possibility involving uncontrolled DNS:

    <VirtualHost abc.com>
    ...
    </VirtualHost>

    <VirtualHost def.com>
    ...
    </VirtualHost>

Suppose abc.com is your "main server address" and maps to 10.1.1.1 and
you expect def.com to map to 10.1.1.2, but you don't control def.com.
Then the def.com controller (or Evil Person) can force you to use
name-vhosts just by mapping def.com to 10.1.1.1.

Essentially, using DNS names in <VirtualHost> statements should be
entirely avoided.  This is true in pre-apache-1.1 syntax as well, since
def.com could force denial of service for abc.com if their ip addresses
match and def.com appears first (last?) in the config file.

Fortunately, I think we already have workarounds for all of the above.
Always use ip addresses and ServerName, even when defining a name-vhost.
For name-vhosts use ServerAlias to define the name your server responds
to.

So... do we want to rethink the syntax?  At the moment it's actually a bad
thing to be forced into name-vhost mode because not all clients support
it.  In the future this may not be an issue.

The documentation should be updated to at least mention this concern,
and maybe some other DNS concerns:

- Apache doesn't do double reverse lookup for DNS based authentication.
    There's even a helpful example of how to do it wrong in the
    access.conf-dist file for the status handler... which an "attacker"
    can easily spoof by playing with their reverse map.  Naive users
    (heck even HotWired's config had this problem when I arrived :)
    will fall into this trap.

    (I know there's the mention of MAXIMUM_DNS in the Configuration file.)

- If ServerName isn't set, apache does a reverse lookup in
    default_server_hostnames to find it.  If DNS is down when the
    server boots this will cause ServerName to be defaulted incorrectly.

    Actually, default_server_hostnames will segfault if somewhere
    between get_local_host and "main = gethostbyname..." there is a DNS
    failure... or if the servername is bogus.  patch:

*** http_main.c.orig    Sat Jul 13 00:02:03 1996
--- http_main.c Sat Jul 13 00:03:29 1996
***************
*** 1063,1068 ****
--- 1063,1074 ----

      def_hostname = s->server_hostname;
      main = gethostbyname(def_hostname);
+     if( main == NULL ) {
+         fprintf(stderr,"httpd: cannot resolve main ServerName.\n");
+         exit(1);
+     }
+

      /* Then virtual hosts */


Dean

P.S. People doing SSL for apache 1.1 should watch out for the "http://"
comparisons in check_hostalias().

P.P.S.  Hostnames such as "123.hotwired.com" are valid, yet find_allowdeny
does not properly handle them.  This should be put on Known Bugs.  Be
careful when fixing this because just removing the isalpha() check creates
a security hole, consider the DNS map "1.1.1.1.in-addr.arpa IN PTR 2.2.2."
if the user has a config line "allow from 2.2.2" it will allow 1.1.1.1 in
(unless -DMAXIMUM_DNS).  -- which is bad because it breaks people who
understand double reverse lookup and are trying to avoid it by using
only ip addresses on allow/deny statements.

P.P.P.S.  A response to the original article I was following up on:

In article <ho...@bauhaus.organic.com>,
Cliff Skolnick  <ne...@hyperreal.com> wrote:
>In your vitual host fixup, can you do one of the following:
>
>If a second virtual host directive is givin for an existing defined
>virtual host, then:
>
>1) A warning that the first directive's information is lost is
>output to the error log and/or stderr.
>
>2) The new values are merged into the existing virtual host.

I think I'd rather see a warning... I'm not sure why merging would
be useful -- unless you want to be able to define a virtualhost piecemeal,
a bit here, a bit there spread all over the config file(s).

It's also a pain in the butt to solve considering you have to compare
all possible methods of accessing each virtual server to even find it.
Sure, pure ip comparison would have been easy... but now with ServerAlias
and name-vhosts it's not fun.

Re: syntax questions

Posted by Cliff Skolnick <cl...@organic.com>.

In your vitual host fixup, can you do one of the following:

If a second virtual host directive is givin for an existing defined
virtual host, then:

1) A warning that the first directive's information is lost is
output to the error log and/or stderr.

2) The new values are merged into the existing virtual host.


I just handled a bug report where the guy had <VirtualHost a.b.c.d>
directives in httpd.conf and srm.conf :)  Of course once it saw the
directive in srm.conf, poof all httpd.conf stuff was silently forgotten.

Cliff

--
Cliff Skolnick, CIO      http://www.organic.com/     cliff@organic.com
Organic Online, Inc.       ** we're hiring **           (415) 278-5650
"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety." -- Benjamin Franklin, 1759