You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Brian Behlendorf <br...@organic.com> on 1996/04/24 22:37:31 UTC
this host crap
is mighty interesting. Some of the bizarre Host: headers I've been sent:
HYPERREAL.COM
WWW.APACHE.ORG
(okay, not that bizarre, but a reminder that we need to account for
capitalization)
bong.com:80
hyperreal.com:80
taz.hyperreal.com:80
(a reminder that we need to deal with port #'s - I'd prefer to in this
case sent back a 302 Location: (or is it 303?) removing the redundant
:80 URL)
vrml.wired.com.
www.apache.org.
www.hyperreal.com.
(a reminder that it looks like we'll have to handle ending-periods too)
Truely fucked up:
fgw
(sent by "NCSA_Mosaic/2.7b3 (X11;meson 4_52 mips)" and
"NCSA_Mosaic/2.7b3 (X11;krypton 4_52 mips)" when accessing
ecstasy.org, all coming from relay4.oleane.net
hyperreal.com:
(is this legal?)
hyperreal.com:70
(this is coming from a lot of different UA's and hosts, all the UA's
have "via proxy gateway CERN-HTTPD/3.0 libwww/2.17" attached, I
suppose the CERN proxy has a problem with "http://hyperreal.com:70",
huh?)
images.bianca.com
(previously noted, I'm working with Dean on this one.)
linux
linux.cis.nctu.edu.tw
www.ukweb.com
These are the most bizarre. The actual logfile entries claim they are
coming from "linux.cis.nctu.edu.tw" and "www.ukweb.com" respectively,
and they appear to be robotic in nature, yet they have the User-Agent
set to valid Mozilla user-agents (like Mozilla/3.0b2 (X11; I; Linux
1.3.94 i586), sometimes i486)) sometimes going via a proxy server,
sometimes not. I'm almost wondering if this is a bug of some sort -
I've sent mail to Mark and to the .tw mirror maintainer (this is a
mirror I had not been informed of, and isn't on our pages yet) to see
if it's really from them, or if some sort of corruption from the
Referer: field is coming in somehow.
www-cache.funet.fi
Apparently the cache at www-cache.funet.fi (which doesn't appear to
identify itself in the user-agent header, maybe it does in the
Forwarded: header, I don't know) decided to add a "Host:
www-cache.funet.fi". The browser which sent this was NCSA_Mosaic/2.7b3 (X11;IRIX 5.3 IP19)
Maybe it was the browser, but I saw lots of other NCSA_Mosaic/2.7b3
X11's which appeared to handle proxies without a problem. No, wait -
this was also the cause of another bogus Host: header, "fgw". I
haven't seen any requests from NCSA_Mosaic/2.7b4 through a proxy yet, so
maybe this is a bug in XMosaic. You folks at NCSA want to look at this?
www.sandbox.net
Okay, so both "Mozilla/2.01Gold (Win95; I)" and "Mozilla/2.0
(Macintosh; I; 68K)" sent this erroneously - the URL in these cases was
(get ready)
http://www.sandbox.net/cyberhunt2/prot-bin/webfilter/www.lycos.com:80/cgi-bin/pursuit?query=faberge+eggs
and
http://www.sandbox.net/cyberhunt2/prot-bin/webfilter/www.lycos.com/cgi-bin/nph-randurl/cgi-bin/largehostpursuit1.html?query=relic&maxhits=20
This is a protected service so I can't see what type of response these
people really got - both requests came from "www.tracer.com". Maybe
Netscape 2.0 doesn't change the value of the Host: header after a 302
or 303 redirect?
www.webville.com
6 bogus requests were made, all from the same remote host and with the same
client (Mozilla/2.0 (Win16; I)), with the referer being
"http://www.webville.com/oak/Marco-25/archive.html". Looking at that
page, there are references to hyperreal in addition to lots of other
places, but I don't see anything that should explicitly trigger such a
bogus request.
>From all of these, I get the feeling that handling bogus Host: headers is
going to be an interesting situation. Since the migration path will not
be smooth, one option I'd like to have is to be able to, on the absence
of a Host: header or the existance of a bogus one, return an error,
something like "Malformed Request". Roy will no doubt have opinions on
this. :)
Forward where appropriate.
Brian
--=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--
brian@organic.com | We're hiring! http://www.organic.com/Home/Info/Jobs/
Re: this host crap
Posted by Mark J Cox <ma...@ukweb.com>.
> www.ukweb.com
>
> These are the most bizarre. The actual logfile entries claim they are
> coming from "linux.cis.nctu.edu.tw" and "www.ukweb.com" respectively,
www.ukweb.com runs Apache 1.1b1 and has "ProxyPass" set up to map "/apache"
to "www.apache.org/".
It looks like the proxy module is passing on the "Host" header. I've not
checked the proxy code but it seems the most logical explanation.
Mark
Re: this host crap
Posted by Dean Gaudet <dg...@hotwired.com>.
In article <Pi...@fully.organic.com> you write:
>fgw
[...]
Not that this is necessarily the case for this one... but something to
worry about. Netscape doesn't ever send/use FQDNs when the user types
in an unqualified name. i.e. try "http://www/" and you'll just get
"Host: www". It doesn't treat cookies from "www" as if they come from
"www.fqdn.com". It doesn't share www authentication with www.fqdn.com.
The list is probably longer...
I haven't checked any other client's behaviour in this respect.
It's great trying to train people to use fqdns when they're sorta
comfortable with adding CNAMEs for single letter host names 'cause they
think it's a great timesaver. I don't recall how many "bug reports"
and "server problems" I've had to "debug" related to this.
Dean
Re: this host crap
Posted by Alexei Kosut <ak...@nueva.pvt.k12.ca.us>.
On Wed, 24 Apr 1996, Brian Behlendorf wrote:
> HYPERREAL.COM
> WWW.APACHE.ORG
> (okay, not that bizarre, but a reminder that we need to account for
> capitalization)
We do.
> bong.com:80
> hyperreal.com:80
> taz.hyperreal.com:80
> (a reminder that we need to deal with port #'s - I'd prefer to in this
We do.
> case sent back a 302 Location: (or is it 303?) removing the redundant
> :80 URL)
Bad idea. According to the HTTP/1.1 spec, the Host header is not
entirely related to the URL the client typed in. Namely, it's
host:port, with port defaulting to 80. So a client would be perfectly
within its rights sending "Host: www.apache.org:80", even if the user
typed in or was linked to "http//www.apache.org/".
> vrml.wired.com.
> www.apache.org.
> www.hyperreal.com.
>
> (a reminder that it looks like we'll have to handle ending-periods too)
Rgph. We don't do that. I suppose we could. I'll think about it.
> hyperreal.com:
>
> (is this legal?)
I don't think so. But we handle it correctly.
> hyperreal.com:70
>
> (this is coming from a lot of different UA's and hosts, all the UA's
> have "via proxy gateway CERN-HTTPD/3.0 libwww/2.17" attached, I
> suppose the CERN proxy has a problem with "http://hyperreal.com:70",
> huh?)
But, we already knew that, yes... ?
> linux
> linux.cis.nctu.edu.tw
> www.ukweb.com
>
> These are the most bizarre. The actual logfile entries claim they are
> coming from "linux.cis.nctu.edu.tw" and "www.ukweb.com" respectively,
> and they appear to be robotic in nature, yet they have the User-Agent
> set to valid Mozilla user-agents (like Mozilla/3.0b2 (X11; I; Linux
> 1.3.94 i586), sometimes i486)) sometimes going via a proxy server,
> sometimes not. I'm almost wondering if this is a bug of some sort -
> I've sent mail to Mark and to the .tw mirror maintainer (this is a
> mirror I had not been informed of, and isn't on our pages yet) to see
> if it's really from them, or if some sort of corruption from the
> Referer: field is coming in somehow.
I sure hope not... Could be someone was poring over a spec, came
across Host and misinterpreted it to mean the browser's hostname... I
hope not.
> www-cache.funet.fi
>
> Apparently the cache at www-cache.funet.fi (which doesn't appear to
> identify itself in the user-agent header, maybe it does in the
> Forwarded: header, I don't know) decided to add a "Host:
> www-cache.funet.fi". The browser which sent this was NCSA_Mosaic/2.7b3 (X11;IRIX 5.3 IP19)
> Maybe it was the browser, but I saw lots of other NCSA_Mosaic/2.7b3
> X11's which appeared to handle proxies without a problem. No, wait -
> this was also the cause of another bogus Host: header, "fgw". I
> haven't seen any requests from NCSA_Mosaic/2.7b4 through a proxy yet, so
> maybe this is a bug in XMosaic. You folks at NCSA want to look at this?
Here's my bet: NCSA Mosaic 2.7b3, when talking to a proxy, sends a
Host header with the proxy's name. This could explain
www-cache.funet.fi, the .tw and ukweb ones, and even fgw - if it's an
internal name of a proxy.
> www.sandbox.net
>
> Okay, so both "Mozilla/2.01Gold (Win95; I)" and "Mozilla/2.0
> (Macintosh; I; 68K)" sent this erroneously - the URL in these cases was
> (get ready)
>
> http://www.sandbox.net/cyberhunt2/prot-bin/webfilter/www.lycos.com:80/cgi-bin/pursuit?query=faberge+eggs
Hmm.
> http://www.sandbox.net/cyberhunt2/prot-bin/webfilter/www.lycos.com/cgi-bin/nph-randurl/cgi-bin/largehostpursuit1.html?query=relic&maxhits=20
Hmm hmm.
> This is a protected service so I can't see what type of response these
> people really got - both requests came from "www.tracer.com". Maybe
> Netscape 2.0 doesn't change the value of the Host: header after a 302
> or 303 redirect?
Could be. Or it could be a Netscape clone...
> www.webville.com
>
> 6 bogus requests were made, all from the same remote host and with the same
> client (Mozilla/2.0 (Win16; I)), with the referer being
> "http://www.webville.com/oak/Marco-25/archive.html". Looking at that
> page, there are references to hyperreal in addition to lots of other
> places, but I don't see anything that should explicitly trigger such a
> bogus request.
Don't have a clue about that one.
> >From all of these, I get the feeling that handling bogus Host: headers is
> going to be an interesting situation. Since the migration path will not
> be smooth, one option I'd like to have is to be able to, on the absence
> of a Host: header or the existance of a bogus one, return an error,
> something like "Malformed Request". Roy will no doubt have opinions on
> this. :)
This is not neccessary. Malformed headers, if they don't pass muster, and are
treated like they didn't exist... just make all your servers
VirtualHosts, and make the "main" server just a page that says "hey,
you, get a browser that supports Host: correctly."
If you want them seperately, that's something different.
--
________________________________________________________________________
Alexei Kosut <ak...@nueva.pvt.k12.ca.us>
URL: http://www.nueva.pvt.k12.ca.us/~akosut/
Lefler on IRC, DALnet <http://www.dal.net/>