You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Marc Slemko <ma...@worldgate.com> on 1998/01/01 23:17:00 UTC

r->hostname vs. s->server->server_hostname

background: say you want to use mod_rewrite to implement large numbers of
pseudo vhosts (without using real vhosts with their overhead), eg. one for
each user. It appears like this _almost_ works.  You need to make a custom
log using {Host}i to get the server name logged right, but that isn't a
big deal. 

The problem comes with when Apache has to generate the servername.  This
happens in several places, but one most visable externally is in the
creation of redirects; they get redirected to the ServerName name instead
of the pseudo-vhost name.

Why not use r->hostname instead of s->server_hostname for generating the
redirect URL (and possibly other things... but you have to be very careful
there...) if it is set?  This has the added advantages of eliminating all
the double auth problems when someone gets a redirect from http://site/foo
to http://site.domain.com/foo/ in addition to fixing the redirect problem
here.

The problem is that construct_url doesn't take the request_rec but only
the server_rec.  A new function could be added easily enough that took the
request_rec instead and did this, since all construct_url does is:

API_EXPORT(char *) construct_url(pool *p, const char *uri, const
server_rec *s)
{
    return pstrcat(p, "http://",
                   construct_server(p, s->server_hostname, s->port),
                   uri, NULL);
}

The disadvantage of this idea is that it doesn't discourage URL-variance,
however it isn't encouraging it; just lessening the effect of it changing
in midstream.  This would eliminate a lot of user problems since most
clients send the Host: header.

Comments?


"fake" vhosts now working- Thanks!

Posted by Brian Atkins <br...@posthuman.com>.
I have completed hacking our 1.2.4 server to allow "fake" vhosts
for our subdomain users. I took it live about 30 minutes ago
so we will see if any incompatibilites crop up... I tested it
on everything I could think of including Frontpage and it all
seems to work ok. We are now serving over 3300 vhosts and our
Apache process sizes are around 1.5 megs (less than 1 meg
resident).

I would like to personally thank Marc Slemko for his advice
regarding this- couldn't have done it with you! And also I
would like to say, that as a person who has very little real
experience with C, the Apache code is extremely well laid
out and easy to work on- even for me!

Hypermart bows to your greatness :-)
-- 
The future has arrived; it's just not evenly distributed.
                                                       -William Gibson
______________________________________________________________________
Visit Hypermart at http://www.hypermart.net for free business hosting!

Re: r->hostname vs. s->server->server_hostname

Posted by Rob Hartill <ro...@imdb.com>.
This won't help unless you use mod_perl, but anyway...

On Wed, 7 Jan 1998, Brian Atkins wrote:

> Ok, some progress now that I finally have a bit of time. I modified
> the prototype of construct_url to include the request_rec and
> modified the function and all references to it (I only found one
> in mod_dir.c?) to use the r->hostname in place of the old
> server_hostname. It works great!
> 
> Next I dropped in:
> table_set (e, "HOSTNAME", r->hostname);
> into add_common_vars() in util_script.c and this is the custom
> log format we are using:
> LogFormat "\"%{HOSTNAME}e\" %h %l %u %t \"%r\" %s %b \"%{Referer}i\"
> \"%{User-agent}i\""
> But in the logs now we get "-" instead of the hostname. Any
> idea what could be wrong?

I wanted to do something similar here.
I found a solution using mod_perl's  <Perl> </Perl> config parsing system:

for my $h (@virtual_hosts) {
   my $fingerprint = $logas{$h} ? $logas{$h} : "-";
   $Configuration::PerlConfig .= qq|
<VirtualHost $h.imdb.com>
  ServerName  $h.imdb.com
  LogFormat "%h $fingerprint %u %t \\"%r\\" %s %b \\"\\" \\"%{User-agent}i\\""
</VirtualHost>
|;  
}



--
Rob Hartill                              Internet Movie Database (Ltd)
http://www.moviedatabase.com/   .. a site for sore eyes.


Re: r->hostname vs. s->server->server_hostname

Posted by Brian Atkins <br...@posthuman.com>.
Ok, some progress now that I finally have a bit of time. I modified
the prototype of construct_url to include the request_rec and
modified the function and all references to it (I only found one
in mod_dir.c?) to use the r->hostname in place of the old
server_hostname. It works great!

Next I dropped in:
table_set (e, "HOSTNAME", r->hostname);
into add_common_vars() in util_script.c and this is the custom
log format we are using:
LogFormat "\"%{HOSTNAME}e\" %h %l %u %t \"%r\" %s %b \"%{Referer}i\"
\"%{User-agent}i\""
But in the logs now we get "-" instead of the hostname. Any
idea what could be wrong?

Assuming we get the logging working, the last incompatibility
we have run into with these "fake" vhosts is Frontpage. For
instance, right now we have a copy of this server running on
port 8000, and if you try to List Webs on brian5.hypermart.net:8000
in Frontpage you will get a listing of all of our webs instead of
the single <root web> of brian5 (a test account of mine). So
Frontpage obviously is still picking up www.hypermart.net instead
of brian5.hypermart.net somewhere... What else should I be changing
in the source to get this working? CGI vars?

Thanks for any help
-- 
The future has arrived; it's just not evenly distributed.
                                                       -William Gibson
______________________________________________________________________
Visit Hypermart at http://www.hypermart.net for free business hosting!

Re: r->hostname vs. s->server->server_hostname

Posted by Dean Gaudet <dg...@arctic.org>.
I think Martin is on vacation still.  He was working on a proper API for
url parsing and construction.  So it's somewhat orthogonal to this. 

I'd say just start grepping the source for ->server_hostname and ->port
... against 1.3 of course :)

If you're new to apache development, take a peek at http://dev.apache.org/
for some details on where to get the latest code and whatnot. 

Dean

On Sat, 3 Jan 1998, Brian Atkins wrote:

> Hi, we were the ones who posted about this to the newsgroup...
> Before I dive into the code to try and hack it myself, I'd like
> to know if anyone out there is working on this or plans to work
> on it (you mention some patches in progress below...) ?
> 
> Thanks, and any tips as to what lines to look at in the code
> would be most helpful!
> 
> Dean Gaudet wrote:
> > 
> > There's like a half-dozen PRs mentioning the related issue of how the port
> > number is chosen for generating these "canonical" URLs.  My opinion on
> > this topic is that web servers have a single canonical name, and catering
> > to "http://www/" is not worth any effort in core code.  I'd rather just
> > advocate a mod_rewrite rule that sends a redirect to correct broken URLs.
> > Apache right now dictates that a server has a single canonical name.
> > 
> > That doesn't mean I'd veto a change that gets rid of the concept of a
> > canonical name, and generates URLs using (r->hostname ? r->hostname :
> > r->server->server_hostname) and ntohs(r->connection->local_addr.sin_port).
> > 
> > If we go that way, the Port directive should be disallowed inside vhosts,
> > and Port should be documented as only there for legacy compatibility with
> > NCSA, in the main server only.
> > 
> > Some of this is related to Martin's URL parsing patches that are in
> > progress still.
> > 
> > Dean
> > 
> > On Thu, 1 Jan 1998, Marc Slemko wrote:
> > 
> > > background: say you want to use mod_rewrite to implement large numbers of
> > > pseudo vhosts (without using real vhosts with their overhead), eg. one for
> > > each user. It appears like this _almost_ works.  You need to make a custom
> > > log using {Host}i to get the server name logged right, but that isn't a
> > > big deal.
> > >
> > > The problem comes with when Apache has to generate the servername.  This
> > > happens in several places, but one most visable externally is in the
> > > creation of redirects; they get redirected to the ServerName name instead
> > > of the pseudo-vhost name.
> > >
> > > Why not use r->hostname instead of s->server_hostname for generating the
> > > redirect URL (and possibly other things... but you have to be very careful
> > > there...) if it is set?  This has the added advantages of eliminating all
> > > the double auth problems when someone gets a redirect from http://site/foo
> > > to http://site.domain.com/foo/ in addition to fixing the redirect problem
> > > here.
> > >
> > > The problem is that construct_url doesn't take the request_rec but only
> > > the server_rec.  A new function could be added easily enough that took the
> > > request_rec instead and did this, since all construct_url does is:
> > >
> > > API_EXPORT(char *) construct_url(pool *p, const char *uri, const
> > > server_rec *s)
> > > {
> > >     return pstrcat(p, "http://",
> > >                    construct_server(p, s->server_hostname, s->port),
> > >                    uri, NULL);
> > > }
> > >
> > > The disadvantage of this idea is that it doesn't discourage URL-variance,
> > > however it isn't encouraging it; just lessening the effect of it changing
> > > in midstream.  This would eliminate a lot of user problems since most
> > > clients send the Host: header.
> > >
> > > Comments?
> > >
> > >
> 
> -- 
> The future has arrived; it's just not evenly distributed.
>                                                        -William Gibson
> ______________________________________________________________________
> Visit Hypermart at http://www.hypermart.net for free business hosting!
> 


Re: r->hostname vs. s->server->server_hostname

Posted by Brian Atkins <br...@posthuman.com>.
Hi, we were the ones who posted about this to the newsgroup...
Before I dive into the code to try and hack it myself, I'd like
to know if anyone out there is working on this or plans to work
on it (you mention some patches in progress below...) ?

Thanks, and any tips as to what lines to look at in the code
would be most helpful!

Dean Gaudet wrote:
> 
> There's like a half-dozen PRs mentioning the related issue of how the port
> number is chosen for generating these "canonical" URLs.  My opinion on
> this topic is that web servers have a single canonical name, and catering
> to "http://www/" is not worth any effort in core code.  I'd rather just
> advocate a mod_rewrite rule that sends a redirect to correct broken URLs.
> Apache right now dictates that a server has a single canonical name.
> 
> That doesn't mean I'd veto a change that gets rid of the concept of a
> canonical name, and generates URLs using (r->hostname ? r->hostname :
> r->server->server_hostname) and ntohs(r->connection->local_addr.sin_port).
> 
> If we go that way, the Port directive should be disallowed inside vhosts,
> and Port should be documented as only there for legacy compatibility with
> NCSA, in the main server only.
> 
> Some of this is related to Martin's URL parsing patches that are in
> progress still.
> 
> Dean
> 
> On Thu, 1 Jan 1998, Marc Slemko wrote:
> 
> > background: say you want to use mod_rewrite to implement large numbers of
> > pseudo vhosts (without using real vhosts with their overhead), eg. one for
> > each user. It appears like this _almost_ works.  You need to make a custom
> > log using {Host}i to get the server name logged right, but that isn't a
> > big deal.
> >
> > The problem comes with when Apache has to generate the servername.  This
> > happens in several places, but one most visable externally is in the
> > creation of redirects; they get redirected to the ServerName name instead
> > of the pseudo-vhost name.
> >
> > Why not use r->hostname instead of s->server_hostname for generating the
> > redirect URL (and possibly other things... but you have to be very careful
> > there...) if it is set?  This has the added advantages of eliminating all
> > the double auth problems when someone gets a redirect from http://site/foo
> > to http://site.domain.com/foo/ in addition to fixing the redirect problem
> > here.
> >
> > The problem is that construct_url doesn't take the request_rec but only
> > the server_rec.  A new function could be added easily enough that took the
> > request_rec instead and did this, since all construct_url does is:
> >
> > API_EXPORT(char *) construct_url(pool *p, const char *uri, const
> > server_rec *s)
> > {
> >     return pstrcat(p, "http://",
> >                    construct_server(p, s->server_hostname, s->port),
> >                    uri, NULL);
> > }
> >
> > The disadvantage of this idea is that it doesn't discourage URL-variance,
> > however it isn't encouraging it; just lessening the effect of it changing
> > in midstream.  This would eliminate a lot of user problems since most
> > clients send the Host: header.
> >
> > Comments?
> >
> >

-- 
The future has arrived; it's just not evenly distributed.
                                                       -William Gibson
______________________________________________________________________
Visit Hypermart at http://www.hypermart.net for free business hosting!

Re: r->hostname vs. s->server->server_hostname

Posted by Dean Gaudet <dg...@arctic.org>.
There's like a half-dozen PRs mentioning the related issue of how the port
number is chosen for generating these "canonical" URLs.  My opinion on
this topic is that web servers have a single canonical name, and catering
to "http://www/" is not worth any effort in core code.  I'd rather just
advocate a mod_rewrite rule that sends a redirect to correct broken URLs.
Apache right now dictates that a server has a single canonical name. 

That doesn't mean I'd veto a change that gets rid of the concept of a
canonical name, and generates URLs using (r->hostname ? r->hostname :
r->server->server_hostname) and ntohs(r->connection->local_addr.sin_port).

If we go that way, the Port directive should be disallowed inside vhosts,
and Port should be documented as only there for legacy compatibility with
NCSA, in the main server only. 

Some of this is related to Martin's URL parsing patches that are in
progress still. 

Dean

On Thu, 1 Jan 1998, Marc Slemko wrote:

> background: say you want to use mod_rewrite to implement large numbers of
> pseudo vhosts (without using real vhosts with their overhead), eg. one for
> each user. It appears like this _almost_ works.  You need to make a custom
> log using {Host}i to get the server name logged right, but that isn't a
> big deal. 
> 
> The problem comes with when Apache has to generate the servername.  This
> happens in several places, but one most visable externally is in the
> creation of redirects; they get redirected to the ServerName name instead
> of the pseudo-vhost name.
> 
> Why not use r->hostname instead of s->server_hostname for generating the
> redirect URL (and possibly other things... but you have to be very careful
> there...) if it is set?  This has the added advantages of eliminating all
> the double auth problems when someone gets a redirect from http://site/foo
> to http://site.domain.com/foo/ in addition to fixing the redirect problem
> here.
> 
> The problem is that construct_url doesn't take the request_rec but only
> the server_rec.  A new function could be added easily enough that took the
> request_rec instead and did this, since all construct_url does is:
> 
> API_EXPORT(char *) construct_url(pool *p, const char *uri, const
> server_rec *s)
> {
>     return pstrcat(p, "http://",
>                    construct_server(p, s->server_hostname, s->port),
>                    uri, NULL);
> }
> 
> The disadvantage of this idea is that it doesn't discourage URL-variance,
> however it isn't encouraging it; just lessening the effect of it changing
> in midstream.  This would eliminate a lot of user problems since most
> clients send the Host: header.
> 
> Comments?
> 
>