You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Andrew Finkenstadt <ka...@icon-stl.net> on 1998/11/03 06:52:53 UTC

Apache 2.0 ideas

Greetings.

Many of you have no idea who I am, so I thought I'd introduce myself and give
out a half-baked idea for Apache 2.0 development.

In my day job I'm a senior developer for an online game company whose
simultaneous interactive connection load is in the thousands into one product,
and in the tens of thousands in aggregate.  While I haven't directly written
any code that talks at low-levels to our user base, I have done quite a bit of
work with stateful expressions of HTML pages both with IIS and Apache with or
without mod_perl to help.

I've been reading new-httpd for several months and have been enlightened
several times at what the OSS process can be like.

I was doing some serious cogitating on how to distribute the load for a
generic application that has various parts communicating via a well-defined
messaging interface across potentially thousands of processors or processes,
and came across the LISTSERV method of disconnected virtual machines. (cf:
http://www.lsoft.com/listserv-hist.stm )  That said, a strong message passing
architecture (similar to the apache request_rec but designed to minimize
expensive memory-to-memory copies) would probably suffice to avoid multiple
independent processes like could be used elsewhere, or the monolithic
single-threaded (unix) process of LISTSERV.

In a pique of fancy I started sketching out on the back of an napkin at dinner
this evening just how you'd go about dividing the various portions of Apache
into DVMs, and basically came up with this model:

   USER sends 
     one or more (perhaps empty, perhaps lengthy) TRANSACTIONS to 
       a SERVER who eventually GENERATES 
         a (perhaps empty, perhaps lengthy, perhaps delayed) REPLY
   expected by the USER.

At heart, this describes just about any send-expect protocol, of which
HTTP/1.0 is one.  

HTTP/1.1 adds complexity by attempting to multiplex connections across one
expensive-to-bring-up connection, along with various add-ons for content
language negotiation and abilities to signal third-party waystations (cache
servers, proxy servers, etc) about the contents.  One must deal with this
complexity while still maintaining the still-unstateful transaction.  There's
no guarantee that a proxy server implementing /1.1 WOULDN'T intercomingle
MULTIPLE users' requests across the same connection, and so one can not assume
that kept-open connections have anything to do with each other.

Thus we deal with HTTP/1.1 conceptually by generating multiple transactions
and combining output back to the requesting user.

Thus, we end up with layers that:

  Read in and gather an entire transaction (POST/PUT data, etc)
  Submit the transaction-message to the server black box.
  Magically deal with 1.1 multiple transactions, output chunking,
    etc.

The server black box:

  receives a message (the digested transaction, which doesn't necessarily have
to come from an HTTP processor, it could just as easily come from a Gopher
processor, or command line exerciser),

  has various ways of knowing how to service the reply through the various
phases (authorization, authentication, etc),

  has various back end methods of retrieving data (file, process, CGI,
mod_perl transaction handler, et.al.),

  and passes the result back to the originator as yet another message.

If at any point it becomes important to transfer the message to another
processor instead of something in local memory, then it is passed across
transparently with its accompanying environment necessary for processing
(magic, but we do it here), and the result re-inserted into the output chain
when the message has been processed.

I would think this sort of processing would work on multiple processor boxes,
or in a single-shared-memory multi-thread architecture program where the
time-cost of copying memory around to toss messages around can be minimized.

I'm not sure how well this could be implemented using the current NSPR
implementation.

-A

Re: Apache 2.0 ideas

Posted by Rasmus Lerdorf <ra...@lerdorf.on.ca>.
> A thriving apache module ecology is very important.  Many interesting
> modules want to reside in their own process.  An I/O pipeline with
> useful semantic elements for modules lets Apache to what it should
> do: protocol, fast I/O, and module orchestration.

One of the most frequently asked questions I get from large ISP's with
respect to the PHP module is how they can set things up such that each
virtual host, or even each individual user can run their PHP scripts in a
secure manner.

My usual response is to either not run the Apache module version of PHP,
but instead run the CGI version through something like suExec.  This works
fine and is nice and secure for the most part, but the performance is
terrible.

Another response I give them is to tell them to run a separate pool of
httpd processes per virtual host.  If they are using name-based virtual
hosts this requires a bit of mod_rewrite trickery on port 80 of their main
server to redirect to whichever ip's and ports they have decided to run
the various pools on.  For IP-based virtual hosts it is actually a very
clean solution.

It would be nice if something could be done in 2.0 to allow a
multi-process and multi-threaded model where individual threaded processes
could run as different users.

-Rasmus


Re: Apache 2.0 ideas

Posted by Ben Hyde <bh...@pobox.com>.
I've fussed with the order here.

Dean Gaudet writes:
> ... We're not going to get anywhere further in
>the performance game without threads.  There's no point in even worrying
>about comparing the performance of unixes that lack threads... if they
>lack threads they probably also lack all the fundamental TCP/IP
>improvements necessary to even think about comparing HTTP performance.

Absolutely.

Dean Gaudet writes:
>On Tue, 3 Nov 1998, Andrew Finkenstadt wrote:
...
>> Yes, it would leave behind many flavors of Unix that don't have good support
>> for shared memory, but it would beat the pants out of Microsoft.
>
>Why worry about shared memory?  

A thriving apache module ecology is very important.  Many interesting
modules want to reside in their own process.  An I/O pipeline with
useful semantic elements for modules lets Apache to what it should
do: protocol, fast I/O, and module orchestration.

Two large unix applications NEVER can be linked together into a single
address space because they all have such interestingly baroque things
built atop the read/write/select/errno API.  I.e. they all have 
different process models - more power to them.

The unix culture presumes that a pipeline is the right way to
do this.  A pipeline with shared memory offers the chance at
zero copy.

Shared memory is the pipeline advocates trying to sing a few verses
from the performance hymn book - oh happy day.  We should all sing
a few hymns from the modularity hymn book.

  - ben


Re: Apache 2.0 ideas

Posted by Marc Slemko <ma...@znep.com>.
On Tue, 3 Nov 1998, Jim Gettys wrote:

> No, from userland, the fastest server will be one which caches (small)
> objects in memory, and then does a single send() of the cached memory.
> 
> File opens are expensive.  Save sendfile() for big objects, where the
> open overhead isn't significant.

No, you have to keep a cache of descriptors.  Then you completely avoid
having two fighting caches and inherit all the work that is already done
for the VM system.

Then you just need a system that scales well with large numbers of
descriptors.  Sun says Solaris 7 will handle "15000 times more open
sockets".  Can't figure out what that is 15000 more times than though; if
it is against 2.6, then a conservative estimate would say 7 would have to
handle at least 15 million.  Granted, that isn't necessarily per process.
Other systems are or will be making such things efficient as well.

I'm not too convinced that you are likely to get a huge win from merging
writes for multiple responses at the syscall level.  If you do, then I
would think that may show more that syscalls are too expensive on your
kernel.

I think there is a big advantage of sendfile() vs. mmap() in that the
implementation can pick the cheapest way to send small files depending on
the architecture.

> And yes, there was a crazy who thought putting X in the server was a win
> as well.  Didn't end up with better performance, and never got very
> stable (since a bug crashed your system, debugging was a pain).
> The CPU runs just as fast in user space as in kernel...

Yes, but a web server serving static content is far less CPU bound in
userland.  In fact, for the benchmark case, you can cut the processing CPU
down very very low.  Then the overhead for shipping the data can be
significant.


Re: Apache 2.0 ideas

Posted by Marc Slemko <ma...@znep.com>.
On Tue, 3 Nov 1998, Dean Gaudet wrote:

> 
> 
> On Tue, 3 Nov 1998, Jim Gettys wrote:
> 
> > No, from userland, the fastest server will be one which caches (small)
> > objects in memory, and then does a single send() of the cached memory.
> > 
> > File opens are expensive.  Save sendfile() for big objects, where the
> > open overhead isn't significant.
> 
> We can argue about it, but the best thing would be to measure ;) 
> 
> open()s aren't as expensive under linux as they are elsewhere... and
> sendfile() isn't "thread safe" in the sense that you can use a single fd
> with multiple threads (so caching open fds isn't worth it).  Linus keeps

Why not?

If the Linux one isn't, then it is broken and should be fixed AFAIK.


Re: Apache 2.0 ideas

Posted by Jim Gettys <jg...@pa.dec.com>.

> From: Dean Gaudet <dg...@arctic.org>
> Date: Tue, 3 Nov 1998 10:08:57 -0800 (PST)
> To: Jim Gettys <jg...@pa.dec.com>
> Cc: new-httpd@apache.org
> Subject: Re: Apache 2.0 ideas
> -----
> On Tue, 3 Nov 1998, Jim Gettys wrote:
> 
> > No, from userland, the fastest server will be one which caches (small)
> > objects in memory, and then does a single send() of the cached memory.
> >
> > File opens are expensive.  Save sendfile() for big objects, where the
> > open overhead isn't significant.
> 
> We can argue about it, but the best thing would be to measure ;)
> 

Yup.  Measurement is the only way.

> open()s aren't as expensive under linux as they are elsewhere... and
> sendfile() isn't "thread safe" in the sense that you can use a single fd
> with multiple threads (so caching open fds isn't worth it).  Linus keeps
> claiming that open() is the way to go, it'd be worthwhile to prove or
> disprove his claim.

Open on UNIX/Linux is relatively cheap; but this is cheap relative to
other operating systems (on a 1 mip vax, a file open on VMS was 10x as
expensive as on UNIX, at 1/4 CPU second). Things have changed somewhat
as the system speeded up, but I'd be amazed if Linus does too much better
than "conventional UNIX".

I think Linus is wrong here (from first hand experience).

And it doesn't solve your synchronization problem; a file can be updated
rather than replaced.

I agree with the attitude that mesurement is best; but 2 system calls/request
(open(), sendfile(); relative to a fraction of one isn't even remotely
comparable.  When I get my home system up under Linux, though, I may make
some measurements.
 
> 
> To cache things in memory requires synchronization between threads... to
> use open() lets the kernel do its best job of synchronization... which is
> really where I prefer to let that happen.  If userland could do fancy
> spinlock tricks I wouldn't worry about it so much.  But those are
> extremely non-portable.  I'd rather give the kernel as many opportunities
> as possible to parallelize on SMP systems.  ('cause then it's the kernel
> folks' problems to make things go fast ;)

Userland can do fancy spin-lock tricks, on good systems; they do a system
call only if the lock is contended for for a significant period.
Acquiring a lock should only be the cost of a trip to main memory,
on a good operating system.

Build for a good system here; those who don't measure up will profile,
find they are spending too much time in some part or the other, and
then fix their systems.  So long as Apache can be made to run on most
systems, you've won the portability game.  Vendors (and Linux) will
fix their systems as it becomes clear there is a win.

> 
> > Fundamentally, for a pipelined server with good buffering, you can end
> > up with much less than one system call/operation.  This is what makes
> 
> Yeah I showed this with apache 1.3 with a few small tweaks -- the main one
> required is to get rid of the calls to time() and use a word of shared
> memory for the time.  (This is functionality the kernel/libc folks should
> provide, either through shared mem or through the now ubiquitous time
> stamp counters on all modern processors.)  I showed 75 responses in 21
> syscalls.

Yup; X had the time problem too: events get timestamped.
The solution in X was either:
	o just do a time call on every batch  or input
	event. (crummy systems)
	o good systems put the time into a shared memory interface
	and update by the device driver as each event occurred. (smart systems)
Unfortunately, there are lots of dumb X ports out there (haven't seen
what XFree86 does).

> 
> > load (when cycles are scarcest); this is the ideal situation.  A web
> > server probably can't be as simpleminded, but you get the idea anyway.
> 
> In theory it can -- if you're doing userland threads and they're
> multiplexed with select() then you get much of the benefit of how X works.
> That's why I find the userland and userland/kernel hybrid approaches to
> threading so much more interesting than pure kernel threads.
> 
> (Note:  I know we could write a webserver without threads, much like
> squid, but it couldn't be apache then -- it's too hard to do general
> module support without threads or processes.)
> 
> > The problem this model faces for a Web server is how the server gets
> > informed that its underlying database is different, so that it can't
> > trust its in memory copy.  I leave this as an exercise to the readers :-).
> 
> The web server has one other thing going for it in kernel land -- intense
> usage of cached disk data.  X doesn't have that.  For example, a cached
> 1Mb file requires 256 4K pages.  If you've got an intelligent network card
> you can completely avoid 256 TLB misses on each response doing the work in
> the kernel -- or by providing a sendfile()-style interface... anything to
> avoid the need for v->p mappings.

Yup; my message said that sendfile() would likely be a win for large objects;
here the system call overhead is not significant.  But most web objects
are small; system overhead dominates most of the time.

Even this isn't necessarily true; on some systems, you can have large
TLB entries (megabytes in size).  The question I have no data on is
how many systems actually do anything with vadvise() to get the
system to "do the right thing".  But I believe a compromise between all
one or the other is a portable, high performance solution.  Exactly where
the size of objects is would be worth some performance analysis
once one has running code.

> 
> I really have to put a caveat on all of this:  I'm just blowing hot air, I
> haven't measured any of this, and I'm not likely to do it soon.
> 

I measured all of this for X, way back when (another words, I don't believe 
I'm blowing ANY smoke).  We had to beat the competition (of the day), 
which were kernel based systems.

It is truly a win to have the ease of debugging in user space.

I guarantee that with cleverness and care one will be able to always
beat kernel systems in userland.

I used to characterize X as "X is an exercise in avoiding system calls".
It is a good mantra.
				- Jim


Re: Apache 2.0 ideas

Posted by Dean Gaudet <dg...@arctic.org>.

On Tue, 3 Nov 1998, Jim Gettys wrote:

> No, from userland, the fastest server will be one which caches (small)
> objects in memory, and then does a single send() of the cached memory.
> 
> File opens are expensive.  Save sendfile() for big objects, where the
> open overhead isn't significant.

We can argue about it, but the best thing would be to measure ;) 

open()s aren't as expensive under linux as they are elsewhere... and
sendfile() isn't "thread safe" in the sense that you can use a single fd
with multiple threads (so caching open fds isn't worth it).  Linus keeps
claiming that open() is the way to go, it'd be worthwhile to prove or
disprove his claim. 

To cache things in memory requires synchronization between threads... to
use open() lets the kernel do its best job of synchronization... which is
really where I prefer to let that happen.  If userland could do fancy
spinlock tricks I wouldn't worry about it so much.  But those are
extremely non-portable.  I'd rather give the kernel as many opportunities
as possible to parallelize on SMP systems.  ('cause then it's the kernel
folks' problems to make things go fast ;)

> Fundamentally, for a pipelined server with good buffering, you can end 
> up with much less than one system call/operation.  This is what makes 

Yeah I showed this with apache 1.3 with a few small tweaks -- the main one
required is to get rid of the calls to time() and use a word of shared
memory for the time.  (This is functionality the kernel/libc folks should
provide, either through shared mem or through the now ubiquitous time
stamp counters on all modern processors.)  I showed 75 responses in 21
syscalls.

> load (when cycles are scarcest); this is the ideal situation.  A web
> server probably can't be as simpleminded, but you get the idea anyway.

In theory it can -- if you're doing userland threads and they're
multiplexed with select() then you get much of the benefit of how X works. 
That's why I find the userland and userland/kernel hybrid approaches to
threading so much more interesting than pure kernel threads. 

(Note:  I know we could write a webserver without threads, much like
squid, but it couldn't be apache then -- it's too hard to do general
module support without threads or processes.) 

> The problem this model faces for a Web server is how the server gets
> informed that its underlying database is different, so that it can't
> trust its in memory copy.  I leave this as an exercise to the readers :-).

The web server has one other thing going for it in kernel land -- intense
usage of cached disk data.  X doesn't have that.  For example, a cached
1Mb file requires 256 4K pages.  If you've got an intelligent network card
you can completely avoid 256 TLB misses on each response doing the work in
the kernel -- or by providing a sendfile()-style interface... anything to
avoid the need for v->p mappings.

I really have to put a caveat on all of this:  I'm just blowing hot air, I
haven't measured any of this, and I'm not likely to do it soon. 

Dean



Re: Apache 2.0 ideas

Posted by Jim Gettys <jg...@pa.dec.com>.
> Sender: new-httpd-owner@apache.org
> From: Dean Gaudet <dg...@arctic.org>
> Date: Mon, 2 Nov 1998 23:47:28 -0800 (PST)
> To: new-httpd@apache.org
> Subject: Re: Apache 2.0 ideas
> -----
> On Tue, 3 Nov 1998, Andrew Finkenstadt wrote:
> 
> > On further reflection and after reading the "Halloween Document" (
> > http://www.tuxedo.org/~esr/halloween.html ) and Microsoft's alleged desire
> to
> > more tightly integrate IIS into the kernel, ...
> 
> IBM and Sun have already done it.
> 
> > Yes, it would leave behind many flavors of Unix that don't have good support
> > for shared memory, but it would beat the pants out of Microsoft.
> 
> Why worry about shared memory?  We're not going to get anywhere further in
> the performance game without threads.  There's no point in even worrying
> about comparing the performance of unixes that lack threads... if they
> lack threads they probably also lack all the fundamental TCP/IP
> improvements necessary to even think about comparing HTTP performance.
> 
> > We should take a page from Oracle's book on semaphores and enqueues, by
> making
> > the critical sections as small as possible, and as fine-grained as possible,
> > allowing multiple processes access to the data without road-blocking.
> 
> There's essentially no userland syncrhonization required in a static
> content web server (i.e. a benchmark web server).  For example on linux
> open()/sendfile() should produce the fastest web server possible from
> userland... and there's nothing in there which requires userland to
> synchronize (you have to do a little magic with memory allocation).  So
> this is easy.
> 

No, from userland, the fastest server will be one which caches (small)
objects in memory, and then does a single send() of the cached memory.

File opens are expensive.  Save sendfile() for big objects, where the
open overhead isn't significant.

Lets take a page with embedded objects, most of which are small enough
to be cached.  If you do the server right and are caching in main memory
(rather than always sending from files), you can potentially do multiple
objects in a single writev() system call.

This beats a sequence of open()/sendfile()'s all to h*** and gone.

Fundamentally, for a pipelined server with good buffering, you can end 
up with much less than one system call/operation.  This is what makes 
the X Window System fast (when well implemented). One system call reads
a bunch of requests into a buffer, another writes the results into an
output buffer.  If there are a bunch of requests in a batch, you can
get well under 1 system call/operation (in the X case, which has a relatively
compact protocol (though not as compact as I'd do if we had it to do over
again), you are way under one system call/request.

The basic scheduling loop in the X server is to do a select(), which
tells you all the connections that have work to do; it then round robins
among those connections, and handles a buffer full of requests before
moving on; it only does another select when it has done all the work
it can on all connections.  This means the select overhead drops as
load on the server goes up, so that it runs at best performance at
load (when cycles are scarcest); this is the ideal situation.  A web
server probably can't be as simpleminded, but you get the idea anyway.

And yes, there was a crazy who thought putting X in the server was a win
as well.  Didn't end up with better performance, and never got very
stable (since a bug crashed your system, debugging was a pain).
The CPU runs just as fast in user space as in kernel...

The problem this model faces for a Web server is how the server gets
informed that its underlying database is different, so that it can't
trust its in memory copy.  I leave this as an exercise to the readers :-).
					- Jim

Re: Apache 2.0 ideas

Posted by Simon Spero <se...@tipper.oit.unc.edu>.
There are some places where you need locks in user-space, but not many, and
most can be moved out of the critical path. The most obvious (i.e. the one I
can remember before coffee) are logs and status tables. However, if you have
light-weight-threads, you can set up an extra thread to do the status tasks
that might block, leaving the criticial path thread to keep on going.

BTW, Dean is right about layered I/O being something to be leary of IFF a full
chain of up and down calls are needed for all sends. One of the main ideas of
the IHOP  flat stacks is that in the static case, the back-end should be able
to pass a file-name into the middle-end, and the middle-end cues up a
transmetafile in the front-end. You do need back-end layering if you want to
properly support on-the-fly encoding/format conversion, lazy MD5 caching, etc;
layering in the front end is a big win as well,as long as the middle end can
pick ILP'ed versions wherever possible.

Simon

Dean Gaudet wrote:

> On Tue, 3 Nov 1998, Marc Slemko wrote:
>
> > On Mon, 2 Nov 1998, Dean Gaudet wrote:
> >
> > > There's essentially no userland syncrhonization required in a static
> > > content web server (i.e. a benchmark web server).  For example on linux
> > > open()/sendfile() should produce the fastest web server possible from
> > > userland... and there's nothing in there which requires userland to
> > > synchronize (you have to do a little magic with memory allocation).  So
> > > this is easy.
> >
> > Header caching?
>
> Why bother?
>
> You can use a few rw locks to do the Date: header without much
> synchronization at all... caching headers may be worth it single cpu, but
> in a four way I bet you'll be a lot happier without it.
>
> Dean


Re: Apache 2.0 ideas

Posted by Dean Gaudet <dg...@arctic.org>.

On Tue, 3 Nov 1998, Marc Slemko wrote:

> You aren't caching open descriptors?

You can't use them on linux.  Linux sendfile doesn't have an offset
parameter, so you can't use the same fd from two threads at once... and
you'd have to pay a lseek() regardless.

Dean


Re: Apache 2.0 ideas

Posted by Marc Slemko <ma...@znep.com>.
On Tue, 3 Nov 1998, Dean Gaudet wrote:

> On Tue, 3 Nov 1998, Marc Slemko wrote:
> 
> > On Mon, 2 Nov 1998, Dean Gaudet wrote:
> > 
> > > There's essentially no userland syncrhonization required in a static
> > > content web server (i.e. a benchmark web server).  For example on linux
> > > open()/sendfile() should produce the fastest web server possible from
> > > userland... and there's nothing in there which requires userland to
> > > synchronize (you have to do a little magic with memory allocation).  So
> > > this is easy.

You aren't caching open descriptors?

> > 
> > Header caching?
> 
> Why bother?

Well you need metadata caching of some sort, no, even if you don't want
to cache generated headers?  

Even if you didn't for filesystem "backed" documents, other backings may
make getting metadata a lot more expensive.

> You can use a few rw locks to do the Date: header without much
> synchronization at all... caching headers may be worth it single cpu, but
> in a four way I bet you'll be a lot happier without it. 

Mmm.  Maybe.


Re: Apache 2.0 ideas

Posted by Dean Gaudet <dg...@arctic.org>.

On Tue, 3 Nov 1998, Marc Slemko wrote:

> On Mon, 2 Nov 1998, Dean Gaudet wrote:
> 
> > There's essentially no userland syncrhonization required in a static
> > content web server (i.e. a benchmark web server).  For example on linux
> > open()/sendfile() should produce the fastest web server possible from
> > userland... and there's nothing in there which requires userland to
> > synchronize (you have to do a little magic with memory allocation).  So
> > this is easy.
> 
> Header caching?

Why bother?

You can use a few rw locks to do the Date: header without much
synchronization at all... caching headers may be worth it single cpu, but
in a four way I bet you'll be a lot happier without it. 

Dean


Re: Apache 2.0 ideas

Posted by Marc Slemko <ma...@worldgate.com>.
On Mon, 2 Nov 1998, Dean Gaudet wrote:

> There's essentially no userland syncrhonization required in a static
> content web server (i.e. a benchmark web server).  For example on linux
> open()/sendfile() should produce the fastest web server possible from
> userland... and there's nothing in there which requires userland to
> synchronize (you have to do a little magic with memory allocation).  So
> this is easy.

Header caching?


Re: Apache 2.0 ideas

Posted by Dean Gaudet <dg...@arctic.org>.

On Tue, 3 Nov 1998, Andrew Finkenstadt wrote:

> On further reflection and after reading the "Halloween Document" (
> http://www.tuxedo.org/~esr/halloween.html ) and Microsoft's alleged desire to
> more tightly integrate IIS into the kernel, ...

IBM and Sun have already done it. 

> Yes, it would leave behind many flavors of Unix that don't have good support
> for shared memory, but it would beat the pants out of Microsoft.

Why worry about shared memory?  We're not going to get anywhere further in
the performance game without threads.  There's no point in even worrying
about comparing the performance of unixes that lack threads... if they
lack threads they probably also lack all the fundamental TCP/IP
improvements necessary to even think about comparing HTTP performance.

> We should take a page from Oracle's book on semaphores and enqueues, by making
> the critical sections as small as possible, and as fine-grained as possible,
> allowing multiple processes access to the data without road-blocking.

There's essentially no userland syncrhonization required in a static
content web server (i.e. a benchmark web server).  For example on linux
open()/sendfile() should produce the fastest web server possible from
userland... and there's nothing in there which requires userland to
synchronize (you have to do a little magic with memory allocation).  So
this is easy.

Dean



Re: Apache 2.0 ideas

Posted by Andrew Finkenstadt <ka...@icon-stl.net>.
On further reflection and after reading the "Halloween Document" (
http://www.tuxedo.org/~esr/halloween.html ) and Microsoft's alleged desire to
more tightly integrate IIS into the kernel, ...

(yes, it's been a long evening of thinking)

if you posit the presence of a large pool of shared memory between processes
of varying flavors (kernel-space and user-space) and the existence of a fast
semaphore capability in the underlying kernel to gate access to critical
segments of that shared memory (which is essentially how Windows NT operates),

then you can gain the zero memory copy capability very easily, by leaving the
various chunks of your data in the pool of shared memory, and sending them
directly from kernel space with a write-multi type system call aka message.

Yes, it would leave behind many flavors of Unix that don't have good support
for shared memory, but it would beat the pants out of Microsoft.

We should take a page from Oracle's book on semaphores and enqueues, by making
the critical sections as small as possible, and as fine-grained as possible,
allowing multiple processes access to the data without road-blocking.

-andy





Andrew Finkenstadt wrote:
> a strong message passing
> architecture (similar to the apache request_rec but designed to minimize
> expensive memory-to-memory copies) would probably suffice to avoid multiple
> independent processes like could be used elsewhere, or the monolithic
> single-threaded (unix) process of LISTSERV.
>

Re: Apache 2.0 ideas

Posted by Dean Gaudet <dg...@arctic.org>.

On Mon, 2 Nov 1998, Andrew Finkenstadt wrote:

> HTTP/1.1 adds complexity by attempting to multiplex connections across one
> expensive-to-bring-up connection, along with various add-ons for content

I think you're confusing HTTP/1.1 with HTTP/ng.  HTTP/1.1 adds only
pipelining. 

> Thus, we end up with layers that:
> 
>   Read in and gather an entire transaction (POST/PUT data, etc)
>   Submit the transaction-message to the server black box.
>   Magically deal with 1.1 multiple transactions, output chunking,
>     etc.

I wouldn't say "gather".  For MUX all you need is a layer which does
packet assembly and disassembly.  It should present an interface similar
to accept()/read()/write() to the rest of the server. 

Dean