You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by un...@riverstyx.net on 1999/04/18 22:46:46 UTC

select vs. threads vs. fork

I was just wondering whether anyone's put any thought into making Apache
into a select-based multiplexing web server instead of the concurrent
process model that it currently is?  Looking around, I've seen a couple
servers (thttpd, mathopd, boa) that are way higher in performance... and
they look like they'd be really easy to code modules for.  I don't know
how difficult it would be to port Apache though...

---
tani hosokawa
river styx internet



Re: select vs. threads vs. fork

Posted by Ryan Mooney <ry...@area51.verge.net>.
I've read through all the archives on this (well the last year or so :),
and I've yet to see anyone propose the following model.  Let me prefix
this by stating I've seen this model handle over 2000 simultaneous users
in a transaction processing enviroment on a pretty weenie machine with
good response time (although all connections were persistent and long 
lasting which could signifigantly change the model).  This is very simular
to a lot of the existing models, but is quite different in the actual 
implementation.  I know a lot of people feel that select() has a lot of
overhead, that may be but I haven't found good profiling stats that show
exactly how onerous it is (if someone has them I would e very interested).

Anyway...

Front End 
  [Preforking Select based daemon, hands listen socket off to another
   front end when full]

Backends 
  [Connect to the front end through messages queues {we used shared mem,
   but I'm not sure of the portability there - could be a big arguement}.
   Actually processes the transactions and hand them back through the
   front ends when done]

Benefits:
  - Abstraction of the callback "mess" into a relatively simple frontend
  - Abstraction of backend processes with a standardized message queuing
    system reaps "other" benefits in that it can easily be reused elsewhere.
  - Quite fast
  - Possibility of creating more tightly integrated server farms
  - Simplified programming for the backends (and only guru's hit the
    front end part, which is basically what happens now).
  - You can abstract the "frontend/backend" into either processes or
    threads (ok not really sure how thats a benefit).

Drawbacks
  - May actually be slower?  We won't know unless someone tries it, but
    given the way web traffic works the overhead MAY outweigh the gains,
    this would require carefull profiling/benchmarking.
  - Slightly more complex than the current model.

I'm not saying that this is the "way to go" or the fastest way or even the
best way.  I am saying I've seen it perform very well in a slightly different
enviroment that had certain analogies to the model at hand, and its worth
exploring.

> And before someone says "use multiple i/o threads like squid does": gee,
> why don't we just use threads? 
> 
> And before someone says "use multiple processes like zeus does": gee,
> we're already planning on that.  If you think a single process select
> based threading model is better, then you can build with mit-pthreads.
> 
> It all comes down to simplicity.  Callback/event-loop programming is not
> immediately obvious to all programmers.  If that's the model we used, it
> would become much more difficult for third party folks to integrate. 
> callback programming is such a different model that many third party
> libraries which do i/o can't be used.
> 
> yadda yadda. 
> 
> This is all in the archives somewhere.  It repeats about once every other
> month or so. 
> 
> Repeat the mantra:  correct first, fast second.  If you think you can do
> better I'd love to be proved wrong.  But if all you want is to max out a
> benchmark, then I think you're taking the wrong approach.

Re: select vs. threads vs. fork

Posted by Dean Gaudet <dg...@arctic.org>.

On Mon, 19 Apr 1999, Nitin Sonawane wrote:

> example of the open system call. Wouldnt the kernel block you while each
> directory in the path is being read, searched, the inode looked up, and
> the inode read in.

Yup. 

And then imagine it running on NFS.  Which does occur in the real world, a
lot.

And before someone says "cache it all": get your head out of the
benchmarks. 

And before someone says "async i/o": gee, I don't see an aio_open. 

And before someone says "use multiple i/o threads like squid does": gee,
why don't we just use threads? 

And before someone says "use multiple processes like zeus does": gee,
we're already planning on that.  If you think a single process select
based threading model is better, then you can build with mit-pthreads.

It all comes down to simplicity.  Callback/event-loop programming is not
immediately obvious to all programmers.  If that's the model we used, it
would become much more difficult for third party folks to integrate. 
callback programming is such a different model that many third party
libraries which do i/o can't be used.

yadda yadda. 

This is all in the archives somewhere.  It repeats about once every other
month or so. 

Repeat the mantra:  correct first, fast second.  If you think you can do
better I'd love to be proved wrong.  But if all you want is to max out a
benchmark, then I think you're taking the wrong approach.

Dean



Re: select vs. threads vs. fork

Posted by Marc Slemko <ma...@znep.com>.
On Mon, 19 Apr 1999, Nitin Sonawane wrote:

> Hi,
>     This 'great performance' argument has always puzzled me. While
> running in a single process, everytime you make a system call, youd get
> context switched or BLOCKED at the kernel's discretion. Take a simple
> example of the open system call. Wouldnt the kernel block you while each
> directory in the path is being read, searched, the inode looked up, and
> the inode read in. Where does the high performance come from?

Caching the whole document tree and only having to deal with network IO.
That is, after all, what most current benchmarks are about.


Re: select vs. threads vs. fork

Posted by Nitin Sonawane <ns...@infolibria.com>.
> 
> > I dont mean to get into a flame war but rather a brainstorming of 'why
> > would event driven servers ever perform better than a
> > multiprocess/threaded server'.
> 
> The argument, once you've realised the above, is that of where do you
> store the context.  You need context -- it's either implicit in the call
> stack (threads), or explicit in the data structures
> (callback/event-driven).
> 
> Both camps have compelling stories on both sides of that argument (you
> have to start arguing microarchitectural details about caches, registers,
> and how the compiler interacts).  But the threads camp wins the "ease of
> implementation" argument in my books.

	I would readily agree about ease of use. I realised that the hard way a
few years ago when an initial event-driven model (on an embedded server)
got so way out of hand that it was much easier to write a few cpu
context switching routines and threadise the whole thing. Anyway, but if
that issue was to be set aside, should we simply accept Mark's assertion
that these so called fast servers 'cache the whole document tree and
only deal with network IO'? So when someone gives Zeus or other examples
of being 'fast by design', is there an element of truth in that. If yes,
what is it?

Cheers,
Nitin.

Re: select vs. threads vs. fork

Posted by Dean Gaudet <dg...@arctic.org>.

On Mon, 19 Apr 1999, Nitin Sonawane wrote:

> The second issue I dont understand is 'unnecessary context switches'.

The goal is to maximize the use of every timeslice the kernel hands your
thread/task/process/lwp/whatever.  Kernel context switches tend to be more
heavyweight than userland switches (which are about as expensive as a
function call).

> Isnt it the case that the kernel is incessantly servicing peripheral
> interrupts. If so wouldnt context switches be almost unavoidable? 

In some sense yeah, but you're talking about a different context.  Also,
high speed networking is polled rather than interrupt driven -- because
it's cheaper to poll than it is to take interrupts. 

> Speaking of mit-pthreads, barring some overhead wouldnt such a server
> internally behave in the same manner as a select/poll based system. If
> yes, then we should be able to compile apache-apr with mit-pthreads and
> see how that performs in terms of sheer throughput.

Yes.  (Which is exactly what gets said in this thread every time it arises
:) 

Some of this is in the apache-2.0/docs archive as well.  You should find
Sun's document describing their hybrid user/kernel pthreads
implementation.  You can also look at NSPR's hybrid user/kernel
implementation that lives on IRIX -- and which has the potential to live
on any kernel pthreads implementation (i.e. linux).

> I dont mean to get into a flame war but rather a brainstorming of 'why
> would event driven servers ever perform better than a
> multiprocess/threaded server'. 

The argument, once you've realised the above, is that of where do you
store the context.  You need context -- it's either implicit in the call
stack (threads), or explicit in the data structures
(callback/event-driven).

Both camps have compelling stories on both sides of that argument (you
have to start arguing microarchitectural details about caches, registers,
and how the compiler interacts).  But the threads camp wins the "ease of
implementation" argument in my books.

Dean


Re: select vs. threads vs. fork

Posted by un...@riverstyx.net.


---
tani hosokawa
river styx internet


On Mon, 19 Apr 1999, Dean Gaudet wrote:

> 
> 
> On Mon, 19 Apr 1999 unknown@riverstyx.net wrote:
> 
> > My argument is, they don't matter.  For a normal static content servers,
> > most data is cached since it's used over and over.  Checking vmstat, it's
> > normal to see at most 40 blocks being read in per second.
> 
> localdisk.  Try it on NFS. 
> 
> > Like I said before, you're not likely to be in a situation where you never
> > serve the same file twice.  I can easily serve move than 100 static files
> > per second with Apache.  I can easily serve more with a different server.
> 
> I've had specweb results for apache on linux in the 1500 range quoted to
> me.
> 
> 100/s is trivial, I can't imagine how you're not serving more than that.
> 
> At any rate -- does your 100/s saturate your CPU?  Does it saturate your
> internet pipe?
> 
> I find it hard to get excited about benchmark numbers on locally attached
> gigabit networks when the internet pipe is still so small.  It doesn't
> take much to saturate the bandwidth a server has available. 

Actually, it generally just runs out of memory.  But beyond that, the load
average also gets way up there.  Like, 12-15.  According to top, the CPU
is only at 20-25% utilization at this point... network bandwidth probably
isn't an issue, since it's on ethernet and the server's nowhere near
filling that.  Switched ethernet, with burstable 100Mb available (minus
the other servers).  The most I've gotten it up to is about 0.8 MB/sec.

> > This isn't exactly an event driven server model.
> 
> I think the piece your missing is an understanding of the difference
> between a userland threads package (such as mit-pthreads) and a kernel
> threads package (such as linuxthreads), and hybrids of the two (such as
> solaris pthreads, or NSPR).
> 
> Maybe this can sum it up:  assume that you can wrap all the socket library
> calls, socket(), connect(), read(), write(), ...  For each socket created
> you place it in non-blocking mode.  Then when a read() or a write() would
> block, you can stop doing what the caller asked, and you can start doing
> something else (longjmp() somewhere else).  Somewhere along the way you do
> a select() and find out that the socket is ready for more i/o.  Then you
> can switch back to the fellow blocked in read() and complete the
> operation. 
> 
> That's userland threads:  behind the scenes all your i/o operations are
> translated into select() operations. 
> 
> Sounds a lot like select/event doesn't it? 
> 
> The difference is in where you keep track of context.  In
> select/event/callback you keep track of it explicitly in data structures.
> In userland threads you keep track of it on a stack.

Hmm.  I've only worked with Linuxthreads, so I'm probably missing a good
chunk of the picture.


Re: select vs. threads vs. fork

Posted by Dean Gaudet <dg...@arctic.org>.

On Mon, 19 Apr 1999 unknown@riverstyx.net wrote:

> My argument is, they don't matter.  For a normal static content servers,
> most data is cached since it's used over and over.  Checking vmstat, it's
> normal to see at most 40 blocks being read in per second.

localdisk.  Try it on NFS. 

> Like I said before, you're not likely to be in a situation where you never
> serve the same file twice.  I can easily serve move than 100 static files
> per second with Apache.  I can easily serve more with a different server.

I've had specweb results for apache on linux in the 1500 range quoted to
me.

100/s is trivial, I can't imagine how you're not serving more than that.

At any rate -- does your 100/s saturate your CPU?  Does it saturate your
internet pipe?

I find it hard to get excited about benchmark numbers on locally attached
gigabit networks when the internet pipe is still so small.  It doesn't
take much to saturate the bandwidth a server has available. 

> This isn't exactly an event driven server model.

I think the piece your missing is an understanding of the difference
between a userland threads package (such as mit-pthreads) and a kernel
threads package (such as linuxthreads), and hybrids of the two (such as
solaris pthreads, or NSPR).

Maybe this can sum it up:  assume that you can wrap all the socket library
calls, socket(), connect(), read(), write(), ...  For each socket created
you place it in non-blocking mode.  Then when a read() or a write() would
block, you can stop doing what the caller asked, and you can start doing
something else (longjmp() somewhere else).  Somewhere along the way you do
a select() and find out that the socket is ready for more i/o.  Then you
can switch back to the fellow blocked in read() and complete the
operation. 

That's userland threads:  behind the scenes all your i/o operations are
translated into select() operations. 

Sounds a lot like select/event doesn't it? 

The difference is in where you keep track of context.  In
select/event/callback you keep track of it explicitly in data structures.
In userland threads you keep track of it on a stack.

Dean


Re: select vs. threads vs. fork

Posted by un...@riverstyx.net.
On Mon, 19 Apr 1999, Nitin Sonawane wrote:

> unknown@riverstyx.net wrote:
> > 
> > Well, when it's select based you can do your own "task management" meaning
> > no unnecessary context switches.  Not that there's much task management to
> > do.  Two selects and two for loops to feed all the outputs and read all
> > the inputs.  Even in a pre-forked or threaded server, system calls will
> > block you.  However, that shouldn't impact much on the throughput, because
> > all your I/O is buffered.  It's not as if when you're busy looking for a
> > file your TCP output queues are empty and waiting for input.  CGI's are
> > forked, so that shouldn't be a problem.
> 
> Youre right that network/socket i/o is implicitly non-blocking (thank
> TCP for its window buffers). The issue Im unconvinced of is file system
> calls. Those can indeterminately block inside the kernel (unless your
> htdocs tree sits in your buffer cache). Inodes can be written
> asynchronously (eg., last access time stamps) but cannot be read
> asynchronously. Consequently all file system calls could get serialized
> inside the kernel. Its these calls that would end up throttling
> performance.

My argument is, they don't matter.  For a normal static content servers,
most data is cached since it's used over and over.  Checking vmstat, it's
normal to see at most 40 blocks being read in per second.  That's not much
data moving, and not much opporunity for I/O blocking.  What you seem to
be missing here, is that it *doesn't* throttle performance.  I think
everyone's aware that thttpd performs much better as a static content
server.  I think anyone arging differently hasn't actually used it.  What
I want to know, is how difficult it would be to use a similar model in
Apache.

> As a ball park estimate consider a 10ms delay for every file open (very
> conservatively spread across inode reads and or directory reads), you
> couldnt possibly serve more than 100 static files per second.

Like I said before, you're not likely to be in a situation where you never
serve the same file twice.  I can easily serve move than 100 static files
per second with Apache.  I can easily serve more with a different server.

> The second issue I dont understand is 'unnecessary context switches'.
> Isnt it the case that the kernel is incessantly servicing peripheral
> interrupts. If so wouldnt context switches be almost unavoidable? 

Obviously context switches are unavoidable.  However, requiring the server
to switch between all 600 processes/threads has more overhead than
switching between one webserver process and various kernel processes.  On
most webservers (and certainly any that are relevant to this discussion),
there won't be word processors and rendering engines running in the
background.

> Speaking of mit-pthreads, barring some overhead wouldnt such a server
> internally behave in the same manner as a select/poll based system. If
> yes, then we should be able to compile apache-apr with mit-pthreads and
> see how that performs in terms of sheer throughput.

I've never used mit-pthreads, so I can't comment.

> I dont mean to get into a flame war but rather a brainstorming of 'why
> would event driven servers ever perform better than a
> multiprocess/threaded server'. 

This isn't exactly an event driven server model.  What the server's doing
is checking its queue over and over, waiting for something to be done.  As
soon as there is something, it goes and does it.  Thats' quite different
from the multipricess model, where the kernel is busy flipping through the
different processes, using its normal process system which is quite suited
to handling generic processes, but obviously can't be as well optimized as
a task-written model.

---
tani hosokawa
river styx internet





Re: select vs. threads vs. fork

Posted by Nitin Sonawane <ns...@infolibria.com>.
unknown@riverstyx.net wrote:
> 
> Well, when it's select based you can do your own "task management" meaning
> no unnecessary context switches.  Not that there's much task management to
> do.  Two selects and two for loops to feed all the outputs and read all
> the inputs.  Even in a pre-forked or threaded server, system calls will
> block you.  However, that shouldn't impact much on the throughput, because
> all your I/O is buffered.  It's not as if when you're busy looking for a
> file your TCP output queues are empty and waiting for input.  CGI's are
> forked, so that shouldn't be a problem.


Youre right that network/socket i/o is implicitly non-blocking (thank
TCP for its window buffers). The issue Im unconvinced of is file system
calls. Those can indeterminately block inside the kernel (unless your
htdocs tree sits in your buffer cache). Inodes can be written
asynchronously (eg., last access time stamps) but cannot be read
asynchronously. Consequently all file system calls could get serialized
inside the kernel. Its these calls that would end up throttling
performance.

As a ball park estimate consider a 10ms delay for every file open (very
conservatively spread across inode reads and or directory reads), you
couldnt possibly serve more than 100 static files per second.

The second issue I dont understand is 'unnecessary context switches'.
Isnt it the case that the kernel is incessantly servicing peripheral
interrupts. If so wouldnt context switches be almost unavoidable? 

Speaking of mit-pthreads, barring some overhead wouldnt such a server
internally behave in the same manner as a select/poll based system. If
yes, then we should be able to compile apache-apr with mit-pthreads and
see how that performs in terms of sheer throughput.

I dont mean to get into a flame war but rather a brainstorming of 'why
would event driven servers ever perform better than a
multiprocess/threaded server'. 

Cheers,
Nitin.

Re: select vs. threads vs. fork

Posted by Dean Gaudet <dg...@arctic.org>.
On Mon, 19 Apr 1999 unknown@riverstyx.net wrote:

> On Mon, 19 Apr 1999, Dean Gaudet wrote:
> 
> > POSIX defines asynchronous i/o -- aio_read() and such.  But it's not
> > portable.  And in some places (linux for example) it's implemented via
> > threads anyhow.
> 
> That's the I/O subsystem.  Kernel threads shouldn't affect this much...

No, I'm not talking about kernel i/o threads.  I'm talking about the
aio_foo() implementation -- it uses i/o handler threads launched in
userland.  The kernel has no idea that aio is implemented.  There are no
aio system calls. 

> Well, that's 52 megs then, with 600 children... I'd be interested in
> finding out exactly how you got it down that low.

Pretty much just follow the perf-tuning guidelines. 

> I do need to do some
> CGI on these servers, so I can't ditch mod_cgi

that's fine, it's cheap. 

> , and I need mod_rewrite for
> stopping hotlinking of images.

You could write a small module that does it.  You don't save much though
-- code is free, it's completely shared. 

> Also, I need mod_alias.  That alone seems
> to make for hefty children.  When I check the output of free I see this:

Nope code is free. 

As is most everything you put into your http.conf file. 

What costs you are directives that generate run-time memory needs. 

Such as a bunch of <Directory>/etc containers.  Minimize the number of
those which can apply to each URL -- factor everything down to the lowest
level. 

>              total       used       free     shared    buffers     cached
> Mem:        386368     366936      19432     270896      45704     137808
> -/+ buffers/cache:     183424     202944
> Swap:       130748        408     130340

Look at "ps amx".  The system-wide numbers aren't decipherable. 

> Well, enlighten me... I'm just looking at empirical results here.  Would
> it be possible to have a select server process that handles only static
> content, and has the file descriptors for static content thrown over to it
> by the rest of the children?  That would get the best of both worlds, I
> think.

Descriptor passing is expensive. 

Make it multithreaded, and have a single thread which does the select
thing for static responses.  I posted about this last week or so... it's
an older idea than that, I forget whose it is originally.

(I suppose I sound like a broken record... which is a sign I should either
shut up, or get back to writing code for apache or something.) 

Dean


Re: select vs. threads vs. fork

Posted by un...@riverstyx.net.
On Mon, 19 Apr 1999, Dean Gaudet wrote:

> 
> 
> On Mon, 19 Apr 1999 unknown@riverstyx.net wrote:
> 
> > Well, when it's select based you can do your own "task management" meaning
> > no unnecessary context switches.  Not that there's much task management to
> > do.  Two selects and two for loops to feed all the outputs and read all
> > the inputs.  Even in a pre-forked or threaded server, system calls will
> > block you.  However, that shouldn't impact much on the throughput, because
> > all your I/O is buffered.  It's not as if when you're busy looking for a
> > file your TCP output queues are empty and waiting for input.  CGI's are
> > forked, so that shouldn't be a problem.
> 
> There's a difference between socket i/o and disk i/o.  Socket i/o has
> portable non-blocking interfaces which are available end to end -- from
> socket creation to close.  Disk i/o has no such portable interface.  When
> you call open() you (might) block.  It's even worse on NFS where an open() 
> could result in a half dozen network round trips.  When you call read() 
> you block (round trips on NFS). 
> 
> POSIX defines asynchronous i/o -- aio_read() and such.  But it's not
> portable.  And in some places (linux for example) it's implemented via
> threads anyhow.

That's the I/O subsystem.  Kernel threads shouldn't affect this much...

> > Plus, the fact that your memory footprint will be much smaller in a single
> > process (I'm looking at footprints of 18 megs, including all the mmap'd
> > files that most of the select'ing servers I've seen automatically mmap...
> > Compare that to an equivalently loaded forked or threaded server.  Still
> > in the 200+ meg range on a forked server, and I haven't tried out a
> > threaded server on quite that much traffic.  Even still, on a moderately
> > loaded threaded server (a snapshot of apache-apr) it's using 90+ megs.
> 
> Uh are you just totalling up the SZ column?  That's wrong.  Find the
> amount of shared pages.  I've tuned simple static content servers on linux
> which consume 80kb *per child* plus some minimum 5 or 6Mb overhead. That's
> nothing.  We're not going to do a lot better with threads. 

Well, that's 52 megs then, with 600 children... I'd be interested in
finding out exactly how you got it down that low.  I do need to do some
CGI on these servers, so I can't ditch mod_cgi, and I need mod_rewrite for
stopping hotlinking of images.  Also, I need mod_alias.  That alone seems
to make for hefty children.  When I check the output of free I see this:

             total       used       free     shared    buffers     cached
Mem:        386368     366936      19432     270896      45704     137808
-/+ buffers/cache:     183424     202944
Swap:       130748        408     130340

I don't have output for 600 processes right now because the server's not
at peak time right now.

             total       used       free     shared    buffers     cached
Mem:        386464     374348      12116      13532     200344     109352
-/+ buffers:            64652     321812
Swap:       130748         48     130700

That's on a thttpd server doing more traffic than the first.  The first
server was a PII-450, the second was a PII-333, same RAM, same hard
drives.

> > That alone is going to harm performance by reducing the total amount of
> > memory available for disk caching.
> No, you see you're getting into the callback versus thread debate.  The
> callback group jumps up and shouts "you've got too many stacks in threaded
> programming".  The threaded group jumps up and shouts "you have too many
> state structures, they're just as expensive as stacks!  And worse, your
> callbacks mess up compiler optimizations!"  They're both right, they're
> both wrong. 

Well, enlighten me... I'm just looking at empirical results here.  Would
it be possible to have a select server process that handles only static
content, and has the file descriptors for static content thrown over to it
by the rest of the children?  That would get the best of both worlds, I
think.

> All I know is that it's easier for people to think and code linearly, and
> the threaded model is far closer to linear than the callback model. 

---
tani hosokawa
river styx internet


Re: select vs. threads vs. fork

Posted by Dean Gaudet <dg...@arctic.org>.

On Mon, 19 Apr 1999 unknown@riverstyx.net wrote:

> Well, when it's select based you can do your own "task management" meaning
> no unnecessary context switches.  Not that there's much task management to
> do.  Two selects and two for loops to feed all the outputs and read all
> the inputs.  Even in a pre-forked or threaded server, system calls will
> block you.  However, that shouldn't impact much on the throughput, because
> all your I/O is buffered.  It's not as if when you're busy looking for a
> file your TCP output queues are empty and waiting for input.  CGI's are
> forked, so that shouldn't be a problem.

There's a difference between socket i/o and disk i/o.  Socket i/o has
portable non-blocking interfaces which are available end to end -- from
socket creation to close.  Disk i/o has no such portable interface.  When
you call open() you (might) block.  It's even worse on NFS where an open() 
could result in a half dozen network round trips.  When you call read() 
you block (round trips on NFS). 

POSIX defines asynchronous i/o -- aio_read() and such.  But it's not
portable.  And in some places (linux for example) it's implemented via
threads anyhow.

> Plus, the fact that your memory footprint will be much smaller in a single
> process (I'm looking at footprints of 18 megs, including all the mmap'd
> files that most of the select'ing servers I've seen automatically mmap...
> Compare that to an equivalently loaded forked or threaded server.  Still
> in the 200+ meg range on a forked server, and I haven't tried out a
> threaded server on quite that much traffic.  Even still, on a moderately
> loaded threaded server (a snapshot of apache-apr) it's using 90+ megs.

Uh are you just totalling up the SZ column?  That's wrong.  Find the
amount of shared pages.  I've tuned simple static content servers on linux
which consume 80kb *per child* plus some minimum 5 or 6Mb overhead. That's
nothing.  We're not going to do a lot better with threads. 

> That alone is going to harm performance by reducing the total amount of
> memory available for disk caching.

No, you see you're getting into the callback versus thread debate.  The
callback group jumps up and shouts "you've got too many stacks in threaded
programming".  The threaded group jumps up and shouts "you have too many
state structures, they're just as expensive as stacks!  And worse, your
callbacks mess up compiler optimizations!"  They're both right, they're
both wrong. 

All I know is that it's easier for people to think and code linearly, and
the threaded model is far closer to linear than the callback model. 

Dean


Re: select vs. threads vs. fork

Posted by un...@riverstyx.net.
Well, when it's select based you can do your own "task management" meaning
no unnecessary context switches.  Not that there's much task management to
do.  Two selects and two for loops to feed all the outputs and read all
the inputs.  Even in a pre-forked or threaded server, system calls will
block you.  However, that shouldn't impact much on the throughput, because
all your I/O is buffered.  It's not as if when you're busy looking for a
file your TCP output queues are empty and waiting for input.  CGI's are
forked, so that shouldn't be a problem.

Plus, the fact that your memory footprint will be much smaller in a single
process (I'm looking at footprints of 18 megs, including all the mmap'd
files that most of the select'ing servers I've seen automatically mmap...
Compare that to an equivalently loaded forked or threaded server.  Still
in the 200+ meg range on a forked server, and I haven't tried out a
threaded server on quite that much traffic.  Even still, on a moderately
loaded threaded server (a snapshot of apache-apr) it's using 90+ megs.
That alone is going to harm performance by reducing the total amount of
memory available for disk caching.

It's not like I write webservers for a living or anything, so feel free to
correct me on whatever.  However, I'm seeing marge performance gains on
all the servers I've switched so far.

---
tani hosokawa
river styx internet


On Mon, 19 Apr 1999, Nitin Sonawane wrote:

> Hi,
>     This 'great performance' argument has always puzzled me. While
> running in a single process, everytime you make a system call, youd get
> context switched or BLOCKED at the kernel's discretion. Take a simple
> example of the open system call. Wouldnt the kernel block you while each
> directory in the path is being read, searched, the inode looked up, and
> the inode read in. Where does the high performance come from?
> 
> Cheers,
> Nitin.
> 
> PS: The other compelling argument for having more than one server
> processes is that if a single process seg faults., the server keeps
> running. - NS
> 


Re: select vs. threads vs. fork

Posted by Nitin Sonawane <ns...@infolibria.com>.
unknown@riverstyx.net wrote:
> 
> I was just wondering whether anyone's put any thought into making Apache
> into a select-based multiplexing web server instead of the concurrent
> process model that it currently is?  Looking around, I've seen a couple
> servers (thttpd, mathopd, boa) that are way higher in performance... and
> they look like they'd be really easy to code modules for.  I don't know
> how difficult it would be to port Apache though...
> 
> ---
> tani hosokawa
> river styx internet

Hi,
    This 'great performance' argument has always puzzled me. While
running in a single process, everytime you make a system call, youd get
context switched or BLOCKED at the kernel's discretion. Take a simple
example of the open system call. Wouldnt the kernel block you while each
directory in the path is being read, searched, the inode looked up, and
the inode read in. Where does the high performance come from?

Cheers,
Nitin.

PS: The other compelling argument for having more than one server
processes is that if a single process seg faults., the server keeps
running. - NS

Re: select vs. threads vs. fork

Posted by Marc Slemko <ma...@znep.com>.
On Mon, 19 Apr 1999, Vincent Janelle wrote:

> What about Roxen(http://www.roxen.com/)?  Got any bad things to say
> about it? =)

Yes, quite a number.  But this isn't the place or time...


Re: select vs. threads vs. fork

Posted by Vincent Janelle <vj...@home.com>.
What about Roxen(http://www.roxen.com/)?  Got any bad things to say
about it? =)

(threads based select() or poll() based implementation written in pike)

A lot of us aren't running SMP systems for servers.

Marc Slemko wrote:
> 
> On Sun, 18 Apr 1999 unknown@riverstyx.net wrote:
> 
> > I was just wondering whether anyone's put any thought into making Apache
> > into a select-based multiplexing web server instead of the concurrent
> > process model that it currently is?  Looking around, I've seen a couple
> > servers (thttpd, mathopd, boa) that are way higher in performance... and
> > they look like they'd be really easy to code modules for.  I don't know
> 
> No.
> 
> If you look at nearly all those servers, you will see that they lack very
> important functionality, they have scalabaility issues on SMP systems,
> they have very restricted module functionality, can quickly fall over if
> you accidently do the wrong thing causing them to block, etc.
> 
> That doesn't exclude the possiblity of having some special "fast path"
> type code for static content where you can have a single thread handling a
> bunch of static content being sent, etc.  But the general idea has to be a
> thread based server.  That doesn't necessarily mean a thread for every
> concurrent connection, but trying to avoid having each bit of general
> functionality be generic threaded code isn't worthwhile.

-- 
------------
"I can get you anything! Money, power, women... Men?"  
--http://random.gimp.org --mailto:random@gimp.org --UIN 23939474

Re: select vs. threads vs. fork

Posted by un...@riverstyx.net.
Well, how about Zeus?  It's got bigtime performance, scales impressively,
appears to work well on SMP...

I was looking at thttpd, and I didn't think it would be difficult to add
in a module type scheme into it.  It only took me an hour to incorporate
the HSREGEX package and implement an anti-hotlinking handler.  In fact,
the entire code length is very tiny and straightforward.  Something like
12 functions to implement everything, and really obvious places to put in
various handlers... I don't know how Zues does it, but I figured it
wouldn't be too difficult to just spawn an extra thread for each processor
in the box, and have a separate client queue for each thread.  All the
info about each connection is held in one data structure, which in a
threaded implementation would be globally accessible... anyhow, just my
thoughts.

---
tani hosokawa
river styx internet


On Sun, 18 Apr 1999, Marc Slemko wrote:

> On Sun, 18 Apr 1999 unknown@riverstyx.net wrote:
> 
> > I was just wondering whether anyone's put any thought into making Apache
> > into a select-based multiplexing web server instead of the concurrent
> > process model that it currently is?  Looking around, I've seen a couple
> > servers (thttpd, mathopd, boa) that are way higher in performance... and
> > they look like they'd be really easy to code modules for.  I don't know
> 
> No.
> 
> If you look at nearly all those servers, you will see that they lack very
> important functionality, they have scalabaility issues on SMP systems,
> they have very restricted module functionality, can quickly fall over if
> you accidently do the wrong thing causing them to block, etc.
> 
> That doesn't exclude the possiblity of having some special "fast path"
> type code for static content where you can have a single thread handling a
> bunch of static content being sent, etc.  But the general idea has to be a
> thread based server.  That doesn't necessarily mean a thread for every
> concurrent connection, but trying to avoid having each bit of general
> functionality be generic threaded code isn't worthwhile.
> 


Re: select vs. threads vs. fork

Posted by Marc Slemko <ma...@znep.com>.
On Sun, 18 Apr 1999 unknown@riverstyx.net wrote:

> I was just wondering whether anyone's put any thought into making Apache
> into a select-based multiplexing web server instead of the concurrent
> process model that it currently is?  Looking around, I've seen a couple
> servers (thttpd, mathopd, boa) that are way higher in performance... and
> they look like they'd be really easy to code modules for.  I don't know

No.

If you look at nearly all those servers, you will see that they lack very
important functionality, they have scalabaility issues on SMP systems,
they have very restricted module functionality, can quickly fall over if
you accidently do the wrong thing causing them to block, etc.

That doesn't exclude the possiblity of having some special "fast path"
type code for static content where you can have a single thread handling a
bunch of static content being sent, etc.  But the general idea has to be a
thread based server.  That doesn't necessarily mean a thread for every
concurrent connection, but trying to avoid having each bit of general
functionality be generic threaded code isn't worthwhile.