You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Ryan Bloom <rb...@raleigh.ibm.com> on 1999/01/28 22:36:22 UTC

Thread/Process model discussion.

1)  Restarts - 
	Non-graceful, same as Process Apache:  issue a HUP, if
still up after x time, issue a TERM, if still up after x time issue a
kill, if still up after x time, mark it as zombie, and move on
	Graceful:  Issue signal to each Process, threads in process die if
not currently serving a page.  Process dies when last thread dies, Monitor
Process brings Processes back up.

2)  Max Servers -	
	MaxServerProcess:  How many servers are we allowed to run at any
time, same as Process Apache's MaxServers.	
	MaxActiveServerProcess:  How many Active servers can be running
at any time.  A server is inactive, if it has been issued a signal, but
there are threads still serving requests.  This will default to Max
Servers, but for users who want to tweak their server, this could be 95%
of  MaxServerProcess, and they will still have 5% server power when
issuing a signal.

3)  Process management -
	Each process takes care of managing itself.  If it is supposed to
die (for example, MaxRemainingRequests == 0), and it isn't dead, the
monitor process will kill it.  Same as Process Apache.
	Process are created when all current processes have threads that
are not serving requests >= MinSpareThreads.
	Processes are destroyed when # of threads serving requests <=
MinSpareThreads, and # of process is > MinSpareProcesses.
	The monitor process is responsible for determining when a new
process is needed, and it will fork that process.  New Processes are
created when all current processes have x% of their threads serving
requests.  x is user configurable, and will default to something like 90%
(open for discussion).

4)  Thread Management -
	Threads are created at process creation.  They do not die.  If a
thread kills itself under abnormal conditions, the process will not create
a new one. If all threads in a process die, the process will die, and we
will fork a new process. Threads per process is user configurable.

5)  Specialized threads -
	Timing is done with a timing thread.  Each thread tells the timer
it wants notification when x seconds has gone by.  The timer will send a
notification to that thread at correct time.  Do we want the timer thread
to wake up once a second to keep track of time, or should all threads
ignore sigalarm except for the timer thread, and let the OS/APR provide
timer capabilities.

	Logging.  Do we want a logging thread?  What benefits does it give
us?  Drawbacks?

If I missed anything let me know.

Ryan	

_______________________________________________________________________
Ryan Bloom		rbb@raleigh.ibm.com
4205 S Miami Blvd	
RTP, NC 27709		It's a beautiful sight to see good dancers 
			doing simple steps.  It's a painful sight to
			see beginners doing complicated patterns.	




Re: Thread/Process model discussion.

Posted by Chris Tacy <ch...@enginered.com>.

Ryan Bloom wrote:
> 
>
>         Logging.  Do we want a logging thread?  What benefits does it give
> us?  Drawbacks?

i guess it would allow for queuing for the log (helpful for logging to a
RDBMS on a high peak utilization site for example).

-c

-- 
#################################################
chris tacy		 president and co-founder
fire engine red		http://www.enginered.com/

Re: Thread/Process model discussion.

Posted by Ben Hyde <bh...@pobox.com>.
Tony Finch writes:
>I have a question about the basic architecture you describe: Why are
>you using both threads and processes? If threads by themselves aren't
>good enough I'd take that as an indication that they probably
>shouldn't be used on that platform.

The multiprocess model is so extremely robust in the face of
random memory corruption and memory leaks that it is a good
thing to keep even if you use threads (or fibers) for most
per-connection processing.  It also is a nice way to hold
onto the accept Q over failures.  - ben

Re: Thread/Process model discussion.

Posted by Ben Hyde <bh...@pobox.com>.
Ryan Bloom writes:
>> 	Logging.  Do we want a logging thread?  What benefits does it give
>> us?  Drawbacks?
...
>Any thoughts?

http_log.c will dribble to syslog, processes, and files.  So it's go
three stream implementations so far.  A fourth that uses a thread -
sounds like fun.  But - in heaven I'm sure have http_log designed so
plug in modules can implement this output stream in what every way
that catches their fantasy that day.

The flushf in 'static void log_error_core' means that the
serialization and Qing of log output is left entirely to the OS.
Maybe that's a performance issue.

I'd like to see the 10 or so logging routines that http_log exports
consolidated a little and put in APR.  With the log_error_core
bottleneck calling a hook.  The default for that hook can just go to
stderr.  A standard plug modules should to replace that output with
the variations we currently support.  Then some volunteers can build
random experiments in logging - for example using a thread to do the
Q.

More call back hooks in A2 will enable fantasy and give volunteers
a place to party.

 - ben

ps. Anybody actually use the syslog version?

Re: Thread/Process model discussion.

Posted by Rodent of Unusual Size <Ke...@Golux.Com>.
Rodent of Unusual Size wrote:
> 
> Maybe I'm being think
                  thick
D'oh!
-- 
#ken	P-)}

Ken Coar                    <http://Web.Golux.Com/coar/>
Apache Group member         <http://www.apache.org/>
"Apache Server for Dummies" <http://Web.Golux.Com/coar/ASFD/>

Re: Thread/Process model discussion.

Posted by Dean Gaudet <dg...@arctic.org>.

On Fri, 29 Jan 1999, Rodent of Unusual Size wrote:

> Dean Gaudet wrote:
> > 
> > Instead what you want to use is non-blocking sockets (and pipes[1]) and
> > implement a send/recv with timeout by using select/poll (or completion
> > ports on win32).  This gives you synchronous notification of timeouts,
> > which is portable.
> 
> Maybe I'm being think, but aren't you assuming that the
> only thing for which we want timers is I/O?

I've brought this up before and this is the only use that has been
presented.  Nobody has come up with another use for timers that asynch
notification could help. 

To cut off one debate before it happens -- timers can't control broken 3rd
party modules.  Broken 3rd party modules, by definition, don't behave
correctly.  There's no reason that we should assume just because they're
spending a lot of time churning doing nothing that we can interrupt them
*and clean up nicely*.  Nope.  We could get away with it in process model
apache because we can just toss the process (we don't do that though, we
just make this blind jump back to the main loop and hope the world still
functions). 

Dean


Re: Thread/Process model discussion.

Posted by Rodent of Unusual Size <Ke...@Golux.Com>.
Dean Gaudet wrote:
> 
> Instead what you want to use is non-blocking sockets (and pipes[1]) and
> implement a send/recv with timeout by using select/poll (or completion
> ports on win32).  This gives you synchronous notification of timeouts,
> which is portable.

Maybe I'm being think, but aren't you assuming that the
only thing for which we want timers is I/O?
-- 
#ken	P-)}

Ken Coar                    <http://Web.Golux.Com/coar/>
Apache Group member         <http://www.apache.org/>
"Apache Server for Dummies" <http://Web.Golux.Com/coar/ASFD/>

Re: Thread/Process model discussion.

Posted by Dean Gaudet <dg...@arctic.org>.

On Thu, 28 Jan 1999, Ryan Bloom wrote:

> 5)  Specialized threads -
> 	Timing is done with a timing thread.  Each thread tells the timer
> it wants notification when x seconds has gone by.  The timer will send a
> notification to that thread at correct time.  Do we want the timer thread
> to wake up once a second to keep track of time, or should all threads
> ignore sigalarm except for the timer thread, and let the OS/APR provide
> timer capabilities.

Asynchronous notification is non-portable, and error prone.  The timer
thread isn't required... we only do this sort of thing in 1.x for legacy
reasons -- that's just how it was always implemented.  I've got a few past
rants in the archive about this -- the main problem with async
notification is that 3rd party libraries (and I suspect many libc
implementations) don't expect it.

Instead what you want to use is non-blocking sockets (and pipes[1]) and
implement a send/recv with timeout by using select/poll (or completion
ports on win32).  This gives you synchronous notification of timeouts,
which is portable. 

Dean

[1] Did we ever figure out how to do CGI with timeouts in win32?  Last I
remember there is a method to do IPC with timeouts in win32 but it's not
the IPC method used to implement CGI... and nobody knew how to get
timeouts for a read/write on a pipe. 



Re: Thread/Process model discussion.

Posted by Rodent of Unusual Size <Ke...@Golux.Com>.
Tony Finch wrote:
> 
> I have a question about the basic architecture you describe: Why are
> you using both threads and processes? If threads by themselves aren't
> good enough I'd take that as an indication that they probably
> shouldn't be used on that platform.

Because if a single thread encounters a fatal process-killing
condition, you've lost your entire server.  A hybrid model of
N threads in M processes means you'll only lose 1/Mth of your
capacity in such a case, rather than 100% of it.
-- 
#ken	P-)}

Ken Coar                    <http://Web.Golux.Com/coar/>
Apache Group member         <http://www.apache.org/>
"Apache Server for Dummies" <http://Web.Golux.Com/coar/ASFD/>

Re: Thread/Process model discussion.

Posted by Bill Stoddard <st...@raleigh.ibm.com>.
Bill Stoddard wrote:

> > I like the hybrid process/thread architecture.  In a purely threaded
> > environment a fatal error in a loaded module will take down your server.
> > In the hybrid model only the requests served up by the threads in
> > the process that died will be lost and further requests will be handled by
> > threads in the other active processes.  It combines the robustness of the
> > multi-process model with the performance and memory efficiency of the
> > threaded model.
> 
> Rasmus makes the best argument for a hybrid process/thread architecture.
> Another benefit is to overcome max threads per process limits in some
> Unix variants. Earlier versions of AIX had a max thread per process
> limit of 512 (this limit was removed in later versions of AIX).
>

If Unix had asynchronous I/O and completion ports, the thread per
process limit wouldn't be such a problem. :-)

-- 
Bill Stoddard
stoddard@raleigh.ibm.com

Re: Thread/Process model discussion.

Posted by Bill Stoddard <st...@raleigh.ibm.com>.
Rasmus Lerdorf wrote:
> 
> > I have a question about the basic architecture you describe: Why are
> > you using both threads and processes? If threads by themselves aren't
> > good enough I'd take that as an indication that they probably
> > shouldn't be used on that platform.
> 
> I like the hybrid process/thread architecture.  In a purely threaded
> environment a fatal error in a loaded module will take down your server.
> In the hybrid model only the requests served up by the threads in
> the process that died will be lost and further requests will be handled by
> threads in the other active processes.  It combines the robustness of the
> multi-process model with the performance and memory efficiency of the
> threaded model.

Rasmus makes the best argument for a hybrid process/thread architecture.
Another benefit is to overcome max threads per process limits in some
Unix variants. Earlier versions of AIX had a max thread per process
limit of 512 (this limit was removed in later versions of AIX). 
 
-- 
Bill Stoddard
stoddard@raleigh.ibm.com

RE: Thread/Process model discussion.

Posted by Eric Anderson <et...@iname.com>.
gcc and VC support templates/STL just fine.  No need to punt.

=Eric



> And rule 1 is don't use templates which punts STL as well. Unfortunate
> but true if portability in important to you.
> 
> -- 
> Bill Stoddard
> stoddard@raleigh.ibm.com
> 

Re: Thread/Process model discussion.

Posted by Bill Stoddard <st...@raleigh.ibm.com>.
Ben Hyde wrote:
> 
> Eric Anderson writes:
> ...
> >exception handling.
> 
>   http://www.mozilla.org/docs/tplist/catBuild/portable-cpp.html#dont_use_exceptions
> 
>  -  ben

And rule 1 is don't use templates which punts STL as well. Unfortunate
but true if portability in important to you.

-- 
Bill Stoddard
stoddard@raleigh.ibm.com

RE: C++

Posted by Eric Anderson <et...@iname.com>.
> The list is, but there ain't no-one talking :-)

Really?  How does one get on it?

> I've yet to understand how people can think C++ is bad and yet still
> think Java is good. If you promise to only ever use pointers in C++ you
> can hardly see the difference.

If you only use references (instead of pointers), lots of exception
handling, and STL, it practically IS Java, except with better IDE/debugging
support.  You lose the WORA of Java, and the nice standard libraries, but
you get better performance and the ability to customize the code in a
platform-specific way.

> Except you don't get templates in Java so you have all the crappy runtime
type-checking.

Templates are cool.  I'm still not clear on why Java interfaces are
(allegedly) better than C++ MI, but that's a non-Apache discussion for sure.

> Don't get me wrong. I like Java. But I like C++, too. Horses for courses.

Java rocks, for some things, but in my mind it still has a couple of major
problems:
poor IDE support
(relatively, compared to native code) poor performance
deployment issues (does your VM support JNI, RMI?  Are your class libs up to
date (with JFC for example)?)

Besides, we already have Java web servers coming out of our ears.

C++ might not be an OOP purist's choice, but for a reasonable OO model,
excellent performance, and very good tool support, it's hard to beat.

-Eric



-----------------
ETA Associates, Inc.
http://www.ultracode.com/


Re: C++

Posted by Ben Laurie <be...@algroup.co.uk>.
Brian Behlendorf wrote:
> 
> My guess would be that taking a mature C program and trying to revamp it to
> work in a C++ model is more work and more heartache/madness than simply
> writing a new one in C++ from scratch, borrowing liberally perhaps from
> other sources, but still a separate development effort.
> 
> In fact, I seem to recall Ben Laurie starting up a mailing list to focus on
> a C++ rewrite of Apache - Ben, is that list still around?

The list is, but there ain't no-one talking :-)

> My personal opinion is that C++ is a frankenstein of a language, and if I
> were going to take the time to write a new web server with OO in mind, I'd
> do it first in Java, and then code up native methods for any
> performance-critical components, such as I/O or regex comparisons, that a
> profiling tool shows to be bottlenecks.

I've yet to understand how people can think C++ is bad and yet still
think Java is good. If you promise to only ever use pointers in C++ you
can hardly see the difference. Except you don't get templates in Java so
you have all the crappy runtime type-checking.

Don't get me wrong. I like Java. But I like C++, too. Horses for
courses.

Cheers,

Ben.

--
http://www.apache-ssl.org/ben.html

"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
     - Indira Gandhi

Re: C++

Posted by Renaud Bruyeron <re...@w3.org>.
Brian Behlendorf wrote:
> 
> My guess would be that taking a mature C program and trying to revamp it to
> work in a C++ model is more work and more heartache/madness than simply
> writing a new one in C++ from scratch, borrowing liberally perhaps from
> other sources, but still a separate development effort.
> 
> In fact, I seem to recall Ben Laurie starting up a mailing list to focus on
> a C++ rewrite of Apache - Ben, is that list still around?
> 
> My personal opinion is that C++ is a frankenstein of a language, and if I
> were going to take the time to write a new web server with OO in mind, I'd
> do it first in Java, and then code up native methods for any
> performance-critical components, such as I/O or regex comparisons, that a
> profiling tool shows to be bottlenecks.

Then, if you are thinking Java, check this out:

http://www.w3.org/Jigsaw/

=)

 - Renaud

Re: C++

Posted by Bill Stoddard <st...@raleigh.ibm.com>.
Brian Behlendorf wrote:

> My personal opinion is that C++ is a frankenstein of a language, and if I
> were going to take the time to write a new web server with OO in mind, I'd
> do it first in Java, and then code up native methods for any
> performance-critical components, such as I/O or regex comparisons, that a
> profiling tool shows to be bottlenecks.
> 
APR and jApache :-)

-- 
Bill Stoddard
stoddard@raleigh.ibm.com

C++

Posted by Brian Behlendorf <br...@hyperreal.org>.
My guess would be that taking a mature C program and trying to revamp it to
work in a C++ model is more work and more heartache/madness than simply
writing a new one in C++ from scratch, borrowing liberally perhaps from
other sources, but still a separate development effort.

In fact, I seem to recall Ben Laurie starting up a mailing list to focus on
a C++ rewrite of Apache - Ben, is that list still around?

My personal opinion is that C++ is a frankenstein of a language, and if I
were going to take the time to write a new web server with OO in mind, I'd
do it first in Java, and then code up native methods for any
performance-critical components, such as I/O or regex comparisons, that a
profiling tool shows to be bottlenecks.  

	Brian


--=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--
History is made at night;                         brian@hyperreal.org
  character is what you are in the dark.

RE: Thread/Process model discussion.

Posted by Marc Slemko <ma...@worldgate.com>.
On Fri, 29 Jan 1999, Rasmus Lerdorf wrote:

> > That's a benefit to the end user (more goodies to choose from).  I think
> > that an OO Apache would be much easier to extend (for programmers) and this
> > would benefit end users because the supply of extensions (modules,
> > customized versions of the core...) would increase.
> 
> I'd argue that one.  And this has been discussed to death here in the past
> and it is generally accepted, I think, that there are a lot more
> programmers out there who are completely comfortable in ANSI C as opposed
> to C++.  I know for sure that the PHP module would never have come to be
> if it had needed to be written in C++.  I agree that there would be some
> technical benefits to using C++, but I don't agree that it would make the
> server more robust.  It may be easier to make it robust in C++, but that
> is a separate issue.

And, of course, nothing prevents people from using C++ for their modules
and nothing prevents us from adding a C++ module API which wraps a
lot of the API in a more OOPy way.

> 
> > Functionality: it's easier to do complex things with "higher-level"
> > languages.  Thus functionality could increase (with the same programmer
> > effort) if a higher-level language were used.
> 
> Assuming the same number of programmers are comfortable in this
> higher-level language.  Just about every C++ programmer can write C
> whereas the opposite is not true.  When it comes to Open Source projects
> where contributed code is essential, not limiting yourself to a smaller
> subset of available talent is a factor that needs to be considered.
> 
> There really is no point to this argument.  You are either an OO (or at
> least the twisted C++ variety thereof) freak or you aren't.  For someone

I really don't see that this way.  I see the biggest argument against
C++ as C++ itself, not OOP.  I have done C and C++ for a number of 
years now.  I have always really hated C++ because it is just such a 
PITA.  I used to think that perhaps I was just rebelling against OOP
and it wasn't really C++, but learning Java cured me of that.  C++ just
sucks.  That is a darn good argument against using it to me.  

I would pick Java over C++ or C as a language to write Apache in, but
for numerous reasons (mostly unrelated to the syntactic form of the 
language itself) Java really isn't appropriate and wouldn't work too well.


Re: Thread/Process model discussion.

Posted by Ben Laurie <be...@algroup.co.uk>.
Eric Anderson wrote:
> I AM curious though: do you think that there are really a lot more open
> source C people than C++ people?  Are they even teaching ANSI C in college
> these days?

Regrettably (IMO), far more. I don't know what they teach these days,
and I'm not entirely convinced it makes any difference, anyway.

Cheers,

Ben.

--
http://www.apache-ssl.org/ben.html

"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
     - Indira Gandhi

RE: Thread/Process model discussion.

Posted by Eric Anderson <et...@iname.com>.
> it usually devolves into useless bickering.

Oh sure, ruin my chance to make important and solid contributions to useless
bickering!

> eventually everyone agrees that in the pragmatic world we live in C is
> the solution. i'm sure that these discussions can be found off
> http://dev.apache.org/ in the archived mailing list stuff.

I'll have a look

Tnx,
Eric


Re: Thread/Process model discussion.

Posted by Chris Tacy <ch...@enginered.com>.
> > > That's a benefit to the end user (more goodies to choose from).  I think
> > > that an OO Apache would be much easier to extend (for programmers) and
> this
> > > would benefit end users because the supply of extensions (modules,
> > > customized versions of the core...) would increase.
> >
> > I'd argue that one.  And this has been discussed to death here in the past
> 
> Has it?  I was wondering about that.  I've only been on the list for a
> couple of months and have no doubt missed lots of good stuff.

yes it has.
it usually devolves into useless bickering.
eventually everyone agrees that in the pragmatic world we live in C is
the solution. i'm sure that these discussions can be found off
http://dev.apache.org/ in the archived mailing list stuff.

-c

-- 
#################################################
chris tacy		 president and co-founder
fire engine red		http://www.enginered.com/

RE: Thread/Process model discussion.

Posted by Eric Anderson <et...@iname.com>.
> > That's a benefit to the end user (more goodies to choose from).  I think
> > that an OO Apache would be much easier to extend (for programmers) and
this
> > would benefit end users because the supply of extensions (modules,
> > customized versions of the core...) would increase.
>
> I'd argue that one.  And this has been discussed to death here in the past

Has it?  I was wondering about that.  I've only been on the list for a
couple of months and have no doubt missed lots of good stuff.

> I agree that there would be some technical benefits to using C++, but I
don't agree that it would make the
> server more robust.  It may be easier to make it robust in C++, but that
is a separate issue.

Of course, the mere choice of a language does not yield robustness.  As you
say, it simply makes it easier in some ways (and of course introduces
potential pitfalls).

> Assuming the same number of programmers are comfortable in this
> higher-level language.  Just about every C++ programmer can write C
> whereas the opposite is not true.  When it comes to Open Source projects
> where contributed code is essential, not limiting yourself to a smaller
> subset of available talent is a factor that needs to be considered.

Good point.  I've been a C++ programmer for about four years now, and
everyone around me seems to be either a C++ or Java programmer, so I just
automatically assumed that that is the prevailing state of the industry.
FWIW, I've also been in PC-land, where C++ is perhaps more common than
elsewhere.

I AM curious though: do you think that there are really a lot more open
source C people than C++ people?  Are they even teaching ANSI C in college
these days?

> There really is no point to this argument.  You are either an OO (or at
> least the twisted C++ variety thereof) freak or you aren't.  For someone
> who thinks in OO terms when writing code the things you say make perfect
> sense and it seems like such an obvious approach.  For people who have
> been writing non-OO code for 25 years, the switch is not so automatic and
> to many what was once a very clean and logical structure to a piece of
> code now becomes a tangled mess.  I don't think anybody is ever going to
> convince me that multiple inheritance is a good idea, for example.

Heh heh, I rather like MI (of course).  A pity it was left out of Java.  But
anyway, your point is well taken.  If this is truly the defining argument
w/regard to language choice for Apache, and if the fact is that people who
know C++ are a substantial minority, then I'd concede the point.  Obviously,
a C Apache is far superior to no Apache at all.  I'd rather thought I'd hear
arguments about the "efficiency" of one language versus another, or about
why OO would be the ruin of us all.  :)

-Eric




RE: Thread/Process model discussion.

Posted by Rasmus Lerdorf <ra...@lerdorf.on.ca>.
> That's a benefit to the end user (more goodies to choose from).  I think
> that an OO Apache would be much easier to extend (for programmers) and this
> would benefit end users because the supply of extensions (modules,
> customized versions of the core...) would increase.

I'd argue that one.  And this has been discussed to death here in the past
and it is generally accepted, I think, that there are a lot more
programmers out there who are completely comfortable in ANSI C as opposed
to C++.  I know for sure that the PHP module would never have come to be
if it had needed to be written in C++.  I agree that there would be some
technical benefits to using C++, but I don't agree that it would make the
server more robust.  It may be easier to make it robust in C++, but that
is a separate issue.

> Functionality: it's easier to do complex things with "higher-level"
> languages.  Thus functionality could increase (with the same programmer
> effort) if a higher-level language were used.

Assuming the same number of programmers are comfortable in this
higher-level language.  Just about every C++ programmer can write C
whereas the opposite is not true.  When it comes to Open Source projects
where contributed code is essential, not limiting yourself to a smaller
subset of available talent is a factor that needs to be considered.

There really is no point to this argument.  You are either an OO (or at
least the twisted C++ variety thereof) freak or you aren't.  For someone
who thinks in OO terms when writing code the things you say make perfect
sense and it seems like such an obvious approach.  For people who have
been writing non-OO code for 25 years, the switch is not so automatic and
to many what was once a very clean and logical structure to a piece of
code now becomes a tangled mess.  I don't think anybody is ever going to
convince me that multiple inheritance is a good idea, for example.

-Rasmus


RE: Thread/Process model discussion.

Posted by Eric Anderson <et...@iname.com>.
> In the case of the Apache Web server, the LCD is "ANSI C compliance."
> That's really not that low.

Not super low, but you don't get any of the "goodness" of an object-oriented
approach (yes, some of this is IMO) ... a better conceptual model, easy
code-reuse (STL is one example), better resource tracking with object
constructors/destructors.  You also don't get the very nice feature of
exception handling.  Exception handling helps to crash-proof the program,
and in some ways provides a nicer model then return code checking (exception
handling is used almost exclusively in Java, for example...though there's no
reason that you *couldn't* do regular return code checking in that
language).

> How does the language in which something is written affect its
> value to the end-user?

It doesn't *directly* affect its value (hell, some Perl hacker could
probably do Apache/P ... and that would suit a lot of users just fine),
outside of robustness gains.  However, providing a platform that's very easy
to extend means that it will get extended more frequently and in more ways.
That's a benefit to the end user (more goodies to choose from).  I think
that an OO Apache would be much easier to extend (for programmers) and this
would benefit end users because the supply of extensions (modules,
customized versions of the core...) would increase.

> I just think that there's a trade-off to be made between portability and
functionality,
> > robustness, and providing a highly (read: easily, cheaply) extensible
> > platform.

> I'm not sure how any but the last relate to the implementation
> language.

Functionality: it's easier to do complex things with "higher-level"
languages.  Thus functionality could increase (with the same programmer
effort) if a higher-level language were used.

Robustness: better resource tracking with C++ objects, exception handling
for graceful handling of "bad" events.

Easily Extensible: an OO model (again, think Java servlets) would make it
much easier to write Apache extensions, among other things.  The main reason
that the servlet environment is such a high-productivity environment (in my
experience) is that the OO model makes it easy to write your own extensions.
You subclass the base servlet class, implement the functions that you care
about, and you've got your own servlet - like magic.  This same ability
*inside* the Apache core would make it easy to abstract certain things about
the core (the threading model, for instance) and make them easily
replaceable.

> As for the last, what percentage of C++ compilers in use are free or
OS-bundled?  How about ANSI C compilers?

I may be wrong, but I was under the impression that gcc came with (or is
available for) most Unix and PC platforms.  You're right that M$VC is _not_
very free at all.

-Eric


Re: Thread/Process model discussion.

Posted by Rodent of Unusual Size <Ke...@Golux.Com>.
Eric Anderson wrote:
> 
> Interesting reading.  I'd take issue with the "portability at all costs"
> stance, however, because this forces you to code/design to the lowest common
> (across all compilers, no matter how cheesy) language denominator.

As I understand it, this is one of the prominent reasons C++ isn't
the base language for Apache: the C++ implementations vary more
widely than the C one do.

In the case of the Apache Web server, the LCD is "ANSI C compliance."
That's really not that low.

>                                                                     In my
> mind this hurts the "product" just as much as having code that not every
> fringe compiler can accept.

How does the language in which something is written affect its
value to the end-user?

> I realize that portability is a huge feature of Apache.  I just think that
> there's a trade-off to be made between portability and functionality,
> robustness, and providing a highly (read: easily, cheaply) extensible
> platform.

I'm not sure how any but the last relate to the implementation
language.  As for the last, what percentage of C++ compilers in
use are free or OS-bundled?  How about ANSI C compilers?
-- 
#ken	P-)}

Ken Coar                    <http://Web.Golux.Com/coar/>
Apache Group member         <http://www.apache.org/>
"Apache Server for Dummies" <http://Web.Golux.Com/coar/ASFD/>

RE: Thread/Process model discussion.

Posted by Eric Anderson <et...@iname.com>.
> http://www.mozilla.org/docs/tplist/catBuild/portable-cpp.html#dont
> _use_exceptions
>
>  -  ben
>

Interesting reading.  I'd take issue with the "portability at all costs"
stance, however, because this forces you to code/design to the lowest common
(across all compilers, no matter how cheesy) language denominator.  In my
mind this hurts the "product" just as much as having code that not every
fringe compiler can accept.  W/regard to Apache, I'd think that sticking to
the features supported by gcc and VC would cover 99% of the platforms in the
world.

I realize that portability is a huge feature of Apache.  I just think that
there's a trade-off to be made between portability and functionality,
robustness, and providing a highly (read: easily, cheaply) extensible
platform.


-Eric


RE: Thread/Process model discussion.

Posted by Ben Hyde <bh...@pobox.com>.
Eric Anderson writes:
...
>exception handling.
 
  http://www.mozilla.org/docs/tplist/catBuild/portable-cpp.html#dont_use_exceptions

 -  ben

RE: Thread/Process model discussion.

Posted by Eric Anderson <et...@iname.com>.
> If a module goes nuts and steps all over process storage, I'm not sure
> why you would WANT to keep the process up. This has nothing to do with
> the language the app is written in.

True enough.  I was thinking of process-terminating exceptions (where the OS
would normally kill the process),
not internal corruption.

-Eric


Re: Thread/Process model discussion.

Posted by Bill Stoddard <st...@raleigh.ibm.com>.
Eric Anderson wrote:
> 
> > I like the hybrid process/thread architecture.  In a purely threaded
> > environment a fatal error in a loaded module will take down your server.
> 
> Not if the server's written in C++ (as Apache should be) and you use
> exception handling.

If a module goes nuts and steps all over process storage, I'm not sure
why you would WANT to keep the process up. This has nothing to do with
the language the app is written in.

-- 
Bill Stoddard
stoddard@raleigh.ibm.com

RE: Thread/Process model discussion.

Posted by Cliff Skolnick <cl...@steam.com>.
I'll only touch the flame bait here, indeed, it would be a rock of a server.
:)

We need to consider 3 types of multi-process threads.

The first is a pure userland thread, which will share an interface into the
kernel with all other user threads.  In other words when I/O is done, only
all threads stop waiting for the kernel to complete the request, if it is
blocking. If you are lucky you can multiplex n kernel threads over m user
threads.  This type of thread is really cheap to create and run, pretty good
for compute only functions.

The second level of thread is a userland thread with a corresponding kernel
thread.  When it goes into the kernel it will only block itself.  These
threads require more data structures so the are a little more expensive, but
they are great for I/O.

The last level is the process level, you know fork(), get your own address
space which may or may not share address space with your parent.  This model
is expensive.

If any OS only provides the first and third type of threads, we must go to
the multiprocess model or our server will only be able to have one
outstanding blocking I/O request.  This would be very bad!

If the OS provides the second type of thread, sure we could cram all threads
into the same process and be happy.  But as mentioned before, it would still
be more robust to also use multiple processes and more importantly it may
get you beyond per process thread and file descriptor limits.

Cliff

> > I like the hybrid process/thread architecture.  In a purely threaded
> > environment a fatal error in a loaded module will take down your server.
>
> Not if the server's written in C++ (as Apache should be) and you use
> exception handling.
>
> Yes, I know this resembles flame-bait, but it's really not.  I'd
> love to see
> a pure C++ implementation, including a new module API (that resembles the
> Java servlet API perhaps).  If the overhead/baggage/resource
> consumption of
> the multiple-process model is acceptable, then a bit of C++ language
> overhead ought to be acceptable too.  You'd get a more robust server
> (exception handling, better resource leak prevention (via object
> destructors)), and IMO it would be easier to extend the platform
> (imagine a
> "module" base class ... just derive from it, override a few functions, and
> tada, a new module).  Yes, all this *can* be accomplished (except for
> exception handling) in straight C, it's just easier (to me anyway) in C++
> w/the OO model.
>
> C++, OO model, lots of STL, and maybe some good platform specific
> performance tweaks (like asynchronous I/O w/completion ports on NT) - now
> that would rock!  :)


RE: Thread/Process model discussion.

Posted by Eric Anderson <et...@iname.com>.
> I like the hybrid process/thread architecture.  In a purely threaded
> environment a fatal error in a loaded module will take down your server.

Not if the server's written in C++ (as Apache should be) and you use
exception handling.

Yes, I know this resembles flame-bait, but it's really not.  I'd love to see
a pure C++ implementation, including a new module API (that resembles the
Java servlet API perhaps).  If the overhead/baggage/resource consumption of
the multiple-process model is acceptable, then a bit of C++ language
overhead ought to be acceptable too.  You'd get a more robust server
(exception handling, better resource leak prevention (via object
destructors)), and IMO it would be easier to extend the platform (imagine a
"module" base class ... just derive from it, override a few functions, and
tada, a new module).  Yes, all this *can* be accomplished (except for
exception handling) in straight C, it's just easier (to me anyway) in C++
w/the OO model.

C++, OO model, lots of STL, and maybe some good platform specific
performance tweaks (like asynchronous I/O w/completion ports on NT) - now
that would rock!  :)

-Eric


---------------------------------------
ETA Associates, Inc.
http://www.ultracode.com/


Re: Thread/Process model discussion.

Posted by Rasmus Lerdorf <ra...@lerdorf.on.ca>.
> I have a question about the basic architecture you describe: Why are
> you using both threads and processes? If threads by themselves aren't
> good enough I'd take that as an indication that they probably
> shouldn't be used on that platform.

I like the hybrid process/thread architecture.  In a purely threaded
environment a fatal error in a loaded module will take down your server.
In the hybrid model only the requests served up by the threads in
the process that died will be lost and further requests will be handled by
threads in the other active processes.  It combines the robustness of the
multi-process model with the performance and memory efficiency of the
threaded model.

-Rasmus


Re: Thread/Process model discussion.

Posted by Tony Finch <do...@dotat.at>.
I have a question about the basic architecture you describe: Why are
you using both threads and processes? If threads by themselves aren't
good enough I'd take that as an indication that they probably
shouldn't be used on that platform.

Tony.
-- 
f.a.n.finch  dot@dotat.at  fanf@demon.net

Re: Thread/Process model discussion.

Posted by "Michael H. Voase" <mv...@midcoast.com.au>.
Dean Gaudet wrote:

> On Fri, 29 Jan 1999, Rodent of Unusual Size wrote:
>
> > Dean Gaudet wrote:
> > >
> > > > The advantage to this, is that the disk I/O is going to be a killer, and I
> > > > don't think we want multiple threads blocking on disk I/O.

>
> > >
> > > Why not?
> >
> > Because it keeps them from moving on to the next request.

>
> Somewhere you have to pay for the logging.  You can't hide it under a rug.
>
> So what if a thread doesn't move to the next request?  There are other
> threads to handle other requests.
>
> Dean

Gday ,
    I note that the topic is on threads and syslogs here so I thought
I throw in an some findings on what happens when Apache is hit
with a couple of thousand requests and syslogd craps out .

    Down here in the dungeon of the Castle we have been subjecting
Apache and mod_cgisock to numerous tortures on slow limited
hardware.This afternoon after a particularly serious
hammering I noted that Apaches response time to hits had dropped
to about 20ms with 100 clients banging away at it whilst it was
running on my krufty old 486 . Impressive thinks I untill a quick scan
of the returned info revealed that that there was no data being returned .

    Further checks revealed the situation above . once the log file had
grown to 4mb , Apache stopped serving hits but kept returning
responses which VeloMeter ( I use it cause I need  the pretty
output , on the Pentium it can give me old 486 plenty of greif :- )
interpreted as legitamate serves .

    What had happened is that syslogd , after hitting the ceiling ,
commenced restarting . After each restart it would get a request
from apache to log another line , find the log file maxxed out and
promptly restart again . All the hits were bounced but apache
never hung a single connection .

    To me , thats fair enough behavoir . Better than hanging lots
of requests until the networking layer exploded .

    Just me 2c worth anyways .

Cheers Mik Voase.

<QUICK PLUG>
PS. If youre curious about the results they're at :

    http://www.midcoast.com.au/~mvoase/cgisockperf.html

I manged to get a CGI server to serve hits faster than Apache
could serve files this afternoon . It was kruft but it beat it . Just
as an aside .... ;-)

</QUICK PLUG>

--
----------------------------------------------------------------------------
 /~\     /~\            CASTLE INDUSTRIES PTY. LTD.
 | |_____| |            Incorporated 1969. in N.S.W., Australia
 |         |            Phone +612 6562 1345 Fax +612 6567 1449
 |   /~\   |            Web http://www.midcoast.com.au/~mvoase
 |   [ ]   |            Michael H. Voase.  Director.
~~~~~~~~~~~~~~          Cause Linux Flies and Windoze Dies ... 'nuf said.
----------------------------------------------------------------------------




Re: Thread/Process model discussion.

Posted by Dean Gaudet <dg...@arctic.org>.

On Fri, 29 Jan 1999, Rodent of Unusual Size wrote:

> Dean Gaudet wrote:
> > 
> > > The advantage to this, is that the disk I/O is going to be a killer, and I
> > > don't think we want multiple threads blocking on disk I/O.
> > 
> > Why not?
> 
> Because it keeps them from moving on to the next request.

Somewhere you have to pay for the logging.  You can't hide it under a rug. 

So what if a thread doesn't move to the next request?  There are other
threads to handle other requests.

Dean


Re: Thread/Process model discussion.

Posted by Rodent of Unusual Size <Ke...@Golux.Com>.
Dean Gaudet wrote:
> 
> > The advantage to this, is that the disk I/O is going to be a killer, and I
> > don't think we want multiple threads blocking on disk I/O.
> 
> Why not?

Because it keeps them from moving on to the next request.
-- 
#ken	P-)}

Ken Coar                    <http://Web.Golux.Com/coar/>
Apache Group member         <http://www.apache.org/>
"Apache Server for Dummies" <http://Web.Golux.Com/coar/ASFD/>

Re: Thread/Process model discussion.

Posted by Dean Gaudet <dg...@arctic.org>.
On Fri, 29 Jan 1999, Ryan Bloom wrote:

> > 	Logging.  Do we want a logging thread?  What benefits does it give
> > us?  Drawbacks?
> 
> Okay, I will now suggest a possible solution.
> 
> Each process has it's own low priority logging thread, which grabs the
> logging data  from a queue.

i.e. you're re-inventing append only files. 

> writting logs.  This could cause the queue to fill, and then all the
> threads are wasted, because they can't put their logging information
> anywhere.

Yeah that's a terrible situation, which only exists in the model you
propose. 

> The logging queue wakes up every few seconds (or once a second,
> preformance tests could tell us how often is required), and writes a burst
> of messages to the log file.  When the thread wakes up, it creates a new
> queue, and replaces the old queue with the new one.  This allows the other
> threads to continue as if nothing has changed.  It then dumps the log
> messages to the appropriate log file.

If your server is so busy that logging is a CPU issue for you, then I
suggest: 

- you use unix where append only files have sane semantics
- you use BUFFERED_LOGS (see mod_log_config.c)
- you sort your logs offline, on a CPU/disk that's not in your web farm

> The advantage to this, is that the disk I/O is going to be a killer, and I
> don't think we want multiple threads blocking on disk I/O.

Why not? 

That's the best possible thing an application can do: give the kernel all
the information it needs to optimize a solution.  Multiple threads blocked
in an append-only write on a file can be ordered in any damn way the
kernel pleases (as long as it maintains the atomicity of each individual
write).  And if they're all blocked, then clearly there's a lot of data to
be logged -- you have to pay for it somewhere.

> There were two messages about rotating logs, and I think this solution
> also provides us a nice way to rotate logs.

There is already a nice way to rotate logs.  Study how the pipe() is used
in apache.  Notice that it provides a seamless way to rotate logs, and has
absolutely no problems in a multiprocess or multithreaded environment. 

Too bad it only works on unix. 

How's that quote go?  "those who don't understand unix are doomed to
re-invent it"  ? 

:) 

OK, since you have to re-invent unix I suggest you do not have a dedicated
logging thread.  Instead I suggest that, since you already need to
synchronize your threads to add log entries to the queue, have the thread
which notices the queue is full flush it to disk... it's already in the
critical section when it notices, and so you save yourself two context
switches.  This simplifies the implementation... and in fact is somewhat
better than what unix can give.  But we're down in the 1 or 2% range I'm
guessing. 

I still believe log rotation, along with log splitting for massive numbers
of vhosts, belongs outside httpd.  Search the archives for the many
previous discussions of this.  KISS. 

Dean



Re: Thread/Process model discussion.

Posted by Ryan Bloom <rb...@raleigh.ibm.com>.
> 	Logging.  Do we want a logging thread?  What benefits does it give
> us?  Drawbacks?

Okay, I will now suggest a possible solution.

Each process has it's own low priority logging thread, which grabs the
logging data  from a queue.  The queue stores fully formatted strings.
This is done, because even though formatting the string is expensive, it
makes more sense for the workere thread to do it.  If the logging thread
does it, the logger spends most of it's time doing formatting, and not
writting logs.  This could cause the queue to fill, and then all the
threads are wasted, because they can't put their logging information
anywhere.

The logging queue wakes up every few seconds (or once a second,
preformance tests could tell us how often is required), and writes a burst
of messages to the log file.  When the thread wakes up, it creates a new
queue, and replaces the old queue with the new one.  This allows the other
threads to continue as if nothing has changed.  It then dumps the log
messages to the appropriate log file.

The advantage to this, is that the disk I/O is going to be a killer, and I
don't think we want multiple threads blocking on disk I/O.

There were two messages about rotating logs, and I think this solution
also provides us a nice way to rotate logs.  If our logging module has a
directive for how often to rotatlogs, we can do it in the module itself.
When the logging thread wakes up, it checks the time, and if we should
rotate logs, it moves one log file out of the way and opens the new one,
and continues as it should.  Or, it could check the date/time stamp on the
log string, and determine based on that string if the log should go to a
new file or not.  Of course, that causes a hairy mess when we add multiple
processes, but it is still possible with some minor trickery.

Any thoughts?


_______________________________________________________________________
Ryan Bloom		rbb@raleigh.ibm.com
4205 S Miami Blvd	
RTP, NC 27709		It's a beautiful sight to see good dancers 
			doing simple steps.  It's a painful sight to
			see beginners doing complicated patterns.