You are viewing a plain text version of this content. The canonical link for it is here.
Posted to cvs@httpd.apache.org by dg...@locus.apache.org on 2000/06/09 22:50:20 UTC
cvs commit: apache-2.0/src/docs mpm-design.txt
dgaudet 00/06/09 13:50:19
Added: src/docs mpm-design.txt
Log:
one of the things i had hoped for the doc directory would be that useful
threads would get saved here... to save digging around in the mail
archives. anyhow, i thought some of these messages deserved saving.
Revision Changes Path
1.1 apache-2.0/src/docs/mpm-design.txt
Index: mpm-design.txt
===================================================================
From wrowe@lnd.com Fri Jun 9 13:49:27 2000
Reply-To: new-httpd@apache.org
From: "William A. Rowe, Jr." <wr...@lnd.com>
To: new-httpd@apache.org
Subject: RE: Windows 2.0 MPM design issues
Date: Wed, 31 May 2000 12:58:40 -0500
> From: jlwpc1 [mailto:jlwpc1@earthlink.net]
> Sent: Wednesday, May 31, 2000 11:21 AM
>
> I see I am not the only one confused. :)
>
> Okay from the last few days mail it it _clear_ no one knows
> just what is the Windows MPM.
I would strongly disagree with that, I'd say a majority
understand the principals of the MPM (not specifically Win32,
but the entire Apache-wide concept.)
> So please explain to us the Windows 2.0 MPM in words (no code
> if possible).
Fair question. apache-2.0/htdocs/manual/new_features_2_0.html#core
is not very complete :-) apache-2.0/htdocs/manual/misc/API.html
misses the issue entirely, apache-2.0/htdocs/manual/mpm.html is just
a pretty list, and apache-2.0/htdocs/manual/developer/modules.html
doesn't even touch it with a 10 foot pole. So first history and
what the MPM (especially Win32) has evolved into:
It used to be we deployed just about every sort of platform
specific process, thread, parent and child logic in the main()
function of http_main.c. It was a nightmare of tangled code,
aka. #ifdef MY_OS, #ifndef THOSE_OSES, and althought it could
be followed, it was very ugly.
APR was to be (and still might become) the know-all, tell-all of
how to launch processes and threads on any platform, and make the
user believe it has even if they are not supported. But that
doesn't fit into the Apache philosophy of "run this server by the
fastest means this OS will allow". Developers want control over
just how the server runs on a specific OS, perhaps even to the
specific CPU.
While waiting for APR (Apache Portable Runtime) to offer all the
support, the new-httpd coders attacked http_main.c and gutted it.
What is left over is (I believe) sitting in modules/mpm/prefork.
They stripped every aspect of launching processes, establishing
the listen ports, and cleaning up afterwords into the MPM.
The MPM is the 'engine' around the server. It's job is to create
and manage every process, thread and socket of the running server.
It is implemented as a 'module', but that's a pretty obtuse
description of what it is expected to do :-) That's the reason
for Greg's recent posts.
Win32 was never a process-oriented engine. The http_main.c code
did it's own threading thing (the Win32 was almost ahead of it's
time, I believe it derived first from the OS2 port.) Bill Stoddard
has reproduced all those funky exceptions into the winnt MPM. You
will find that implementation in modules/mpm/winnt/winnt.c with
both master_main() and child_main() engine loops.
Now, when Win32 and the "Service Control Manager" came along back
in 1.3.x, the world changed. We couldn't play our games 'inside'
of http_main.c's main() function, we needed a head start to set
up the service. That's the #ifdef'ed apache_main() you see in
the http_main.c source. We set up main() in main_win32.c, to do
nothing more than play with extra args and set up service control.
It's not something that other platform's won't need. I'm betting
it's a good fit for extra args on NetWare and other non-unix
platforms. So I created an extra hook for pre-argv processing, so
the MPM now has the following chances to set up it's OS's prefered
environment:
register_hooks The very first chance to set up hooks and environ.
rewrite_args Before the common main() touches the cmd line args
pre_config After cmd line args, but before we load httpd.conf
post_config After we've loaded httpd.conf
ap_mpm_run The process is ours to serve. If the MPM returns
false, the config pools are cleared, all hooks are
reset, and main() calls our pre-config, reloads
httpd.conf, calls our post_config and ap_mpm_run
all over again to restart the server.
That's it. Apache is just a function library, config processor,
and module manager. The gruntwork of handling processes, threads
and the ip sockets all falls on the MPM. That's why we will see
many different MPM's, each for a specific family of OS's, or just
a single OS, or even CPU. Everything that is OS dependent should
be localized in the MPM, or in the APR. The rest of the server
and modules should work (as applicable) across platforms.
Warning: you might see right here that Win9x and WinNT would make
great, seperate MPM's. That's only 1/2 right. Because they share
so much common code, we are asking for chances to correct bugs
on only one MPM, and forgetting the other. So, right now, they will
live in one MPM.
Now, since an MPM now gets a chance to hook in and diddle with the
command line args, the Win32 MPM can handle the service control
within MPM, instead of wrapping around http_main.c.
Now going back over my earlier comments on service.c, they should
make alot more sense :-)
More reading is worthwhile. Please read Dean's master plans:
apache-2.0/src/docs/goals.txt (almost a year old, and as true as ever.)
apache-2.0/src/docs/initial-blurb.txt
There is more reading there, but I don't know how current tls.txt and
buff.txt still are, and whatever is in new_features_2_0.html should
be moved into htdocs, CHANGES, and then knocked off.
From rbb@covalent.net Fri Jun 9 13:49:27 2000
Reply-To: new-httpd@apache.org
Date: Wed, 31 May 2000 17:05:25 -0700 (PDT)
From: rbb@covalent.net
To: new-httpd@apache.org
Subject: Re: Windows 2.0 MPM design issues
MPM's are how the server maps a request to an execution
primitive. Basically, the server starts configures itself and calls the
MPM code.
The MPM is responsible for starting child processes and monitoring
them. It is also responsible for starting threads within the child
process and having those threads (whether there be 1 or more) accept on
the socket. Once a request is made, the MPM is responsible for having an
execution primitive handle the request. For all of the current MPM's the
same primitive that accepted the connection also handles request, however
it would be possible to have one thread accept all connections and hand
the requests off to other threads to actuall serve the request.
LAstly, the MPM is responsible for managing the threads and other child
processes. And killing them when a restart/shutdown is requested.
I believe this is everythin an MPM is responsible for. If I missed
anything, somebody correct me.
Ryan
On Wed, 31 May 2000, jlwpc1 wrote:
> From: <rb...@covalent.net>
>
> >
> > What do you mean? OtherBill is working hard to clean up the Windows
> > MPM. Currently, it is a mix of code in the modules/mpm/winnt and os/win32
> > directories. This is changing (but it is really just moving code around
> > and less new code) to be just the modules/mpm/winnt directory.
> >
> > What exactly do you want to know?
> >
>
> Just in words what it is?
>
> Single process.
>
> Process after process?
>
> Multi-threaded?
>
> Process A starts then what happens in order to "serve up pages"?
>
> Thanks,
> JLW
>
>
>
_______________________________________________________________________________
Ryan Bloom rbb@apache.org
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------
From reddrum@attglobal.net Fri Jun 9 13:49:27 2000
Reply-To: new-httpd@apache.org
From: Bill Stoddard <re...@attglobal.net>
To: new-httpd@apache.org
Subject: Re: Windows 2.0 MPM design issues
Date: Wed, 31 May 2000 09:52:14 -0400
Organization: Apache Software Foundation
> MPM's are how the server maps a request to an execution
> primitive. Basically, the server starts configures itself and calls the
> MPM code.
>
> The MPM is responsible for starting child processes and monitoring
> them. It is also responsible for starting threads within the child
> process and having those threads (whether there be 1 or more) accept on
> the socket. Once a request is made, the MPM is responsible for having an
> execution primitive handle the request. For all of the current MPM's the
> same primitive that accepted the connection also handles request, however
> it would be possible to have one thread accept all connections and hand
> the requests off to other threads to actuall serve the request.
>
> LAstly, the MPM is responsible for managing the threads and other child
> processes. And killing them when a restart/shutdown is requested.
>
Very good MPM overview. Continue with Win32 specific MPM details...
The WIN32 MPM creates two processes. The parent process and the parent process creates a child
process. The child process is multithreaded and is responsible for processing all HTTP requests.
The parent process:
1. Creates the AcceptEx IO Completion Port (IOCP for short) (I won't attempt explain this. I'd take
too many words)
2. Opens all the listen sockets
3. Creates the child process
4. Passes a duplicated IOCP handle to the child (parent communicates to the child via pipe, NOT
shared memory)
5. Passed duplicated listen socket handles to the child (via a pipe)
6. Waits for a restart or shutdown event (from an external process) or a child exit (normal or
abnormal)
7. When one of the above events is signaled, the parent does the right thing (restarts the failed
child process, signals the child process to die gracefully and optionally on restart, restarts a new
child process to take the place of the old child process.)
8. Thats it! The parent process job is very simple.
Steps 1, 2, 4 & 5 could be done directly in the child process. Doing them in the parent process
allows the sockets (and pending connections in the listen queue) to be maintained across a server
restart. Doing these steps in the child process would cause all pending connections to be dropped
across a restart.
The main thread in the child process:
1. Receives the IOCP and duplicated sockets
2. Does initialization required to begin accepting connections
3. Creates a pool of worker threads which accept requests off the listen sockets (the details differ
a bitr depending on whether you are on NT or not). When a connection is received, a thread accepts
the connection and processes the request that comes in on that connection.
4. Starts accepting requests
95/98 requests are accepted on a seperate thread. NT uses an IOCP.
5. The main child thread then waits for a shutdown event (or on NT, a server maintenance event)
If the main thread received a server maintenance event, it does some IOCP magic to increase the
number of connections that can be handled). When it receives a shutdown event (either generated by a
worker thread or generated by the parent process), it shutdowns the worker threads gracefully and
eventually exits.
There is a start_mutex which prevents more than one child process from accepting requests at once.
The WIN32 MPM uses threads as the "execution primitive". Apache 1.3 on Windows used threads as well.
The NT specific code uses IO Completion ports and does accepts asynchronously. Worker threads are
dispatched off the IOCP in LIFO order, which is pretty cool. This is the first step to getting to a
fully asynchronous server (which is my goal). What does this mean? Today the server is not fully
asynchronous, which means that you require 1 thread per concurrently connected client. 2000
concurrent clients implies the need for 2000 threads. Now threads are lighter weight than processes,
but that still cost resources. A fully async server could handle those 2000 concurrent clients with
1 thread (or realistically, a few threads) because that thread would NEVER block on network I/O.
Hope this helps.
Bill Stoddard
From wrowe@lnd.com Fri Jun 9 13:49:27 2000
Reply-To: new-httpd@apache.org
From: "William A. Rowe, Jr." <wr...@lnd.com>
To: new-httpd@apache.org
Subject: RE: Windows 2.0 MPM design issues
Date: Wed, 31 May 2000 20:53:30 -0500
> From: jlwpc1 [mailto:jlwpc1@earthlink.net]
> Sent: Wednesday, May 31, 2000 5:47 PM
>
> Windows starts process A (Windows or Console) and main thread
> Aa (process A thread a) and then what happens?
>
> Just in words what it is?
In the new scheme? We start in http_main() and start calling
the MPM's hooks, in the order I just documented for you.
Under Winnt's MPM, we are dual process;
Primary process
Primary thread of master process creates sockets,
spawns the child and pipes the handles to the child.
Please skim mpm_winnt.c master_main() for the flow.
Second thread (if needed) handles the 9x windows message
pump for shutdown or NT service control manager handler.
It is known to NT or 9x as the 'service' process.
See mpm_service_to_start() in the new service.c for how
that hooks in to the server.
Second process
Many worker threads spewing off web pages, I think they
are coordinated by a master thread.
Please skim mpm_winnt.c child_main() for the flow.
Just a general thought - Apache is really only understood by
walking the code - no, there is no master blueprint of the
application. If you want to understand the code, jump into
it, break it, rework it, and spend some time in the debugger
walking it.
Bill
From rbb@covalent.net Fri Jun 9 13:49:27 2000
Reply-To: new-httpd@apache.org
Date: Fri, 2 Jun 2000 09:31:22 -0700 (PDT)
From: rbb@covalent.net
To: new-httpd@apache.org
Subject: Re: Windows 2.0 MPM design issues
> Yes but there is no written down reason _why_ Apache is dual process in
> the Windows version. Why?
>
> Windows is a multi-threaded OS not process - it looks like what Apache
> is doing would be cleaner and quicker using just threads. So why is
> there another process used as if it were a thread?
Because the problem with threads is that when one thread causes a GPF, the
whole process goes away. The nice part about having that first process
around to monitor the second, is that if/when a thread dies for some
unknown reason, we can restart and the User doesn't need to know about it.
Ryan
_______________________________________________________________________________
Ryan Bloom rbb@apache.org
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------