You are viewing a plain text version of this content. The canonical link for it is here.
Posted to cvs@httpd.apache.org by dg...@locus.apache.org on 2000/06/09 22:50:20 UTC

cvs commit: apache-2.0/src/docs mpm-design.txt

dgaudet     00/06/09 13:50:19

  Added:       src/docs mpm-design.txt
  Log:
  one of the things i had hoped for the doc directory would be that useful
  threads would get saved here... to save digging around in the mail
  archives.  anyhow, i thought some of these messages deserved saving.
  
  Revision  Changes    Path
  1.1                  apache-2.0/src/docs/mpm-design.txt
  
  Index: mpm-design.txt
  ===================================================================
  From wrowe@lnd.com Fri Jun  9 13:49:27 2000
  Reply-To: new-httpd@apache.org
  From: "William A. Rowe, Jr." <wr...@lnd.com>
  To: new-httpd@apache.org
  Subject: RE: Windows 2.0  MPM design issues
  Date: Wed, 31 May 2000 12:58:40 -0500
  
  > From: jlwpc1 [mailto:jlwpc1@earthlink.net]
  > Sent: Wednesday, May 31, 2000 11:21 AM
  > 
  > I see I am not the only one confused. :)
  > 
  > Okay from the last few days mail it it _clear_ no one knows 
  > just what is the Windows MPM.
  
  I would strongly disagree with that, I'd say a majority 
  understand the principals of the MPM (not specifically Win32,
  but the entire Apache-wide concept.)
  
  > So please explain to us the Windows 2.0 MPM in words (no code 
  > if possible).
  
  Fair question.  apache-2.0/htdocs/manual/new_features_2_0.html#core
  is not very complete :-)  apache-2.0/htdocs/manual/misc/API.html
  misses the issue entirely, apache-2.0/htdocs/manual/mpm.html is just
  a pretty list, and apache-2.0/htdocs/manual/developer/modules.html
  doesn't even touch it with a 10 foot pole.  So first history and
  what the MPM (especially Win32) has evolved into:
  
  
  It used to be we deployed just about every sort of platform
  specific process, thread, parent and child logic in the main()
  function of http_main.c.  It was a nightmare of tangled code,
  aka. #ifdef MY_OS, #ifndef THOSE_OSES, and althought it could 
  be followed, it was very ugly.
  
  APR was to be (and still might become) the know-all, tell-all of
  how to launch processes and threads on any platform, and make the
  user believe it has even if they are not supported.  But that
  doesn't fit into the Apache philosophy of "run this server by the 
  fastest means this OS will allow".  Developers want control over 
  just how the server runs on a specific OS, perhaps even to the
  specific CPU.
  
  While waiting for APR (Apache Portable Runtime) to offer all the
  support, the new-httpd coders attacked http_main.c and gutted it.
  What is left over is (I believe) sitting in modules/mpm/prefork.
  They stripped every aspect of launching processes, establishing
  the listen ports, and cleaning up afterwords into the MPM.
  
  The MPM is the 'engine' around the server.  It's job is to create
  and manage every process, thread and socket of the running server.
  It is implemented as a 'module', but that's a pretty obtuse
  description of what it is expected to do :-)  That's the reason
  for Greg's recent posts.
  
  Win32 was never a process-oriented engine.  The http_main.c code
  did it's own threading thing (the Win32 was almost ahead of it's
  time, I believe it derived first from the OS2 port.)  Bill Stoddard 
  has reproduced all those funky exceptions into the winnt MPM.  You 
  will find that implementation in modules/mpm/winnt/winnt.c with
  both master_main() and child_main() engine loops.  
  
  
  Now, when Win32 and the "Service Control Manager" came along back
  in 1.3.x, the world changed.  We couldn't play our games 'inside'
  of http_main.c's main() function, we needed a head start to set
  up the service.  That's the #ifdef'ed apache_main() you see in
  the http_main.c source.  We set up main() in main_win32.c, to do
  nothing more than play with extra args and set up service control.
  
  It's not something that other platform's won't need.  I'm betting
  it's a good fit for extra args on NetWare and other non-unix
  platforms.  So I created an extra hook for pre-argv processing, so
  the MPM now has the following chances to set up it's OS's prefered
  environment:
  
    register_hooks  The very first chance to set up hooks and environ.
    rewrite_args    Before the common main() touches the cmd line args
  
    pre_config      After cmd line args, but before we load httpd.conf
    post_config     After we've loaded httpd.conf
    ap_mpm_run      The process is ours to serve.  If the MPM returns
                    false, the config pools are cleared, all hooks are 
                    reset, and main() calls our pre-config, reloads
                    httpd.conf, calls our post_config and ap_mpm_run 
                    all over again to restart the server.
  
  
  That's it.  Apache is just a function library, config processor, 
  and module manager.  The gruntwork of handling processes, threads 
  and the ip sockets all falls on the MPM.  That's why we will see
  many different MPM's, each for a specific family of OS's, or just
  a single OS, or even CPU.  Everything that is OS dependent should
  be localized in the MPM, or in the APR.  The rest of the server
  and modules should work (as applicable) across platforms.
  
  Warning: you might see right here that Win9x and WinNT would make
  great, seperate MPM's.  That's only 1/2 right.  Because they share
  so much common code, we are asking for chances to correct bugs
  on only one MPM, and forgetting the other.  So, right now, they will 
  live in one MPM.
  
  Now, since an MPM now gets a chance to hook in and diddle with the
  command line args, the Win32 MPM can handle the service control
  within MPM, instead of wrapping around http_main.c.
  
  Now going back over my earlier comments on service.c, they should
  make alot more sense :-)
  
  
  More reading is worthwhile.  Please read Dean's master plans:
  
  apache-2.0/src/docs/goals.txt (almost a year old, and as true as ever.)
  apache-2.0/src/docs/initial-blurb.txt
  
  There is more reading there, but I don't know how current tls.txt and 
  buff.txt still are, and whatever is in new_features_2_0.html should
  be moved into htdocs, CHANGES, and then knocked off.
  
  
  From rbb@covalent.net Fri Jun  9 13:49:27 2000
  Reply-To: new-httpd@apache.org
  Date: Wed, 31 May 2000 17:05:25 -0700 (PDT)
  From: rbb@covalent.net
  To: new-httpd@apache.org
  Subject: Re: Windows 2.0  MPM design issues
  
  
  MPM's are how the server maps a request to an execution
  primitive.  Basically, the server starts configures itself and calls the
  MPM code.
  
  The MPM is responsible for starting child processes and monitoring
  them.  It is also responsible for starting threads within the child
  process and having those threads (whether there be 1 or more) accept on
  the socket.  Once a request is made, the MPM is responsible for having an
  execution primitive handle the request.  For all of the current MPM's the
  same primitive that accepted the connection also handles request, however
  it would be possible to have one thread accept all connections and hand
  the requests off to other threads to actuall serve the request.
  
  LAstly, the MPM is responsible for managing the threads and other child
  processes.  And killing them when a restart/shutdown is requested.
  
  I believe this is everythin an MPM is responsible for.  If I missed
  anything, somebody correct me.
  
  Ryan
  
  On Wed, 31 May 2000, jlwpc1 wrote:
  
  > From: <rb...@covalent.net>
  > 
  > > 
  > > What do you mean?  OtherBill is working hard to clean up the Windows
  > > MPM.  Currently, it is a mix of code in the modules/mpm/winnt and os/win32
  > > directories.  This is changing (but it is really just moving code around
  > > and less new code) to be just the modules/mpm/winnt directory.
  > > 
  > > What exactly do you want to know?
  > > 
  > 
  > Just in words what it is?
  > 
  > Single process.
  > 
  > Process after process?
  > 
  > Multi-threaded?
  > 
  > Process A starts then what happens in order to "serve up pages"?
  > 
  > Thanks,
  > JLW
  > 
  > 
  > 
  
  
  _______________________________________________________________________________
  Ryan Bloom                        	rbb@apache.org
  406 29th St.
  San Francisco, CA 94131
  -------------------------------------------------------------------------------
  
  
  From reddrum@attglobal.net Fri Jun  9 13:49:27 2000
  Reply-To: new-httpd@apache.org
  From: Bill Stoddard <re...@attglobal.net>
  To: new-httpd@apache.org
  Subject: Re: Windows 2.0  MPM design issues
  Date: Wed, 31 May 2000 09:52:14 -0400
  Organization: Apache Software Foundation
  
  
  > MPM's are how the server maps a request to an execution
  > primitive.  Basically, the server starts configures itself and calls the
  > MPM code.
  >
  > The MPM is responsible for starting child processes and monitoring
  > them.  It is also responsible for starting threads within the child
  > process and having those threads (whether there be 1 or more) accept on
  > the socket.  Once a request is made, the MPM is responsible for having an
  > execution primitive handle the request.  For all of the current MPM's the
  > same primitive that accepted the connection also handles request, however
  > it would be possible to have one thread accept all connections and hand
  > the requests off to other threads to actuall serve the request.
  >
  > LAstly, the MPM is responsible for managing the threads and other child
  > processes.  And killing them when a restart/shutdown is requested.
  >
  
  Very good MPM overview. Continue with Win32 specific MPM details...
  
  The WIN32 MPM creates two processes. The parent process and the parent process creates a child
  process. The child process is multithreaded and is responsible for processing all HTTP requests.
  
  The parent process:
  1. Creates the AcceptEx IO Completion Port (IOCP for short) (I won't attempt explain this. I'd take
  too many words)
  2. Opens all the listen sockets
  3. Creates the child process
  4. Passes a duplicated IOCP handle to the child (parent communicates to the child via pipe, NOT
  shared memory)
  5. Passed duplicated listen socket handles to the child (via a pipe)
  6. Waits for a restart or shutdown event (from an external process) or a child exit (normal or
  abnormal)
  7. When one of the above events is signaled, the parent does the right thing (restarts the failed
  child process, signals the child process to die gracefully and optionally on restart, restarts a new
  child process to take the place of the old child process.)
  8. Thats it! The parent process job is very simple.
  
  Steps 1, 2, 4 & 5 could be done directly in the child process. Doing them in the parent process
  allows the sockets (and pending connections in the listen queue) to be maintained across a server
  restart. Doing these steps in the child process would cause all pending connections to be dropped
  across a restart.
  
  The main thread in the child process:
  1. Receives the IOCP and duplicated sockets
  2. Does initialization required to begin accepting connections
  3. Creates a pool of worker threads which accept requests off the listen sockets (the details differ
  a bitr depending on whether you are on NT or not). When a connection is received, a thread accepts
  the connection and processes the request that comes in on that connection.
  4. Starts accepting requests
      95/98 requests are accepted on a seperate thread. NT uses an IOCP.
  5. The main child thread then waits for a shutdown event (or on NT, a server maintenance event)
  
  If the main thread received a server maintenance event, it does some IOCP magic to increase the
  number of connections that can be handled). When it receives a shutdown event (either generated by a
  worker thread or generated by the parent process), it shutdowns the worker threads gracefully and
  eventually exits.
  
  There is a start_mutex which prevents more than one child process from accepting requests at once.
  
  The WIN32 MPM uses threads as the "execution primitive". Apache 1.3 on Windows used threads as well.
  The NT specific code uses IO Completion ports and does accepts asynchronously. Worker threads are
  dispatched off the IOCP in LIFO order, which is pretty cool. This is the first step to getting to a
  fully asynchronous server (which is my goal). What does this mean? Today the server is not fully
  asynchronous, which means that you require 1 thread per concurrently connected client. 2000
  concurrent clients implies the need for 2000 threads. Now threads are lighter weight than processes,
  but that still cost resources. A fully async server could handle those 2000 concurrent clients with
  1 thread (or realistically, a few threads) because that thread would NEVER block on network I/O.
  
  Hope this helps.
  
  Bill Stoddard
  
  
  
  
  
  From wrowe@lnd.com Fri Jun  9 13:49:27 2000
  Reply-To: new-httpd@apache.org
  From: "William A. Rowe, Jr." <wr...@lnd.com>
  To: new-httpd@apache.org
  Subject: RE: Windows 2.0  MPM design issues
  Date: Wed, 31 May 2000 20:53:30 -0500
  
  > From: jlwpc1 [mailto:jlwpc1@earthlink.net]
  > Sent: Wednesday, May 31, 2000 5:47 PM
  > 
  > Windows starts process A (Windows or Console) and main thread 
  > Aa (process A thread a) and then what happens?
  >
  > Just in words what it is?
  
  In the new scheme?  We start in http_main() and start calling
  the MPM's hooks, in the order I just documented for you.
  
  Under Winnt's MPM, we are dual process;
  
  Primary process
  
    Primary thread of master process creates sockets,
    spawns the child and pipes the handles to the child.
    Please skim mpm_winnt.c master_main() for the flow.
    
    Second thread (if needed) handles the 9x windows message
    pump for shutdown or NT service control manager handler.
    It is known to NT or 9x as the 'service' process.
    See mpm_service_to_start() in the new service.c for how 
    that hooks in to the server.
  
  Second process
  
    Many worker threads spewing off web pages, I think they
    are coordinated by a master thread.
    Please skim mpm_winnt.c child_main() for the flow.
  
  Just a general thought - Apache is really only understood by 
  walking the code - no, there is no master blueprint of the 
  application.  If you want to understand the code, jump into
  it, break it, rework it, and spend some time in the debugger
  walking it.
  
  Bill
  
  
  From rbb@covalent.net Fri Jun  9 13:49:27 2000
  Reply-To: new-httpd@apache.org
  Date: Fri, 2 Jun 2000 09:31:22 -0700 (PDT)
  From: rbb@covalent.net
  To: new-httpd@apache.org
  Subject: Re: Windows 2.0  MPM design issues
  
  
  > Yes but there is no written down reason _why_ Apache is dual process in
  > the Windows version. Why?
  > 
  > Windows is a multi-threaded OS not process - it looks like what Apache
  > is doing would be cleaner and quicker using just threads.  So why is
  > there another process used as if it were a thread?
  
  Because the problem with threads is that when one thread causes a GPF, the
  whole process goes away.  The nice part about having that first process
  around to monitor the second, is that if/when a thread dies for some
  unknown reason, we can restart and the User doesn't need to know about it.
  
  Ryan
  _______________________________________________________________________________
  Ryan Bloom                        	rbb@apache.org
  406 29th St.
  San Francisco, CA 94131
  -------------------------------------------------------------------------------