You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@httpd.apache.org by Ryan Bloom <rb...@covalent.net> on 2001/11/13 03:55:43 UTC

MPM re-write for network logic

Grrrrrrrrrr.........

MPMs suck sometimes.   :-)

I am trying to remove the network logic from the MPMs, so that modules can
implement different transport layers.  I am looking at using a couple of hooks to
accomplish this.  The problem is that Windows just doesn't fit into this model at
all.

Every other MPM fits a simple model.  The child process starts and it creates an
apr_pollfd_t with every socket.  If there are multiple sockets, then the MPM
calls apr_poll.  It then takes the socket and calls apr_accept. The socket is
then passed to ap_run_create_connection, and the server can process
the socket.

This model suggests the following flow to solve the problem.

1)  a hook to add items to poll set
2)  the MPM calls apr_poll
3)  a hook to accept the connection and pass the information
     to ap_run_create_connection.

The only problem I have with this, is that it requires that every transport layer
can be represented as the same primitive as a socket.  I don't think this is an
issue, because on Unix and Unix-like platforms, this will be an int, and on
Windows, this will be a Handle.  OS/2 might have a problem, because it uses
SOCKET, which is not generic at all.

I thought of having a hook to replace the call to apr_poll, but I couldn't get
past the starvation problem that always resulted.

The problem that remains is Windows.  Windows starts the server, and creates
one thread for each socket that is configured.  That thread sits in accept, and
passes the accepted socket to worker threads.  This seems like a waste of
resources, but I will accept that the Windows experts know what they are doing.
My problem is that it doesn't really fit the model above. I guess that Windows 
could work by using the first hook above, and then looping through the 
apr_pollfd_t, creating threads that call the third hook above.

How does this sound to everybody?

Ryan

______________________________________________________________
Ryan Bloom				rbb@apache.org
Covalent Technologies			rbb@covalent.net
--------------------------------------------------------------

Re: MPM re-write for network logic

Posted by Ryan Bloom <rb...@covalent.net>.

On Monday 12 November 2001 07:55 pm, Bill Stoddard wrote:
> I need to study your proposal in detail... but for now I'll try to plant a
> seed...
>
> Now seems to be a good time to consider what an async event driven network
> API would look like :-)
>
> The worker and windows MPM architectures lend themselves to this. One (or
> more) thread doing accepting; multiple workers blocked on a queue
> (completion port, pthread condition varible, kqueue, /dev/poll, etc).
> Hopefully should be simple extension to do non-blocking/async reads and
> writes and have the completion status posted to the same queue the worker
> threads are blocked on :-)

It would be simple to do.  Because the socket is now stored in the filter
itself, it should be relatively easy to have a single thread that does all
of the writing.  Of course, I'm not going to look at writing that MPM until
this work is over.   :-)

Ryan
______________________________________________________________
Ryan Bloom				rbb@apache.org
Covalent Technologies			rbb@covalent.net
--------------------------------------------------------------

Re: MPM re-write for network logic

Posted by Bill Stoddard <bi...@wstoddard.com>.

I need to study your proposal in detail... but for now I'll try to plant a seed...

Now seems to be a good time to consider what an async event driven network API would look
like :-)

The worker and windows MPM architectures lend themselves to this. One (or more) thread
doing accepting; multiple workers blocked on a queue (completion port, pthread condition
varible, kqueue, /dev/poll, etc). Hopefully should be simple extension to do
non-blocking/async reads and writes and have the completion status posted to the same
queue the worker threads are blocked on :-)

Bill

>
> Grrrrrrrrrr.........
>
> MPMs suck sometimes.   :-)
>
> I am trying to remove the network logic from the MPMs, so that modules can
> implement different transport layers.  I am looking at using a couple of hooks to
> accomplish this.  The problem is that Windows just doesn't fit into this model at
> all.
>
> Every other MPM fits a simple model.  The child process starts and it creates an
> apr_pollfd_t with every socket.  If there are multiple sockets, then the MPM
> calls apr_poll.  It then takes the socket and calls apr_accept. The socket is
> then passed to ap_run_create_connection, and the server can process
> the socket.
>
> This model suggests the following flow to solve the problem.
>
> 1)  a hook to add items to poll set
> 2)  the MPM calls apr_poll
> 3)  a hook to accept the connection and pass the information
>      to ap_run_create_connection.
>
> The only problem I have with this, is that it requires that every transport layer
> can be represented as the same primitive as a socket.  I don't think this is an
> issue, because on Unix and Unix-like platforms, this will be an int, and on
> Windows, this will be a Handle.  OS/2 might have a problem, because it uses
> SOCKET, which is not generic at all.
>
> I thought of having a hook to replace the call to apr_poll, but I couldn't get
> past the starvation problem that always resulted.
>
> The problem that remains is Windows.  Windows starts the server, and creates
> one thread for each socket that is configured.  That thread sits in accept, and
> passes the accepted socket to worker threads.  This seems like a waste of
> resources, but I will accept that the Windows experts know what they are doing.
> My problem is that it doesn't really fit the model above. I guess that Windows
> could work by using the first hook above, and then looping through the
> apr_pollfd_t, creating threads that call the third hook above.
>
> How does this sound to everybody?
>
> Ryan
>
> ______________________________________________________________
> Ryan Bloom rbb@apache.org
> Covalent Technologies rbb@covalent.net
> --------------------------------------------------------------
>

Re: MPM re-write for network logic

Posted by Ryan Bloom <rb...@covalent.net>.

On Wednesday 14 November 2001 09:28 pm, Harrie Hazewinkel wrote:
> Hi,
>
> Sorry that I maybe jump in so late on this thread, but still.
> Some food for thought.
>
> --On Monday, November 12, 2001 6:55 PM -0800 Ryan Bloom <rb...@covalent.net>
>
> wrote:
> > I am trying to remove the network logic from the MPMs, so that modules
> > can implement different transport layers.  I am looking at using a couple
> > of hooks to accomplish this.
>
> Is this feature really neccessary to have in order to release
> an Apache 2.0 version?? Is it not time that the focus is on getting
> all things fixed instead of adding features and redesigning things??
> If everyone keeps going like this Apache 2.0 will
> never be released and this group start looking like a bunch
> of hackers who never get it right. They want the perfect
> reease, but what is perfect??

Considering that this change is actually allowing us to remove a lot of 
duplicate code and a lot of special cases that make the code more complex,
yes this is an important change.  This change makes the code easier
to maintain, and less likely to have bugs.

Ryan 
______________________________________________________________
Ryan Bloom				rbb@apache.org
Covalent Technologies			rbb@covalent.net
--------------------------------------------------------------

Re: MPM re-write for network logic

Posted by Harrie Hazewinkel <ha...@lisanza.net>.

Hi,

Sorry that I maybe jump in so late on this thread, but still.
Some food for thought.

--On Monday, November 12, 2001 6:55 PM -0800 Ryan Bloom <rb...@covalent.net> 
wrote:
>
> I am trying to remove the network logic from the MPMs, so that modules can
> implement different transport layers.  I am looking at using a couple of
> hooks to accomplish this.

Is this feature really neccessary to have in order to release
an Apache 2.0 version?? Is it not time that the focus is on getting
all things fixed instead of adding features and redesigning things??
If everyone keeps going like this Apache 2.0 will
never be released and this group start looking like a bunch
of hackers who never get it right. They want the perfect
reease, but what is perfect??

Harrie Hazewinkel
mailto: harrie@lisanza.net
http://www.lisanza.net/

Re: MPM re-write for network logic

Posted by Ryan Bloom <rb...@covalent.net>.

On Tuesday 13 November 2001 01:24 am, Greg Stein wrote:
> On Mon, Nov 12, 2001 at 06:55:43PM -0800, Ryan Bloom wrote:
> >...
> > I am trying to remove the network logic from the MPMs, so that modules
> > can implement different transport layers.  I am looking at using a couple
> > of hooks to accomplish this.  The problem is that Windows just doesn't
> > fit into this model at all.
>
> I think your original premise is incorrect.
>
> MPMs are all about waiting on one or more sockets, accepting connections,
> handling those connections, and mapping units of work to workers. By its
> very nature, it depends upon the sockets.
>
> Apache as a whole should not know about the socket (only the core filters),
> but the MPM definitely should. It is the beast which maps *sockets* to
> *workers*. It has to know.

The problem is that other modules need to be able to insert socket-like
entities to the list of sockets that the MPM will listen to.  My goal isn't to 
remove sockets from the MPM, but rather to standardize how MPMs deal with
sockets.  In this case, I am trying to move everything to a pollset.  In time, I
would like to remove the listensocks array, but I'm not at that point yet.

MPMs will always control the threads that are listening to sockets, but the
actual accepting needs to move to a hook, so that other modules can implement
their own acceptors.  By moving this into the core, we can eliminate a lot of
duplicate code that has fallen out of synch.

Ryan
______________________________________________________________
Ryan Bloom				rbb@apache.org
Covalent Technologies			rbb@covalent.net
--------------------------------------------------------------

Re: MPM re-write for network logic

Posted by Greg Stein <gs...@lyra.org>.

On Mon, Nov 12, 2001 at 06:55:43PM -0800, Ryan Bloom wrote:
>...
> I am trying to remove the network logic from the MPMs, so that modules can
> implement different transport layers.  I am looking at using a couple of hooks to
> accomplish this.  The problem is that Windows just doesn't fit into this model at
> all.

I think your original premise is incorrect.

MPMs are all about waiting on one or more sockets, accepting connections,
handling those connections, and mapping units of work to workers. By its
very nature, it depends upon the sockets.

Apache as a whole should not know about the socket (only the core filters),
but the MPM definitely should. It is the beast which maps *sockets* to
*workers*. It has to know.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: MPM re-write for network logic

Posted by Bill Stoddard <bi...@wstoddard.com>.

>
> > On Monday 12 November 2001 07:22 pm, William A. Rowe, Jr. wrote:
> > > From: "Ryan Bloom" <rb...@covalent.net>
> > > Sent: Monday, November 12, 2001 8:55 PM
> > >
> > > > The problem that remains is Windows.  Windows starts the server, and
> > > > creates one thread for each socket that is configured.  That thread sits
> > > > in accept, and passes the accepted socket to worker threads.  This seems
> > > > like a waste of resources, but I will accept that the Windows experts
> > > > know what they are doing. My problem is that it doesn't really fit the
> > > > model above. I guess that Windows could work by using the first hook
> > > > above, and then looping through the apr_pollfd_t, creating threads that
> > > > call the third hook above.
> > >
> > > Uh... no, that's AcceptEx, and it has entirely different mechanics.  There
> > > will always be data to process when a winsock has accept-ex-ed a socket
> > > (thus the different API.)  Ergo, no thread is woken until it has a job to
> > > do.
> >
> > I have a stupid question.  I have been looking at the Windows code, and I
> > can't see where the data that is read by AcceptEx ever gets to the processing
> > thread.  Does that data ever get to the thread doing the work?  If so, how????
> >
>
> See PostQueuedCompletionStatus() and GetQueuedCompletionStatus().
>
> The worker thread pool blocks on GetQueuedCompletionStatus() in winnt_get_connection().
> winnt_accept() accepts connections and calls PostQueuedCompletionSTatus() to wake up one
> of the workers blocked on GetQueuedCompletionStatus().
>
> Notice the queue of 'completion contexts' that flow between the two calls. And the
> ThreadDispatchIOCP is a Windows "Completion Port" a kernel resident FIFO queue of sorts.
>
> Give me a call tomorrow and I'll walk you through it in detail if you like.
>
> Bill
>

BTW, the NT code patch for accepting/dispatching connections is different from the Win9*
path. 9* does not support CompletionPorts.

Bill

Re: MPM re-write for network logic

Posted by Bill Stoddard <bi...@wstoddard.com>.

> On Monday 12 November 2001 07:22 pm, William A. Rowe, Jr. wrote:
> > From: "Ryan Bloom" <rb...@covalent.net>
> > Sent: Monday, November 12, 2001 8:55 PM
> >
> > > The problem that remains is Windows.  Windows starts the server, and
> > > creates one thread for each socket that is configured.  That thread sits
> > > in accept, and passes the accepted socket to worker threads.  This seems
> > > like a waste of resources, but I will accept that the Windows experts
> > > know what they are doing. My problem is that it doesn't really fit the
> > > model above. I guess that Windows could work by using the first hook
> > > above, and then looping through the apr_pollfd_t, creating threads that
> > > call the third hook above.
> >
> > Uh... no, that's AcceptEx, and it has entirely different mechanics.  There
> > will always be data to process when a winsock has accept-ex-ed a socket
> > (thus the different API.)  Ergo, no thread is woken until it has a job to
> > do.
>
> I have a stupid question.  I have been looking at the Windows code, and I
> can't see where the data that is read by AcceptEx ever gets to the processing
> thread.  Does that data ever get to the thread doing the work?  If so, how????
>

See PostQueuedCompletionStatus() and GetQueuedCompletionStatus().

The worker thread pool blocks on GetQueuedCompletionStatus() in winnt_get_connection().
winnt_accept() accepts connections and calls PostQueuedCompletionSTatus() to wake up one
of the workers blocked on GetQueuedCompletionStatus().

Notice the queue of 'completion contexts' that flow between the two calls. And the
ThreadDispatchIOCP is a Windows "Completion Port" a kernel resident FIFO queue of sorts.

Give me a call tomorrow and I'll walk you through it in detail if you like.

Bill

> Ryan
> ______________________________________________________________
> Ryan Bloom rbb@apache.org
> Covalent Technologies rbb@covalent.net
> --------------------------------------------------------------
>

Re: MPM re-write for network logic

Posted by Ryan Bloom <rb...@covalent.net>.

On Monday 12 November 2001 07:22 pm, William A. Rowe, Jr. wrote:
> From: "Ryan Bloom" <rb...@covalent.net>
> Sent: Monday, November 12, 2001 8:55 PM
>
> > The problem that remains is Windows.  Windows starts the server, and
> > creates one thread for each socket that is configured.  That thread sits
> > in accept, and passes the accepted socket to worker threads.  This seems
> > like a waste of resources, but I will accept that the Windows experts
> > know what they are doing. My problem is that it doesn't really fit the
> > model above. I guess that Windows could work by using the first hook
> > above, and then looping through the apr_pollfd_t, creating threads that
> > call the third hook above.
>
> Uh... no, that's AcceptEx, and it has entirely different mechanics.  There
> will always be data to process when a winsock has accept-ex-ed a socket
> (thus the different API.)  Ergo, no thread is woken until it has a job to
> do.

I have a stupid question.  I have been looking at the Windows code, and I 
can't see where the data that is read by AcceptEx ever gets to the processing
thread.  Does that data ever get to the thread doing the work?  If so, how????

Ryan
______________________________________________________________
Ryan Bloom				rbb@apache.org
Covalent Technologies			rbb@covalent.net
--------------------------------------------------------------

Re: MPM re-write for network logic

Posted by "William A. Rowe, Jr." <wr...@covalent.net>.

From: "Ryan Bloom" <rb...@covalent.net>
Sent: Monday, November 12, 2001 8:55 PM

> The problem that remains is Windows.  Windows starts the server, and creates
> one thread for each socket that is configured.  That thread sits in accept, and
> passes the accepted socket to worker threads.  This seems like a waste of
> resources, but I will accept that the Windows experts know what they are doing.
> My problem is that it doesn't really fit the model above. I guess that Windows 
> could work by using the first hook above, and then looping through the 
> apr_pollfd_t, creating threads that call the third hook above.

Uh... no, that's AcceptEx, and it has entirely different mechanics.  There will
always be data to process when a winsock has accept-ex-ed a socket (thus the
different API.)  Ergo, no thread is woken until it has a job to do.

You don't have an empty 'I'm Here' cycle on Windows.  Therefore the mechanics of
your hook mechansim should reflect that.  Having a single acceptor thread would
actually incur a much larger problem in scalability, for which there is no
reason to force threads to wait in the queue.

Also, waitformultipleobjects only accepts some 65 events (timeout + 64 entries)
so it's a rather limited server (as 1.3 was) that can only listen to 64 sockets.

Bill

Re: MPM re-write for network logic

Posted by Ryan Bloom <rb...@covalent.net>.

On Tuesday 13 November 2001 04:56 pm, dean gaudet wrote:
> On Tue, 13 Nov 2001, Ryan Bloom wrote:
> > On Tuesday 13 November 2001 04:35 pm, dean gaudet wrote:
> > > On Mon, 12 Nov 2001, Ryan Bloom wrote:
> you might also want to think about webmux.  'cause i think it breaks some
> more assumptions you're making (such as 1:1 mapping between client and
> kernel network object).

I don't think the current code makes that assumption any more than the code
did last week.  Essentially, the server gets a connection, and any module is
allowed to execute the accept function for that socket.  Once the accept is
done, the code works the way it always has.

Ryan
______________________________________________________________
Ryan Bloom				rbb@apache.org
Covalent Technologies			rbb@covalent.net
--------------------------------------------------------------

Re: MPM re-write for network logic

Posted by Bill Stoddard <bi...@wstoddard.com>.

> On Tue, 13 Nov 2001, Ryan Bloom wrote:
>
> > On Tuesday 13 November 2001 04:35 pm, dean gaudet wrote:
> > > On Mon, 12 Nov 2001, Ryan Bloom wrote:
> > > > I am trying to remove the network logic from the MPMs, so that modules
> > > > can implement different transport layers.
> > >
> > > are you referring to multiplexing transport layers?  'cause what's there
> > > already should work fine for non-multiplexed transports... i.e. you've got
> > > SSL implemented already.
> >
> > The idea is to allow an MPM to use multiple communication mediums.  For
> > example, IBM has the AFPA cache, which doesn't communicate over regular
> > sockets.  It uses it's own socket type.  Our SSL implementation encrypts the
> > data in memory, and we just write the data to the socket using the standard
> > apr network calls.
>
> jeez, that's so stupid.  years ago when IBM asked me for input on the
> design they were planning to do it right:  cache misses appear in userland
> as sockets.

This AFPA cache funkiness is just on windows.  Sort of hard to "do it right" when you
don't have the Windows socket source code to play with. Well, yea, it would be possible to
use the loopback interface but that has problems of its own.

This is a filter design discussion, not an AFPA discussion. AFPA, right or wrong, seems to
be a good test of the filter design. I am -absolutely- against anything unnatural going
into Apache 2.0 on behalf of AFPA or any other proprietary hack. Go back and read that
last sentence again to make sure the message sinks in.

BTW, your IOL's were a perfect solution. last week I posted a patch to the list the
reintroduced a socket IOL that solved the problem. Both Ryan and Roy said that if filters
could not do what I needed them to do, then they were broken. Well, they are broken and
Ryan's trying to get them right.  And Roy posted an excellent summary of why Apache 2.0
filters are broken.

>
>
>
> > I hope I answered your question, but I'm not sure that I did.
>
> you might also want to think about webmux.  'cause i think it breaks some
> more assumptions you're making (such as 1:1 mapping between client and
> kernel network object).

Yep. This is a variation of the event driven network API. And now is the time to get this
right IMHO.

Bill

Re: MPM re-write for network logic

Posted by dean gaudet <de...@arctic.org>.

On Tue, 13 Nov 2001, Ryan Bloom wrote:

> On Tuesday 13 November 2001 04:35 pm, dean gaudet wrote:
> > On Mon, 12 Nov 2001, Ryan Bloom wrote:
> > > I am trying to remove the network logic from the MPMs, so that modules
> > > can implement different transport layers.
> >
> > are you referring to multiplexing transport layers?  'cause what's there
> > already should work fine for non-multiplexed transports... i.e. you've got
> > SSL implemented already.
>
> The idea is to allow an MPM to use multiple communication mediums.  For
> example, IBM has the AFPA cache, which doesn't communicate over regular
> sockets.  It uses it's own socket type.  Our SSL implementation encrypts the
> data in memory, and we just write the data to the socket using the standard
> apr network calls.

jeez, that's so stupid.  years ago when IBM asked me for input on the
design they were planning to do it right:  cache misses appear in userland
as sockets.

i'd tell IBM the same thing i told sun:  fix your interface.

note that TUX does it right.

> I hope I answered your question, but I'm not sure that I did.

you might also want to think about webmux.  'cause i think it breaks some
more assumptions you're making (such as 1:1 mapping between client and
kernel network object).

-dean

Re: MPM re-write for network logic

Posted by Ryan Bloom <rb...@covalent.net>.

On Tuesday 13 November 2001 04:35 pm, dean gaudet wrote:
> On Mon, 12 Nov 2001, Ryan Bloom wrote:
> > I am trying to remove the network logic from the MPMs, so that modules
> > can implement different transport layers.
>
> are you referring to multiplexing transport layers?  'cause what's there
> already should work fine for non-multiplexed transports... i.e. you've got
> SSL implemented already.

The idea is to allow an MPM to use multiple communication mediums.  For
example, IBM has the AFPA cache, which doesn't communicate over regular
sockets.  It uses it's own socket type.  Our SSL implementation encrypts the
data in memory, and we just write the data to the socket using the standard
apr network calls.

This will also allow us to easily abstract out some of the stuff that has been
hacked in in the past.  For example, I am re-writing the worker MPM to
use this logic, and the pipe_of_death is no longer a special case, it works
just like every other socket as far as Apache is concerned.  The Unix Domain
sockets in the perchild MPM are another example of a hack that can be
cleaned up a bit with this logic.

I hope I answered your question, but I'm not sure that I did.

Ryan

______________________________________________________________
Ryan Bloom				rbb@apache.org
Covalent Technologies			rbb@covalent.net
--------------------------------------------------------------

Re: MPM re-write for network logic

Posted by dean gaudet <de...@arctic.org>.

On Mon, 12 Nov 2001, Ryan Bloom wrote:

> I am trying to remove the network logic from the MPMs, so that modules can
> implement different transport layers.

are you referring to multiplexing transport layers?  'cause what's there
already should work fine for non-multiplexed transports... i.e. you've got
SSL implemented already.

-dean