You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Paul Querna <pa...@querna.org> on 2009/07/07 07:20:51 UTC

Events, Destruction and Locking

Can't sleep, so finally writing this email I've been meaning to write
for about 7 months now :D

One of the challenges in the Simple MPM, and to a smaller degree in
the Event MPM, is how to manage memory allocation, destruction, and
thread safety.

A 'simple' example:
 - 1) Thread A: Client Connection Created
   -  2) Thread A: Timer Event Added for 10 seconds in the future to
detect  IO timeout,
 - 3) Thread B: Client Socket closes in 9.99 seconds.
 - 4) Thread C: Timer Event for IO timeout is triggered after 10 seconds

The simple answer is placing a Mutex around the connection object.
Any operation which two threads are working on the connection, locks
this Mutex.

This has many problems, the first of which is destruction.  In this
case, Thread B would start destructing the connection, since the
socket was closed, but thread C would already be waiting for this
mutex.... and then the object underneath it was just free'ed.

To solve this Thread B would unregister all existing (and unfired)
triggers/timeouts first.  Events would increment a reference count on
the connection object, and Thread B would schedule a future event to
check this reference count. If the reference count is zero, this timer
would free the connection object, if there was still an outstanding
reference in a running event, it would schedule itself for a future
cleanup attempt.

All of this is insanely error prone, difficult to debug, and painful to explain.

Pools don't help, but don't really make it worse, and are good enough
for the actual cleanup part -- the difficultly lies in knowing *when*
you can cleanup an object.

A related problem of using Mutex Guards on a connection object is that
if a single connection 'locks up' a thread, its feasible forl other
worker threads to get stuck waiting for this connection, and we would
have no way to 'recover' these lost threads.

I think it is possible to write a complete server that deals with all
these intricacies and gets everything just 'right', but as soon as you
introduce 3rd party module writers, no matter how 'smart' we are, our
castle of event goodness will crumble.

I am looking for an alternative that doesn't expose all this crazyness
of when to free, destruct, or lock things.  The best idea I can come
up with is for each Connection, it would become 'semi-sticky' to a
single thread.  Meaning each worker thread would have its own queue of
upcoming events to process, and all events for connection X would sit
on the same 'queue'.  This would prevent two threads waiting for
destruction, and other cases of a single connection's mutex locking up
all your works, essentially providing basic fault isolation.

These queues could be mutable, and you could 'move' a connection
between queues, but you would always take all of its events and
triggers, and move them together to a different queue.

Does the 'connection event queue' idea make sense?

I'm not sure I'm expressing the idea fully over email.... but I'll be
at OSCON in a few weeks if anyone wants beer :)

-Paul

Re: Events, Destruction and Locking

Posted by Mladen Turk <mt...@apache.org>.
Paul Querna wrote:
> 
> This deals with removing an event from the pollset, but what about an
> event that had already fired, as I gave in the original example  of a
> timeout event firing the same time a socket close event happened?
> 

In that case I suppose the only solution is to make the operations
atomic. Since both operations would lead to the same result
(closing a connection) I suppose an atomic state flag should be enough.

> In that state you have two threads both in a 'run state' for a
> connection, and I'm not sure how the pre-cleanup to pools solves this
> in any way?
>

It won't because the cleanup pool API doesn't bother with
cleanup callback return values, so there's no way to bail out
from the pool cleanup call. I suppose we could modify the
pre-cleanup to handle the retval from callback and breaks the
entire pool cleanup if one of them returns something other
then APR_SUCCESS. Then the callback function can decide
weather there is a pending close operation or not.


Regards
-- 
^(TM)

Re: Events, Destruction and Locking

Posted by Paul Querna <pa...@querna.org>.
On Mon, Jul 6, 2009 at 10:56 PM, Mladen Turk<mt...@apache.org> wrote:
> Paul Querna wrote:
>>
>> Can't sleep, so finally writing this email I've been meaning to write
>> for about 7 months now :D
>>
>> Pools don't help, but don't really make it worse, and are good enough
>> for the actual cleanup part -- the difficultly lies in knowing *when*
>> you can cleanup an object.
>>
>
> Pool pre cleanup is meant to deal with such issues.
> You register a pre-cleanup and it will run before any of the
> pool objects are actually destroyed.
>
> In your case pre-cleanup callback could break the wait loop
> and make sure you don't reference a zombie object.
> The only issue left is guarding thread access to a singleton
> pollset interrupt (we even have pollset_interrupt with latest APR)
> from a pre-cleanup callback (or simply using a queue to serialize
> the objects that needs to get removed from the pollset)

This deals with removing an event from the pollset, but what about an
event that had already fired, as I gave in the original example  of a
timeout event firing the same time a socket close event happened?

In that state you have two threads both in a 'run state' for a
connection, and I'm not sure how the pre-cleanup to pools solves this
in any way?

Thanks,

Paul

Re: Events, Destruction and Locking

Posted by Mladen Turk <mt...@apache.org>.
Paul Querna wrote:
> Can't sleep, so finally writing this email I've been meaning to write
> for about 7 months now :D
> 
> Pools don't help, but don't really make it worse, and are good enough
> for the actual cleanup part -- the difficultly lies in knowing *when*
> you can cleanup an object.
>

Pool pre cleanup is meant to deal with such issues.
You register a pre-cleanup and it will run before any of the
pool objects are actually destroyed.

In your case pre-cleanup callback could break the wait loop
and make sure you don't reference a zombie object.
The only issue left is guarding thread access to a singleton
pollset interrupt (we even have pollset_interrupt with latest APR)
from a pre-cleanup callback (or simply using a queue to serialize
the objects that needs to get removed from the pollset)

The exact problem we have/had with Tomcat native where shutting
down the server can lead to JVM crash in case the connections
are still in the pool waiting for the network event to happen.


Regards
-- 
^(TM)

Re: Events, Destruction and Locking

Posted by Paul Querna <pa...@querna.org>.
On Mon, Jul 6, 2009 at 10:50 PM, Justin Erenkrantz<ju...@erenkrantz.com> wrote:
> On Mon, Jul 6, 2009 at 10:20 PM, Paul Querna<pa...@querna.org> wrote:
>> I am looking for an alternative that doesn't expose all this crazyness
>> of when to free, destruct, or lock things.  The best idea I can come
>> up with is for each Connection, it would become 'semi-sticky' to a
>> single thread.  Meaning each worker thread would have its own queue of
>> upcoming events to process, and all events for connection X would sit
>> on the same 'queue'.  This would prevent two threads waiting for
>> destruction, and other cases of a single connection's mutex locking up
>> all your works, essentially providing basic fault isolation.
>>
>> These queues could be mutable, and you could 'move' a connection
>> between queues, but you would always take all of its events and
>> triggers, and move them together to a different queue.
>>
>> Does the 'connection event queue' idea make sense?
>
> I think I see where you're going with this...being so dependent upon
> mutexes seems...like going into a jungle full of guerillas armed with
> only a dull kitchen knife.
>
> So, a connection gets assigned to a 'thread' - but it has only two
> states: running or waiting for a network event.  The critical part is
> that the thread *never* blocks on network traffic...all the 'network
> event' thread does is detect "yup, ready to go" and throws it back to
> that 'assigned' thread to process the event.  Seems trivial enough to
> do with a serf-centric system.  =)

Yes, I think the connection having the "two states: running or waiting
for a network event" is the key to making this work.  The
thread-stickiness is really just the conceptional model, but basically
if a connection is already 'running', all other events that would of
fired for it, like a timeout, would just queue up behind the running
operation, rather than running directly on another thread.  This start
solving a multitude of locking and cleanup issues (i think).

Re: Events, Destruction and Locking

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
On Mon, Jul 6, 2009 at 10:20 PM, Paul Querna<pa...@querna.org> wrote:
> I am looking for an alternative that doesn't expose all this crazyness
> of when to free, destruct, or lock things.  The best idea I can come
> up with is for each Connection, it would become 'semi-sticky' to a
> single thread.  Meaning each worker thread would have its own queue of
> upcoming events to process, and all events for connection X would sit
> on the same 'queue'.  This would prevent two threads waiting for
> destruction, and other cases of a single connection's mutex locking up
> all your works, essentially providing basic fault isolation.
>
> These queues could be mutable, and you could 'move' a connection
> between queues, but you would always take all of its events and
> triggers, and move them together to a different queue.
>
> Does the 'connection event queue' idea make sense?

I think I see where you're going with this...being so dependent upon
mutexes seems...like going into a jungle full of guerillas armed with
only a dull kitchen knife.

So, a connection gets assigned to a 'thread' - but it has only two
states: running or waiting for a network event.  The critical part is
that the thread *never* blocks on network traffic...all the 'network
event' thread does is detect "yup, ready to go" and throws it back to
that 'assigned' thread to process the event.  Seems trivial enough to
do with a serf-centric system.  =)

> I'm not sure I'm expressing the idea fully over email.... but I'll be
> at OSCON in a few weeks if anyone wants beer :)

I'll take you up on the beer and we can mull it over...  -- justin

Re: Events, Destruction and Locking

Posted by Jim Jagielski <ji...@jaguNET.com>.
On Jul 7, 2009, at 10:17 AM, Graham Leggett wrote:

> Paul Querna wrote:
>
>> It breaks the 1:1: connection mapping to thread (or process) model
>> which is critical to low memory footprint, with thousands of
>> connections, maybe I'm just insane, but all of the servers taking
>> market share, like lighttpd, nginx, etc, all use this model.
>>
>> It also prevents all variations of the slowaris stupidity, because  
>> its
>> damn hard to overwhelm the actual connection processing if its all
>> async, and doesn't block a worker.
>
> But as you've pointed out, it makes our heads bleed, and locks slow  
> us down.
>
> At the lowest level, the event loop should be completely async, and be
> capable of supporting an arbitrary (probably very high) number of
> concurrent connections.
>

We are looking at a kernel-like scheduler more than anything
else...

Re: Events, Destruction and Locking

Posted by Graham Dumpleton <gr...@gmail.com>.
2009/7/9 Rainer Jung <ra...@kippdata.de>:
> On 08.07.2009 15:55, Paul Querna wrote:
>> On Wed, Jul 8, 2009 at 3:05 AM, Graham
>> Dumpleton<gr...@gmail.com> wrote:
>>> 2009/7/8 Graham Leggett <mi...@sharp.fm>:
>>>> Paul Querna wrote:
>>>>
>>>>> It breaks the 1:1: connection mapping to thread (or process) model
>>>>> which is critical to low memory footprint, with thousands of
>>>>> connections, maybe I'm just insane, but all of the servers taking
>>>>> market share, like lighttpd, nginx, etc, all use this model.
>>>>>
>>>>> It also prevents all variations of the slowaris stupidity, because its
>>>>> damn hard to overwhelm the actual connection processing if its all
>>>>> async, and doesn't block a worker.
>>>> But as you've pointed out, it makes our heads bleed, and locks slow us down.
>>>>
>>>> At the lowest level, the event loop should be completely async, and be
>>>> capable of supporting an arbitrary (probably very high) number of
>>>> concurrent connections.
>>>>
>>>> If one connection slows or stops (deliberately or otherwise), it won't
>>>> block any other connections on the same event loop, which will continue
>>>> as normal.
>>> But which for a multiprocess web server screws up if you then have a
>>> blocking type model for an application running on top. Specifically,
>>> the greedy nature of accepting connections may mean a process accepts
>>> more connections which it has high level threads to handle. If the
>>> high level threads end up blocking, then any accepted connections for
>>> the blocking high level application, for which request headers are
>>> still being read, or are pending, will be blocked as well even though
>>> another server process may be idle. In the current Apache model a
>>> process will only accept connections if it knows it is able to process
>>> it at that time. If a process doesn't have the threads available, then
>>> a different process would pick it up instead. I have previously
>>> commented how this causes problems with nginx for potentially blocking
>>> applications running in nginx worker processes. See:
>>>
>>>  http://blog.dscpl.com.au/2009/05/blocking-requests-and-nginx-version-of.html
>>>
>>> To prevent this you are forced to run event driven system for
>>> everything and blocking type applications can't be run in same
>>> process. Thus, anything like that has to be shoved out into a separate
>>> process. FASTCGI was mentioned for that, but frankly I believed
>>> FASTCGI is getting a bit crufty these days. It perhaps really needs to
>>> be modernised, with the byte protocol layout simplified to get rid of
>>> these varying size length indicator bytes. This may have been
>>> warranted when networks were slower and amount of body data being
>>> passed around less, but I can't see that that extra complexity is
>>> warranted any more. FASTCGI also can't handle things like end to end
>>> 100-continue processing and perhaps has other problems as well in
>>> respect of handling logging outside of request context etc etc.
>>>
>>> So, I personally would really love to see a good review of FASTCGI,
>>> AJP and any other similar/pertinent protocols done to distill what in
>>> these modern times is required and would be a better mechanism. The
>>> implementations of FASTCGI could also perhaps be modernised. Of
>>> course, even though FASTCGI may not be the most elegant of systems,
>>> probably too entrenched to get rid of it. The only way perhaps might
>>> be if a improved version formed the basis of any internal
>>> communications for a completely restructured internal model for Apache
>>> 3.0 based on serf which had segregation between processes handling
>>> static files and applications, with user separation etc etc.
>>
>> TBH, I think the best way to modernize FastCGI or AJP is to just proxy
>> HTTP over a daemon socket, then you solve all the protocol issues...
>> and just treat it like another reverse proxy.  The part we really need
>> to write is the backend process manager, to spawn/kill more of these
>> workers.
>
> Though there is one nice feature in the AJP protocol: since it knows
> it's serving via a reverse proxy, the back end patches some
> communication data like it were the front end. So if the context on the
> back end asks for port, protocol, host name etc. it automatically gets
> the data that looks like the one of the front end. That way cookies,
> self-referencing links etc. work right.
>
> Most of that can be simulated by appropriate configuration with HTTP to
> (yes, there are a lot of proxy options for this), but in AJP its
> automatic. Some parts are not configurable right now, like e.g. the
> client IP. You always have to introduce something that's aware e.g. of
> the X-Forwarded-For header. Another example would be whether the
> communication to the reverse proxy was via https. You can transport all
> that info va custom headers, but the backend usually doesn't know how to
> handle it.

Yes, these are the sort of things which would be nice to be
transparent. Paul's comment is valid though in that HTTP itself could
be used as the protocol. Right now you couldn't do that over a UNIX
socket though for local backend process, and you loose the ability to
feed back error logging into main Apache error logs in a similar local
setup. So, in some respects what I see is where a better FASTCGI is
used for communicating with local processes only. Anything else would
use normal mod_proxy to another server, thus in effect getting rid of
external mode in FASTCGI. For the local stuff, what it then comes down
to is better process management and dealing with running as a distinct
user in a better way. Solving the problem of how to log errors to
distinct error logs in a mass virtual hosting environment would be
good as well.

Graham

Re: Events, Destruction and Locking

Posted by Rainer Jung <ra...@kippdata.de>.
On 08.07.2009 15:55, Paul Querna wrote:
> On Wed, Jul 8, 2009 at 3:05 AM, Graham
> Dumpleton<gr...@gmail.com> wrote:
>> 2009/7/8 Graham Leggett <mi...@sharp.fm>:
>>> Paul Querna wrote:
>>>
>>>> It breaks the 1:1: connection mapping to thread (or process) model
>>>> which is critical to low memory footprint, with thousands of
>>>> connections, maybe I'm just insane, but all of the servers taking
>>>> market share, like lighttpd, nginx, etc, all use this model.
>>>>
>>>> It also prevents all variations of the slowaris stupidity, because its
>>>> damn hard to overwhelm the actual connection processing if its all
>>>> async, and doesn't block a worker.
>>> But as you've pointed out, it makes our heads bleed, and locks slow us down.
>>>
>>> At the lowest level, the event loop should be completely async, and be
>>> capable of supporting an arbitrary (probably very high) number of
>>> concurrent connections.
>>>
>>> If one connection slows or stops (deliberately or otherwise), it won't
>>> block any other connections on the same event loop, which will continue
>>> as normal.
>> But which for a multiprocess web server screws up if you then have a
>> blocking type model for an application running on top. Specifically,
>> the greedy nature of accepting connections may mean a process accepts
>> more connections which it has high level threads to handle. If the
>> high level threads end up blocking, then any accepted connections for
>> the blocking high level application, for which request headers are
>> still being read, or are pending, will be blocked as well even though
>> another server process may be idle. In the current Apache model a
>> process will only accept connections if it knows it is able to process
>> it at that time. If a process doesn't have the threads available, then
>> a different process would pick it up instead. I have previously
>> commented how this causes problems with nginx for potentially blocking
>> applications running in nginx worker processes. See:
>>
>>  http://blog.dscpl.com.au/2009/05/blocking-requests-and-nginx-version-of.html
>>
>> To prevent this you are forced to run event driven system for
>> everything and blocking type applications can't be run in same
>> process. Thus, anything like that has to be shoved out into a separate
>> process. FASTCGI was mentioned for that, but frankly I believed
>> FASTCGI is getting a bit crufty these days. It perhaps really needs to
>> be modernised, with the byte protocol layout simplified to get rid of
>> these varying size length indicator bytes. This may have been
>> warranted when networks were slower and amount of body data being
>> passed around less, but I can't see that that extra complexity is
>> warranted any more. FASTCGI also can't handle things like end to end
>> 100-continue processing and perhaps has other problems as well in
>> respect of handling logging outside of request context etc etc.
>>
>> So, I personally would really love to see a good review of FASTCGI,
>> AJP and any other similar/pertinent protocols done to distill what in
>> these modern times is required and would be a better mechanism. The
>> implementations of FASTCGI could also perhaps be modernised. Of
>> course, even though FASTCGI may not be the most elegant of systems,
>> probably too entrenched to get rid of it. The only way perhaps might
>> be if a improved version formed the basis of any internal
>> communications for a completely restructured internal model for Apache
>> 3.0 based on serf which had segregation between processes handling
>> static files and applications, with user separation etc etc.
> 
> TBH, I think the best way to modernize FastCGI or AJP is to just proxy
> HTTP over a daemon socket, then you solve all the protocol issues...
> and just treat it like another reverse proxy.  The part we really need
> to write is the backend process manager, to spawn/kill more of these
> workers.

Though there is one nice feature in the AJP protocol: since it knows
it's serving via a reverse proxy, the back end patches some
communication data like it were the front end. So if the context on the
back end asks for port, protocol, host name etc. it automatically gets
the data that looks like the one of the front end. That way cookies,
self-referencing links etc. work right.

Most of that can be simulated by appropriate configuration with HTTP to
(yes, there are a lot of proxy options for this), but in AJP its
automatic. Some parts are not configurable right now, like e.g. the
client IP. You always have to introduce something that's aware e.g. of
the X-Forwarded-For header. Another example would be whether the
communication to the reverse proxy was via https. You can transport all
that info va custom headers, but the backend usually doesn't know how to
handle it.

Regards,

Rainer

Re: Events, Destruction and Locking

Posted by Paul Querna <pa...@querna.org>.
On Wed, Jul 8, 2009 at 3:05 AM, Graham
Dumpleton<gr...@gmail.com> wrote:
> 2009/7/8 Graham Leggett <mi...@sharp.fm>:
>> Paul Querna wrote:
>>
>>> It breaks the 1:1: connection mapping to thread (or process) model
>>> which is critical to low memory footprint, with thousands of
>>> connections, maybe I'm just insane, but all of the servers taking
>>> market share, like lighttpd, nginx, etc, all use this model.
>>>
>>> It also prevents all variations of the slowaris stupidity, because its
>>> damn hard to overwhelm the actual connection processing if its all
>>> async, and doesn't block a worker.
>>
>> But as you've pointed out, it makes our heads bleed, and locks slow us down.
>>
>> At the lowest level, the event loop should be completely async, and be
>> capable of supporting an arbitrary (probably very high) number of
>> concurrent connections.
>>
>> If one connection slows or stops (deliberately or otherwise), it won't
>> block any other connections on the same event loop, which will continue
>> as normal.
>
> But which for a multiprocess web server screws up if you then have a
> blocking type model for an application running on top. Specifically,
> the greedy nature of accepting connections may mean a process accepts
> more connections which it has high level threads to handle. If the
> high level threads end up blocking, then any accepted connections for
> the blocking high level application, for which request headers are
> still being read, or are pending, will be blocked as well even though
> another server process may be idle. In the current Apache model a
> process will only accept connections if it knows it is able to process
> it at that time. If a process doesn't have the threads available, then
> a different process would pick it up instead. I have previously
> commented how this causes problems with nginx for potentially blocking
> applications running in nginx worker processes. See:
>
>  http://blog.dscpl.com.au/2009/05/blocking-requests-and-nginx-version-of.html
>
> To prevent this you are forced to run event driven system for
> everything and blocking type applications can't be run in same
> process. Thus, anything like that has to be shoved out into a separate
> process. FASTCGI was mentioned for that, but frankly I believed
> FASTCGI is getting a bit crufty these days. It perhaps really needs to
> be modernised, with the byte protocol layout simplified to get rid of
> these varying size length indicator bytes. This may have been
> warranted when networks were slower and amount of body data being
> passed around less, but I can't see that that extra complexity is
> warranted any more. FASTCGI also can't handle things like end to end
> 100-continue processing and perhaps has other problems as well in
> respect of handling logging outside of request context etc etc.
>
> So, I personally would really love to see a good review of FASTCGI,
> AJP and any other similar/pertinent protocols done to distill what in
> these modern times is required and would be a better mechanism. The
> implementations of FASTCGI could also perhaps be modernised. Of
> course, even though FASTCGI may not be the most elegant of systems,
> probably too entrenched to get rid of it. The only way perhaps might
> be if a improved version formed the basis of any internal
> communications for a completely restructured internal model for Apache
> 3.0 based on serf which had segregation between processes handling
> static files and applications, with user separation etc etc.

TBH, I think the best way to modernize FastCGI or AJP is to just proxy
HTTP over a daemon socket, then you solve all the protocol issues...
and just treat it like another reverse proxy.  The part we really need
to write is the backend process manager, to spawn/kill more of these
workers.

Re: Events, Destruction and Locking

Posted by Graham Dumpleton <gr...@gmail.com>.
2009/7/8 Graham Leggett <mi...@sharp.fm>:
> Paul Querna wrote:
>
>> It breaks the 1:1: connection mapping to thread (or process) model
>> which is critical to low memory footprint, with thousands of
>> connections, maybe I'm just insane, but all of the servers taking
>> market share, like lighttpd, nginx, etc, all use this model.
>>
>> It also prevents all variations of the slowaris stupidity, because its
>> damn hard to overwhelm the actual connection processing if its all
>> async, and doesn't block a worker.
>
> But as you've pointed out, it makes our heads bleed, and locks slow us down.
>
> At the lowest level, the event loop should be completely async, and be
> capable of supporting an arbitrary (probably very high) number of
> concurrent connections.
>
> If one connection slows or stops (deliberately or otherwise), it won't
> block any other connections on the same event loop, which will continue
> as normal.

But which for a multiprocess web server screws up if you then have a
blocking type model for an application running on top. Specifically,
the greedy nature of accepting connections may mean a process accepts
more connections which it has high level threads to handle. If the
high level threads end up blocking, then any accepted connections for
the blocking high level application, for which request headers are
still being read, or are pending, will be blocked as well even though
another server process may be idle. In the current Apache model a
process will only accept connections if it knows it is able to process
it at that time. If a process doesn't have the threads available, then
a different process would pick it up instead. I have previously
commented how this causes problems with nginx for potentially blocking
applications running in nginx worker processes. See:

  http://blog.dscpl.com.au/2009/05/blocking-requests-and-nginx-version-of.html

To prevent this you are forced to run event driven system for
everything and blocking type applications can't be run in same
process. Thus, anything like that has to be shoved out into a separate
process. FASTCGI was mentioned for that, but frankly I believed
FASTCGI is getting a bit crufty these days. It perhaps really needs to
be modernised, with the byte protocol layout simplified to get rid of
these varying size length indicator bytes. This may have been
warranted when networks were slower and amount of body data being
passed around less, but I can't see that that extra complexity is
warranted any more. FASTCGI also can't handle things like end to end
100-continue processing and perhaps has other problems as well in
respect of handling logging outside of request context etc etc.

So, I personally would really love to see a good review of FASTCGI,
AJP and any other similar/pertinent protocols done to distill what in
these modern times is required and would be a better mechanism. The
implementations of FASTCGI could also perhaps be modernised. Of
course, even though FASTCGI may not be the most elegant of systems,
probably too entrenched to get rid of it. The only way perhaps might
be if a improved version formed the basis of any internal
communications for a completely restructured internal model for Apache
3.0 based on serf which had segregation between processes handling
static files and applications, with user separation etc etc.

Graham

Re: Events, Destruction and Locking

Posted by Graham Leggett <mi...@sharp.fm>.
Paul Querna wrote:

> It breaks the 1:1: connection mapping to thread (or process) model
> which is critical to low memory footprint, with thousands of
> connections, maybe I'm just insane, but all of the servers taking
> market share, like lighttpd, nginx, etc, all use this model.
> 
> It also prevents all variations of the slowaris stupidity, because its
> damn hard to overwhelm the actual connection processing if its all
> async, and doesn't block a worker.

But as you've pointed out, it makes our heads bleed, and locks slow us down.

At the lowest level, the event loop should be completely async, and be
capable of supporting an arbitrary (probably very high) number of
concurrent connections.

If one connection slows or stops (deliberately or otherwise), it won't
block any other connections on the same event loop, which will continue
as normal.

The only requirement is that each request accurately registers event
deregistration functions in their cleanups, so that the request is
cleanly deregistered and future events canceled on apr_pool_destroy().

The event loop can also choose to proactively kill too-slow connections
if certain memory or concurrent connection threshholds are reached.

Regards,
Graham
--

Re: Events, Destruction and Locking

Posted by Paul Querna <pa...@querna.org>.
On Tue, Jul 7, 2009 at 10:01 AM, Graham Leggett<mi...@sharp.fm> wrote:
> Paul Querna wrote:
>
>> Yes, but in a separate process it has fault isolation.. and we can
>> restart it when it fails, neither of which are true for modules using
>> the in-process API directly -- look at the reliability of QMail, or
>> the newer architecture of Google's Chrome, they are both great
>> examples of fault isolation.
>
> As is httpd prefork :)
>
> I think the key target for the event model is for low-complexity
> scenarios like shipping raw files, or being a cache, or a reverse proxy.
>
> If we have three separate levels, a process, containing threads,
> containing an event loop, we could allow the behaviour of prefork (many
> processes, one thread, one-request-per-event-loop-at-a-time), or the
> bahaviour of worker (one or many processes, many threads,
> one-request-per-event-loop-at-a-time), or an event model (one or many
> processes, one or many threads,
> many-requests-per-event-loop-at-one-time) at the same time.
>
> I am not sure that splitting request handling across threads (in your
> example, connection close handled by event on thread A, timeout handled
> by event on thread B) buys us anything (apart from the complexity you
> described).

It breaks the 1:1: connection mapping to thread (or process) model
which is critical to low memory footprint, with thousands of
connections, maybe I'm just insane, but all of the servers taking
market share, like lighttpd, nginx, etc, all use this model.

It also prevents all variations of the slowaris stupidity, because its
damn hard to overwhelm the actual connection processing if its all
async, and doesn't block a worker.

Re: Events, Destruction and Locking

Posted by Bojan Smojver <bo...@rexursive.com>.
On Wed, 2009-07-08 at 22:53 -0400, Paul Querna wrote:
> But the event mpm doesn't have an accept mutex :D

Yeah, I know. I was talking about making prefork async too.

-- 
Bojan


Re: Events, Destruction and Locking

Posted by Paul Querna <pa...@querna.org>.
On Wed, Jul 8, 2009 at 9:11 PM, Bojan Smojver<bo...@rexursive.com> wrote:
> On Wed, 2009-07-08 at 11:01 +1000, Bojan Smojver wrote:
>> So, the loop would be:
>>
>> - poll()
>> - try assembling a full request from data read so far
>>   - process if successful
>>   - go back to poll() if not
>>
>> Too naive?
>
> I see that we'd most likely get stuck with the accept mutex (i.e. if
> another process had it, we would not be poll()-ing already accepted fds
> any more).
>

But the event mpm doesn't have an accept mutex :D

> We could work around this by using apr_proc_mutex_trylock() if there are
> any already accepted fds. If this fails, we just poll() already accepted
> fds (i.e. someone is already poll()-ing to accept()). Otherwise, we
> poll() the lot.
>
> --
> Bojan
>
>

Re: Events, Destruction and Locking

Posted by Bojan Smojver <bo...@rexursive.com>.
On Wed, 2009-07-08 at 11:01 +1000, Bojan Smojver wrote:
> So, the loop would be:
> 
> - poll()
> - try assembling a full request from data read so far
>   - process if successful
>   - go back to poll() if not
> 
> Too naive?

I see that we'd most likely get stuck with the accept mutex (i.e. if
another process had it, we would not be poll()-ing already accepted fds
any more).

We could work around this by using apr_proc_mutex_trylock() if there are
any already accepted fds. If this fails, we just poll() already accepted
fds (i.e. someone is already poll()-ing to accept()). Otherwise, we
poll() the lot.

-- 
Bojan


Re: Events, Destruction and Locking

Posted by Bojan Smojver <bo...@rexursive.com>.
On Tue, 2009-07-07 at 16:01 +0200, Graham Leggett wrote:
> As is httpd prefork :)

Yeah, definitely my favourite MPM :-)

As far as I understand this, the deal is that we need to have a complete
request before we start processing it. Otherwise, we can get stuck and
one of our precious resources is tied up for a long time.

Is there anything stopping us from having not just fds in listen in that
apr_pollset_poll() of prefork.c, but also a bunch of already accepted
fds that are waiting for more data to come in? I'm guessing we'd have to
use ap_process_http_async_connection() and have multiple ptrans pools,
but that should not be all that hard to do.

So, the loop would be:

- poll()
- try assembling a full request from data read so far
  - process if successful
  - go back to poll() if not

Too naive?

-- 
Bojan


Re: Events, Destruction and Locking

Posted by Graham Leggett <mi...@sharp.fm>.
Paul Querna wrote:

> Yes, but in a separate process it has fault isolation.. and we can
> restart it when it fails, neither of which are true for modules using
> the in-process API directly -- look at the reliability of QMail, or
> the newer architecture of Google's Chrome, they are both great
> examples of fault isolation.

As is httpd prefork :)

I think the key target for the event model is for low-complexity
scenarios like shipping raw files, or being a cache, or a reverse proxy.

If we have three separate levels, a process, containing threads,
containing an event loop, we could allow the behaviour of prefork (many
processes, one thread, one-request-per-event-loop-at-a-time), or the
bahaviour of worker (one or many processes, many threads,
one-request-per-event-loop-at-a-time), or an event model (one or many
processes, one or many threads,
many-requests-per-event-loop-at-one-time) at the same time.

I am not sure that splitting request handling across threads (in your
example, connection close handled by event on thread A, timeout handled
by event on thread B) buys us anything (apart from the complexity you
described).

Regards,
Graham
--

Re: Events, Destruction and Locking

Posted by Paul Querna <pa...@querna.org>.
On Tue, Jul 7, 2009 at 12:54 PM, Akins, Brian<Br...@turner.com> wrote:
> This is how I envisioned the async stuff working.
>
> -Async event thread is used only for input/output of httpd to/from network*
> -After we read the headers, we pass the request/connection to the worker
> threads.  Each request is "sticky" to a thread.  Request stuff may block,
> etc, so this thread pool size is configurable and in mod_status, etc.
> -any "writes" out of the request to the clientare passed into the async
> thread.  This may be wrapped in filters, whatever.
>
> *We may allow there to be multiple ones of these, ie one for proxies, or
> have a very well defined way to add watches to this.
>
> This is a very simplistic view.  I was basically thinking that all conn_rec
> "stuff" is handled in the async event thread, all the request_rec "stuff" is
> handled in the worker threads.

Right, but I think the 'waiting for X' while processing is a very
important case, it can get you to a fully async reverse proxy, which
opens up lots of possibilities.

Re: Events, Destruction and Locking

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
Graham Leggett wrote:
> 
> Ideally any async implementation should be 100% async end to end. I
> don't believe that its necessary though for a single request to be
> handled by more than one thread.

That pretty much ensures you are congesting unequal requests on the same
CPU, not necessarily a good idea.

I expect the ideal number of threads will end up looking like 100 in this
sort of a scenario, for serving ~5k requests in parallel.

Assuming an 8 cpu system, we can get much closer to some 20 threads if all
requests were free-threaded. (assuming here a thread per cpu for traffic
and some worker delegation - if they sat on the same thread perhaps this
is closer to 12).





Re: Events, Destruction and Locking

Posted by Ruediger Pluem <rp...@apache.org>.

On 07/07/2009 07:02 PM, Graham Leggett wrote:

> Ideally any async implementation should be 100% async end to end. I
> don't believe that its necessary though for a single request to be
> handled by more than one thread.

I agree. I see no reason for multiple threads working on the same request at
the same time (at least handler wise). On the other side it may be interesting
to develop async handlers that wait for external events like a post body or a database
response and that might want to free the thread until this event happens.
The same may be interesting for filters.
So it should be possible for a request to move over to a different thread,
but not more than one thread should be working on the same request at the
same time.

Regards

Rüdiger



Re: Events, Destruction and Locking

Posted by "Akins, Brian" <Br...@turner.com>.
On 7/7/09 1:02 PM, "Graham Leggett" <mi...@sharp.fm> wrote:

> Ideally any async implementation should be 100% async end to end. I
> don't believe that its necessary though for a single request to be
> handled by more than one thread.

True.  However, what about things that may be "process" intensive. Ie,
running lua in process.  And we'd want to run multiple async threads (or
processes). One of the issues with lighttpd with multiple processes (to use
multiple cores, etc) is that lots of stuff is broken (ie, log files
interleave).  We just need to be aware of the issues that other servers have
uncovered in this area.

-- 
Brian Akins


Re: Events, Destruction and Locking

Posted by Graham Leggett <mi...@sharp.fm>.
Akins, Brian wrote:

> This is how I envisioned the async stuff working.
> 
> -Async event thread is used only for input/output of httpd to/from network*
> -After we read the headers, we pass the request/connection to the worker
> threads.  Each request is "sticky" to a thread.  Request stuff may block,
> etc, so this thread pool size is configurable and in mod_status, etc.
> -any "writes" out of the request to the clientare passed into the async
> thread.  This may be wrapped in filters, whatever.
> 
> *We may allow there to be multiple ones of these, ie one for proxies, or
> have a very well defined way to add watches to this.
> 
> This is a very simplistic view.  I was basically thinking that all conn_rec
> "stuff" is handled in the async event thread, all the request_rec "stuff" is
> handled in the worker threads.

The trouble with this is that all you need to do to wedge one of the
worker threads is to promise to send two bytes as a request body, and
then send one (or zero), then hang.

Ideally any async implementation should be 100% async end to end. I
don't believe that its necessary though for a single request to be
handled by more than one thread.

Regards,
Graham
--

Re: Events, Destruction and Locking

Posted by "Akins, Brian" <Br...@turner.com>.
This is how I envisioned the async stuff working.

-Async event thread is used only for input/output of httpd to/from network*
-After we read the headers, we pass the request/connection to the worker
threads.  Each request is "sticky" to a thread.  Request stuff may block,
etc, so this thread pool size is configurable and in mod_status, etc.
-any "writes" out of the request to the clientare passed into the async
thread.  This may be wrapped in filters, whatever.

*We may allow there to be multiple ones of these, ie one for proxies, or
have a very well defined way to add watches to this.

This is a very simplistic view.  I was basically thinking that all conn_rec
"stuff" is handled in the async event thread, all the request_rec "stuff" is
handled in the worker threads.


-- 
Brian Akins


Re: Events, Destruction and Locking

Posted by Jeff Trawick <tr...@gmail.com>.
On Tue, Jul 7, 2009 at 9:39 AM, Paul Querna <pa...@querna.org> wrote:

> On Tue, Jul 7, 2009 at 8:39 AM, Graham Leggett<mi...@sharp.fm> wrote:
> > Paul Querna wrote:
> >
> >> Nah, 90% of what is done in moduels today should be out of process aka
> >> in FastCGI.... or another method, but out of process. (regardless of
> >> MPM)
> >
> > You're just moving the problem from one server to another, the problem
> > remains unsolved. Whether the code runs within httpd space, or fastcgi
> > space, the code still needs to run, and if it's written badly, the code
> > will still leak/crash, and you still have to cater for it.
>
> Yes, but in a separate process it has fault isolation.. and we can
> restart it when it fails, neither of which are true for modules using
> the in-process API directly -- look at the reliability of QMail, or
> the newer architecture of Google's Chrome, they are both great
> examples of fault isolation.
>

Also, it simplifies the programming problem by reducing the number of
separate memory and concurrency models that must be accommodated by the
application-level code.

Re: Events, Destruction and Locking

Posted by Paul Querna <pa...@querna.org>.
On Tue, Jul 7, 2009 at 8:39 AM, Graham Leggett<mi...@sharp.fm> wrote:
> Paul Querna wrote:
>
>> Nah, 90% of what is done in moduels today should be out of process aka
>> in FastCGI.... or another method, but out of process. (regardless of
>> MPM)
>
> You're just moving the problem from one server to another, the problem
> remains unsolved. Whether the code runs within httpd space, or fastcgi
> space, the code still needs to run, and if it's written badly, the code
> will still leak/crash, and you still have to cater for it.

Yes, but in a separate process it has fault isolation.. and we can
restart it when it fails, neither of which are true for modules using
the in-process API directly -- look at the reliability of QMail, or
the newer architecture of Google's Chrome, they are both great
examples of fault isolation.

Re: Events, Destruction and Locking

Posted by Graham Leggett <mi...@sharp.fm>.
Paul Querna wrote:

> Nah, 90% of what is done in moduels today should be out of process aka
> in FastCGI.... or another method, but out of process. (regardless of
> MPM)

You're just moving the problem from one server to another, the problem
remains unsolved. Whether the code runs within httpd space, or fastcgi
space, the code still needs to run, and if it's written badly, the code
will still leak/crash, and you still have to cater for it.

Regards,
Graham
--

Re: Events, Destruction and Locking

Posted by Paul Querna <pa...@querna.org>.
On Tue, Jul 7, 2009 at 7:34 AM, Graham Leggett<mi...@sharp.fm> wrote:
> Paul Querna wrote:
>> I think it is possible to write a complete server that deals with all
>> these intricacies and gets everything just 'right', but as soon as you
>> introduce 3rd party module writers, no matter how 'smart' we are, our
>> castle of event goodness will crumble.
>
> You've hit the nail on the head as to why the prefork and worker models
> are still relevant - they are very forgiving of "irresponsible
> behaviour" by modules.

Nah, 90% of what is done in moduels today should be out of process aka
in FastCGI.... or another method, but out of process. (regardless of
MPM)

Re: Events, Destruction and Locking

Posted by Graham Leggett <mi...@sharp.fm>.
Paul Querna wrote:

> Can't sleep, so finally writing this email I've been meaning to write
> for about 7 months now :D
> 
> One of the challenges in the Simple MPM, and to a smaller degree in
> the Event MPM, is how to manage memory allocation, destruction, and
> thread safety.
> 
> A 'simple' example:
>  - 1) Thread A: Client Connection Created
>    -  2) Thread A: Timer Event Added for 10 seconds in the future to
> detect  IO timeout,
>  - 3) Thread B: Client Socket closes in 9.99 seconds.
>  - 4) Thread C: Timer Event for IO timeout is triggered after 10 seconds
> 
> The simple answer is placing a Mutex around the connection object.
> Any operation which two threads are working on the connection, locks
> this Mutex.

As you've said, locks create many problems, the most fatal of which is
that locks potentially block the event loop. Ideally if a try_lock
fails, the event should reschedule itself to try again at some point in
the near future, but that relies on people bothering to write this
logic, and I suspect many won't.

A pragmatic approach might be to handle a request completely within a
single event loop running in a single thread. In this case the timer
event for IO timeout is in the same thread as the socket close event.

At this point you just need to make sure that your pool cleanups are
handled correctly. So if a timeout runs, all the timeout does is
apr_destroy_pool(r->pool), and that's it. It is up to the pool cleanup
to make sure that all events (such as the event that calls connection
close) are cleanly deregistered so that they won't get called in future.

We may offer a mechanism (such as a watchdog) that allows a request to
kick off code in another thread, but a prerequisite for that is that the
pool cleanup will have to be created to make sure that other thread is
terminated cleanly, or the request is cleanly removed from that other
thread's event loop.

> I think it is possible to write a complete server that deals with all
> these intricacies and gets everything just 'right', but as soon as you
> introduce 3rd party module writers, no matter how 'smart' we are, our
> castle of event goodness will crumble.

You've hit the nail on the head as to why the prefork and worker models
are still relevant - they are very forgiving of "irresponsible
behaviour" by modules.

Regards,
Graham
--