You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@httpd.apache.org by Paul Querna <pa...@querna.org> on 2011/06/16 00:01:52 UTC

3.0, the 2011 thread.

I think we have all joked on and off about 3.0 for... well about 8 years now.

I think we are nearing the point we might actually need to be serious about it.

The web is changed.

SPDY is coming down the pipe pretty quickly.

WebSockets might actually be standardized this year.

Two protocols which HTTPD is unable to be good at. Ever.

The problem is our process model, and our module APIs.

The Event MPM was a valiant effort in some ways, but mod_ssl and other
filters will always block its progress, and with protocols like SPDY,
falling back to Worker MPM behaviors is pointless.

I think there are exciting things happening in C however.

4 projects that maybe could form the baseline for something new.

pocore: For base OS portability and memory pooling system.
  <http://code.google.com/p/pocore/>
libuv: Portable, fast, Network IO. (IOCP programming model, brought to Unix)
  <https://github.com/joyent/libuv>
http-parser: HTTP really broken out to simple callbacks.
  <https://github.com/ry/http-parser>
selene: SSL, redone to better support Async IO.
  <https://github.com/pquerna/selene>

All of these are young.  Most are incomplete.

But they could be the tools to build a real 3.0 upon.

If we don't, I'm sure others in the web server market will continue to
gain market share.

But I think we could make do it better.  We have the experience, we
know the value of a modules ecosystem, we build stable, quality
software.  We just need to step up to how the internet is changing.

Thoughts?

Thanks,

Paul

Re: 3.0, the 2011 thread.

Posted by Paul Querna <pa...@querna.org>.

2011/6/15 Colm MacCárthaigh <co...@allcosts.net>:
> On Wed, Jun 15, 2011 at 3:01 PM, Paul Querna <pa...@querna.org> wrote:
>> I think we have all joked on and off about 3.0 for... well about 8 years now.
>
> At least!
>
>> I think there are exciting things happening in C however.
>
> I love C, but unless we can come up with something radical, it's hard
> to see a way out of the prison it creates. That realisation led me to
> hacking mostly on functional-oriented servers. I'll try to explain why
> - in case any of those thoughts are useful here too :)
>
> I like the things you've pointed out, but they seem relatively
> cosmetic. Things like the parser, async, event and portability
> frameworks are really cool - but hardly fundamental. Anyone could use
> those, in any language - it's not a real leap in the field. Similarly,
> SPDY, websockets, COMET and so on are ultra-cool - but are still
> potential bolt-ons to almost any kind of webserver. It sucks that we
> don't do them well, but doing them better won't fundamentally change
> the market or the pressures on adoption.
>
> Today webservers are almost entirely network I/O bound - disk seek and
> CPU speeds are pretty great these days, way faster than is really
> neccessary. In a properly architected set-up, end-user delay is really
> about the limitations of TCP. You can multiplex and use keepalives as
> much as you want, you'll eventually realise that the size of the world
> and speed of light mean that this inevitably ends up being slow
> without a lot of distributed endpoints.
>
> But we have some cool secret sauce to help fix that. I think the best
> architectural thing about Apache is buckets and brigades. Using a list
> structure to represent portions of differently-generated content like
> that is great. Imagine how much better wordpress would run if PHP
> represented the php scripts as a some dynamic buckets intermingled
> with some static file io buckets (and even repeated when in loops).
> There'd be a lot less data to copy around.
>
> Now imagine a backend that could identify the dynamic buckets and, by
> forbidding side effects, parallellise work on them - a bucket as a
> message in message-passing system of co-routines, for example. Imagine
> that in turn feeding into a set of co-routine filters. That's
> fundamentally different - it parallelises content generation, but it's
> really really hard to do in C.
>
> Then next, imagine a backend that could identify the static buckets
> and re-order them so that they come first - it could understand things
> like XML and Javascript and intelligently front-load your transfer so
> that the content we have ready goes first, while the dynamic stuff is
> being built. It's a real layer-8-aware scheduler and content
> re-compiler. Again it's really really hard to do in C - but imagine
> the benefits of a server layer that really understood how to model and
> re-order content.
>
> These are the kinds of transform that make a webservers job as optimal
> as it can be. Network data is the most expensive part of any modern
> web application, in terms of both time and money, so the ecosystem
> faces huge economic pressure to make these as optimal as possible over
> time. Things like SPDY are just the first generation.
>
> It'd be cool if Apache 3.0 could do those things - we have some great
> building blocks and experience - but it feels like a language with
> support for first-order functions and co-routines would be better at
> it.
>
> Again, I'm just thinking out loud :)

I think its an interesting idea, focusing on a content aware server
architecture;  An example of this is SPDY's ability to push content
for a client -- For example, a client requests /index.html, you can
push /css/main.css to them without waiting for them to request it.

I think as far as the implementation language, you seem to be asking for Go?

Which I really really like, but it seems hard to make extensions/module APIs.

Of course, the other option is to just write it in Node.js....  I mean
most of the web erver is not about the lower bits, its about
configuration and content generation.

Re: 3.0, the 2011 thread.

Posted by "Akins, Brian" <Br...@turner.com>.

On 6/15/11 7:40 PM, "Colm MacCárthaigh" <co...@allcosts.net> wrote:
>  Imagine
> that in turn feeding into a set of co-routine filters. That's
> fundamentally different - it parallelises content generation, but it's
> really really hard to do in C.

Depending on how far you want to push the model, it's not that hard.
Obviously you can't do "co-routines" but just using the current ideas about
requests and sub requests, you could easily do the subrequests in parallel.
FWIW, nginx can use Lua co-routines to do this and does it "natively" with
SSI's. The code, however, will make you go blind ;)

My biggest issue with HTTPD really comes down to connections per OS image.
In general, threads suck at this - memory "per connection" and context
switches just kill you.  "C1M" is just not that hard to achieve nowadays.

-- 
Brian Akins

Re: 3.0, the 2011 thread.

Posted by Colm MacCárthaigh <co...@allcosts.net>.

On Wed, Jun 15, 2011 at 3:01 PM, Paul Querna <pa...@querna.org> wrote:
> I think we have all joked on and off about 3.0 for... well about 8 years now.

At least!

> I think there are exciting things happening in C however.

I love C, but unless we can come up with something radical, it's hard
to see a way out of the prison it creates. That realisation led me to
hacking mostly on functional-oriented servers. I'll try to explain why
- in case any of those thoughts are useful here too :)

I like the things you've pointed out, but they seem relatively
cosmetic. Things like the parser, async, event and portability
frameworks are really cool - but hardly fundamental. Anyone could use
those, in any language - it's not a real leap in the field. Similarly,
SPDY, websockets, COMET and so on are ultra-cool - but are still
potential bolt-ons to almost any kind of webserver. It sucks that we
don't do them well, but doing them better won't fundamentally change
the market or the pressures on adoption.

Today webservers are almost entirely network I/O bound - disk seek and
CPU speeds are pretty great these days, way faster than is really
neccessary. In a properly architected set-up, end-user delay is really
about the limitations of TCP. You can multiplex and use keepalives as
much as you want, you'll eventually realise that the size of the world
and speed of light mean that this inevitably ends up being slow
without a lot of distributed endpoints.

But we have some cool secret sauce to help fix that. I think the best
architectural thing about Apache is buckets and brigades. Using a list
structure to represent portions of differently-generated content like
that is great. Imagine how much better wordpress would run if PHP
represented the php scripts as a some dynamic buckets intermingled
with some static file io buckets (and even repeated when in loops).
There'd be a lot less data to copy around.

Now imagine a backend that could identify the dynamic buckets and, by
forbidding side effects, parallellise work on them - a bucket as a
message in message-passing system of co-routines, for example. Imagine
that in turn feeding into a set of co-routine filters. That's
fundamentally different - it parallelises content generation, but it's
really really hard to do in C.

Then next, imagine a backend that could identify the static buckets
and re-order them so that they come first - it could understand things
like XML and Javascript and intelligently front-load your transfer so
that the content we have ready goes first, while the dynamic stuff is
being built. It's a real layer-8-aware scheduler and content
re-compiler. Again it's really really hard to do in C - but imagine
the benefits of a server layer that really understood how to model and
re-order content.

These are the kinds of transform that make a webservers job as optimal
as it can be. Network data is the most expensive part of any modern
web application, in terms of both time and money, so the ecosystem
faces huge economic pressure to make these as optimal as possible over
time. Things like SPDY are just the first generation.

It'd be cool if Apache 3.0 could do those things - we have some great
building blocks and experience - but it feels like a language with
support for first-order functions and co-routines would be better at
it.

Again, I'm just thinking out loud :)

-- 
Colm

Re: 3.0, the 2011 thread.

Posted by Johannes Roith <jo...@jroith.de>.

On Thu, Jun 16, 2011 at 6:10 PM, William A. Rowe Jr.
<wr...@rowe-clan.net> wrote:
> On 6/16/2011 4:18 AM, bswen wrote:
>>
>> I think the only major problem of httpd is its "one thread per connection" I/O model. It's an inherently unscalable design. Httpd-3.0 will be meaningless if it keeps on this i/o design.
>
> That is no longer its design; it is now "one thread per request".
> Long lived requests still pose a challenge.

What I really miss is an efficient way to push server events (through
websockets, long-polling, chunked encoding or
multipart/x-mixed-replace responses). In particular many usecases
probably be addressed if a (long-running) request can be hold/managed
by the server without running a thread just like connections are by
the event mpm and modules have an easy way to send/append a
message/chunk when they decide to do so.

Re: 3.0, the 2011 thread.

Posted by Stefan Fritsch <sf...@sfritsch.de>.

On Thursday 16 June 2011, William A. Rowe Jr. wrote:
> On 6/16/2011 4:18 AM, bswen wrote:
> > I think the only major problem of httpd is its "one thread per
> > connection" I/O model. It's an inherently unscalable design.
> > Httpd-3.0 will be meaningless if it keeps on this i/o design. 
> 
> That is no longer its design; it is now "one thread per request".
> 
> Intra-connection mechanics are now handled in an event loop when
> using the event mpm, and this is worked well in real life.  Some
> modules which manipulate the connection or make connection-scope
> assumptions about threading will be require the user to stay on
> worker or prefork, of course.
> 
> Long lived requests still pose a challenge.

I think the more urgent challenge lies with ssl, for which we still 
have the one-thread-per-connection principle.

Re: 3.0, the 2011 thread.

Posted by "William A. Rowe Jr." <wr...@rowe-clan.net>.

On 6/16/2011 4:18 AM, bswen wrote:
> 
> I think the only major problem of httpd is its "one thread per connection" I/O model. It's an inherently unscalable design. Httpd-3.0 will be meaningless if it keeps on this i/o design. 

That is no longer its design; it is now "one thread per request".

Intra-connection mechanics are now handled in an event loop when using
the event mpm, and this is worked well in real life.  Some modules which
manipulate the connection or make connection-scope assumptions about
threading will be require the user to stay on worker or prefork, of course.

Long lived requests still pose a challenge.

> it's well possible to use only a very few threads (no. of CPU cores) to handle tens of thousands of (slow) connections. 
> 
> An optimal network i/o model needs a scheduling layer that maps a *request* (not a connection) to a thread, so a worker thread can be scheduled to handle different connections, instead of being tied up entirely with a single connection during the whole life time of the connection. We hope such a layer should unify the upper interface of Event i/o, Windows i/o completion port, and many other async i/o mechanisms. With luck and careful design, the current filtered i/o chain and the module API can remain the same.
> 
> That might be server as a good mark for the httpd-3.x releases?

RE: 3.0, the 2011 thread.

Posted by bswen <bs...@pku.edu.cn>.

Paul Querna [mailto:paul@querna.org] sent on Thursday, June 16, 2011 6:02 AM
>
> I think we have all joked on and off about 3.0 for... well about 8 years now.
> ...
> The problem is our process model, and our module APIs.
>
> The Event MPM was a valiant effort in some ways, but mod_ssl and other filters will always block
> ...
> libuv: Portable, fast, Network IO. (IOCP programming model, brought to Unix)
>  <https://github.com/joyent/libuv>
> ...

I think the only major problem of httpd is its "one thread per connection" I/O model. It's an inherently unscalable design. Httpd-3.0 will be meaningless if it keeps on this i/o design. 

As we discussed long time ago, 
(e.g., 
	Sent: Sunday, September 21, 2008 2:17 PM
	Subject: Re: Future direction of MPMs, was Re: svn commit: r697357 - in /httpd/httpd/trunk: include/ modules/http/ modules/test/ server/ server/mpm/experimental/event/

	Sent: Sunday, August 31, 2008 9:49 PM
	Subject: Re: [community] 2.3.0 alpha on October 1?
)
it's well possible to use only a very few threads (no. of CPU cores) to handle tens of thousands of (slow) connections. 

An optimal network i/o model needs a scheduling layer that maps a *request* (not a connection) to a thread, so a worker thread can be scheduled to handle different connections, instead of being tied up entirely with a single connection during the whole life time of the connection. We hope such a layer should unify the upper interface of Event i/o, Windows i/o completion port, and many other async i/o mechanisms. With luck and careful design, the current filtered i/o chain and the module API can remain the same.

That might be server as a good mark for the httpd-3.x releases?

Regards,
Bing

Re: 3.0, the 2011 thread.

Posted by Graham Leggett <mi...@sharp.fm>.

On 17 Jun 2011, at 6:14 PM, Paul Querna wrote:

>> - Existing APIs in unix and windows really really suck at non  
>> blocking
>> behaviour. Standard APR file handling couldn't do it, so we  
>> couldn't use it
>> properly. DNS libraries are really terrible at it. The vast  
>> majority of
>> "async" DNS libraries are just hidden threads which wrap attempts  
>> to make
>> blocking calls, which in turn means unknown resource limits are hit  
>> when you
>> least expect it. Database and LDAP calls are blocking. What this  
>> means
>> practically is that you can't link to most software out there.
>
>
> Yes.
>
> Don't use the existing APIs.
>
> Use libuv for IO.
>
> Use c-ares for DNS.
>
> Don't use LDAP and Databases in the Event Loop;  Not all content
> generation needs to be in the main event loop, but lots of content
> generation and handling of clients should be.

This is where the premise falls down. You can't advertise yourself as  
a generally extensible webserver, and then tell everybody that the  
only libraries they are allowed to use come from a tiny exclusive list.

People who extend httpd will use whatever library is most convenient  
to them, and when their server becomes unstable, they will quite  
rightly blame httpd, they won't blame their own code. There has been  
no shortage of other projects learning this lesson over the last ten  
years.

> You are confusing the 'core' network IO model with fault isolation.
> The Worker MPM has actually been quite good on most platforms for the
> last decade.   There is little reason to use prefork anymore.

In our experience, prefork is still the basis for our dynamic code  
servers. As a media organisation we experience massive thundering  
herds, and so fault isolation for us is a big deal. We certainly don't  
prefork exclusively, just where we need it, but it remains our  
required lowest common denominator.

With load balancers in front of httpd to handle massive concurrent  
connections, having massive concurrent connections in httpd isn't  
necessary for us. We pipe the requests from the load balancers down a  
modest number of parallel keepalive connections, keeping concurrent  
connections to a sane level. Modern multi core hardware is really good  
at ths sort of stuff.

Obviously, one size doesn't fit all, which is why we have mpms.

Yes, an event loop in the core will be an awesome thing to have, but  
we need the option to retain both prefork and worker behaviour, and it  
has to be designed very carefully so that we remain good at being  
reliable.

> Should we run PHP inside the core event loop?  Hell no.

Will people who extend our code try to run PHP inside the event loop?  
Hell yes, and this is where the problem lies. We need to design our  
server around what our users will do. It's no use berating users  
afterwards for code they choose to write in good faith.

> I think as Stefan aludes to, there is a reasonable middle ground where
> network IO is done well in an Event loop, but we can still maintain
> easy extendability, with some multi-process and multi-thread systems
> for content generators that have their own needs, like file io.
>
> But certain things in the core, like SSL, must be done right, and done
> in an evented way.  It'll be hard, but we are programmers after all
> aren't we?

I don't understand the problem with SSL - openssl supports  
asynchronous io, we just need to use it.

Regards,
Graham
--

Re: 3.0, the 2011 thread.

Posted by Paul Querna <pa...@querna.org>.

On Wed, Jun 15, 2011 at 4:33 PM, Graham Leggett <mi...@sharp.fm> wrote:
> On 16 Jun 2011, at 12:01 AM, Paul Querna wrote:
>
>> I think we have all joked on and off about 3.0 for... well about 8 years
>> now.
>>
>> I think we are nearing the point we might actually need to be serious
>> about it.
>>
>> The web is changed.
>>
>> SPDY is coming down the pipe pretty quickly.
>>
>> WebSockets might actually be standardized this year.
>>
>> Two protocols which HTTPD is unable to be good at. Ever.
>>
>> The problem is our process model, and our module APIs.
>
> I am not convinced.
>
> Over the last three years, I have developed a low level stream serving
> system that we use to disseminate diagnostic data across datacentres, and
> one of the basic design decisions was that  it was to be lock free and event
> driven, because above all it needed to be fast. The event driven stuff was
> done properly, based on religious application of the following rule:
>
> "Thou shalt not attempt any single read or write without the event loop
> giving you permission to do that single read or write first. Not a single
> attempt, ever."
>
> From that effort I've learned the following:
>
> - Existing APIs in unix and windows really really suck at non blocking
> behaviour. Standard APR file handling couldn't do it, so we couldn't use it
> properly. DNS libraries are really terrible at it. The vast majority of
> "async" DNS libraries are just hidden threads which wrap attempts to make
> blocking calls, which in turn means unknown resource limits are hit when you
> least expect it. Database and LDAP calls are blocking. What this means
> practically is that you can't link to most software out there.


Yes.

Don't use the existing APIs.

Use libuv for IO.

Use c-ares for DNS.

Don't use LDAP and Databases in the Event Loop;  Not all content
generation needs to be in the main event loop, but lots of content
generation and handling of clients should be.

> - You cannot block, ever. Think you can cheat and just make a cheeky attempt
> to load that file quickly while nobody is looking? Your hard disk spins
> down, your network drive is slow for whatever reason, and your entire server
> stops dead in its tracks. We see this choppy behaviour in poorly written
> user interface code, we see the same choppy behaviour in cheating event
> driven webservers.

Node.js doesn't cheat.  It works fine.  Its not that hard to .... not
do file IO in the event loop thread.

> - You have zero room for error. Not a single mistake can be tolerated. One
> foot wrong, the event loop spins. Step one foot wrong the other way, and
> your task you were doing evaporates. Finding these problems is painful, and
> your server is unstable until you do.

This sounds like an implementation problem.  This is not a problem in Node.js.

> - You have to handle every single possible error condition. Every single
> one. Miss one? You suddenly drop out of an event handler, and your event
> loop spins, or the request becomes abandoned. You have no room for error at
> all.

I'm not suggesting the whole thing is trivial, but how is this worse
than our current situation?

> We have made our event driven code work because it does a number of very
> simple and small things, and it's designed to do these simple and small
> things well, and we want it to be as compact and fast as humanly possible,
> given that datacentre footprint is our primary constraint.
>
> Our system is like a sportscar, it's fast, but it breaks down if we break
> the rules. But for us, we are prepared to abide by the rules to achieve the
> speed we need.
>
> Let's contrast this with a web server.
>
> Webservers are traditionally fluid beasts, that have been and continue to be
> moulded and shaped that way through many many ever changing requirements
> from webmasters. They have been made modular and extensible, and those
> modules and extensions are written by people with different programming
> ability, to different levels of tolerances, within very different budget
> constraints.
>
> Simply put, webservers need to tolerate error. They need to be built like
> tractors.
>
> Unreliable code? We have to work despite that. Unhandled error conditions?
> We have to work despite that. Code that was written in a hurry on a budget?
> We have to work despite that.

You are confusing the 'core' network IO model with fault isolation.
The Worker MPM has actually been quite good on most platforms for the
last decade.   There is little reason to use prefork anymore.

Should we run PHP inside the core event loop?  Hell no.

We can build reasonable fault isolation for modules that wish to have
it, probably even do it by default, and if a module 'opts' in, or
maybe there are different APIs, it gets to run in the Event Loop.

> Are we going to be sexy? Of course not. But while the sportscar is broken
> down at the side of the road, the tractor just keeps going.
>
> Why does our incredibly unsexy architecture help webmasters? Because prefork
> is bulletproof. Leak, crash, explode, hang, the parent will clean up after
> us. Whatever we do, within reason, doesn't affect the process next door. If
> things get really dire, we're delayed for a while, and we recover when the
> problems pass. Does the server die? Pretty much never. What if we trust our
> code? Well, worker may help us. Crashes do affect the request next door, but
> if they're rare enough we can tolerate it. The event mpm? It isn't truly an
> even mpm, it is rather more efficient when it comes to keepalives and
> waiting for connections, where we hand this problem to an event loop that
> doesn't run anyone else's code within it, so we're still reliable despite
> the need for a higher standard of code accuracy.

Wait, so you are saying the only valid configuration of httpd is
prefork?  This doesn't match at all how I've been using httpd for the
last 5 years.

> If you've ever been in a situation where a company demands more speed out of
> a webserver, wait until you sacrifice reliability giving them the speed.
> Suddenly they don't care about the speed, reliability becomes top priority
> again, as it should be.
>
> So, to get round to my point. If we decide to relook at the architecture of
> v3.0, we should be careful to ensure that we don't stop offering a "tractor
> mode", as this mode is our killer feature.. There are enough webservers out
> there that try to be event driven and sexy, and then fall over on
> reliability. Or alternatively, there are webservers out there that try to be
> event driven and sexy, and succeed at doing so because they keep their
> feature set modest, keep extensibility to a minimum and avoid touching
> blocking calls to disks and other blocking devices. Great for load
> balancers, not so great for anything else.
>
> Apache httpd has always had at it's heart the ability to be practically
> extensible, while remaining reliable, and I think we should continue to do
> that.

I think as Stefan aludes to, there is a reasonable middle ground where
network IO is done well in an Event loop, but we can still maintain
easy extendability, with some multi-process and multi-thread systems
for content generators that have their own needs, like file io.

But certain things in the core, like SSL, must be done right, and done
in an evented way.  It'll be hard, but we are programmers after all
aren't we?

Re: 3.0, the 2011 thread.

Posted by Igor Galić <i....@brainsware.org>.


----- Original Message -----
> 2011/6/18 Igor Galić <i....@brainsware.org>:
> >
> >
> > ----- Original Message -----
> >> On Friday 17 June 2011, Graham Leggett wrote:
> >> > We used openssl to make our non blocking event driven stuff
> >> > work,
> >> > and it works really well (once you've properly handled
> >> > SSL_ERROR_WANT_READ and SSL_ERROR_WANT_WRITE). There is no
> >> > reason
> >> > I can see that would stop us using openssl to be async in httpd,
> >> > we just need to refactor the mod_ssl code to actually do it.
> >>
> >> Someone (Paul?) once told me that openssl is not very good when it
> >> comes to async access to the session cache (which could need
> >> network
> >> io if using memcached), CRLs (which I could imagine to reside in
> >> LDAP)
> >> and similar things. But this would have to be evaluated.
> >>
> >> > The tricky part with event driven code is the really bad support
> >> > for   event driven file access. We used libev as our core event
> >> > loop, which does a significant amount of work to make files and
> >> > sockets work the same way in the event loop as best it can in a
> >> > portable way. Don't know of any other event loop that does this.
> >> > It's difficult trying to do the event driven thing if you
> >> > intersperse event driven socket handling with blocking file
> >> > handling, you end up with many requests blocked by an unrelated
> >> > system call.
> >>
> >> Yes, I guess we would need a pool of lightweight worker threads
> >> that
> >> does the file io (especially sendfile). Those threads would
> >> probably
> >> get by with very small stack sizes and use little resources. If
> >> the
> >> event library we choose already has this built in, we can of
> >> course
> >> use that, too.
> >
> > This kind of reminds me of the architecture of Apache Traffic
> > Server
> > see http://www.slideshare.net/zwoop/rit-2011-ats
> >
> 
> Yes, ATS is the right model I think, but I don't think the... C++isms
> and general bagage that comes along with it are ideal for httpd
> though?

I was merely referring to the architecture,
not its implementation :)

i

-- 
Igor Galić

Tel: +43 (0) 664 886 22 883
Mail: i.galic@brainsware.org
URL: http://brainsware.org/

Re: 3.0, the 2011 thread.

Posted by Paul Querna <pa...@querna.org>.

2011/6/18 Igor Galić <i....@brainsware.org>:
>
>
> ----- Original Message -----
>> On Friday 17 June 2011, Graham Leggett wrote:
>> > We used openssl to make our non blocking event driven stuff work,
>> > and it works really well (once you've properly handled
>> > SSL_ERROR_WANT_READ and SSL_ERROR_WANT_WRITE). There is no reason
>> > I can see that would stop us using openssl to be async in httpd,
>> > we just need to refactor the mod_ssl code to actually do it.
>>
>> Someone (Paul?) once told me that openssl is not very good when it
>> comes to async access to the session cache (which could need network
>> io if using memcached), CRLs (which I could imagine to reside in
>> LDAP)
>> and similar things. But this would have to be evaluated.
>>
>> > The tricky part with event driven code is the really bad support
>> > for   event driven file access. We used libev as our core event
>> > loop, which does a significant amount of work to make files and
>> > sockets work the same way in the event loop as best it can in a
>> > portable way. Don't know of any other event loop that does this.
>> > It's difficult trying to do the event driven thing if you
>> > intersperse event driven socket handling with blocking file
>> > handling, you end up with many requests blocked by an unrelated
>> > system call.
>>
>> Yes, I guess we would need a pool of lightweight worker threads that
>> does the file io (especially sendfile). Those threads would probably
>> get by with very small stack sizes and use little resources. If the
>> event library we choose already has this built in, we can of course
>> use that, too.
>
> This kind of reminds me of the architecture of Apache Traffic Server
> see http://www.slideshare.net/zwoop/rit-2011-ats
>

Yes, ATS is the right model I think, but I don't think the... C++isms
and general bagage that comes along with it are ideal for httpd
though?

Re: 3.0, the 2011 thread.

Posted by Igor Galić <i....@brainsware.org>.


----- Original Message -----
> On Friday 17 June 2011, Graham Leggett wrote:
> > We used openssl to make our non blocking event driven stuff work,
> > and it works really well (once you've properly handled
> > SSL_ERROR_WANT_READ and SSL_ERROR_WANT_WRITE). There is no reason
> > I can see that would stop us using openssl to be async in httpd,
> > we just need to refactor the mod_ssl code to actually do it.
> 
> Someone (Paul?) once told me that openssl is not very good when it
> comes to async access to the session cache (which could need network
> io if using memcached), CRLs (which I could imagine to reside in
> LDAP)
> and similar things. But this would have to be evaluated.
> 
> > The tricky part with event driven code is the really bad support
> > for   event driven file access. We used libev as our core event
> > loop, which does a significant amount of work to make files and
> > sockets work the same way in the event loop as best it can in a
> > portable way. Don't know of any other event loop that does this.
> > It's difficult trying to do the event driven thing if you
> > intersperse event driven socket handling with blocking file
> > handling, you end up with many requests blocked by an unrelated
> > system call.
> 
> Yes, I guess we would need a pool of lightweight worker threads that
> does the file io (especially sendfile). Those threads would probably
> get by with very small stack sizes and use little resources. If the
> event library we choose already has this built in, we can of course
> use that, too.

This kind of reminds me of the architecture of Apache Traffic Server
see http://www.slideshare.net/zwoop/rit-2011-ats


i

-- 
Igor Galić

Tel: +43 (0) 664 886 22 883
Mail: i.galic@brainsware.org
URL: http://brainsware.org/

Re: 3.0, the 2011 thread.

Posted by Stefan Fritsch <sf...@sfritsch.de>.

On Friday 17 June 2011, Graham Leggett wrote:
> We used openssl to make our non blocking event driven stuff work,
> and it works really well (once you've properly handled
> SSL_ERROR_WANT_READ and SSL_ERROR_WANT_WRITE). There is no reason
> I can see that would stop us using openssl to be async in httpd,
> we just need to refactor the mod_ssl code to actually do it.

Someone (Paul?) once told me that openssl is not very good when it 
comes to async access to the session cache (which could need network 
io if using memcached), CRLs (which I could imagine to reside in LDAP) 
and similar things. But this would have to be evaluated.

> The tricky part with event driven code is the really bad support
> for   event driven file access. We used libev as our core event
> loop, which does a significant amount of work to make files and
> sockets work the same way in the event loop as best it can in a
> portable way. Don't know of any other event loop that does this.
> It's difficult trying to do the event driven thing if you
> intersperse event driven socket handling with blocking file
> handling, you end up with many requests blocked by an unrelated
> system call.

Yes, I guess we would need a pool of lightweight worker threads that 
does the file io (especially sendfile). Those threads would probably 
get by with very small stack sizes and use little resources. If the 
event library we choose already has this built in, we can of course 
use that, too.

Re: 3.0, the 2011 thread.

Posted by Graham Leggett <mi...@sharp.fm>.

On 16 Jun 2011, at 10:27 AM, Stefan Fritsch wrote:

> I mostly agree with Graham. I propose a hybrid approach. Make the  
> MPM and the network/connection filters (this includes ssl) event  
> driven and keep the request handling based on threads and workers.

We used openssl to make our non blocking event driven stuff work, and  
it works really well (once you've properly handled SSL_ERROR_WANT_READ  
and SSL_ERROR_WANT_WRITE). There is no reason I can see that would  
stop us using openssl to be async in httpd, we just need to refactor  
the mod_ssl code to actually do it.

The tricky part with event driven code is the really bad support for  
event driven file access. We used libev as our core event loop, which  
does a significant amount of work to make files and sockets work the  
same way in the event loop as best it can in a portable way. Don't  
know of any other event loop that does this. It's difficult trying to  
do the event driven thing if you intersperse event driven socket  
handling with blocking file handling, you end up with many requests  
blocked by an unrelated system call.

One needs to keep a clear separation between "we're event driven and  
non blocking" code and "we're now in worker mode, and are allowed to  
block" code.

Regards,
Graham
--

Re: 3.0, the 2011 thread.

Posted by Stefan Fritsch <sf...@sfritsch.de>.

On Thu, 16 Jun 2011, Graham Leggett wrote:

> On 16 Jun 2011, at 12:01 AM, Paul Querna wrote:
>> The problem is our process model, and our module APIs.

> Apache httpd has always had at it's heart the ability to be practically 
> extensible, while remaining reliable, and I think we should continue to do 
> that.

I mostly agree with Graham. I propose a hybrid approach. Make the MPM and 
the network/connection filters (this includes ssl) event driven and keep 
the request handling based on threads and workers. This would make the 
architecture change more incremental, maybe allow some moderate API 
compatibility for a significant part of the modules out there, make it 
easier to write modules and to use normal non-event driven third party 
libraries. As an option, it should also be possible to write event-driven 
request handlers (proxy CONNECT handling would be the first candidate for 
that).

Maybe it would even be possible to use pocore/libuv/whatever in the async 
part and still make it create an apr request pool in the non-async part 
for compatibiliy.

Re: 3.0, the 2011 thread.

Posted by Graham Leggett <mi...@sharp.fm>.

On 16 Jun 2011, at 12:01 AM, Paul Querna wrote:

> I think we have all joked on and off about 3.0 for... well about 8  
> years now.
>
> I think we are nearing the point we might actually need to be  
> serious about it.
>
> The web is changed.
>
> SPDY is coming down the pipe pretty quickly.
>
> WebSockets might actually be standardized this year.
>
> Two protocols which HTTPD is unable to be good at. Ever.
>
> The problem is our process model, and our module APIs.

I am not convinced.

Over the last three years, I have developed a low level stream serving  
system that we use to disseminate diagnostic data across datacentres,  
and one of the basic design decisions was that  it was to be lock free  
and event driven, because above all it needed to be fast. The event  
driven stuff was done properly, based on religious application of the  
following rule:

"Thou shalt not attempt any single read or write without the event  
loop giving you permission to do that single read or write first. Not  
a single attempt, ever."

 From that effort I've learned the following:

- Existing APIs in unix and windows really really suck at non blocking  
behaviour. Standard APR file handling couldn't do it, so we couldn't  
use it properly. DNS libraries are really terrible at it. The vast  
majority of "async" DNS libraries are just hidden threads which wrap  
attempts to make blocking calls, which in turn means unknown resource  
limits are hit when you least expect it. Database and LDAP calls are  
blocking. What this means practically is that you can't link to most  
software out there.

- You cannot block, ever. Think you can cheat and just make a cheeky  
attempt to load that file quickly while nobody is looking? Your hard  
disk spins down, your network drive is slow for whatever reason, and  
your entire server stops dead in its tracks. We see this choppy  
behaviour in poorly written user interface code, we see the same  
choppy behaviour in cheating event driven webservers.

- You have zero room for error. Not a single mistake can be tolerated.  
One foot wrong, the event loop spins. Step one foot wrong the other  
way, and your task you were doing evaporates. Finding these problems  
is painful, and your server is unstable until you do.

- You have to handle every single possible error condition. Every  
single one. Miss one? You suddenly drop out of an event handler, and  
your event loop spins, or the request becomes abandoned. You have no  
room for error at all.

We have made our event driven code work because it does a number of  
very simple and small things, and it's designed to do these simple and  
small things well, and we want it to be as compact and fast as humanly  
possible, given that datacentre footprint is our primary constraint.

Our system is like a sportscar, it's fast, but it breaks down if we  
break the rules. But for us, we are prepared to abide by the rules to  
achieve the speed we need.

Let's contrast this with a web server.

Webservers are traditionally fluid beasts, that have been and continue  
to be moulded and shaped that way through many many ever changing  
requirements from webmasters. They have been made modular and  
extensible, and those modules and extensions are written by people  
with different programming ability, to different levels of tolerances,  
within very different budget constraints.

Simply put, webservers need to tolerate error. They need to be built  
like tractors.

Unreliable code? We have to work despite that. Unhandled error  
conditions? We have to work despite that. Code that was written in a  
hurry on a budget? We have to work despite that.

Are we going to be sexy? Of course not. But while the sportscar is  
broken down at the side of the road, the tractor just keeps going.

Why does our incredibly unsexy architecture help webmasters? Because  
prefork is bulletproof. Leak, crash, explode, hang, the parent will  
clean up after us. Whatever we do, within reason, doesn't affect the  
process next door. If things get really dire, we're delayed for a  
while, and we recover when the problems pass. Does the server die?  
Pretty much never. What if we trust our code? Well, worker may help  
us. Crashes do affect the request next door, but if they're rare  
enough we can tolerate it. The event mpm? It isn't truly an even mpm,  
it is rather more efficient when it comes to keepalives and waiting  
for connections, where we hand this problem to an event loop that  
doesn't run anyone else's code within it, so we're still reliable  
despite the need for a higher standard of code accuracy.

If you've ever been in a situation where a company demands more speed  
out of a webserver, wait until you sacrifice reliability giving them  
the speed. Suddenly they don't care about the speed, reliability  
becomes top priority again, as it should be.

So, to get round to my point. If we decide to relook at the  
architecture of v3.0, we should be careful to ensure that we don't  
stop offering a "tractor mode", as this mode is our killer feature..  
There are enough webservers out there that try to be event driven and  
sexy, and then fall over on reliability. Or alternatively, there are  
webservers out there that try to be event driven and sexy, and succeed  
at doing so because they keep their feature set modest, keep  
extensibility to a minimum and avoid touching blocking calls to disks  
and other blocking devices. Great for load balancers, not so great for  
anything else.

Apache httpd has always had at it's heart the ability to be  
practically extensible, while remaining reliable, and I think we  
should continue to do that.

Regards,
Graham
--

Re: 3.0, the 2011 thread.

Posted by Jim Jagielski <ji...@jaguNET.com>.

Regardless of everything else, we *for sure* look at
alternatives for some of what we're using now... I
really, really, REALLY like the pocore-version of APR,
and that would be a relatively quick and easy improvement.

On Jun 15, 2011, at 6:01 PM, Paul Querna wrote:

> I think we have all joked on and off about 3.0 for... well about 8 years now.
>

Re: 3.0, the 2011 thread.

Posted by Tim Bannister <is...@jellybaby.net>.

On 15 Jun 2011, at 23:01, Paul Querna wrote:

> I think we have all joked on and off about 3.0 for... well about 8 years now.
> 
> I think we are nearing the point we might actually need to be serious about it.
…
> If we don't, I'm sure others in the web server market will continue to gain market share.

That's true, perhaps. But this is going to look again like a young, innovation-stage market. It's not so easy to look at potential features and back a winner.
I think the httpd developers should be wary of changing too much. If market share moves because of the quality of the coding, that would suggest a missed opportunity. But if market share switches because of significant technological change, then I'd look more at accepting this. And focusing on serving the many people who want httpd to work much as it does today, because they have already adopted it and come to rely on it.

-- 
Tim Bannister - +44 7980408788 - isoma@jellybaby.net

Re: 3.0, the 2011 thread.

Posted by Paul Querna <pa...@querna.org>.

On Wed, Jun 15, 2011 at 3:26 PM, Akins, Brian <Br...@turner.com> wrote:
> On 6/15/11 6:01 PM, "Paul Querna" <pa...@querna.org> wrote:
>
>> pocore: For base OS portability and memory pooling system.
>>   <http://code.google.com/p/pocore/>
>
> How does this compare to APR?

It's like an APR version 3.0.

It has a faster pools system, with the ability to free() items, and it
drops all of the apr-utilism things like databases, ldap, etc.

Re: 3.0, the 2011 thread.

Posted by "Akins, Brian" <Br...@turner.com>.

On 6/15/11 6:01 PM, "Paul Querna" <pa...@querna.org> wrote:

> pocore: For base OS portability and memory pooling system.
>   <http://code.google.com/p/pocore/>

How does this compare to APR?

> libuv: Portable, fast, Network IO. (IOCP programming model, brought to Unix)
>   <https://github.com/joyent/libuv>

I've played with it.  It's rough - particularly dealing with memory.

> http-parser: HTTP really broken out to simple callbacks.
>   <https://github.com/ry/http-parser>

I like this one a lot.

> selene: SSL, redone to better support Async IO.
>   <https://github.com/pquerna/selene>

Haven't had a chance.
 

+1 to the idea.  I still like Lua ;) People said I was crazy when I said Lua
should be the config and the runtime - now look at node.js

-- 
Brian Akins