You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Paul Querna <ch...@force-elite.com> on 2007/02/14 08:33:27 UTC

3.0 - Proposed Goals

So, I've been kicking around some ideas about where I personally would
like trunk to go for a couple months now.

My personal goals for 3.0:
 - Write some cool stuff, that is fun to hack on.
 - Create an environment that encourages others to contribute, A project
this large cannot and should not be done alone or by a small group. Part
of this means my goal is to meet whatever the personal goals are with
others for httpd; Others might not want cool stuff, or fun stuff, or
they might want something else.

More Product-type Goals:

- Rewrite the Core to be an Async Event state machine and data router.
The core should only route events to protocol modules.  All it handles
is the state machine of connection A is waiting for input, Connection B
is waiting for Disk IO, connection C is being run by another thread,
etc..  This could lead to things like mod_proxy being able to run
completely asynchronously, eliminating many performance issues we see
today.

- Rewrite how Brigades, Buckets and filters work.  Possibly replace them
with other models. I haven't been able to personally consolidate my
thoughts on how to 'fix' filters, but I am sure we can plenty of long
threads about it :-)

- Break the 1:1 mapping of a worker to a single request.  In trunk (and
2.2) with the Event MPM, we have broken the 1:1 mapping of a single
thread to a connection, the next step is to break up a request into a
series of state changes, which would contain attributes stating if they
require a blocking operation. If any operation could block, we would
assign a worker to handle it (or perhaps use the leader model). Linux
also has some interesting ideas going with Syslets to execute system
calls in an async manner. and this is something I would like to
experiment with:
http://lkml.org/lkml/2007/2/13/142

- Change the meaning of MPMs. The problem with MPMs today is they are
really mostly platform abstractions -- not just abstractions of the
process model itself.  For example, if the Worker MPM was ported to use
the correct windows functions, there is no real reason it could not
replace the winnt MPM.  I believe we should try to move platform
abstractions back into APR or other util functions, and try to have a
single MPM that runs on all multi-threaded platforms.

- Include support for Waka. Roy has less than 1 year to get us an RFC :-)

- Build a cleaner configuration system, enabling runtime
reconfiguration. Today's system basically requires a complete restart of
everything to change configurations.  I would like to move to an
internal DOM like representation of the configuration, but preserve the
current file format as the 'default'. (Modules could easily write an XML
config file format, or pull from LDAP).

- Experiment with embedding scripting languages or something like
Varnish'es VCL if and where it makes sense. (Cache Rules, Rewrite Rules,
Require Rules, and the like).

- Experiment with the right way to abstract state machines,
multi-threading, and async IO from module developers who want a 'simple
world view'.  Most modules just want to run a few hooks, or generate
content.  We should preferably easier to do this than it is today.

- Find a better release model for a 3.0/trunk.  I don't think many
people are happy with how 2.0.x was handled in this respect, but I do
believe we need to release early and often.

- Promote and include a external-process communication method in the
core.  This could be used to communicate with PHP, a JVM, Ruby or many
other things that do not wish to be run inside a highly-threaded and
async core.  The place for large dynamic languages is not in the core
'data router' process. Choices include AJP, FastCGI, or just HTTP.  We
should optionally include a process management framework, to spawn these
as needed, to make configuration for server administrators easier.

- <insert your ideas here>

-Paul


Re: 3.0 - Proposed Goals

Posted by Jim Jagielski <ji...@jaguNET.com>.
On Feb 14, 2007, at 10:45 AM, Justin Erenkrantz wrote:

> On 2/13/07, Paul Querna <ch...@force-elite.com> wrote:
>> - Rewrite how Brigades, Buckets and filters work.  Possibly  
>> replace them
>> with other models. I haven't been able to personally consolidate my
>> thoughts on how to 'fix' filters, but I am sure we can plenty of long
>> threads about it :-)
>
> The collective design experiences behind serf tell me it's a lot
> easier (and performant) following's serf's bucket model.  Remember
> that serf's design came out of everyone's (me, Greg, Cliff, Roy, etc.)
> grief with filters and brigades and such - so I think it represents at
> least a good step in the right direction.
>
> I think httpd's bucket brigade model became overly complex and missed
> some goals.  I really like how Serf standardized on one model for
> 'input' and 'output' - which is a sore point with httpd's filters.
> Serf's buckets themselves are also about as close to Roy's original
> 'onions' model as you'll find anywhere.
>
> For those that haven't seen serf, it lives here now:
> http://code.google.com/p/serf/
>
> So, it'd be nice if Serf would be a starting point - plus, if we
> switched to that, we'd have most of the core design done for 3.x.  --
> justin
>

+1... I hadn't tracked serf for a few months but
I had been thinking the same thing.

Re: 3.0 - Proposed Goals

Posted by Plüm, Rüdiger, VF EITO <ru...@vodafone.com>.

> -----Ursprüngliche Nachricht-----
> Von: justin.erenkrantz@gmail.com 
>  Im Auftrag von Justin Erenkrantz
> Gesendet: Mittwoch, 14. Februar 2007 16:45
> An: dev@httpd.apache.org
> Betreff: Re: 3.0 - Proposed Goals
> 
> 
> On 2/13/07, Paul Querna <ch...@force-elite.com> wrote:
> > - Rewrite how Brigades, Buckets and filters work.  Possibly 
> replace them
> > with other models. I haven't been able to personally consolidate my
> > thoughts on how to 'fix' filters, but I am sure we can 
> plenty of long
> > threads about it :-)
> 
> The collective design experiences behind serf tell me it's a lot
> easier (and performant) following's serf's bucket model.  Remember
> that serf's design came out of everyone's (me, Greg, Cliff, Roy, etc.)
> grief with filters and brigades and such - so I think it represents at
> least a good step in the right direction.
> 
> I think httpd's bucket brigade model became overly complex and missed
> some goals.  I really like how Serf standardized on one model for
> 'input' and 'output' - which is a sore point with httpd's filters.
> Serf's buckets themselves are also about as close to Roy's original
> 'onions' model as you'll find anywhere.

Maybe we should keep in mind new possibilities like Linux's splice,
vmsplice and tee which could be useful for transfering files (local
on slow disks or on NFS) fast to a fast local cache.

> 
> For those that haven't seen serf, it lives here now:
> http://code.google.com/p/serf/
> 
> So, it'd be nice if Serf would be a starting point - plus, if we

Plus, we could integrate serf itself into 3.x to

- Have a http client API available inside of the core httpd / module delivered
  with httpd which has been requested in the past.
- We could use this API to improve the http proxy (the current access method
  with the reversed filter chains seems to me some sort of a hack)
- Support things like OCSP which also need a http client API.

Regards

Rüdiger

Re: 3.0 - Proposed Goals

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
On 2/13/07, Paul Querna <ch...@force-elite.com> wrote:
> - Rewrite how Brigades, Buckets and filters work.  Possibly replace them
> with other models. I haven't been able to personally consolidate my
> thoughts on how to 'fix' filters, but I am sure we can plenty of long
> threads about it :-)

The collective design experiences behind serf tell me it's a lot
easier (and performant) following's serf's bucket model.  Remember
that serf's design came out of everyone's (me, Greg, Cliff, Roy, etc.)
grief with filters and brigades and such - so I think it represents at
least a good step in the right direction.

I think httpd's bucket brigade model became overly complex and missed
some goals.  I really like how Serf standardized on one model for
'input' and 'output' - which is a sore point with httpd's filters.
Serf's buckets themselves are also about as close to Roy's original
'onions' model as you'll find anywhere.

For those that haven't seen serf, it lives here now:
http://code.google.com/p/serf/

So, it'd be nice if Serf would be a starting point - plus, if we
switched to that, we'd have most of the core design done for 3.x.  --
justin

Re: 3.0 - Proposed Goals

Posted by Joachim Zobel <jz...@heute-morgen.de>.
Am Donnerstag, den 15.02.2007, 11:51 -0800 schrieb Paul Querna:

> XML isn't important.

But validation is. And it would be really nice to have a uniqueness
constraint for the configuation that makes shure certain settings are
only done once. An error message is really preferrable to a silent
overwrite.

And an xml schema would be a way to say what is considered a valid
configuration. But I am not shure if this is a good idea.

Sincerely,
Joachim



Re: 3.0 - Proposed Goals

Posted by Paul Querna <ch...@force-elite.com>.
Joachim Zobel wrote:
> Am Dienstag, den 13.02.2007, 23:33 -0800 schrieb Paul Querna:
>> - Build a cleaner configuration system, enabling runtime
>> reconfiguration. Today's system basically requires a complete restart of
>> everything to change configurations.  I would like to move to an
>> internal DOM like representation of the configuration, but preserve the
>> current file format as the 'default'. (Modules could easily write an XML
>> config file format, or pull from LDAP).
> 
> A configuration command line similiar to a databases command line
> interface would be nice to have.
> 
> The good thing with an XML configuration would be, that you could
> validate it against a schema. Especially certain settings could be
> unique. This would put an end to wondering why a conf. change has no
> effect for an hour until discovering that the same value is set again
> somewhere else in the config.
> 
> An XML command line however ... :)

XML isn't important.

What is important is the internal API of how we represent the configuration.

XML, or the current config format, or something completely different,
are just INPUTS to the API for configuration......

-Paul

Re: 3.0 - Proposed Goals

Posted by Joachim Zobel <jz...@heute-morgen.de>.
Am Dienstag, den 13.02.2007, 23:33 -0800 schrieb Paul Querna:
> - Build a cleaner configuration system, enabling runtime
> reconfiguration. Today's system basically requires a complete restart of
> everything to change configurations.  I would like to move to an
> internal DOM like representation of the configuration, but preserve the
> current file format as the 'default'. (Modules could easily write an XML
> config file format, or pull from LDAP).

A configuration command line similiar to a databases command line
interface would be nice to have.

The good thing with an XML configuration would be, that you could
validate it against a schema. Especially certain settings could be
unique. This would put an end to wondering why a conf. change has no
effect for an hour until discovering that the same value is set again
somewhere else in the config.

An XML command line however ... :)

Sincerely,
Joachim



Re: 3.0 - Proposed Goals

Posted by Aaron Bannert <aa...@clove.org>.
On Wed, Feb 14, 2007 at 07:08:32PM +0000, Colm MacCarthaigh wrote:
> On Wed, Feb 14, 2007 at 01:57:27PM -0500, Brian Akins wrote:
> > Would be nice if we could do HTTP over unix domain sockets, for example.  
> > No need for full TCP stack just to pass things back and forth between 
> > Apache and "back-end" processes.
> 
> Or over standard input, so that we can have an admin debug mode. Type
> HTTP on standard in, see corresponing log messages on standard out.
> Exim has this feature and it is very useful.

For this you just need telnet or netcat, and then to tail the error log
in another window. I do this all the time to debug requests/responses.

-aaron

Re: 3.0 - Proposed Goals

Posted by Colm MacCarthaigh <co...@stdlib.net>.
On Wed, Feb 14, 2007 at 01:57:27PM -0500, Brian Akins wrote:
> Would be nice if we could do HTTP over unix domain sockets, for example.  
> No need for full TCP stack just to pass things back and forth between 
> Apache and "back-end" processes.

Or over standard input, so that we can have an admin debug mode. Type
HTTP on standard in, see corresponing log messages on standard out.
Exim has this feature and it is very useful.

-- 
Colm MacCárthaigh                        Public Key: colm+pgp@stdlib.net

Re: 3.0 - Proposed Goals

Posted by Paul Querna <ch...@force-elite.com>.
Brian Akins wrote:
> Jim Jagielski wrote:
> 
>>
>> This makes a lot of sense, but please NOT AJP... It
>> seems to be that staying with HTTP is the most scalable,
>> easiest to debug and troubleshoot, and the most straightforward.
> 
> 
> Would be nice if we could do HTTP over unix domain sockets, for 
> example.  No need for full TCP stack just to pass things back and forth 
> between Apache and "back-end" processes.
> 

+1

Re: 3.0 - Proposed Goals

Posted by Brian Akins <br...@turner.com>.
Jim Jagielski wrote:

> 
> This makes a lot of sense, but please NOT AJP... It
> seems to be that staying with HTTP is the most scalable,
> easiest to debug and troubleshoot, and the most straightforward.


Would be nice if we could do HTTP over unix domain sockets, for example.  No 
need for full TCP stack just to pass things back and forth between Apache and 
"back-end" processes.

-- 
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies

Re: 3.0 - Proposed Goals

Posted by Jim Jagielski <ji...@jaguNET.com>.
On Feb 14, 2007, at 2:33 AM, Paul Querna wrote:

>
> - Promote and include a external-process communication method in the
> core.  This could be used to communicate with PHP, a JVM, Ruby or many
> other things that do not wish to be run inside a highly-threaded and
> async core.  The place for large dynamic languages is not in the core
> 'data router' process. Choices include AJP, FastCGI, or just HTTP.  We
> should optionally include a process management framework, to spawn  
> these
> as needed, to make configuration for server administrators easier.
>

This makes a lot of sense, but please NOT AJP... It
seems to be that staying with HTTP is the most scalable,
easiest to debug and troubleshoot, and the most straightforward.
I agree that we need to improve our FastCGI capability as well.


Re: 3.0 - Proposed Goals

Posted by Niklas Edmundsson <ni...@acc.umu.se>.
On Wed, 14 Feb 2007, Nick Kew wrote:

> On Wed, 14 Feb 2007 15:41:38 +0100 (MET)
> Niklas Edmundsson <ni...@acc.umu.se> wrote:
>
>> One problem here is that this kind of docco usually needs to be made
>> by those who hate to write it: the core programmers.
>
> The core programmers use the core programmer documentation,
> aka the source code.  In particular, the .h files, which
> give you detailed API documentation.

Mkay. However, the source and header files aren't very good in the 
"how it's supposed to work" department. You usually end up looking at 
a module that implements stuff the wrong way. mod_example might be the 
ultimate example of this ;)

> For higher-level documentation of Apache 2.2, follow my .sig.

Remove stale docco and point there from the httpd website, then?

/Nikke
-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se      |     nikke@acc.umu.se
---------------------------------------------------------------------------
  A bird in the bush can't mess in your hand!
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Re: 3.0 - Proposed Goals

Posted by Vinko Vrsalovic <vi...@gmail.com>.
>
> Of course if no one wants to do it, we'll have to do with what we've got,
> but saying that it's not a problem doesn't seem wise to me.
>

I punish myself for talking before following the instructions. There are
good docs about module/core development in apachetutor.org.

And they even are referenced in http://httpd.apache.org/docs/2.2/developer/

Sorry for the noise,
V.

Re: 3.0 - Proposed Goals

Posted by Vinko Vrsalovic <vi...@gmail.com>.
On 2/14/07, Nick Kew <ni...@webthing.com> wrote:

> One problem here is that this kind of docco usually needs to be made
> > by those who hate to write it: the core programmers.
>
> The core programmers use the core programmer documentation,
> aka the source code.  In particular, the .h files, which
> give you detailed API documentation.


That's true for current core programmers, but the lack of sane "doccos"
about core/module development raises the entry barrier for potential future
core programmers and makes things such as "start depending on behavior in
the system that isn't actually documented [supposed] to work that way"
happen very easily.

This of course might be an intended side effect of the lack of documentation
(ie, only real hackers that invest time learning can hack on Apache), but
the presence of those outdated doc versions leads me to believe it isn't so.

The existence of your book, while very welcome, doesn't solve the problem.

Of course if no one wants to do it, we'll have to do with what we've got,
but saying that it's not a problem doesn't seem wise to me.

Sadly, I can't resist doing a lousy analogy (based on non true events):
'Real sysadmins use the best documentation available, the comments in
httpd.conf. For higher-level documentation, buy the "Apache Cookbook"'.

V.

Re: 3.0 - Proposed Goals

Posted by Nick Kew <ni...@webthing.com>.
On Wed, 14 Feb 2007 15:41:38 +0100 (MET)
Niklas Edmundsson <ni...@acc.umu.se> wrote:

> One problem here is that this kind of docco usually needs to be made 
> by those who hate to write it: the core programmers.

The core programmers use the core programmer documentation,
aka the source code.  In particular, the .h files, which
give you detailed API documentation.

For higher-level documentation of Apache 2.2, follow my .sig.

-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/

Re: 3.0 - Proposed Goals

Posted by Niklas Edmundsson <ni...@acc.umu.se>.
On Wed, 14 Feb 2007, Garrett Rooney wrote:

>> - Rewrite how Brigades, Buckets and filters work.  Possibly replace them
>> with other models. I haven't been able to personally consolidate my
>> thoughts on how to 'fix' filters, but I am sure we can plenty of long
>> threads about it :-)
>
> I think a big part of this should be documenting how filters are
> supposed to interact with the rest of the system.  Right now it seems
> to be very much a "well, I looked at this other module and did what it
> did", and it's quite easy to start depending on behavior in the system
> that isn't actually documented to work that way.

This hits a rather sweet spot it seems. Browsing the current httpd 
module/developer docco I find gems like:
http://httpd.apache.org/docs/2.2/developer/modules.html

One would think that now that 2.2 is released at least the 1.3->2.0 
converting docco would have evolved to something better than "it's a 
start" ...

Also, we have http://httpd.apache.org/docs/2.2/developer/API.html ... 
It seems that the most current API docco is for 1.3, but at least 
there's a nice disclaimer telling that it's obsolete but some 
information might be correct.

So yes, I fully agree that documentation is needed. It's a pain trying 
to figure out how stuff (are supposed to) work when the docco is two 
major releases behind...

One problem here is that this kind of docco usually needs to be made 
by those who hate to write it: the core programmers.


/Nikke
-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se      |     nikke@acc.umu.se
---------------------------------------------------------------------------
  "I should have done this a long time ago." - Picard
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Re: 3.0 - Proposed Goals

Posted by Garrett Rooney <ro...@electricjellyfish.net>.
On 2/14/07, Paul Querna <ch...@force-elite.com> wrote:

> - Rewrite how Brigades, Buckets and filters work.  Possibly replace them
> with other models. I haven't been able to personally consolidate my
> thoughts on how to 'fix' filters, but I am sure we can plenty of long
> threads about it :-)

I think a big part of this should be documenting how filters are
supposed to interact with the rest of the system.  Right now it seems
to be very much a "well, I looked at this other module and did what it
did", and it's quite easy to start depending on behavior in the system
that isn't actually documented to work that way.

> - Build a cleaner configuration system, enabling runtime
> reconfiguration. Today's system basically requires a complete restart of
> everything to change configurations.  I would like to move to an
> internal DOM like representation of the configuration, but preserve the
> current file format as the 'default'. (Modules could easily write an XML
> config file format, or pull from LDAP).

This seems like a rather invasive change.  Virtually every module
currently caches configuration info into global variables.  Are we
expecting these modules to dynamically query the core config system
whenever they want to access this sort of information?  What will the
performance implications of this sort of thing be?

> - Experiment with embedding scripting languages or something like
> Varnish'es VCL if and where it makes sense. (Cache Rules, Rewrite Rules,
> Require Rules, and the like).

This seems like a Good Idea (tm).

> - Promote and include a external-process communication method in the
> core.  This could be used to communicate with PHP, a JVM, Ruby or many
> other things that do not wish to be run inside a highly-threaded and
> async core.  The place for large dynamic languages is not in the core
> 'data router' process. Choices include AJP, FastCGI, or just HTTP.  We
> should optionally include a process management framework, to spawn these
> as needed, to make configuration for server administrators easier.

+1

-garrett

Re: 3.0 - Proposed Goals

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
Jim Jagielski wrote:
> 
> On Feb 14, 2007, at 3:28 PM, William A. Rowe, Jr. wrote:
> 
>>
>> It's always been small groups ;-)  But we are loathe to drop the 'barrier
>> to entry' of demonstrating that the new coder is 'cluefull'.  This is a
>> server platform, rife with the security issues that go along with that.
>>
> 
> We need to remind ourselves of the long period of
> time between 1.3 and 2.0, and the relatively short
> one between 2.0 and 2.2. The reason? Easy-to-manage
> changes. 2.0 was a victim of "just one more thing"itis,
> which made it prolonged as well as more complex than
> it needed to be. Most of the things we want to
> fix are things that were done "quickly" to get
> the basically implementation in there.
> 
> IMO, an Async baseline, event-based state model
> is, no doubt, the direction we need to be heading.
> Unless we do that right, the other stuff really won't
> be that worthwhile in the server we eventually come up
> with.

To turn this around, there's no reason not to start working towards
a flattened brigade passing schema such as serf's in a 3.0 prior to
an async 4.0.  Or begin to refactor the conf schema in 2.4.

Pick something.

My comment overall to Paul's post is that there's nothing stopping
this now; he did a fairly good job of identifying things that would
be nice to have, and show us the code, or thank you for the kick in
the pants as the case might be.

If there are obstacles, please start a thread on those.

Bill

Re: 3.0 - Proposed Goals

Posted by Jim Jagielski <ji...@jaguNET.com>.
On Feb 14, 2007, at 3:28 PM, William A. Rowe, Jr. wrote:

>
> It's always been small groups ;-)  But we are loathe to drop the  
> 'barrier
> to entry' of demonstrating that the new coder is 'cluefull'.  This  
> is a
> server platform, rife with the security issues that go along with  
> that.
>

We need to remind ourselves of the long period of
time between 1.3 and 2.0, and the relatively short
one between 2.0 and 2.2. The reason? Easy-to-manage
changes. 2.0 was a victim of "just one more thing"itis,
which made it prolonged as well as more complex than
it needed to be. Most of the things we want to
fix are things that were done "quickly" to get
the basically implementation in there.

IMO, an Async baseline, event-based state model
is, no doubt, the direction we need to be heading.
Unless we do that right, the other stuff really won't
be that worthwhile in the server we eventually come up
with.


Re: 3.0 - Proposed Goals

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
Paul Querna wrote:
> So, I've been kicking around some ideas about where I personally would
> like trunk to go for a couple months now.
> 
> My personal goals for 3.0:
>  - Write some cool stuff, that is fun to hack on.
>  - Create an environment that encourages others to contribute, A project
> this large cannot and should not be done alone or by a small group. Part
> of this means my goal is to meet whatever the personal goals are with
> others for httpd; Others might not want cool stuff, or fun stuff, or
> they might want something else.

It's always been small groups ;-)  But we are loathe to drop the 'barrier
to entry' of demonstrating that the new coder is 'cluefull'.  This is a
server platform, rife with the security issues that go along with that.

But I have no issue with us identifying new committers who successfully
and consistently add worthwhile code.  Propose them to private@.

See proxy-dev, the netware guts, the win32 guts, even rewrite or ssl
for some code that very few understand but makes for a bigger whole.

> More Product-type Goals:
> 
> - Rewrite the Core to be an Async Event state machine and data router.
> The core should only route events to protocol modules.  All it handles
> is the state machine of connection A is waiting for input, Connection B
> is waiting for Disk IO, connection C is being run by another thread,
> etc..  This could lead to things like mod_proxy being able to run
> completely asynchronously, eliminating many performance issues we see
> today.

+.999995

-1 to requiring an async model OS/communications stack.

You might think you are doing everyone a huge favor by setting a high
bar, but you neglect all the micro OS / embedded places that apache
can live today.  You might get a kick out of the March 07 Dr Dobbs
on this topic.

> - Rewrite how Brigades, Buckets and filters work.  Possibly replace them
> with other models. I haven't been able to personally consolidate my
> thoughts on how to 'fix' filters, but I am sure we can plenty of long
> threads about it :-)

+1 - let's start with serf.  Right now we have a pretty big stack space
burden, something that sucks for thousands of parallel threads.  Let's
eliminate that stack nightmare and pass brigades sideways, not down the
stack, and remove the input/output distinctions.  Also, let's get the
metadata right this time :)

> - Break the 1:1 mapping of a worker to a single request.  In trunk (and
> 2.2) with the Event MPM, we have broken the 1:1 mapping of a single
> thread to a connection, the next step is to break up a request into a
> series of state changes, which would contain attributes stating if they
> require a blocking operation. If any operation could block, we would
> assign a worker to handle it (or perhaps use the leader model). Linux
> also has some interesting ideas going with Syslets to execute system
> calls in an async manner. and this is something I would like to
> experiment with:
> http://lkml.org/lkml/2007/2/13/142

Fun technologies.  Fix brigade passing and you can accomplish whatever
you would like.  Provided that brigade passing and filtering can be
"incomplete" and we solve that issue, then everything else you want to
do can fall into the MPM pattern.

> - Change the meaning of MPMs. The problem with MPMs today is they are
> really mostly platform abstractions -- not just abstractions of the
> process model itself.  For example, if the Worker MPM was ported to use
> the correct windows functions, there is no real reason it could not
> replace the winnt MPM.  I believe we should try to move platform
> abstractions back into APR or other util functions, and try to have a
> single MPM that runs on all multi-threaded platforms.

Ok I'm confused.  First, worker requires fork, that isn't likely to
change, some platforms don't fork, so we will always have forking and
non-forking models.  A better question/example is why Netware and Win
don't share one, overall more effective MPM.

+1 to few, well identified MPM's, to do the traditional request queue
or the event model.  And the real test, the truly async MPM (third party
module users beware ;-)

-1 to even suggesting we change what they are.

+1 to making them loadable ;-)

> - Include support for Waka. Roy has less than 1 year to get us an RFC :-)

Ha.  Seriously you missed one, let users drop mod_http.  If they want to
run with only mod_waka, or mod_ftp, or mod_pop3+mod_snmp that's their
choice.  Loadable mod_http is the start of that.

The second aspect gets tricky; resource abstraction in a protocol neutral
manner.

> - Build a cleaner configuration system, enabling runtime
> reconfiguration. Today's system basically requires a complete restart of
> everything to change configurations.  I would like to move to an
> internal DOM like representation of the configuration, but preserve the
> current file format as the 'default'. (Modules could easily write an XML
> config file format, or pull from LDAP).

We are getting closer since 2.0.0, +1.  Offer the appropriate abstract
containers for folks to play mod_perl'ish games.

> - Experiment with embedding scripting languages or something like
> Varnish'es VCL if and where it makes sense. (Cache Rules, Rewrite Rules,
> Require Rules, and the like).

You'll get alot of -1's to this as a part of the core. As a module - cool.

> - Experiment with the right way to abstract state machines,
> multi-threading, and async IO from module developers who want a 'simple
> world view'.  Most modules just want to run a few hooks, or generate
> content.  We should preferably easier to do this than it is today.

Take it to apr.

> - Find a better release model for a 3.0/trunk.  I don't think many
> people are happy with how 2.0.x was handled in this respect, but I do
> believe we need to release early and often.

2.0.x howso?  We won't improve it if you don't elaborate what you didn't
like.  If you refer back to the constantly broken state of releases
prior to, oh, say 2.0.36 or so, then we already have it solved.  Start
releasing 2.9.0 releases unstable and let's let people play.  But 2.1.x
showed that not enough of them do, maybe this is more of an 'advertising'
sort of problem.

> - Promote and include a external-process communication method in the
> core.  This could be used to communicate with PHP, a JVM, Ruby or many
> other things that do not wish to be run inside a highly-threaded and
> async core.  The place for large dynamic languages is not in the core
> 'data router' process. Choices include AJP, FastCGI, or just HTTP.  We
> should optionally include a process management framework, to spawn these
> as needed, to make configuration for server administrators easier.

???  Ok, you confused me here ;-)  You want to reinvent the System.Web.Host
model?


Re: 3.0 - Proposed Goals

Posted by Graham Leggett <mi...@sharp.fm>.
On Mon, February 19, 2007 11:44 am, Nick Kew wrote:

> You've missed the most important consideration here.
> Namely, don't break everything that's gone before.
>
> Specifically, a big -1 on forcing substantial rewrites of
> existing applications.  Or in other words, the API must
> continue to work (with at most trivial breakages).

>From the experience of other projects, grand rewrites have a tendency to
take a long time, causing the developers to lose interest eventually and
for the effort to ultimately fizzle out.

I would suggest pick one well defined goal out of the list proposed, and
focus on that till you can get it working, then move to the next item (and
probably release a v4.x).

Regards,
Graham
--



RE: 3.0 - Proposed Goals

Posted by "Peter J. Cranstone" <pe...@5o9inc.com>.
So might I make a humble suggestion.

Ask your 65 million customers what they would like in Apache 3.0 - this time
around let someone else tell you what they want.

It's the only way to build something.


Peter J. Cranstone
5o9, Inc.
303.809.7342 | peter.cranstone@5o9inc.com

Making Web Applications Location and User Aware
URL: www.5o9inc.com 

-----Original Message-----
From: Nick Kew [mailto:nick@webthing.com] 
Sent: Monday, February 19, 2007 2:44 AM
To: dev@httpd.apache.org
Subject: Re: 3.0 - Proposed Goals

On Tue, 13 Feb 2007 23:33:27 -0800
Paul Querna <ch...@force-elite.com> wrote:

> So, I've been kicking around some ideas about where I personally would
> like trunk to go for a couple months now.

You've missed the most important consideration here.
Namely, don't break everything that's gone before.

Specifically, a big -1 on forcing substantial rewrites of
existing applications.  Or in other words, the API must
continue to work (with at most trivial breakages).

Of course, deprecating things is fine.  And where parts of
the existing API do not fit well, they might be moved outwards
from the core to a compatibility layer - provided that's
going to be maintainable.

The breakage between 1.x and 2.0 was far too much.  If we
do it again, the world will rightly conclude that Apache
is not a solution fit for the long term.

-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/


Re: 3.0 - Proposed Goals

Posted by Sander Temme <sc...@apache.org>.
On Feb 19, 2007, at 12:51 PM, Roy T. Fielding wrote:

> We work best as a collaboration when we give people the freedom to
> explore their own personal wild ideas (or even just reasonable ideas
> for which the solution has no clear timeline).  If we artificially
> constrain the scope of what can be done based on the group's a priori
> perception, then we effectively go nowhere new (because collectives
> fear the new).

I really like Roy's comment.  Yes, let's hack!  When the time comes,  
we can close the loop on the module API and get with the third party  
guys.

S.

-- 
sctemme@apache.org            http://www.temme.net/sander/
PGP FP: 51B4 8727 466A 0BC3 69F4  B7B8 B2BE BC40 1529 24AF



Re: 3.0 - Proposed Goals

Posted by "Roy T. Fielding" <fi...@gbiv.com>.
On Feb 19, 2007, at 11:06 AM, Sander Temme wrote:
> On Feb 19, 2007, at 1:44 AM, Nick Kew wrote:
>
>> The breakage between 1.x and 2.0 was far too much.  If we
>> do it again, the world will rightly conclude that Apache
>> is not a solution fit for the long term.
>
> +1.  While it's fun and rewarding to hack on advanced stuff in its  
> own right, the project as a whole must keep an eye on the user  
> community, their needs and their wants.  Many of us wouldn't be  
> here if the server wasn't popular to begin with: we need the user  
> community as a soundboard and a source of new contributors.

Yes and no.  There is an inherent reflex of any group of people
to resist change.  That is the main reason I wanted the work to
be done in one or more sandboxes, instead of some place that is
perceptually limited by a version number.

New design is not a group effort.  Critique of designs is a group
effort, as is tweaking the results to fit new ideas or setting
constraints within which a particular solution can be found.
New implementation is not a group effort either, at least no more
so than can be cleanly broken out into components and worked on
separately.

It is important to always keep in mind that the group as a whole
does not *do* anything in particular for Apache other than to
critique each others' designs and implementations in the hope that
our collection of experience will compensate for the time spent
collaborating on a common solution.

We work best as a collaboration when we give people the freedom to
explore their own personal wild ideas (or even just reasonable ideas
for which the solution has no clear timeline).  If we artificially
constrain the scope of what can be done based on the group's a priori
perception, then we effectively go nowhere new (because collectives
fear the new).  The fact of the matter is that any one of us could,
given adequate energy, rewrite the entire server in a couple months
of focused time, and the only thing holding us back (aside from lack
of said time) is the fear of acceptance at the end.  We need to lose
that fear.

In short, worrying about API backwards compatibility is not an
issue yet.  If you want to make it an issue, then you have to
create a way to provide backwards compatibility for whatever
ends up being merged to trunk.  You don't have to worry about
what breaks in amsterdam until a later time, when we have evidence
to evaluate whether the breakage is worth the cost and we have
a reasonable chance of determining whether a bridge API could
be built as part of the merge.

In the mean time, work on what matters to you.  Do not work against
what matters to someone else.

....Roy

Re: 3.0 - Proposed Goals

Posted by Jim Jagielski <ji...@devsys.jaguNET.com>.
Sander Temme wrote:
> 
> How many Apache 'D' versions do we want to maintain?  Popularity of  
> 1.3 is still too high for us to completely ignore, and there is much  
> 2.0 still out there.
> 

Any many people taking up 2.2...

-- 
===========================================================================
   Jim Jagielski   [|]   jim@jaguNET.com   [|]   http://www.jaguNET.com/
	    "If you can dodge a wrench, you can dodge a ball."

Re: 3.0 - Proposed Goals

Posted by Sander Temme <sc...@apache.org>.
On Feb 19, 2007, at 1:44 AM, Nick Kew wrote:

> The breakage between 1.x and 2.0 was far too much.  If we
> do it again, the world will rightly conclude that Apache
> is not a solution fit for the long term.

+1.  While it's fun and rewarding to hack on advanced stuff in its  
own right, the project as a whole must keep an eye on the user  
community, their needs and their wants.  Many of us wouldn't be here  
if the server wasn't popular to begin with: we need the user  
community as a soundboard and a source of new contributors.

How many Apache 'D' versions do we want to maintain?  Popularity of  
1.3 is still too high for us to completely ignore, and there is much  
2.0 still out there.

We need to actively engage with third party module authors,  
*especially* the PHP community, to make sure that the most popular  
third party modules will be ready for 3.0 right out of the gate.  I  
don't know whether we should preserve 2.0 API support... don't know  
if we *could*, but if we can replace it with a kinder, simpler API we  
can promote uptake before we make the release.

S.

-- 
sctemme@apache.org            http://www.temme.net/sander/
PGP FP: 51B4 8727 466A 0BC3 69F4  B7B8 B2BE BC40 1529 24AF



Re: 3.0 - Proposed Goals

Posted by Nick Kew <ni...@webthing.com>.
On Tue, 13 Feb 2007 23:33:27 -0800
Paul Querna <ch...@force-elite.com> wrote:

> So, I've been kicking around some ideas about where I personally would
> like trunk to go for a couple months now.

You've missed the most important consideration here.
Namely, don't break everything that's gone before.

Specifically, a big -1 on forcing substantial rewrites of
existing applications.  Or in other words, the API must
continue to work (with at most trivial breakages).

Of course, deprecating things is fine.  And where parts of
the existing API do not fit well, they might be moved outwards
from the core to a compatibility layer - provided that's
going to be maintainable.

The breakage between 1.x and 2.0 was far too much.  If we
do it again, the world will rightly conclude that Apache
is not a solution fit for the long term.

-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/

Re: 3.0 - Proposed Goals

Posted by Jean-Frederic <jf...@gmail.com>.
> - Provide a generic inter-process data-sharing framework.  Currently
>   mod_ssl, mod_auth_digest, mod_ldap, and the scoreboard all use
>   more-or-less independent implementations of shared memory data stores.
>   As someone who maintains a module with yet another such data store,
>   I think a standard interface for such things (beyond apr_rmm) might be
>   useful.  Perhaps something key/value based; maybe aligned with memcached
>   somehow?  See my final musings below.
> 
> - Provide a generic scoreboard interface for use by modules.  The
>   current scoreboard is effectively sized at initial startup to
>   max MPM processes * max MPM threads.  That wastes space, but also
>   provides no way for modules to register their own private threads.
>   As someone who maintains a module with such threads, I'd love to
>   see them in mod_status.  I'd also like to see the non-worker threads
>   from an MPM like worker in there too (i.e., listener, start, main);
>   I have a collection of incomplete patches to do this hanging around.
> 
>    Admittedly, this may be a hard problem: how do you size the scoreboard's
> block of shared memory if modules can be added at restarts, and might
> suddenly require extra scoreboard space for their threads?  I have no good
> solution.  (Should the scoreboard use the above-mentioned generic
> data-sharing framework, or not?  Perhaps shared memory isn't even the
> right tool?)

You need different kinds of "shared" memory:
- stuff with one writer and several readers.
- stuff with several writes and several readers.
httpd-proxy-scoreboard in fact only solved the first one and only
provide memory organised in slotmem (records).
The idea to have a provider of "shared" memory and use it for the
scoreboard is of course a need for the next version and there should be
a way to access to httpd scoreboard from in external process (for
example a java process for JMX access). Once the configuration is
readable and dynamic it could be forwarded to the next node of an
"httpd" running in a cluster. 

If you have something something like the workers of mod_proxy it must
possible to add a new worker (a new node of cluster front-end by httpd)
without the need of restarting httpd. A not easy problem is to remove a
worker.

> 
>    If we're generally moving to an increasingly asynchronous, threaded
> design then I think such a scoreboard might also serve as a valuable
> sanity check during development ("What the heck is that thread doing?")

One need thing is to be able from a module to ask for a "timeout": some
thing like a dummy request that happens after a timeout if no requests
are coming. (To query a back-end server for example).

> 
>    An API which allowed threads to register their possible states might
> be valuable; this would allow modules/providers/MPMs to define what
> states were meaningful to them, rather than trying to define them
> all in scoreboard.h.
> 
>    I confess I haven't followed the progress of the httpd-proxy-scoreboard
> branch; maybe there's some work in there that would apply to these issues.
> 
>    

It is a bit dormant for the moment.

Cheers

Jean-Frederic


Re: 3.0 - Proposed Goals

Posted by Chris Darroch <ch...@pearsoncmg.com>.
Hi --

Paul Querna wrote:

> - Rewrite the Core to be an Async Event state machine and data router.
> - Break the 1:1 mapping of a worker to a single request.
> - Change the meaning of MPMs. The problem with MPMs today is they are
> really mostly platform abstractions -- not just abstractions of the
> process model itself.
> - Build a cleaner configuration system, enabling runtime
> reconfiguration.
> - Experiment with the right way to abstract state machines,
> multi-threading, and async IO from module developers who want a 'simple
> world view'.

   At a high level, I like these goals!  I'd agree with others that
it's probably most important to set specific goals which are achievable
in a mid-term timeframe (3-6 months?) and focus on those.  The process
of actually implementing them is likely to spur new ideas, as well.

   I'm not keen on requiring an XML configuration file format; so long
as it's optional, that's OK with me.

   Here are a few itches that I'd personally like to see scratched as
well; they may or may not align with your proposals:

- Stop passing sub-requests through the potentially expensive authn/z
  steps if they share the same authn/z configuration as the main
  request.  This would have security implications for certain external
  modules like mod_authz_svn, which currently expect to be called to do
  private authorization of each sub-request.  However, there are otherwise
  some serious performance implications when using, say, mod_dav
  with mod_authn_dbd.[1]

- Provide a generic inter-process data-sharing framework.  Currently
  mod_ssl, mod_auth_digest, mod_ldap, and the scoreboard all use
  more-or-less independent implementations of shared memory data stores.
  As someone who maintains a module with yet another such data store,
  I think a standard interface for such things (beyond apr_rmm) might be
  useful.  Perhaps something key/value based; maybe aligned with memcached
  somehow?  See my final musings below.

- Provide a generic scoreboard interface for use by modules.  The
  current scoreboard is effectively sized at initial startup to
  max MPM processes * max MPM threads.  That wastes space, but also
  provides no way for modules to register their own private threads.
  As someone who maintains a module with such threads, I'd love to
  see them in mod_status.  I'd also like to see the non-worker threads
  from an MPM like worker in there too (i.e., listener, start, main);
  I have a collection of incomplete patches to do this hanging around.

   Admittedly, this may be a hard problem: how do you size the scoreboard's
block of shared memory if modules can be added at restarts, and might
suddenly require extra scoreboard space for their threads?  I have no good
solution.  (Should the scoreboard use the above-mentioned generic
data-sharing framework, or not?  Perhaps shared memory isn't even the
right tool?)

   If we're generally moving to an increasingly asynchronous, threaded
design then I think such a scoreboard might also serve as a valuable
sanity check during development ("What the heck is that thread doing?")

   An API which allowed threads to register their possible states might
be valuable; this would allow modules/providers/MPMs to define what
states were meaningful to them, rather than trying to define them
all in scoreboard.h.

   I confess I haven't followed the progress of the httpd-proxy-scoreboard
branch; maybe there's some work in there that would apply to these issues.

   As a long-term goal I think it would be interesting to try to design
these interfaces in such a way as to allow them to work between multiple
instances of httpd.  This obviously heads into the tricky territory of
distributed computing, clustering, etc.  If one can't permit stale or
cached data you may need write replication, a distributed lock manager,
leader election schemes, and so forth.

   This is obviously complex stuff and maybe out of scope for httpd.
Still, I can't help but feel like there's a logical continuum here
from the existing httpd 2.x shared memory data stores, to memcached,
to a distributed locking and data storage system like Google's Chubby
lock service.[2]

   Moving httpd's existing uses of inter-process data stores to a
generic key/value interface might allow us to start with just a
default provider that had a shared memory implementation no different
than today's.  Other providers could then be developed later to
replicate the data across a cluster, if so desired.

Chris.

[1] http://marc.theaimsgroup.com/?l=apache-httpd-dev&m=116556310814307&w=2
[2] http://labs.google.com/papers/chubby.html

-- 
GPG Key ID: 366A375B
GPG Key Fingerprint: 485E 5041 17E1 E2BB C263  E4DE C8E3 FA36 366A 375B

Re: 3.0 - Proposed Goals

Posted by Paul Querna <ch...@force-elite.com>.
I have created a file in svn to track the discussion about these ideas,
and others at:
  <https://svn.apache.org/repos/asf/httpd/sandbox/amsterdam/ROADMAP>

As new ideas were added later on in the thread, please add them to this
file.

Paul Querna wrote:
> So, I've been kicking around some ideas about where I personally would
> like trunk to go for a couple months now.
....
> -Paul