You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Brian Behlendorf <br...@organic.com> on 1997/06/11 05:22:17 UTC

internal data format critical to api/config issue

More smoke to blow.

After reading all the posts on the API/config file issues, I think
we should start with something else: the core internal data
structures. I believe that we should have a config file format which
very closely metaphorically maps the internal structure of Apache.  I
think we've strayed a little from that only because new requirements
have been placed on old code, but I'd like to ask the question: are
people happy with the way the internal data structures are currently
organized?  It seems that some of the problems we have had recently
may have come from stretching the definition of a conn_rec, a
server_rec, command_rec, etc.  Thoughts?  

The answer might be no, the current model is fine and good, but I just
wanted to see what people thought.

If the answer is yes, we should *first* think about changes to the
internal structures, then the module API, then the config file format.
If the answer is no, then move on. :)

I advocate making the basic configuration file as closely
metaphorically mapped to the data structures as possible.  In
addition, we should find a way to support configuration engines - be
it a perl engine, an SNMP MIB, a Java app, etc.

	Brian

--=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--
brian@organic.com  www.apache.org  hyperreal.com  http://www.organic.com/JOBS


Re: internal data format critical to api/config issue

Posted by Dean Gaudet <dg...@arctic.org>.
On Wed, 11 Jun 1997, Brian Behlendorf wrote:

> On Wed, 11 Jun 1997, Dean Gaudet wrote:
> > Every time I walk down this path I think of how it'd be cool to compile
> > the <Directory>/<File>/<Location> sections into a lex grammar and generate
> > a pattern matching engine.
> 
> Could that be done at runtime?

Well all that lex/flex do is generate a bunch of tables and copy a
template .c file, it's possible to do that at runtime.  It is mostly
idle thinking though, very dreamy thinking.  It'd make for a kick-ass
dedicated webserver (i.e. no .htaccess files) but you can't mix it with
.htaccess files as effectively.  So it's of limited use.  maybe.

Given the complications with .htaccess I've temporarily given up on this
approach for parsing the url.  There are also complications with
regexs, wildcards and pre-merging directory configs.

A lot of what happens during the request parsing phases can be phrased
as a scanner problem.  mod_browser (mod_setenvif), and mod_rewrite
could probably benefit from a config-time generated scanning engine.
At least on some sites ... hotwired's set of BrowserMatch directives
numbers in the 50 or 60 range.

Dean


Re: internal data format critical to api/config issue

Posted by Brian Behlendorf <br...@organic.com>.
On Wed, 11 Jun 1997, Dean Gaudet wrote:
> Every time I walk down this path I think of how it'd be cool to compile
> the <Directory>/<File>/<Location> sections into a lex grammar and generate
> a pattern matching engine.

Could that be done at runtime?

	Brian

--=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--
brian@organic.com  www.apache.org  hyperreal.com  http://www.organic.com/JOBS


Re: internal data format critical to api/config issue

Posted by Dean Gaudet <dg...@arctic.org>.
On Tue, 10 Jun 1997, Marc Slemko wrote:

> Really unrelated to this, but two things I would like to see are expanded
> security abilities, so control over commands is far more fine-grained and
> the ability for users to specify startup-time config files; like .htaccess
> files, only they are only parsed at startup or at certain intervals.  Of
> course good caching of .htaccess files (erm... more likely the parsed
> structures resulting from them) could perhaps accomplish much of that
> without the pain.

There is already a per request cache of htaccess files, see
parse_htaccess().  I've also got the basis of a "directory contents cache"
designed, something which caches the only data we use from readdir().

Oh btw, as the code is presently written, it's not necessarily true that
"AllowOverride None" is a performance win.  The loop in directory_walk
is iterated O(n*m) where n is the number of directory sections, and m
is the number of components in the filename.  Whereas with no directory
sections, and using htaccess you can reduce that to O(m).

We should be able to reduce the "AllowOverride None" case to something
like O(lg n) by doing a pre-merge which lets us search only for longest
matches.

We should also be able to reduce the combined htaccess and <Directory>
case to O(n + m) by partitioning the <Directory>s by the number of
components in each.

Oh yeah, arbitrary wildcards (and regexs) screw all of that up :)  But if
we had a wildcard ^ which matched exactly one component we could still do
cute stuff... I'm thinking of cases like <Directory /home/*/public_html>.

Every time I walk down this path I think of how it'd be cool to compile
the <Directory>/<File>/<Location> sections into a lex grammar and generate
a pattern matching engine.

Dean


Re: internal data format critical to api/config issue

Posted by Marc Slemko <ma...@worldgate.com>.
Really unrelated to this, but two things I would like to see are expanded
security abilities, so control over commands is far more fine-grained and
the ability for users to specify startup-time config files; like .htaccess
files, only they are only parsed at startup or at certain intervals.  Of
course good caching of .htaccess files (erm... more likely the parsed
structures resulting from them) could perhaps accomplish much of that
without the pain.

On Tue, 10 Jun 1997, Brian Behlendorf wrote:

> 
> More smoke to blow.
> 
> After reading all the posts on the API/config file issues, I think
> we should start with something else: the core internal data
> structures. I believe that we should have a config file format which
> very closely metaphorically maps the internal structure of Apache.  I
> think we've strayed a little from that only because new requirements
> have been placed on old code, but I'd like to ask the question: are
> people happy with the way the internal data structures are currently
> organized?  It seems that some of the problems we have had recently
> may have come from stretching the definition of a conn_rec, a
> server_rec, command_rec, etc.  Thoughts?  
> 
> The answer might be no, the current model is fine and good, but I just
> wanted to see what people thought.
> 
> If the answer is yes, we should *first* think about changes to the
> internal structures, then the module API, then the config file format.
> If the answer is no, then move on. :)
> 
> I advocate making the basic configuration file as closely
> metaphorically mapped to the data structures as possible.  In
> addition, we should find a way to support configuration engines - be
> it a perl engine, an SNMP MIB, a Java app, etc.
> 
> 	Brian
> 
> --=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--
> brian@organic.com  www.apache.org  hyperreal.com  http://www.organic.com/JOBS
>