You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@httpd.apache.org by Alexei Kosut <ak...@leland.Stanford.EDU> on 1999/01/03 13:56:28 UTC

more wacky A2 ideas

[I've decided not to write out "Apache 2.0" anymore. I'm declaring A2 to
be the new code name for the set of enhancements previously planned for
the 'next' revision of Apache. I'm also using A1 to refer to the current
version of Apache]

...so I was trying to fall asleep, and I suddenly had a horrid thought
that reveals a nasty surprise inherent in the plans for A2's content
delivery system that we discussed in June, and that I've been expanding
and expounding on for the past few months.

To recap: based on its parameters (URL, headers, etc...), the server maps
the request to what I've been calling an LRI (Local Resource Identifier):
an object from a 'data store', plus zero or more content filters. These
are layered, with a 'magic cache' in between each layer (see my previous
emails for great detail - it doesn't affect what I'm talking about here).
The object is created, filtered, processed, diced, sliced and delivered to
the client.

So far so good. One goal of this plan is to remove Apache's file-centrism. 
The idiom that each request maps to a file, as A1 currently expects,
wreaks havoc on content served from databases, or remote servers, or
whatnot. So getting filenames out of the picture seems a desirable plan. 

The problem: Presumably, we want to keep all of the current Apache
features in A2. Including the ability to do things that are related to the
filename of the request, e.g., htaccess files, symlink following,
<Directory> sections, etc... Currently, this is all done in the Apache
core based on the filename associated with a request. But without a
filename, only an opaque LRI, this sort of processing simply cannot be
done.

The first thought I had was "why does this have to be in the core at all?" 
We could remove all this sort of thing into the filesystem data store (aka
A1's http_core handler), give the chosen data store a chance to add
per-request configuration information based on the LRI. This seems like
a neat idea: One could store configuration information in a database, or
even generate it dynamically! (No, I'm not sure what that's good for,
either)

The problem is that so many things use the filesystem. A lot of them can
be rewritten as filters on top of the filesystem data store, e.g.,
includes, PHP, imagemaps, type maps. But some simply cannot: CGI scripts
cannot be run as a stream of data read of the filesystem (even if the OS
somehow supports it, it's a bad idea). Yet we want to ensure that all of
these filesystem-based data stores properly implement today's features in
A2, without requiring the code exist in each data store module.

One option is to special-case filenames in the core. I don't like this
idea, but it would work. Another is to provide, as part of the core, a
library of filesystem routines that data stores can use to easily
implement filesystem-based stuff (e.g., just have
your_module_add_config_data() call
ap_filesystem_add_config_data("filename")). This is probably the best
idea, but it doesn't *quite* sit well with me.

At any rate, I was curious to know what anyone else might think about this
topic. Any comments

Thanks!

P.S. One of these days, I might even write some code to exercise my
thoughts... I had planned to do just that these last few weeks, but those
of you who have been in school no doubt recall how 'productive' vacations
(winter quarter starts Tuesday) can be ;)

-- Alexei Kosut <ak...@stanford.edu> <http://www.stanford.edu/~akosut/>
   Stanford University, Class of 2001 * Apache <http://www.apache.org> *

Re: more wacky A2 ideas

Posted by Dirk-Willem van Gulik <di...@jrc.it>.

Ben Laurie wrote:
> 
> Hmmm ... there is a certain appeal in doing it all in LRIs...
> 
> URI:/x/y
> -> LRI:sandwich:/a/b/y
> -> LRI:cat:file:/a/b/header,file:/a/b/y,file:/a/b/footer
> -> concatenate (LRI:file:/a/b/header,
>                 LRI:file:/a/b/y,
>                 LRI:file:/a/b/footer)
> 
> the last bit being what the "cat:" LRI handler does, but everything up
> to there being done by regexps (or some other mechanism)...
> 
> Am I making _any_ sense?

Actually; yes.. we started doing things (for the log files and other
places where references are made to files, pipes, http proxies
or straight sockets) to use things like

	ErrorLog	socket://loghost:1234
	ResourceType	http://confhost.jrc.it/srm.conf

And so on. And it certainly works well when you start combining
it with multiple alternatives to do fall-through or load 
sharing; it is a natural syntax.

Dw.

Re: more wacky A2 ideas

Posted by Alexei Kosut <ak...@leland.Stanford.EDU>.

On Mon, 4 Jan 1999, Ben Laurie wrote:

> > I suppose... that might not be a bad idea; splitting the data store into
> > two sections, in (possibly) different modules: one that determines which
> > object in the data store the request refers to, and the other that
> > actually delivers it. Hmm.
> 
> Not just which object, but other aspects (such as what configuration
> applies, who owns it, etc and blah).

Right. That's what I meant: "Everything except the part where the data
store writes the data out to the stream." The equivilent in A1 of the
'handler' phase, I imagine.

> > > I'm noticing a trend in your snipping, BTW - you seem to be avoiding
> > > anything that leads towards inheritence - I think that is a mistake. 
>
> Well, one view is that CGIs and files inherit from an abstract
> "hierarchical" type. Multiple inheritance may be needed to make this
> useful, though... hmmm...

Interesting idea. It could provide another way to handle the CGI/file
problem - rather than seperating out the data store functions as described
above, the CGI data store could just inherit from the file data store, and
override the content delivery (handler) method.

I like that...

-- Alexei Kosut <ak...@stanford.edu> <http://www.stanford.edu/~akosut/>
   Stanford University, Class of 2001 * Apache <http://www.apache.org> *

Re: more wacky A2 ideas

Posted by Ben Laurie <be...@algroup.co.uk>.

Alexei Kosut wrote:
> 
> On Sun, 3 Jan 1999, Ben Laurie wrote:
> 
> > OK, I get that. Isn't the solution to seperate LRI metadata from LRI
> > data? So CGI and file share (some) metadata but have there own way of
> > deriving content.
> 
> I suppose... that might not be a bad idea; splitting the data store into
> two sections, in (possibly) different modules: one that determines which
> object in the data store the request refers to, and the other that
> actually delivers it. Hmm.

Not just which object, but other aspects (such as what configuration
applies, who owns it, etc and blah).

> > I'm noticing a trend in your snipping, BTW - you seem to be avoiding
> > anything that leads towards inheritence - I think that is a mistake.
> 
> Well, I'm not doing it intentionally, I promise. I like inheritence.
> Really. Nearly all the code I've written in the last year has been object
> oriented code (C++, Java and Perl 5). I'm just not sure where
> inheritence fits into this. Could you elucidate?

Well, one view is that CGIs and files inherit from an abstract
"hierarchical" type. Multiple inheritance may be needed to make this
useful, though... hmmm...

> 
> [...]
> 
> > I'm trying hard not to mention C++ here.... :-) (and, seriously, this
> > kind of inheritance should not be linked with C++ - we're talking about
> > a more limited kind of object-orientation).
> 
> I seem to recall the proper language term for classes that have
> inheritence without polymorphism as being 'object-based'. But that could
> be wrong.

Well, polymorphism may not be so bad, either, but what I don't want to
do is link "object orientation" and "C++" in this conversation.

Cheers,

Ben.

--
http://www.apache-ssl.org/ben.html

"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
     - Indira Ghandi

Re: more wacky A2 ideas

Posted by Alexei Kosut <ak...@leland.Stanford.EDU>.

On Sun, 3 Jan 1999, Ben Laurie wrote:

> OK, I get that. Isn't the solution to seperate LRI metadata from LRI
> data? So CGI and file share (some) metadata but have there own way of
> deriving content.

I suppose... that might not be a bad idea; splitting the data store into
two sections, in (possibly) different modules: one that determines which
object in the data store the request refers to, and the other that
actually delivers it. Hmm.

> I'm noticing a trend in your snipping, BTW - you seem to be avoiding
> anything that leads towards inheritence - I think that is a mistake.

Well, I'm not doing it intentionally, I promise. I like inheritence. 
Really. Nearly all the code I've written in the last year has been object
oriented code (C++, Java and Perl 5). I'm just not sure where
inheritence fits into this. Could you elucidate? 

[...]

> I'm trying hard not to mention C++ here.... :-) (and, seriously, this
> kind of inheritance should not be linked with C++ - we're talking about
> a more limited kind of object-orientation).

I seem to recall the proper language term for classes that have
inheritence without polymorphism as being 'object-based'. But that could
be wrong.

-- Alexei Kosut <ak...@stanford.edu> <http://www.stanford.edu/~akosut/>
   Stanford University, Class of 2001 * Apache <http://www.apache.org> *

Re: more wacky A2 ideas

Posted by Ben Laurie <be...@algroup.co.uk>.

Alexei Kosut wrote:
> 
> On Sun, 3 Jan 1999, Ben Laurie wrote:
> 
> > I'm obviously missing something - clearly _something_ knows about files.
> > All we need to do (All? Hah!) is push the file-dependent stuff down into
> > the file module (which is where it belongs anyway, right?).
> 
> Okay, I know what you're missing. This is what I was talking about in my
> initial message - yes, something needs to know about files. And putting
> that all into the filesystem data store was my first inclination, too. But
> how do you do CGIs then? You can't make them a filter (at least, I've
> never heard of an OS that supports streamed executables); they really
> should be their own data store. But they also need to behave exactly like
> files...

OK, I get that. Isn't the solution to seperate LRI metadata from LRI
data? So CGI and file share (some) metadata but have there own way of
deriving content.

I'm noticing a trend in your snipping, BTW - you seem to be avoiding
anything that leads towards inheritence - I think that is a mistake.

> > > URI-to-LRI ^/imagemaps/(.*) file-data-store("$1") imap-filter("CERN")
> > >
> > > URI-to-LRI ^/html(.*) file-data-store("/usr/www/htdocs$1")
> >
> > I find the lack of slashes here disconcerting. Was it deliberate?
> 
> I suppose. The exact syntax isn't terribly concerning to me right now.

Good. Just a passing thought.

> > > <LRI file-data-store *> > AddHandler server-side-includes shtml
> > > AddFilterToLRIByHandler server-side-includes includes-filter()
> > > </LRI>
> >
> > Bong! shtml is an extension, but * matches an opaque string, no? Or is
> > it sufficiently unopaque that we are going to believe in "extensions"?
> 
> Oh. Good point. Yeah, I guess extensions are a file-based thing. Those go
> in the file module too, I guess (makes it an interesting exercise to
> figure out how negotiation works, though - who gets to pick the variant,
> the core or the data store. It should be the core, but that means we need
> a mechanism for the data store to say "well, I've got eight
> representations of this object - which one do you want?" to the core.  It
> may mean you need three representations of each resource: the URI, the
> data store+all object variants, and the LRI (i.e., the chosen resource +
> filters). Ergh.)

Err. I'm not quite up to thinking this through carefully (yet). Can I
take a rain check?

> No, wait... AddHandler is probably a configuration directive of the file
> data store. That would make sense - it'd be illegal to use it inside of an
> <LRI oracle-db> section, but in an <LRI file-data-store> section, it's
> valid, yes?

AddHandler should be a config directive of "things that have
extensions". Of which "hierarchical things" is a derivative. Of which
"file" is a derivative.

I'm trying hard not to mention C++ here.... :-) (and, seriously, this
kind of inheritance should not be linked with C++ - we're talking about
a more limited kind of object-orientation).

Cheers,

Ben.

--
http://www.apache-ssl.org/ben.html

"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
     - Indira Ghandi

Re: more wacky A2 ideas

Posted by Alexei Kosut <ak...@leland.Stanford.EDU>.

On Mon, 4 Jan 1999, Konstantin Chuguev wrote:

> IMO, there should be an abstract data store layer (module), offering
> direct and/or sequential access to the raw data.

While this seems plausable at first, I don't think it would necessarily
work out. The requirements quickly become too complex and specific to
individual data stores. Making an API (as Ben suggested) to allow the
seperation of delivery of a resource from the location of that resource,
but where the two processes are tied to the same data store.

The main application of what you propose - access to streamed data (either
in entirety or partial) from a data store - is already made available in
the A2 plans via filters or sub-requests (which, unlike A1, will I imagine
be readable by the module instead of being shunted directly to the
output). And 'raw' access to the data store requires knowledge of the data
store, so we may as well integrate them.

-- Alexei Kosut <ak...@stanford.edu> <http://www.stanford.edu/~akosut/>
   Stanford University, Class of 2001 * Apache <http://www.apache.org> *

Re: more wacky A2 ideas

Posted by Konstantin Chuguev <jo...@urc.ac.ru>.

Alexei Kosut wrote:
> 
> After sending this, I thought of one possible solution: The problem is
> that we have multiple data stores (the equivilent of probably 80% of the
> A1 modules today) that all share exactly the same characteristics dealing
> with the file system, except one: actually turning the file on disk into
> content. The metadata (extension), permissions, config (htaccess) stuff is
> all identical. So why not give the file data store modules of its own?
> A1's 'handler' phase concept is perfect for this. Allow the file data
> store to configure each of its LRIs with a handler (probably by LRI or
> extension, like we do today), and add modules to the file data store to
> actually serve them. They'd be passed a filename/fd and an output stream
> (of whatever sort), and they could serve it up straight, or execute it
> somehow, or parse it, or whatnot.
> 
IMO, there should be an abstract data store layer (module), offering
direct and/or sequential access to the raw data.
And various implementations based on this layer could be:
- file (more precisely, file system) data store - as now provided by A1
core;
- database;
- HTTP/FTP/... client-side modules, getting raw data via various
protocols
  (A1 already has mod_proxy :-).

The way you process the raw data provided by those modules is not
connected
with their internal functionality. I mean, you may want to use SHTML or
CGI
scripts stored both as files and in a database.
In the first case all you need is to obtain a FD for sequential access;
in the second one, you could ask the store layer to map temporarily a
CGI
executable to a file system in some way for direct access (which is
necessary
for execution).

--
	Konstantin V. Chuguev.		System administrator of Southern
	http://www.urc.ac.ru/~joy/	Ural Regional Center of FREEnet,
	mailto:joy@urc.ac.ru		Chelyabinsk, Russia.

Re: more wacky A2 ideas

Posted by Alexei Kosut <ak...@leland.Stanford.EDU>.

On Sun, 3 Jan 1999, Alexei Kosut wrote:

> Okay, I know what you're missing. This is what I was talking about in my
> initial message - yes, something needs to know about files. And putting
> that all into the filesystem data store was my first inclination, too. But
> how do you do CGIs then? You can't make them a filter (at least, I've
> never heard of an OS that supports streamed executables); they really
> should be their own data store. But they also need to behave exactly like
> files...

After sending this, I thought of one possible solution: The problem is
that we have multiple data stores (the equivilent of probably 80% of the
A1 modules today) that all share exactly the same characteristics dealing
with the file system, except one: actually turning the file on disk into
content. The metadata (extension), permissions, config (htaccess) stuff is
all identical. So why not give the file data store modules of its own? 
A1's 'handler' phase concept is perfect for this. Allow the file data
store to configure each of its LRIs with a handler (probably by LRI or
extension, like we do today), and add modules to the file data store to
actually serve them. They'd be passed a filename/fd and an output stream
(of whatever sort), and they could serve it up straight, or execute it
somehow, or parse it, or whatnot.

I kind of like this solution, it has a certain elegance... Heck, maybe we
could even use the A1 API (or a subset) as the API for file data store
modules. It fits pretty well to what I'm thinking.

-- Alexei Kosut <ak...@stanford.edu> <http://www.stanford.edu/~akosut/>
   Stanford University, Class of 2001 * Apache <http://www.apache.org> *

Re: more wacky A2 ideas

Posted by Alexei Kosut <ak...@leland.Stanford.EDU>.

On Sun, 3 Jan 1999, Ben Laurie wrote:

> I'm obviously missing something - clearly _something_ knows about files.
> All we need to do (All? Hah!) is push the file-dependent stuff down into
> the file module (which is where it belongs anyway, right?).

Okay, I know what you're missing. This is what I was talking about in my
initial message - yes, something needs to know about files. And putting
that all into the filesystem data store was my first inclination, too. But
how do you do CGIs then? You can't make them a filter (at least, I've
never heard of an OS that supports streamed executables); they really
should be their own data store. But they also need to behave exactly like
files...

> > URI-to-LRI ^/imagemaps/(.*) file-data-store("$1") imap-filter("CERN")
> > 
> > URI-to-LRI ^/html(.*) file-data-store("/usr/www/htdocs$1")
> 
> I find the lack of slashes here disconcerting. Was it deliberate?

I suppose. The exact syntax isn't terribly concerning to me right now.

> 
> > <LRI file-data-store *> > AddHandler server-side-includes shtml
> > AddFilterToLRIByHandler server-side-includes includes-filter()
> > </LRI>
> 
> Bong! shtml is an extension, but * matches an opaque string, no? Or is
> it sufficiently unopaque that we are going to believe in "extensions"?

Oh. Good point. Yeah, I guess extensions are a file-based thing. Those go
in the file module too, I guess (makes it an interesting exercise to
figure out how negotiation works, though - who gets to pick the variant,
the core or the data store. It should be the core, but that means we need
a mechanism for the data store to say "well, I've got eight
representations of this object - which one do you want?" to the core.  It
may mean you need three representations of each resource: the URI, the
data store+all object variants, and the LRI (i.e., the chosen resource +
filters). Ergh.)

No, wait... AddHandler is probably a configuration directive of the file
data store. That would make sense - it'd be illegal to use it inside of an
<LRI oracle-db> section, but in an <LRI file-data-store> section, it's
valid, yes?

-- Alexei Kosut <ak...@stanford.edu> <http://www.stanford.edu/~akosut/>
   Stanford University, Class of 2001 * Apache <http://www.apache.org> *

Re: more wacky A2 ideas

Posted by Ben Laurie <be...@algroup.co.uk>.

Alexei Kosut wrote:
> 
> On Sun, 3 Jan 1999, Ben Laurie wrote:
> 
> > Hmmm ... perhaps I'm missing the point here, but wouldn't the point of
> > an LRI be that it actually identified a resource, including, of course,
> > a file. So I'd imagine that an LRI that referred to a file would looks
> > something like "file:/a/b/c". Then a <Directory /x/y/z> directive maps
> > neatly on to the (new) <LRI file:/x/y/z/.*> directive... and so on.
> 
> Probably. The problem I was thinking of was specifically htaccess files,
> and a few others, which can't be solved in a generic manner.

I don't understand why htaccess can't be solved in a generic manner? I
suppose you mean it can't be solved for non-hierarchical resources?
However, files are not the only example of hierarchical resources.
Access control can be solved for non-hierarchical resources, of course,
but not in a way that relies on hierarchy. So, I contend that htaccess
as we know it is something that only makes sense on a hierarchical
resource. So it isn't a problem (but it does imply that we need to know
whether a resource is hierarchical).

> > So "file:" LRIs do have some special properties (e.g. they know about
> > symlinks) but they also have general ones ("directories").
> 
> Right. What I was referring to is that the same properties that the file
> data store has to deal with, the CGI data store does as well, etc...; how
> to produce all current Apache functionality with a model that knows
> nothing about files does present some issues.

I'm obviously missing something - clearly _something_ knows about files.
All we need to do (All? Hah!) is push the file-dependent stuff down into
the file module (which is where it belongs anyway, right?).

> Especially if you want to promote other backing systems to the same level
> as filesystems. i.e., if several data store modules are based on access to
> an Oracle database, and that database is set up with certain properties,
> e.g., the equivilent of file permissions, htaccess files, whichever, there
> should be some way of sharing that code between data stores. The obvious
> way to do that is just to have them both call the same functions, in a
> library or something. Which is why I suggested doing it the same way for a
> filesystem.

There are two aspects to htaccess which need to be separated, IMO - what
they contain (i.e. configuration stuff) and which ones we use (see
above). Clearly configuration stuff can be handled by the general
configuration machinery. Which ones we use is an (the?) interesting
question.

> > Now the thing that exercises me is that LRIs may, in theory, map on to
> > each other in mysterious ways. For example, URI:/x/y maps to
> > LRI:sandwich:/a/b/y which maps to the concatenation of
> > LRI:file:/a/b/header, LRI:file:/a/b/y, LRI:file:/a/b/footer. So what's
> > the problem? Perhaps none, but at what points do the configuration rules
> > apply in all this? Will configuring/debugging something like this drive
> > everyone mad? Or will some kind of beautiful elegance drop out of the
> > bottom?
> 
> I hope so. Simon was claiming he had some brilliant ideas on how to do
> configuration for all this, but we haven't heard from him recently :)
> 
> I would say that "maps to the concatenation of" is probably not a function
> of the server, but of a data store. i.e., a 'cat' data store module might
> take a list of other LRIs, which it included (via a sub-request-like
> mechanism).

Yes, that's what I meant, too (at least, I did by the end of my mail).

> An LRI, as I envision it, is the name of a data store module, an opaque
> string to pass to that module, and a list of filters (with opaque strings
> on those two) to apply to the output of the data store. That's
> possibly simple enough to configure:
> 
> URI-to-LRI ^/imagemaps/(.*) file-data-store("$1") imap-filter("CERN")
> 
> URI-to-LRI ^/html(.*) file-data-store("/usr/www/htdocs$1")

I find the lack of slashes here disconcerting. Was it deliberate?

> <LRI file-data-store *>
> AddHandler server-side-includes shtml

Bong! shtml is an extension, but * matches an opaque string, no? Or is
it sufficiently unopaque that we are going to believe in "extensions"?

> AddFilterToLRIByHandler server-side-includes includes-filter()
> </LRI>
> 
> Or something.

Cheers,

Ben.

--
http://www.apache-ssl.org/ben.html

"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
     - Indira Ghandi

Re: more wacky A2 ideas

Posted by Ben Hyde <bh...@pobox.com>.

Humm...

> LRI:sandwich:/a/b/y which maps to the concatenation of
> LRI:file:/a/b/header, LRI:file:/a/b/y, LRI:file:/a/b/footer. So what's
> the problem? Perhaps none, but at what points do the configuration rules

In my more abstract moods I look at this as two phases:
  A) Request -> plan how to respond -> Pipeline
  B) Pipeline -> evaluate plan -> Response

If the planing phase doesn't work out we return an error.  After that
we are committed to sending the response.  In step B we have to worry
about all the issues of extreme efficiency like zero copy, buffering
to avoid tiny packets, etc. etc.

The planning phase looks to be a matter interleaving the fleshing out
of the pipeline and the validation of that plan.  Selecting caches, 
files, transforms, and paste up to use.  All the esoteric fun
of pattern matching (regex, production systems, whatever floats your
boat).  The validation checks permissions, configuration settings,
etc. etc.

The real nub of the problem is how rich a data structure comes out of
A for use by B.  Looks like a tree to me.  Looks like the nodes are
the direct evolutionary descendents the request_rec, i.e. the leaves
need to support ap_rprintf et. al.  If only to avoid alienating the
users.  What are the methods on request_rec, how _few_ can we leave in
the core?

It's all a little frightening, or fun depending on one's mood.

It's fun if all the core has to do is support a way to hook in new
kinds of nodes, document the _few_ required methods, and allow some
amount of additional methods on them.  It would be extremely nice to
avoid having the group decide how symbolic v.s. textual all this
planing and evaluating ought to be and instead just focus on fast
primitive elements and a good API to add elements.  I.e. you don't do
caches, or charsets, or compression, or pasteup, or new negotiation in
A2, you try to be sure they can be implemented in A2.  Probably by
shipping proofs as plug ins with A2.

Part of what the core delivers is a place to store configuration
information.  One module can collect that info, and another can
use it.  The configuration settings in the various hierarchical
namespaces (file, directory, server) can be tapped into by during
the planning as it checks permissions.

I'll shutup now.

 - ben

Re: more wacky A2 ideas

Posted by Alexei Kosut <ak...@leland.Stanford.EDU>.

On Sun, 3 Jan 1999, Ben Laurie wrote:

> Hmmm ... perhaps I'm missing the point here, but wouldn't the point of
> an LRI be that it actually identified a resource, including, of course,
> a file. So I'd imagine that an LRI that referred to a file would looks
> something like "file:/a/b/c". Then a <Directory /x/y/z> directive maps
> neatly on to the (new) <LRI file:/x/y/z/.*> directive... and so on.

Probably. The problem I was thinking of was specifically htaccess files,
and a few others, which can't be solved in a generic manner.

> So "file:" LRIs do have some special properties (e.g. they know about
> symlinks) but they also have general ones ("directories").

Right. What I was referring to is that the same properties that the file
data store has to deal with, the CGI data store does as well, etc...; how
to produce all current Apache functionality with a model that knows
nothing about files does present some issues.

Especially if you want to promote other backing systems to the same level
as filesystems. i.e., if several data store modules are based on access to
an Oracle database, and that database is set up with certain properties,
e.g., the equivilent of file permissions, htaccess files, whichever, there
should be some way of sharing that code between data stores. The obvious
way to do that is just to have them both call the same functions, in a
library or something. Which is why I suggested doing it the same way for a
filesystem.

> Now the thing that exercises me is that LRIs may, in theory, map on to
> each other in mysterious ways. For example, URI:/x/y maps to
> LRI:sandwich:/a/b/y which maps to the concatenation of
> LRI:file:/a/b/header, LRI:file:/a/b/y, LRI:file:/a/b/footer. So what's
> the problem? Perhaps none, but at what points do the configuration rules
> apply in all this? Will configuring/debugging something like this drive
> everyone mad? Or will some kind of beautiful elegance drop out of the
> bottom?

I hope so. Simon was claiming he had some brilliant ideas on how to do
configuration for all this, but we haven't heard from him recently :)

I would say that "maps to the concatenation of" is probably not a function
of the server, but of a data store. i.e., a 'cat' data store module might
take a list of other LRIs, which it included (via a sub-request-like
mechanism).

An LRI, as I envision it, is the name of a data store module, an opaque
string to pass to that module, and a list of filters (with opaque strings
on those two) to apply to the output of the data store. That's
possibly simple enough to configure:

URI-to-LRI ^/imagemaps/(.*) file-data-store("$1") imap-filter("CERN")

URI-to-LRI ^/html(.*) file-data-store("/usr/www/htdocs$1")
<LRI file-data-store *>
AddHandler server-side-includes shtml
AddFilterToLRIByHandler server-side-includes includes-filter()
</LRI>

Or something.

-- Alexei Kosut <ak...@stanford.edu> <http://www.stanford.edu/~akosut/>
   Stanford University, Class of 2001 * Apache <http://www.apache.org> *

Re: more wacky A2 ideas

Posted by Ben Laurie <be...@algroup.co.uk>.

Alexei Kosut wrote:
> One option is to special-case filenames in the core. I don't like this
> idea, but it would work. Another is to provide, as part of the core, a
> library of filesystem routines that data stores can use to easily
> implement filesystem-based stuff (e.g., just have
> your_module_add_config_data() call
> ap_filesystem_add_config_data("filename")). This is probably the best
> idea, but it doesn't *quite* sit well with me.

Hmmm ... perhaps I'm missing the point here, but wouldn't the point of
an LRI be that it actually identified a resource, including, of course,
a file. So I'd imagine that an LRI that referred to a file would looks
something like "file:/a/b/c". Then a <Directory /x/y/z> directive maps
neatly on to the (new) <LRI file:/x/y/z/.*> directive... and so on.

So "file:" LRIs do have some special properties (e.g. they know about
symlinks) but they also have general ones ("directories").

Now the thing that exercises me is that LRIs may, in theory, map on to
each other in mysterious ways. For example, URI:/x/y maps to
LRI:sandwich:/a/b/y which maps to the concatenation of
LRI:file:/a/b/header, LRI:file:/a/b/y, LRI:file:/a/b/footer. So what's
the problem? Perhaps none, but at what points do the configuration rules
apply in all this? Will configuring/debugging something like this drive
everyone mad? Or will some kind of beautiful elegance drop out of the
bottom?

Hmmm ... there is a certain appeal in doing it all in LRIs...

URI:/x/y
-> LRI:sandwich:/a/b/y
-> LRI:cat:file:/a/b/header,file:/a/b/y,file:/a/b/footer
-> concatenate (LRI:file:/a/b/header,
                LRI:file:/a/b/y,
                LRI:file:/a/b/footer)

the last bit being what the "cat:" LRI handler does, but everything up
to there being done by regexps (or some other mechanism)...

Am I making _any_ sense?

Cheers,

Ben.

--
http://www.apache-ssl.org/ben.html

"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
     - Indira Ghandi

Re: more wacky A2 ideas

Posted by Alexei Kosut <ak...@leland.Stanford.EDU>.

On Tue, 5 Jan 1999, Martin Pool wrote:

> On Sun, Jan 03, 1999 at 04:56:28AM -0800, Alexei Kosut wrote:
> 
> > To recap: based on its parameters (URL, headers, etc...), the server maps
> > the request to what I've been calling an LRI (Local Resource Identifier):
> > an object from a 'data store', plus zero or more content filters. These
> > are layered, with a 'magic cache' in between each layer (see my previous
> > emails for great detail - it doesn't affect what I'm talking about here).
> > The object is created, filtered, processed, diced, sliced and delivered to
> > the client.
> 
> That sounds a lot like what the Java-Apache project (the
> great-grandchild of Alexei's servlet interface) does now.  We want to
> use a portion of the server's namespace to identify Java classnames
> rather than filenames
> http://victim/servlets/org/apache/java/TestServlet is looked up using
> a Java classloader, rather than by descending through directories.
> What's more, we want to map different directories to different
> 'zones' within the Java half, and there's a concept rather like LRIs
> in place already.

Precisely; what I want from LRIs (or at least the data store part of
them - my concept for LRIs also includes a stack of content filters) is
the ability to make any arbitrary namespace for locating an object a
first-class citizen, not hacked around the current file-based model.

-- Alexei Kosut <ak...@stanford.edu> <http://www.stanford.edu/~akosut/>
   Stanford University, Class of 2001 * Apache <http://www.apache.org> *

Re: more wacky A2 ideas

Posted by Martin Pool <mb...@wistful.humbug.org.au>.

On Sun, Jan 03, 1999 at 04:56:28AM -0800, Alexei Kosut wrote:

> To recap: based on its parameters (URL, headers, etc...), the server maps
> the request to what I've been calling an LRI (Local Resource Identifier):
> an object from a 'data store', plus zero or more content filters. These
> are layered, with a 'magic cache' in between each layer (see my previous
> emails for great detail - it doesn't affect what I'm talking about here).
> The object is created, filtered, processed, diced, sliced and delivered to
> the client.

That sounds a lot like what the Java-Apache project (the
great-grandchild of Alexei's servlet interface) does now.  We want to
use a portion of the server's namespace to identify Java classnames
rather than filenames
http://victim/servlets/org/apache/java/TestServlet is looked up using
a Java classloader, rather than by descending through directories.
What's more, we want to map different directories to different
'zones' within the Java half, and there's a concept rather like LRIs
in place already.

> This seems like
> a neat idea: One could store configuration information in a database, or
> even generate it dynamically! (No, I'm not sure what that's good for,
> either)

Perhaps for companies who generate their server configuration from a
database of customers?

-- 
Martin Pool