You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Alexei Kosut <ak...@leland.Stanford.EDU> on 1998/09/10 03:55:40 UTC

A Magic Cache example

During the drive home, I came up with a good example of how I envision the
new module/cache/layer model thingy working. Comments please:

The middle end of the server is responsible for taking the request the
front end gives it and somehow telling the back end how to fulfill it. I
look at it like this: The request is a URI (Uniform Resource Identifier)
and a set of request dimensions (the request headers, the remote IP
address, the time of day, etc...). The middle end, via its configuration,
translates this into a request for content from a backing store module,
plus possibly some filter modules. Since the term "filename" is too
flat-file specific, let's call the parameter we pass to the backing store
a SRI (Specific Resource Identifier), in a format specific to that module.

Our example is similar to the one I was using earlier, with some
additions: The request is for a URI, say "/skzb/teckla.html". The response
is a lookup from a (slow) database. The URI maps to the mod_database SRI
of "BOOK:0-441-7997-9" (I made that format up). We want to take that
output and convert it from whatever charset it's in into Unicode. We then
have a PHP script that works on a Unicode document and does things based
on whether the browser is Netscape or not. Then we translate the document
to the best charset that matches the characters used and the client's
capabilities and send it.

So upon request for /skzb/teckla.html, the middle end translates the
request into the following "equation":

        SRI: mod_database("BOOK:0-441-7997-9")
    +   filter: mod_charset("Unicode")
    +   filter: mod_php()
    +   fllter: mod_charset("best_fit")
 -------------------------------------------------
        URI: /skzb/teckla.html

It then constructs a stack of IO (NSPR) filters like this:

mod_database -> cache-write -> mod_charset -> cache-write -> mod_php ->
cache_write -> mod_charset -> cache-write -> client

And sets it to running. Each of the cache filters is a write-through
filter that copies its data into the cache with a tag based on what
equation the middle end uses to get to it, plus the request dimensions it
uses (info it gets from the modules).

The database access is stored under "SRI: mod_database(BOOK:0-441-79977-9"
with no dimensions (because it's the same for all requests). The first
charset manipulation is stored under "SRI: mod_database(BOOK...) + filter:
mod_charset(Unicode)", again with no dimensions. The PHP output is stored
under "SRI: mod_database(BOOK...) + filter: mod_charset(Unicode) + filter:
mod_php()" with dimesions of (User-Agent). The final output is stored both
as "SRI: mod_database(BOOK...) + filter: mod_charset(Unicode) + filter:
mod_php() + filter: mod_charset(best_fit)" and "URI: /skzb/teckla.html"
(they're the same thing), both with dimensions of (User-Agent,
Accept-Charset).

So far so good. Now, when another request for /skzb/teckla.html comes in,
the cache is consulted to see how much we can use. First, the URI is
looked up. This can be done by a kernel or other streamlined part of the
server. So "URI: /skzb/teckla.html" is looked up, and one entry pops out
with dimensions of (User-Agent, Accept-Charset). The user-agent and
accept-charset of the request are compared against the ones of the stored
entiry(ies). If one matches, it can be sent directly.

If not, the server proceeds to look up "SRI: mod_database(BOOK...) +
filter: mod_charset(Unicode) + filter: mod_php()". If the request has a
different accept-charset, but the same user-agent, then this can be
reprocessed by mod_charset and used. Otherwise, the server proceeds back
to "SRI: mod_database(BOOK...) + filter: mod_charset(Unicode)", which will
match any request. There's probably some sort of cache invalidation
(expires, etc...) that happens eventually to result in a new database
lookup, but mostly, that very costly operation is avoided.

I think I've made it out to be a bit more complicated than it is, with the
long equation strings mixed in there. But the above reflects my
understanding of how the new Apache 2.0 system should work.

Note 1: The cache is smarter than I make it out here when it comes to
adding new entries. It should realize that, since the translation to
Unicode doesn't change or restrict the dimensions of the request, it
really is pointless to cache the original database lookup, since it will
always be translated in exactly the same manner. Knowing this, it will
only cache the Unicode version. 

Note 2: PHP probably doesn't work with Unicode. And there may not be a way
to identify a script as only acting on the User-Agent dimension. That's
not the point.

Note 3: Ten bonus points to anyone who's read this far, and is the first
person to answer today's trivia question: What does the skzb referred to
in the example URI stand for? There's enough information in this mail to
figure it out (with some help from the Net), even if you don't know
offhand (though if you do, I'd be happier). 

-- Alexei Kosut <ak...@stanford.edu> <http://www.stanford.edu/~akosut/>
   Stanford University, Class of 2001 * Apache <http://www.apache.org> *



Re: A Magic Cache example

Posted by Dean Gaudet <dg...@arctic.org>.

On Wed, 9 Sep 1998, Alexei Kosut wrote:

> Oh, and documentation. I'm worried about documentation. We should write
> some one of the days. I think it'd be really neat to have an API that
> was documented in a language other than C.

There are several tech writer volunteers in our midst...  We need to
figure out how to work with them :) 

Dean


Re: A Magic Cache example

Posted by Alexei Kosut <ak...@leland.Stanford.EDU>.
On Wed, 9 Sep 1998, Alexei Kosut wrote:

[...]

> I think I've made it out to be a bit more complicated than it is, with the
> long equation strings mixed in there. But the above reflects my
> understanding of how the new Apache 2.0 system should work.

One thing I forgot to mention, which Simon brought up, is the inclusion of
other objects, e.g., included files, stylesheets, templates, whatever. As
Cliff said in SF, no distinction needs to be made. They get treated just
like sub-requests today, and go through the same mechanism as a normal
request - including all the cacheing, etc... - except that something
different happens to the output (instead of going to the client directly
as a HTTP response, the data gets included in the main request stream, or
it gets parsed by the module, or whichever).

BTW, the discussions of the past ten hours or so have made me much more at
ease with the feasility of getting this working in the 2.0 timeframe. The
only part I'm concerned about is how exactly the middle end figures out
how to assemble a request out of which modules and in what order. I
understand Simon has some brilliant, Nobel-prize winning ideas on this
subject, and I'm eagerly anticipating his thoughts :)

Oh, and documentation. I'm worried about documentation. We should write
some one of the days. I think it'd be really neat to have an API that
was documented in a language other than C.

(and C++ doesn't count, Ben.)

(Unless Babelfish has a C++-to-English option now. That would be *really*
funny, I'd wager.)

-- Alexei Kosut <ak...@stanford.edu> <http://www.stanford.edu/~akosut/>
   Stanford University, Class of 2001 * Apache <http://www.apache.org> *



Re: A Magic Cache example

Posted by Alexei Kosut <ak...@leland.Stanford.EDU>.
On Wed, 9 Sep 1998, Rasmus Lerdorf wrote:

> > Note 3: Ten bonus points to anyone who's read this far, and is the first
> > person to answer today's trivia question: What does the skzb referred to
> > in the example URI stand for? There's enough information in this mail to
> > figure it out (with some help from the Net), even if you don't know
> > offhand (though if you do, I'd be happier). 
> 
> I'm guessing KZ == Karl Zoltan which is Steven Brust's pen-name.  Hence
> skzb.  (And yes, I did have to look up the ISBN number to clue in)

Actually, his full name is Steven Karl Zoltan Brust.  But close enough. 
You get the ten bonus points.  Unfortuantely, I'm a bit short right now,
so I'll leave an IOU in the source code of Apache 2.0 somewhere...

/* Alexei owes Rasmus ten bonus points */

-- Alexei Kosut <ak...@stanford.edu> <http://www.stanford.edu/~akosut/>
   Stanford University, Class of 2001 * Apache <http://www.apache.org> *



Re: A Magic Cache example

Posted by Rasmus Lerdorf <ra...@lerdorf.on.ca>.
> Note 3: Ten bonus points to anyone who's read this far, and is the first
> person to answer today's trivia question: What does the skzb referred to
> in the example URI stand for? There's enough information in this mail to
> figure it out (with some help from the Net), even if you don't know
> offhand (though if you do, I'd be happier). 

I'm guessing KZ == Karl Zoltan which is Steven Brust's pen-name.  Hence
skzb.  (And yes, I did have to look up the ISBN number to clue in)

-Rasmus


Re: A Magic Cache example

Posted by Alexei Kosut <ak...@leland.Stanford.EDU>.
On Wed, 9 Sep 1998, Rasmus Lerdorf wrote:

> > Note 2: PHP probably doesn't work with Unicode. And there may not be a way
> > to identify a script as only acting on the User-Agent dimension. That's
> > not the point.
> 
> Why not?  At the very least the script could specifically state that it
> only acts on the user-agent dimension with a simple, 
> <? ap_set_dimension("User-Agent") ?> call, for example.

Yes, yes. What I was saying wasn't the point was whether PHP could do
those things or not. For the purposes of the example, I was granting it
those abilities :)

-- Alexei Kosut <ak...@stanford.edu> <http://www.stanford.edu/~akosut/>
   Stanford University, Class of 2001 * Apache <http://www.apache.org> *



Re: A Magic Cache example

Posted by Rasmus Lerdorf <ra...@lerdorf.on.ca>.
> Note 2: PHP probably doesn't work with Unicode. And there may not be a way
> to identify a script as only acting on the User-Agent dimension. That's
> not the point.

Why not?  At the very least the script could specifically state that it
only acts on the user-agent dimension with a simple, 
<? ap_set_dimension("User-Agent") ?> call, for example.

-Rasmus