You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Brian Akins <ba...@web.turner.com> on 2003/11/17 17:19:37 UTC

caching includes parse tree

Any thought into parsing the results of the includes filter (offsets, 
etc.).  In our environment, parsing the includes files is a huge 
performance hit.

We are willing to help in any way.


Re: caching includes parse tree

Posted by Brian Akins <ba...@web.turner.com>.
Glenn wrote:

> For files where server-side includes are used for page fragment reuse
> rather than complicated server-side conditional processing, this could
> be an easy win, and a bit more flexible than the XBitHack.


In our environment, we have several includes on a page, only one of 
which is actually dynamic (it's handled by another module).  We'd like 
to be able to cache most of the page, but not that dynamic part. 
Mod_cache is all or nothing.  We are looking at several solutions, but 
it seems "wrong" to parse the same old stuff over and over again...



Re: caching includes parse tree

Posted by Glenn <gs...@gluelogic.com>.
On Mon, Nov 17, 2003 at 08:54:50PM +0100, André Malo wrote:
> * Brian Akins <ba...@web.turner.com> wrote:
> 
> > Any thought into parsing the results of the includes filter (offsets, 
> > etc.).  In our environment, parsing the includes files is a huge 
> > performance hit.
> 
> Just some thoughts from top of my head:
> 
> I'd say, if we do, only with the new code. The old one is not very applicable
> for such a task. 
> 
> However, it sounds interesting. First we must define the cache keys. Cache by
> URL? By file? by ETag? What about dynamic resources?
> 
> The second point is, that caching in a performant manner probably only should
> occur on per child basis (like rewrite map caching). This is highly useful in
> multithreaded environments, of course.
> 
> The third point is, that there's no real parse tree;-) We just skim through
> the document and remember only the necessary things (FLAG_PRINTING etc.). We'd
> need to remember what when happend, i.e. need to build a (kind of) parse tree
> for exactly this purpose. (Array of "tokens" as a first thought).
> 
> But there's still the question whether this really saves performance. This
> depends on how fast the actual bucket stuff around is (compared to the parsing
> code). We still must read the buckets and pass them through at minimum.
> 
> Last but not least; it would be quite interesting to create just an abstract
> API for this in mod_include and build another module that does the actual
> caching. I'm not sure whether this make really sense, however ;)
> 
> Further thoughts if I have more time ...

How about an env variable flag extension in the SSI that can be set to,
say, "cache me".  If you do any complicated processing, you would say
"don't cache me", which would be the default.

For files where server-side includes are used for page fragment reuse
rather than complicated server-side conditional processing, this could
be an easy win, and a bit more flexible than the XBitHack.

Cheers,
Glenn

Re: caching includes parse tree

Posted by André Malo <nd...@perlig.de>.
* Brian Akins <ba...@web.turner.com> wrote:

> Any thought into parsing the results of the includes filter (offsets, 
> etc.).  In our environment, parsing the includes files is a huge 
> performance hit.

Just some thoughts from top of my head:

I'd say, if we do, only with the new code. The old one is not very applicable
for such a task. 

However, it sounds interesting. First we must define the cache keys. Cache by
URL? By file? by ETag? What about dynamic resources?

The second point is, that caching in a performant manner probably only should
occur on per child basis (like rewrite map caching). This is highly useful in
multithreaded environments, of course.

The third point is, that there's no real parse tree;-) We just skim through
the document and remember only the necessary things (FLAG_PRINTING etc.). We'd
need to remember what when happend, i.e. need to build a (kind of) parse tree
for exactly this purpose. (Array of "tokens" as a first thought).

But there's still the question whether this really saves performance. This
depends on how fast the actual bucket stuff around is (compared to the parsing
code). We still must read the buckets and pass them through at minimum.

Last but not least; it would be quite interesting to create just an abstract
API for this in mod_include and build another module that does the actual
caching. I'm not sure whether this make really sense, however ;)

Further thoughts if I have more time ...

nd

Re: caching includes parse tree

Posted by Ian Holsman <li...@holsman.net>.
Brian Akins wrote:

> Any thought into parsing the results of the includes filter (offsets, 
> etc.).  In our environment, parsing the includes files is a huge 
> performance hit.
> 
> We are willing to help in any way.
> 
> 

Hey Brian,
it has been discussed before, and the two approaches is what I recall
we store the parse results in a cache similiar how mod_mem_cache does it

*or*

run 2 mod-includes with a cache in between them. the reasoning for this 
is that 90% of the work assembling the page can be done once, and there 
is a small part which needs to be done which is user specific

FWIW .. you couldn't cache the offsets of a dynamic (cgi/java/php) page 
as the offsets would/could change every time.