You are viewing a plain text version of this content. The canonical link for it is here.

Posted to axkit-dev@xml.apache.org by Kip Hampton <kh...@totalcinema.com> on 2003/04/09 14:24:50 UTC

Incremental Caching Patches [Long]

Howdy AxDevers,

Okay, I finally had a chance to install the proposed caching patches
that came from Chris L through the users' list (CC'ing you, Chris since
I'm not sure if you're subbed here or not). The changes proposed touch a 
fair amount of code and re-organize some of AxKit most basic behaviors 
in to different classes so it just seems smart to be to review them a 
bit more closely than we do the usual patch.

First, the hard numbers:

Pentium II 366, 96M RAM (Hey, its my old lap-top...)
Running RedHat 7.2

The following reflects performance for 10,000 total requests, 10
concurrent, using ab on the loopback. (see the attached bench.txt for
full details) Stripped down XSLT stylesheets were used in effort to
localize processing time to AxKit itself, rather than XSLT transformation.

## Current CVS, AxNoCache On, 3-step (trivial) XSLT
Requests per second:    50.99 [#/sec] (mean)

## Incremental Patch Applied, AxNoCache On, 3-step (trivial) XSLT
Requests per second:    51.79 [#/sec] (mean)


## Current CVS, AxNoCache Off, 3-step (trivial) XSLT
Requests per second:    80.45 [#/sec] (mean)

## Incremental Patch Applied, AxNoCache Off, 3-step (trivial) XSLT
Requests per second:    68.41 [#/sec] (mean)

Some things to note and random ideas:

* There's small increase in requests per second with caching turned off, 
but a modest loss with caching turned on.

* The content-length returned from requests to the CVS and patched is 
different.

The changes to the way data is returned from the LibXSLT Language module 
in the new Provider/Cache/Language interactions further exposes 
differences in the document returned from XML::LibXSLT when the result 
is selected as DOM vs.as a string. That is, given:

my $results = $stylesheet->transform($source);

$results->toString();

is *not* guaranteed to give the same result as

$stylesheet->output_string($results);

hence, if only the DOM is returned, there may be unexpected results.

This is not specifically related to Chris' proposed patches directly, 
but it does point to the fact that we my have rethink the very idea of 
passing around a DOM tree between Provider and Language modules and 
between different Language modules in the processing chain. The sad fact 
is that if we are to adhere to the notion of least surprise for AxKit 
users, we may need to fully serialize and re-parse at every processing 
stage in order to allow each processor the chance to perform whatever 
serialization magic it needs to to deliver the proper result. .

* Given that the Cache and Provider interfaces are more or less 
identical, making Cache a subclass of Provider makes a lot of sense. +1 
on this part.

* Changing the way Language modules get content from the current "it 
might be in pnotes('dom_tree'), pnotes('xml_string') or one of the 
Provider methods and you'll have to check" interface to the "give me 
what i ask for" approach in Chris' patches is a definite change for the 
better.

* It wasn't clear to me whether or not the new Language module design 
allows direct output to the client if the current processor is the last 
in the chain. If not, this presumption puts limits on what can be 
integrated as last-in-chain Language modules that may need access to the 
Apache object to return content appropriately.

* Bolting on the result of the transformation to the return code in the 
new Language modules (e.g. return (Apache::Constants::OK, \%results); ) 
seems a smidge hacky to me. They should return one or the other, both is 
an ugly mix of coding strategies.

So, I guess the sum of my evaluation is "I'm not sure". There are 
definitely some good things in the proposed patches, but I think they go 
too far in some places.

The questions that need to be answered are:

In general, does switching to incremental caching give us something that 
we don't already have, or is it arguably a better generic solution than 
the current all-or-nothing implementation (especially in light of the 
fact that the typical use-case seems to put the dynamic parts at the 
front of the processing chain)?

Do the proposed patches make it easier for users to write custom Cache 
modules? How or how not?

Thoughts?

-kip

Re: Incremental Caching Patches

Posted by Robin Berjon <ro...@knowscape.com>.

Kip Hampton wrote:
> Here's what I suggest:
> 
> 1) Accept the patches.
> 2) Re-implement the current all-or-nothing" caching behavior of the 
> current Cache.pm using the new interfaces and keep this behavior as the 
> default.
> 3) Ship Cache::Incremental (or whatever) as a standard alternative.
> 
> Reasonable? Doable? Anyone have thoughts?

+1 for me

--r

Re: Incremental Caching Patches [plus offtopic]

Posted by Jörg Walter <eh...@ich.bin.kein.hoschi.de>.

Hi AxDevers...

After being submerged in my new contractor's office, which was getting hacked 
by some annoying trojan right away and cleaning stuff up, I'll finally give 
my 2 cents to this issue.

On Friday, 11. April 2003 13:24, Chris Leishman wrote:

> >> It is a smidge hacky I agree.  I did it to be backwards compatible.
> >> I guess the same could be achieved by returning a hashref with a
> >> 'status' member, and then detecting whether a scalar or a hash ref
> >> was returned by the processor back in AxKit.
> >
> > +1

Sounds reasonable.

> > XSP (eXtensible Server Pages) is all about *generating* content. It is
> > not a transformative processor in the way that XSLT and XPathScript
> > are. There is no XSP stylesheet that gets applied to a source document
> > and there's nothing in the language itself that presumes an input
> > document.. Markup, inlined code, and taglibs (markup that maps to
> > code) are combined in one document and the only "transformation"
> > happens when the XSP processor executes the inlined or taglib-added
> > code.
> >
> > So, unless you mean "a pipeline of styles that generates an XSP page
> > that gets executed at the end to generate further content" I'm not
> > sure what you mean by using XSP as "form of styling". Its not that you
> > *couldn't* do things that way, but its hardly the common case. The
> > XSP(to generate content)->XSLT(to style it for the given client)
> > pattern is far more common.
>
> Well, my way of thinking about XML based content management, is that
> everything should start, where possible, from a simple XML document
> that contains only the data of interest.  Eg, if I wanted to display a
> page that has a list of blog entries, then I'd start with some XML
> similar to RSS - or if I wanted an article on my site, I might start
> with a document in docbook format.  These documents would either come
> from a file or somewhere else based on the initial Provider AxKit uses.

I have been doing quite heavy XSLT->XSP->... processing. It boils down to some 
data document (be it from the filesystem or generated from SQL via a nice 
provider I did, or retrieved from XMLDB, whatever...) that is the 'object' of 
some processing. A user might want to edit it, because it's his own. A user 
might want to add a comment, because he is interested in it, whatever.
So there are actually two stylesheet directives to apply one logical 
transformation: The XSLT which converts the input data to some XSP page 
usually using some taglib which was written for that very purpose, and the 
XSP transformation after that. The result is some internal XML which is later 
transformed into the correct HTML forms, links, whatever.
So this is definitely useful, especially since it makes you independent of 
storage. I transferred the whole site from XMLDB to SQL in a matter of a few 
days, including writing/extending the SQL provider. Re-doing all that in ESQL 
would have been deadly.

Yet, let me take you off topic. Matt already said it: XSP is no styling 
language per se. This is a problem. The split approach can get very 
confusing. XSLT doesn't give you enough access to foreign data, even using 
perl extenstion functions, or allows you to leverage perl's module library 
easily. XPathScript allows you to do fairly complex processing, but lacks 
XSP's mix-and-match taglib flexibility. People actually doing XSP-after-XSLT 
are a good indicator that something is indeed missing. I agree with Matt that 
XSP was meant to *generate* content, but the world seems to want something 
between XSP, XSLT and XPathScript for active transformations. At least I want 
;-) XPathScript with taglibs and some selected XSLT constructs, sort of.

> Maybe I'm just being a content purist though ;-)

Oh, you are not alone. I am purist of everything that makes the job easier or 
cooler ;-)

> > Here's what I suggest:
> >
> > 1) Accept the patches.
> > 2) Re-implement the current all-or-nothing" caching behavior of the
> > current Cache.pm using the new interfaces and keep this behavior as
> > the default.
> > 3) Ship Cache::Incremental (or whatever) as a standard alternative.
> >
> > Reasonable? Doable? Anyone have thoughts?

+1 from me. Variations as discussed elsewhere optionally included, I have 
never had much insight in the caching system, so I am not much of a judge.

-- 
CU
  Joerg

PGP Public Key at http://ich.bin.kein.hoschi.de/~trouble/public_key.asc
PGP Key fingerprint = D34F 57C4 99D8 8F16 E16E  7779 CDDC 41A4 4C48 6F94

Re: XSP in a pipeline (was Re: Incremental Caching Patches)

Posted by Chris Leishman <ch...@leishman.org>.

On Monday, April 14, 2003, at 10:09 AM, Matt Sergeant wrote:

> On Sun, 13 Apr 2003, Chris Leishman wrote:
>
>> Actually - the Cocoon guys have covered this topic in their basic
>> concepts introduction:
>>
>> http://xml.apache.org/cocoon/userdocs/concepts/index.html#c2-
>> abstractions
>>
>> Their conclusion is pretty much the same as mine - starting with XSP 
>> in
>> the pipeline is ok for examples but bad for real systems.  Better to
>> use a 'logicsheet' (xslt) to create an xsp page to be processed.  And
>> as soon as people start trying to do this, incremental caching is
>> invaluable...
>
> That's why I say never use a logicsheet ;-)
>
> Cocoon logicsheets are flawed ideas IMHO when faced with AxKit's 
> ability
> to write taglibs.

But I thought you can write taglibs in Cocoon too.....

Regards,
Chris

Re: XSP in a pipeline (was Re: Incremental Caching Patches)

Posted by Matt Sergeant <ma...@sergeant.org>.

On Sun, 13 Apr 2003, Chris Leishman wrote:

> Actually - the Cocoon guys have covered this topic in their basic
> concepts introduction:
>
> http://xml.apache.org/cocoon/userdocs/concepts/index.html#c2-
> abstractions
>
> Their conclusion is pretty much the same as mine - starting with XSP in
> the pipeline is ok for examples but bad for real systems.  Better to
> use a 'logicsheet' (xslt) to create an xsp page to be processed.  And
> as soon as people start trying to do this, incremental caching is
> invaluable...

That's why I say never use a logicsheet ;-)

Cocoon logicsheets are flawed ideas IMHO when faced with AxKit's ability
to write taglibs.

-- 
<!-- Matt -->
<:->get a SMart net</:->
Spam trap - do not mail: spam-sig@spamtrap.messagelabs.com

Re: XSP in a pipeline (was Re: Incremental Caching Patches)

Posted by Chris Leishman <ch...@leishman.org>.

On Friday, April 11, 2003, at 02:53 PM, Chris Leishman wrote:
<snip>
> I guess my way of doing it would be XML -> XSLT (which creates XHTML  
> but also includes XSP tags) -> XSP -> XHTML.
>
> That way the XSLT is still responsible for creating the XHTML and then  
> the XSP just adds in the dynamic elements.  But I could see situations  
> where you might want to use some intermediate XML format and then do a  
> final XSLT to create XHTML.

Actually - the Cocoon guys have covered this topic in their basic  
concepts introduction:

http://xml.apache.org/cocoon/userdocs/concepts/index.html#c2- 
abstractions

Their conclusion is pretty much the same as mine - starting with XSP in  
the pipeline is ok for examples but bad for real systems.  Better to  
use a 'logicsheet' (xslt) to create an xsp page to be processed.  And  
as soon as people start trying to do this, incremental caching is  
invaluable...

Regards,
Chris

XSP in a pipeline (was Re: Incremental Caching Patches)

Posted by Chris Leishman <ch...@leishman.org>.

On Friday, April 11, 2003, at 02:42 PM, Robin Berjon wrote:
<snip>
> There's nothing "wrong" with your approach, but you are using XSP to 
> produce XHTML, and the latter /may/ contain styling information. 
> Producing styled output from XSP is imho a bad idea which is why, in 
> the process you describe, starting from an initial document I might do 
> FooML(source) -> XSLT(adds some XSP elements) -> XSP(processes XSP 
> stuff) -> XSLT(styles).
>
> In such cases, what I've done is XSP(does dynamic stuff *and* includes 
> the real content, usually from path info) -> XSLT(styles).

I guess my way of doing it would be XML -> XSLT (which creates XHTML 
but also includes XSP tags) -> XSP -> XHTML.

That way the XSLT is still responsible for creating the XHTML and then 
the XSP just adds in the dynamic elements.  But I could see situations 
where you might want to use some intermediate XML format and then do a 
final XSLT to create XHTML.

>> Maybe I'm just being a content purist though ;-)
>
> I am, and I tend to have http://foo.org/src/foo/bar.xml to access the 
> content in itself (with no decoration) and 
> http://foo.org/cms.xsp/foo/bar.xml to get the page people want to see.
<snip>

The seems rather hackish to me...

I would have though something like:

http://foo.org/foo/bar.xml                               (gives nice 
XHTML in the style of the site)
http://foo.org/foo/bar.xml?style=printable  (gives another style 
suitable for printing)
http://foo.org/foo/bar.xml?style=raw           (returns the original 
XML document)

or similar would be nicer than separating all the other styles from the 
'raw' one.  And then it saves having to do tricks with path_info...

But each to his own I guess :-)

Regards,
Chris

Re: Incremental Caching Patches

Posted by Robin Berjon <ro...@knowscape.com>.

Chris Leishman wrote:
> On Friday, April 11, 2003, at 01:42 PM, Kip Hampton wrote:
>> It does change it, though. Your patched version always returns a DOM 
>> instance, the current version serializes the result to a string using 
>> the XSLT processor's output_string( $result ) if the current process 
>> is $last_in_chain. The difference is that the latter *alters* the 
>> output based on options that may be set in an <xsl:output/> element in 
>> the stylesheet itself. The last stylesheet in my test suite had
> 
> Ok....thats annoying!  Can someone tell me how XSLT embeds the desired 
> output format into the DOM, so that when 
> XML::LibXSLT::output_string($result) is called, it knows to serialize in 
> a specific manner?

Well, it may be seen as annoying (and I for one hope that mode='html' will cease 
to be useful soon enough) but it's done as per the spec and it thus won't go 
away (besides, I don't really see an alternative solution...).

The info isn't stored in the DOM, it's stored in the stylesheet object. 
$xslt->output_string() looks that up and uses its own serialisation code instead 
of libxml's.

> Then to create something suitable to my site, I might apply some XSLT to 
> convert to XHTML.  But if I wanted to include some dynamic stuff - eg. a 
> 'Welcome <logged in user>' or even a form like 'Add new entry to blog', 
> then I'd want to convert from the original RSS to an XSP page that could 
> output XHTML containing the user details and perhaps the results of 
> processing the parameters in the request...
> 
> I can certainly see situations where there is no initial data source and 
> thus one is generated entirely dynamically via XSP based on the request 
> parameters or similar - but I would've thought it more common to have 
> some initial XML to work with.

There's nothing "wrong" with your approach, but you are using XSP to produce 
XHTML, and the latter /may/ contain styling information. Producing styled output 
from XSP is imho a bad idea which is why, in the process you describe, starting 
from an initial document I might do FooML(source) -> XSLT(adds some XSP 
elements) -> XSP(processes XSP stuff) -> XSLT(styles).

In such cases, what I've done is XSP(does dynamic stuff *and* includes the real 
content, usually from path info) -> XSLT(styles).

> Maybe I'm just being a content purist though ;-)

I am, and I tend to have http://foo.org/src/foo/bar.xml to access the content in 
itself (with no decoration) and http://foo.org/cms.xsp/foo/bar.xml to get the 
page people want to see.

--r

Re: Incremental Caching Patches

Posted by Chris Leishman <ch...@leishman.org>.

On Friday, April 11, 2003, at 01:42 PM, Kip Hampton wrote:
<snip>
> The content-length was different because the content returned was 
> different. See below.
<snip>

Ok...

> It does change it, though. Your patched version always returns a DOM 
> instance, the current version serializes the result to a string using 
> the XSLT processor's output_string( $result ) if the current process 
> is $last_in_chain. The difference is that the latter *alters* the 
> output based on options that may be set in an <xsl:output/> element in 
> the stylesheet itself. The last stylesheet in my test suite had
<snip>

Ok....thats annoying!  Can someone tell me how XSLT embeds the desired 
output format into the DOM, so that when 
XML::LibXSLT::output_string($result) is called, it knows to serialize 
in a specific manner?

> Its not a big deal, all it takes is checking $last_in_chain and 
> returning the result of output_string( $result ) if its defined 
> instead of always returning the DOM.

Rather than this, it could be better to keep passing the DOM back (for 
consistancy) - but then at the final serialization stage (when sending 
to the client) we use a serialization method that respects the desired 
output format attached to the DOM.  Not that it really matters.....

>> It is a smidge hacky I agree.  I did it to be backwards compatible.  
>> I guess the same could be achieved by returning a hashref with a 
>> 'status' member, and then detecting whether a scalar or a hash ref 
>> was returned by the processor back in AxKit.
>
> +1

Ok....we'll use that way then.  Makes no difference to me :-)

> Well, really the only part I was worried about was shutting down 
> direct access to sending data to the client from the Language modules, 
> but, given that its only there in an illusory way now anyway, it 
> really doesn't matter.

Cool.

> XSP (eXtensible Server Pages) is all about *generating* content. It is 
> not a transformative processor in the way that XSLT and XPathScript 
> are. There is no XSP stylesheet that gets applied to a source document 
> and there's nothing in the language itself that presumes an input 
> document.. Markup, inlined code, and taglibs (markup that maps to 
> code) are combined in one document and the only "transformation" 
> happens when the XSP processor executes the inlined or taglib-added 
> code.
>
> So, unless you mean "a pipeline of styles that generates an XSP page 
> that gets executed at the end to generate further content" I'm not 
> sure what you mean by using XSP as "form of styling". Its not that you 
> *couldn't* do things that way, but its hardly the common case. The 
> XSP(to generate content)->XSLT(to style it for the given client) 
> pattern is far more common.

Well, my way of thinking about XML based content management, is that 
everything should start, where possible, from a simple XML document 
that contains only the data of interest.  Eg, if I wanted to display a 
page that has a list of blog entries, then I'd start with some XML 
similar to RSS - or if I wanted an article on my site, I might start 
with a document in docbook format.  These documents would either come 
from a file or somewhere else based on the initial Provider AxKit uses.

Then to create something suitable to my site, I might apply some XSLT 
to convert to XHTML.  But if I wanted to include some dynamic stuff - 
eg. a 'Welcome <logged in user>' or even a form like 'Add new entry to 
blog', then I'd want to convert from the original RSS to an XSP page 
that could output XHTML containing the user details and perhaps the 
results of processing the parameters in the request...

I can certainly see situations where there is no initial data source 
and thus one is generated entirely dynamically via XSP based on the 
request parameters or similar - but I would've thought it more common 
to have some initial XML to work with.

Maybe I'm just being a content purist though ;-)

> Here's what I suggest:
>
> 1) Accept the patches.
> 2) Re-implement the current all-or-nothing" caching behavior of the 
> current Cache.pm using the new interfaces and keep this behavior as 
> the default.
> 3) Ship Cache::Incremental (or whatever) as a standard alternative.
>
> Reasonable? Doable? Anyone have thoughts?

I think it's do-able.  Stage 3 wouldn't be done via a different Cache 
module though - it would have to be via some sort of option 
(AxIncrementalCache On/Off ?).

>> [ all: please change the subject to create a new thread if you want 
>> to discuss this point with me further.  The initial discussion on 
>> this was on -users with the topic 'Adding XSP to an XSLT pipeline'. ]
>
> This discussion is not appropriate for the user's list; that's why I 
> brought it here. If you want to continue in the conversation you are 
> certainly most welcome and what you do with the subject line in your 
> replies is your business. :-)

I was actually referring to my comments on the pros/cons of putting XSP 
earlier or later in a pipeline....thats really a different discussion 
from whether the patches are good or not.

Regards,
Chris

Re: Incremental Caching Patches

Posted by Kip Hampton <kh...@totalcinema.com>.

Chris Leishman wrote:

> Really?  I didn't notice that...  AFAIK a Content-Length header is only 
> actually returned when delivering from a cache file (apache calculates 
> the content-length, etc, after the $r->filename is set and the handler 
> declines).  When delivering normally I noticed that there is no 
> Content-Length header (although HTTP/1.1 connections used chunked 
> encoding which is preferable to content length anyway).
> 

The content-length was different because the content returned was 
different. See below.

>> hence, if only the DOM is returned, there may be unexpected results.
> 
> 
> AFAIK my patches should change this situation at all.  At the moment a 
> language module can return a dom (via pnotes), or just a string.  And 
> the next module has the opportunity to look for the dom or the string in 
> pnotes.  So are you saying that there are actually differences in the 
> output under the new patches, because that seems strange...

It does change it, though. Your patched version always returns a DOM 
instance, the current version serializes the result to a string using 
the XSLT processor's output_string( $result ) if the current process is 
$last_in_chain. The difference is that the latter *alters* the output 
based on options that may be set in an <xsl:output/> element in the 
stylesheet itself. The last stylesheet in my test suite had

<xsl:output mode="html"/>

(which is fairly common) so, when your patched version just returned the 
DOM result, the XSLT processor didn't get a chance to do its HTML mode 
serialization tricks (stripping of the XML declaration, adding meta 
headers for encoding, etc.) like it does in the current 
Language::LibXSLT-- hence the results are different,

Its not a big deal, all it takes is checking $last_in_chain and 
returning the result of output_string( $result ) if its defined instead 
of always returning the DOM.

> I don't see any reason why AxKit should serialise and re-parse at each 
> stage.  If a language processor is known to be broken wrt using a 
> prebuilt DOM, it can query the provider for get_strref and then parse 
> it's own DOM internally.  It would be silly to impose that overhead if 
> the modules being used don't need it.  It would also encourage people to 
> fix bugs that stop it working properly.

Agreed. I didn't want it to go away either. The combination of the above 
XSLT output annoyance and Simon Woodside's recent "bug" w.r.t. 
serializing default namespaces made me a little touchy. What can I say, 
I'll try the decaf...

> 
>> * It wasn't clear to me whether or not the new Language module design 
>> allows direct output to the client if the current processor is the 
>> last in the chain. If not, this presumption puts limits on what can be 
>> integrated as last-in-chain Language modules that may need access to 
>> the Apache object to return content appropriately.
> 
> 
> The last-in-chain module can't output directly to the client, since the 
> response should be returned to AxKit for it to deliver.  But I don't 
> think the previous code did either.  I noticed some modules (eg. 
> LibXSLT) took the effort of serialising the DOM and printing it if 
> last-in-chain was in effect, but the AxKit::Apache module simply 
> redirects the print and puts the data in pnotes('xml_string') anyway - 
> so it's still not dynamic (and is thus kind of pointless....maybe this 
> is historic?).

Er, that is pretty silly, actually...
> 
>> * Bolting on the result of the transformation to the return code in 
>> the new Language modules (e.g. return (Apache::Constants::OK, 
>> \%results); ) seems a smidge hacky to me. They should return one or 
>> the other, both is an ugly mix of coding strategies.
> 
> 
> It is a smidge hacky I agree.  I did it to be backwards compatible.  I 
> guess the same could be achieved by returning a hashref with a 'status' 
> member, and then detecting whether a scalar or a hash ref was returned 
> by the processor back in AxKit.

+1

> 
>> So, I guess the sum of my evaluation is "I'm not sure". There are 
>> definitely some good things in the proposed patches, but I think they 
>> go too far in some places.
> 
> 
> I'm not sure you've really said where in particular they 'go to far'...

Well, really the only part I was worried about was shutting down direct 
access to sending data to the client from the Language modules, but, 
given that its only there in an illusory way now anyway, it really 
doesn't matter.
.
> 
>> The questions that need to be answered are:
>>
>> In general, does switching to incremental caching give us something 
>> that we don't already have, or is it arguably a better generic 
>> solution than the current all-or-nothing implementation (especially in 
>> light of the fact that the typical use-case seems to put the dynamic 
>> parts at the front of the processing chain)?
> 
> 
> Well, it buys you advantages in the cases where dynamic parts are later 
> in the chain.  And I still haven't heard an overly convincing argument 
> for why that shouldn't be a far more appropriate thing in many cases.  
> My view is that XSP by nature is a form of 'styling', and should be 
> added in rather than being in the original XML document.

XSP (eXtensible Server Pages) is all about *generating* content. It is 
not a transformative processor in the way that XSLT and XPathScript are. 
There is no XSP stylesheet that gets applied to a source document and 
there's nothing in the language itself that presumes an input document.. 
Markup, inlined code, and taglibs (markup that maps to code) are 
combined in one document and the only "transformation" happens when the 
XSP processor executes the inlined or taglib-added code.

So, unless you mean "a pipeline of styles that generates an XSP page 
that gets executed at the end to generate further content" I'm not sure 
what you mean by using XSP as "form of styling". Its not that you 
*couldn't* do things that way, but its hardly the common case. The 
XSP(to generate content)->XSLT(to style it for the given client) pattern 
is far more common.

Here's what I suggest:

1) Accept the patches.
2) Re-implement the current all-or-nothing" caching behavior of the 
current Cache.pm using the new interfaces and keep this behavior as the 
default.
3) Ship Cache::Incremental (or whatever) as a standard alternative.

Reasonable? Doable? Anyone have thoughts?

> 
> [ all: please change the subject to create a new thread if you want to 
> discuss this point with me further.  The initial discussion on this was 
> on -users with the topic 'Adding XSP to an XSLT pipeline'. ]

This discussion is not appropriate for the user's list; that's why I 
brought it here. If you want to continue in the conversation you are 
certainly most welcome and what you do with the subject line in your 
replies is your business. :-)

-kip

Re: Incremental Caching Patches [Long]

Posted by Chris Leishman <ch...@leishman.org>.

On Wednesday, April 9, 2003, at 03:24 PM, Kip Hampton wrote:
<snip>
> Some things to note and random ideas:
>
> * There's small increase in requests per second with caching turned 
> off, but a modest loss with caching turned on.

This agrees with my assessment.  Of course you should also test with 
dynamic objects later in the pipeline (eg. XSLT -> XSLT -> XSP ).  In 
those situations you should see very marked increases in performance as 
the first two stages are entirely cacheable.

> * The content-length returned from requests to the CVS and patched is 
> different.

Really?  I didn't notice that...  AFAIK a Content-Length header is only 
actually returned when delivering from a cache file (apache calculates 
the content-length, etc, after the $r->filename is set and the handler 
declines).  When delivering normally I noticed that there is no 
Content-Length header (although HTTP/1.1 connections used chunked 
encoding which is preferable to content length anyway).

> The changes to the way data is returned from the LibXSLT Language 
> module in the new Provider/Cache/Language interactions further exposes 
> differences in the document returned from XML::LibXSLT when the result 
> is selected as DOM vs.as a string. That is, given:
<snip>
> hence, if only the DOM is returned, there may be unexpected results.

AFAIK my patches should change this situation at all.  At the moment a 
language module can return a dom (via pnotes), or just a string.  And 
the next module has the opportunity to look for the dom or the string 
in pnotes.  So are you saying that there are actually differences in 
the output under the new patches, because that seems strange...

> This is not specifically related to Chris' proposed patches directly, 
> but it does point to the fact that we my have rethink the very idea of 
> passing around a DOM tree between Provider and Language modules and 
> between different Language modules in the processing chain. The sad 
> fact is that if we are to adhere to the notion of least surprise for 
> AxKit users, we may need to fully serialize and re-parse at every 
> processing stage in order to allow each processor the chance to 
> perform whatever serialization magic it needs to to deliver the proper 
> result. .

I don't see any reason why AxKit should serialise and re-parse at each 
stage.  If a language processor is known to be broken wrt using a 
prebuilt DOM, it can query the provider for get_strref and then parse 
it's own DOM internally.  It would be silly to impose that overhead if 
the modules being used don't need it.  It would also encourage people 
to fix bugs that stop it working properly.

> * It wasn't clear to me whether or not the new Language module design 
> allows direct output to the client if the current processor is the 
> last in the chain. If not, this presumption puts limits on what can be 
> integrated as last-in-chain Language modules that may need access to 
> the Apache object to return content appropriately.

The last-in-chain module can't output directly to the client, since the 
response should be returned to AxKit for it to deliver.  But I don't 
think the previous code did either.  I noticed some modules (eg. 
LibXSLT) took the effort of serialising the DOM and printing it if 
last-in-chain was in effect, but the AxKit::Apache module simply 
redirects the print and puts the data in pnotes('xml_string') anyway - 
so it's still not dynamic (and is thus kind of pointless....maybe this 
is historic?).

> * Bolting on the result of the transformation to the return code in 
> the new Language modules (e.g. return (Apache::Constants::OK, 
> \%results); ) seems a smidge hacky to me. They should return one or 
> the other, both is an ugly mix of coding strategies.

It is a smidge hacky I agree.  I did it to be backwards compatible.  I 
guess the same could be achieved by returning a hashref with a 'status' 
member, and then detecting whether a scalar or a hash ref was returned 
by the processor back in AxKit.

> So, I guess the sum of my evaluation is "I'm not sure". There are 
> definitely some good things in the proposed patches, but I think they 
> go too far in some places.

I'm not sure you've really said where in particular they 'go to far'....

> The questions that need to be answered are:
>
> In general, does switching to incremental caching give us something 
> that we don't already have, or is it arguably a better generic 
> solution than the current all-or-nothing implementation (especially in 
> light of the fact that the typical use-case seems to put the dynamic 
> parts at the front of the processing chain)?

Well, it buys you advantages in the cases where dynamic parts are later 
in the chain.  And I still haven't heard an overly convincing argument 
for why that shouldn't be a far more appropriate thing in many cases.  
My view is that XSP by nature is a form of 'styling', and should be 
added in rather than being in the original XML document.

[ all: please change the subject to create a new thread if you want to 
discuss this point with me further.  The initial discussion on this was 
on -users with the topic 'Adding XSP to an XSLT pipeline'. ]

> Do the proposed patches make it easier for users to write custom Cache 
> modules? How or how not?

My thoughts (for what it's worth) is that it shouldn't make too much 
difference to the previous way.  The basic way the caching works 
internally is pretty much the same - it's just the interface that's 
changed to make it fit better with the whole 'Cache is a provider too' 
idea.

Regards,
Chris

Re: devel list (was Re: Incremental Caching Patches [Long])

Posted by Chris Leishman <ch...@leishman.org>.

On Friday, April 11, 2003, at 02:24 PM, Robin Berjon wrote:
<snip>
> That one is dead, but I thought we had carried the subscriber base 
> over. I guess perhaps not and you missed the switch message.

I probably signed up after the switch message since I just got the 
address of the axkit.org website.  I've unsubscribed from that list and 
now I'm on the 'real' one.

Regards,
Chris

Re: devel list (was Re: Incremental Caching Patches [Long])

Posted by Robin Berjon <ro...@knowscape.com>.

Chris Leishman wrote:
> On Wednesday, April 9, 2003, at 03:24 PM, Kip Hampton wrote:
>> Okay, I finally had a chance to install the proposed caching patches
>> that came from Chris L through the users' list (CC'ing you, Chris since
>> I'm not sure if you're subbed here or not).
> 
> I though I was subscribed - but it turns out there are actually two 
> devel lists:
> 
> axkit-devel@axkit.org (which is, it seems, not actually used)

That one is dead, but I thought we had carried the subscriber base over. I guess 
perhaps not and you missed the switch message.

> Someone should probably update the mailing lists page on axkit.org so 
> that it references the correct list.

Ick, yes!

--r

Re: devel list (was Re: Incremental Caching Patches [Long])

Posted by Matt Sergeant <ma...@sergeant.org>.

On Fri, 11 Apr 2003, Chris Leishman wrote:

> On Wednesday, April 9, 2003, at 03:24 PM, Kip Hampton wrote:
> <snip>
> > Okay, I finally had a chance to install the proposed caching patches
> > that came from Chris L through the users' list (CC'ing you, Chris since
> > I'm not sure if you're subbed here or not).
>
> I though I was subscribed - but it turns out there are actually two
> devel lists:
>
> axkit-devel@axkit.org (which is, it seems, not actually used)
>
> and this one:
>
> axkit-dev@xml.apache.org
>
>
> Someone should probably update the mailing lists page on axkit.org so
> that it references the correct list.

Done. My bad.

-- 
<!-- Matt -->
<:->get a SMart net</:->
Spam trap - do not mail: spam-sig@spamtrap.messagelabs.com

devel list (was Re: Incremental Caching Patches [Long])

Posted by Chris Leishman <ch...@leishman.org>.

On Wednesday, April 9, 2003, at 03:24 PM, Kip Hampton wrote:
<snip>
> Okay, I finally had a chance to install the proposed caching patches
> that came from Chris L through the users' list (CC'ing you, Chris since
> I'm not sure if you're subbed here or not).

I though I was subscribed - but it turns out there are actually two 
devel lists:

axkit-devel@axkit.org (which is, it seems, not actually used)

and this one:

axkit-dev@xml.apache.org

Someone should probably update the mailing lists page on axkit.org so 
that it references the correct list.

Regards,
Chris