You are viewing a plain text version of this content. The canonical link for it is here.
Posted to axkit-dev@xml.apache.org by Kip Hampton <kh...@totalcinema.com> on 2003/04/09 14:24:50 UTC
Incremental Caching Patches [Long]
Howdy AxDevers,
Okay, I finally had a chance to install the proposed caching patches
that came from Chris L through the users' list (CC'ing you, Chris since
I'm not sure if you're subbed here or not). The changes proposed touch a
fair amount of code and re-organize some of AxKit most basic behaviors
in to different classes so it just seems smart to be to review them a
bit more closely than we do the usual patch.
First, the hard numbers:
Pentium II 366, 96M RAM (Hey, its my old lap-top...)
Running RedHat 7.2
The following reflects performance for 10,000 total requests, 10
concurrent, using ab on the loopback. (see the attached bench.txt for
full details) Stripped down XSLT stylesheets were used in effort to
localize processing time to AxKit itself, rather than XSLT transformation.
## Current CVS, AxNoCache On, 3-step (trivial) XSLT
Requests per second: 50.99 [#/sec] (mean)
## Incremental Patch Applied, AxNoCache On, 3-step (trivial) XSLT
Requests per second: 51.79 [#/sec] (mean)
## Current CVS, AxNoCache Off, 3-step (trivial) XSLT
Requests per second: 80.45 [#/sec] (mean)
## Incremental Patch Applied, AxNoCache Off, 3-step (trivial) XSLT
Requests per second: 68.41 [#/sec] (mean)
Some things to note and random ideas:
* There's small increase in requests per second with caching turned off,
but a modest loss with caching turned on.
* The content-length returned from requests to the CVS and patched is
different.
The changes to the way data is returned from the LibXSLT Language module
in the new Provider/Cache/Language interactions further exposes
differences in the document returned from XML::LibXSLT when the result
is selected as DOM vs.as a string. That is, given:
my $results = $stylesheet->transform($source);
$results->toString();
is *not* guaranteed to give the same result as
$stylesheet->output_string($results);
hence, if only the DOM is returned, there may be unexpected results.
This is not specifically related to Chris' proposed patches directly,
but it does point to the fact that we my have rethink the very idea of
passing around a DOM tree between Provider and Language modules and
between different Language modules in the processing chain. The sad fact
is that if we are to adhere to the notion of least surprise for AxKit
users, we may need to fully serialize and re-parse at every processing
stage in order to allow each processor the chance to perform whatever
serialization magic it needs to to deliver the proper result. .
* Given that the Cache and Provider interfaces are more or less
identical, making Cache a subclass of Provider makes a lot of sense. +1
on this part.
* Changing the way Language modules get content from the current "it
might be in pnotes('dom_tree'), pnotes('xml_string') or one of the
Provider methods and you'll have to check" interface to the "give me
what i ask for" approach in Chris' patches is a definite change for the
better.
* It wasn't clear to me whether or not the new Language module design
allows direct output to the client if the current processor is the last
in the chain. If not, this presumption puts limits on what can be
integrated as last-in-chain Language modules that may need access to the
Apache object to return content appropriately.
* Bolting on the result of the transformation to the return code in the
new Language modules (e.g. return (Apache::Constants::OK, \%results); )
seems a smidge hacky to me. They should return one or the other, both is
an ugly mix of coding strategies.
So, I guess the sum of my evaluation is "I'm not sure". There are
definitely some good things in the proposed patches, but I think they go
too far in some places.
The questions that need to be answered are:
In general, does switching to incremental caching give us something that
we don't already have, or is it arguably a better generic solution than
the current all-or-nothing implementation (especially in light of the
fact that the typical use-case seems to put the dynamic parts at the
front of the processing chain)?
Do the proposed patches make it easier for users to write custom Cache
modules? How or how not?
Thoughts?
-kip
Re: Incremental Caching Patches
Posted by Robin Berjon <ro...@knowscape.com>.
Kip Hampton wrote:
> Here's what I suggest:
>
> 1) Accept the patches.
> 2) Re-implement the current all-or-nothing" caching behavior of the
> current Cache.pm using the new interfaces and keep this behavior as the
> default.
> 3) Ship Cache::Incremental (or whatever) as a standard alternative.
>
> Reasonable? Doable? Anyone have thoughts?
+1 for me
--r
Re: Incremental Caching Patches [plus offtopic]
Posted by Jörg Walter <eh...@ich.bin.kein.hoschi.de>.
Hi AxDevers...
After being submerged in my new contractor's office, which was getting hacked
by some annoying trojan right away and cleaning stuff up, I'll finally give
my 2 cents to this issue.
On Friday, 11. April 2003 13:24, Chris Leishman wrote:
> >> It is a smidge hacky I agree. I did it to be backwards compatible.
> >> I guess the same could be achieved by returning a hashref with a
> >> 'status' member, and then detecting whether a scalar or a hash ref
> >> was returned by the processor back in AxKit.
> >
> > +1
Sounds reasonable.
> > XSP (eXtensible Server Pages) is all about *generating* content. It is
> > not a transformative processor in the way that XSLT and XPathScript
> > are. There is no XSP stylesheet that gets applied to a source document
> > and there's nothing in the language itself that presumes an input
> > document.. Markup, inlined code, and taglibs (markup that maps to
> > code) are combined in one document and the only "transformation"
> > happens when the XSP processor executes the inlined or taglib-added
> > code.
> >
> > So, unless you mean "a pipeline of styles that generates an XSP page
> > that gets executed at the end to generate further content" I'm not
> > sure what you mean by using XSP as "form of styling". Its not that you
> > *couldn't* do things that way, but its hardly the common case. The
> > XSP(to generate content)->XSLT(to style it for the given client)
> > pattern is far more common.
>
> Well, my way of thinking about XML based content management, is that
> everything should start, where possible, from a simple XML document
> that contains only the data of interest. Eg, if I wanted to display a
> page that has a list of blog entries, then I'd start with some XML
> similar to RSS - or if I wanted an article on my site, I might start
> with a document in docbook format. These documents would either come
> from a file or somewhere else based on the initial Provider AxKit uses.
I have been doing quite heavy XSLT->XSP->... processing. It boils down to some
data document (be it from the filesystem or generated from SQL via a nice
provider I did, or retrieved from XMLDB, whatever...) that is the 'object' of
some processing. A user might want to edit it, because it's his own. A user
might want to add a comment, because he is interested in it, whatever.
So there are actually two stylesheet directives to apply one logical
transformation: The XSLT which converts the input data to some XSP page
usually using some taglib which was written for that very purpose, and the
XSP transformation after that. The result is some internal XML which is later
transformed into the correct HTML forms, links, whatever.
So this is definitely useful, especially since it makes you independent of
storage. I transferred the whole site from XMLDB to SQL in a matter of a few
days, including writing/extending the SQL provider. Re-doing all that in ESQL
would have been deadly.
Yet, let me take you off topic. Matt already said it: XSP is no styling
language per se. This is a problem. The split approach can get very
confusing. XSLT doesn't give you enough access to foreign data, even using
perl extenstion functions, or allows you to leverage perl's module library
easily. XPathScript allows you to do fairly complex processing, but lacks
XSP's mix-and-match taglib flexibility. People actually doing XSP-after-XSLT
are a good indicator that something is indeed missing. I agree with Matt that
XSP was meant to *generate* content, but the world seems to want something
between XSP, XSLT and XPathScript for active transformations. At least I want
;-) XPathScript with taglibs and some selected XSLT constructs, sort of.
> Maybe I'm just being a content purist though ;-)
Oh, you are not alone. I am purist of everything that makes the job easier or
cooler ;-)
> > Here's what I suggest:
> >
> > 1) Accept the patches.
> > 2) Re-implement the current all-or-nothing" caching behavior of the
> > current Cache.pm using the new interfaces and keep this behavior as
> > the default.
> > 3) Ship Cache::Incremental (or whatever) as a standard alternative.
> >
> > Reasonable? Doable? Anyone have thoughts?
+1 from me. Variations as discussed elsewhere optionally included, I have
never had much insight in the caching system, so I am not much of a judge.
--
CU
Joerg
PGP Public Key at http://ich.bin.kein.hoschi.de/~trouble/public_key.asc
PGP Key fingerprint = D34F 57C4 99D8 8F16 E16E 7779 CDDC 41A4 4C48 6F94
Re: XSP in a pipeline (was Re: Incremental Caching Patches)
Posted by Chris Leishman <ch...@leishman.org>.
On Monday, April 14, 2003, at 10:09 AM, Matt Sergeant wrote:
> On Sun, 13 Apr 2003, Chris Leishman wrote:
>
>> Actually - the Cocoon guys have covered this topic in their basic
>> concepts introduction:
>>
>> http://xml.apache.org/cocoon/userdocs/concepts/index.html#c2-
>> abstractions
>>
>> Their conclusion is pretty much the same as mine - starting with XSP
>> in
>> the pipeline is ok for examples but bad for real systems. Better to
>> use a 'logicsheet' (xslt) to create an xsp page to be processed. And
>> as soon as people start trying to do this, incremental caching is
>> invaluable...
>
> That's why I say never use a logicsheet ;-)
>
> Cocoon logicsheets are flawed ideas IMHO when faced with AxKit's
> ability
> to write taglibs.
But I thought you can write taglibs in Cocoon too.....
Regards,
Chris
Re: XSP in a pipeline (was Re: Incremental Caching Patches)
Posted by Matt Sergeant <ma...@sergeant.org>.
On Sun, 13 Apr 2003, Chris Leishman wrote:
> Actually - the Cocoon guys have covered this topic in their basic
> concepts introduction:
>
> http://xml.apache.org/cocoon/userdocs/concepts/index.html#c2-
> abstractions
>
> Their conclusion is pretty much the same as mine - starting with XSP in
> the pipeline is ok for examples but bad for real systems. Better to
> use a 'logicsheet' (xslt) to create an xsp page to be processed. And
> as soon as people start trying to do this, incremental caching is
> invaluable...
That's why I say never use a logicsheet ;-)
Cocoon logicsheets are flawed ideas IMHO when faced with AxKit's ability
to write taglibs.
--
<!-- Matt -->
<:->get a SMart net</:->
Spam trap - do not mail: spam-sig@spamtrap.messagelabs.com
Re: XSP in a pipeline (was Re: Incremental Caching Patches)
Posted by Chris Leishman <ch...@leishman.org>.
On Friday, April 11, 2003, at 02:53 PM, Chris Leishman wrote:
<snip>
> I guess my way of doing it would be XML -> XSLT (which creates XHTML
> but also includes XSP tags) -> XSP -> XHTML.
>
> That way the XSLT is still responsible for creating the XHTML and then
> the XSP just adds in the dynamic elements. But I could see situations
> where you might want to use some intermediate XML format and then do a
> final XSLT to create XHTML.
Actually - the Cocoon guys have covered this topic in their basic
concepts introduction:
http://xml.apache.org/cocoon/userdocs/concepts/index.html#c2-
abstractions
Their conclusion is pretty much the same as mine - starting with XSP in
the pipeline is ok for examples but bad for real systems. Better to
use a 'logicsheet' (xslt) to create an xsp page to be processed. And
as soon as people start trying to do this, incremental caching is
invaluable...
Regards,
Chris
XSP in a pipeline (was Re: Incremental Caching Patches)
Posted by Chris Leishman <ch...@leishman.org>.
On Friday, April 11, 2003, at 02:42 PM, Robin Berjon wrote:
<snip>
> There's nothing "wrong" with your approach, but you are using XSP to
> produce XHTML, and the latter /may/ contain styling information.
> Producing styled output from XSP is imho a bad idea which is why, in
> the process you describe, starting from an initial document I might do
> FooML(source) -> XSLT(adds some XSP elements) -> XSP(processes XSP
> stuff) -> XSLT(styles).
>
> In such cases, what I've done is XSP(does dynamic stuff *and* includes
> the real content, usually from path info) -> XSLT(styles).
I guess my way of doing it would be XML -> XSLT (which creates XHTML
but also includes XSP tags) -> XSP -> XHTML.
That way the XSLT is still responsible for creating the XHTML and then
the XSP just adds in the dynamic elements. But I could see situations
where you might want to use some intermediate XML format and then do a
final XSLT to create XHTML.
>> Maybe I'm just being a content purist though ;-)
>
> I am, and I tend to have http://foo.org/src/foo/bar.xml to access the
> content in itself (with no decoration) and
> http://foo.org/cms.xsp/foo/bar.xml to get the page people want to see.
<snip>
The seems rather hackish to me...
I would have though something like:
http://foo.org/foo/bar.xml (gives nice
XHTML in the style of the site)
http://foo.org/foo/bar.xml?style=printable (gives another style
suitable for printing)
http://foo.org/foo/bar.xml?style=raw (returns the original
XML document)
or similar would be nicer than separating all the other styles from the
'raw' one. And then it saves having to do tricks with path_info...
But each to his own I guess :-)
Regards,
Chris
Re: Incremental Caching Patches
Posted by Robin Berjon <ro...@knowscape.com>.
Chris Leishman wrote:
> On Friday, April 11, 2003, at 01:42 PM, Kip Hampton wrote:
>> It does change it, though. Your patched version always returns a DOM
>> instance, the current version serializes the result to a string using
>> the XSLT processor's output_string( $result ) if the current process
>> is $last_in_chain. The difference is that the latter *alters* the
>> output based on options that may be set in an <xsl:output/> element in
>> the stylesheet itself. The last stylesheet in my test suite had
>
> Ok....thats annoying! Can someone tell me how XSLT embeds the desired
> output format into the DOM, so that when
> XML::LibXSLT::output_string($result) is called, it knows to serialize in
> a specific manner?
Well, it may be seen as annoying (and I for one hope that mode='html' will cease
to be useful soon enough) but it's done as per the spec and it thus won't go
away (besides, I don't really see an alternative solution...).
The info isn't stored in the DOM, it's stored in the stylesheet object.
$xslt->output_string() looks that up and uses its own serialisation code instead
of libxml's.
> Then to create something suitable to my site, I might apply some XSLT to
> convert to XHTML. But if I wanted to include some dynamic stuff - eg. a
> 'Welcome <logged in user>' or even a form like 'Add new entry to blog',
> then I'd want to convert from the original RSS to an XSP page that could
> output XHTML containing the user details and perhaps the results of
> processing the parameters in the request...
>
> I can certainly see situations where there is no initial data source and
> thus one is generated entirely dynamically via XSP based on the request
> parameters or similar - but I would've thought it more common to have
> some initial XML to work with.
There's nothing "wrong" with your approach, but you are using XSP to produce
XHTML, and the latter /may/ contain styling information. Producing styled output
from XSP is imho a bad idea which is why, in the process you describe, starting
from an initial document I might do FooML(source) -> XSLT(adds some XSP
elements) -> XSP(processes XSP stuff) -> XSLT(styles).
In such cases, what I've done is XSP(does dynamic stuff *and* includes the real
content, usually from path info) -> XSLT(styles).
> Maybe I'm just being a content purist though ;-)
I am, and I tend to have http://foo.org/src/foo/bar.xml to access the content in
itself (with no decoration) and http://foo.org/cms.xsp/foo/bar.xml to get the
page people want to see.
--r
Re: Incremental Caching Patches
Posted by Chris Leishman <ch...@leishman.org>.
On Friday, April 11, 2003, at 01:42 PM, Kip Hampton wrote:
<snip>
> The content-length was different because the content returned was
> different. See below.
<snip>
Ok...
> It does change it, though. Your patched version always returns a DOM
> instance, the current version serializes the result to a string using
> the XSLT processor's output_string( $result ) if the current process
> is $last_in_chain. The difference is that the latter *alters* the
> output based on options that may be set in an <xsl:output/> element in
> the stylesheet itself. The last stylesheet in my test suite had
<snip>
Ok....thats annoying! Can someone tell me how XSLT embeds the desired
output format into the DOM, so that when
XML::LibXSLT::output_string($result) is called, it knows to serialize
in a specific manner?
> Its not a big deal, all it takes is checking $last_in_chain and
> returning the result of output_string( $result ) if its defined
> instead of always returning the DOM.
Rather than this, it could be better to keep passing the DOM back (for
consistancy) - but then at the final serialization stage (when sending
to the client) we use a serialization method that respects the desired
output format attached to the DOM. Not that it really matters.....
>> It is a smidge hacky I agree. I did it to be backwards compatible.
>> I guess the same could be achieved by returning a hashref with a
>> 'status' member, and then detecting whether a scalar or a hash ref
>> was returned by the processor back in AxKit.
>
> +1
Ok....we'll use that way then. Makes no difference to me :-)
> Well, really the only part I was worried about was shutting down
> direct access to sending data to the client from the Language modules,
> but, given that its only there in an illusory way now anyway, it
> really doesn't matter.
Cool.
> XSP (eXtensible Server Pages) is all about *generating* content. It is
> not a transformative processor in the way that XSLT and XPathScript
> are. There is no XSP stylesheet that gets applied to a source document
> and there's nothing in the language itself that presumes an input
> document.. Markup, inlined code, and taglibs (markup that maps to
> code) are combined in one document and the only "transformation"
> happens when the XSP processor executes the inlined or taglib-added
> code.
>
> So, unless you mean "a pipeline of styles that generates an XSP page
> that gets executed at the end to generate further content" I'm not
> sure what you mean by using XSP as "form of styling". Its not that you
> *couldn't* do things that way, but its hardly the common case. The
> XSP(to generate content)->XSLT(to style it for the given client)
> pattern is far more common.
Well, my way of thinking about XML based content management, is that
everything should start, where possible, from a simple XML document
that contains only the data of interest. Eg, if I wanted to display a
page that has a list of blog entries, then I'd start with some XML
similar to RSS - or if I wanted an article on my site, I might start
with a document in docbook format. These documents would either come
from a file or somewhere else based on the initial Provider AxKit uses.
Then to create something suitable to my site, I might apply some XSLT
to convert to XHTML. But if I wanted to include some dynamic stuff -
eg. a 'Welcome <logged in user>' or even a form like 'Add new entry to
blog', then I'd want to convert from the original RSS to an XSP page
that could output XHTML containing the user details and perhaps the
results of processing the parameters in the request...
I can certainly see situations where there is no initial data source
and thus one is generated entirely dynamically via XSP based on the
request parameters or similar - but I would've thought it more common
to have some initial XML to work with.
Maybe I'm just being a content purist though ;-)
> Here's what I suggest:
>
> 1) Accept the patches.
> 2) Re-implement the current all-or-nothing" caching behavior of the
> current Cache.pm using the new interfaces and keep this behavior as
> the default.
> 3) Ship Cache::Incremental (or whatever) as a standard alternative.
>
> Reasonable? Doable? Anyone have thoughts?
I think it's do-able. Stage 3 wouldn't be done via a different Cache
module though - it would have to be via some sort of option
(AxIncrementalCache On/Off ?).
>> [ all: please change the subject to create a new thread if you want
>> to discuss this point with me further. The initial discussion on
>> this was on -users with the topic 'Adding XSP to an XSLT pipeline'. ]
>
> This discussion is not appropriate for the user's list; that's why I
> brought it here. If you want to continue in the conversation you are
> certainly most welcome and what you do with the subject line in your
> replies is your business. :-)
I was actually referring to my comments on the pros/cons of putting XSP
earlier or later in a pipeline....thats really a different discussion
from whether the patches are good or not.
Regards,
Chris
Re: Incremental Caching Patches
Posted by Kip Hampton <kh...@totalcinema.com>.
Chris Leishman wrote:
> Really? I didn't notice that... AFAIK a Content-Length header is only
> actually returned when delivering from a cache file (apache calculates
> the content-length, etc, after the $r->filename is set and the handler
> declines). When delivering normally I noticed that there is no
> Content-Length header (although HTTP/1.1 connections used chunked
> encoding which is preferable to content length anyway).
>
The content-length was different because the content returned was
different. See below.
>> hence, if only the DOM is returned, there may be unexpected results.
>
>
> AFAIK my patches should change this situation at all. At the moment a
> language module can return a dom (via pnotes), or just a string. And
> the next module has the opportunity to look for the dom or the string in
> pnotes. So are you saying that there are actually differences in the
> output under the new patches, because that seems strange...
It does change it, though. Your patched version always returns a DOM
instance, the current version serializes the result to a string using
the XSLT processor's output_string( $result ) if the current process is
$last_in_chain. The difference is that the latter *alters* the output
based on options that may be set in an <xsl:output/> element in the
stylesheet itself. The last stylesheet in my test suite had
<xsl:output mode="html"/>
(which is fairly common) so, when your patched version just returned the
DOM result, the XSLT processor didn't get a chance to do its HTML mode
serialization tricks (stripping of the XML declaration, adding meta
headers for encoding, etc.) like it does in the current
Language::LibXSLT-- hence the results are different,
Its not a big deal, all it takes is checking $last_in_chain and
returning the result of output_string( $result ) if its defined instead
of always returning the DOM.
> I don't see any reason why AxKit should serialise and re-parse at each
> stage. If a language processor is known to be broken wrt using a
> prebuilt DOM, it can query the provider for get_strref and then parse
> it's own DOM internally. It would be silly to impose that overhead if
> the modules being used don't need it. It would also encourage people to
> fix bugs that stop it working properly.
Agreed. I didn't want it to go away either. The combination of the above
XSLT output annoyance and Simon Woodside's recent "bug" w.r.t.
serializing default namespaces made me a little touchy. What can I say,
I'll try the decaf...
>
>> * It wasn't clear to me whether or not the new Language module design
>> allows direct output to the client if the current processor is the
>> last in the chain. If not, this presumption puts limits on what can be
>> integrated as last-in-chain Language modules that may need access to
>> the Apache object to return content appropriately.
>
>
> The last-in-chain module can't output directly to the client, since the
> response should be returned to AxKit for it to deliver. But I don't
> think the previous code did either. I noticed some modules (eg.
> LibXSLT) took the effort of serialising the DOM and printing it if
> last-in-chain was in effect, but the AxKit::Apache module simply
> redirects the print and puts the data in pnotes('xml_string') anyway -
> so it's still not dynamic (and is thus kind of pointless....maybe this
> is historic?).
Er, that is pretty silly, actually...
>
>> * Bolting on the result of the transformation to the return code in
>> the new Language modules (e.g. return (Apache::Constants::OK,
>> \%results); ) seems a smidge hacky to me. They should return one or
>> the other, both is an ugly mix of coding strategies.
>
>
> It is a smidge hacky I agree. I did it to be backwards compatible. I
> guess the same could be achieved by returning a hashref with a 'status'
> member, and then detecting whether a scalar or a hash ref was returned
> by the processor back in AxKit.
+1
>
>> So, I guess the sum of my evaluation is "I'm not sure". There are
>> definitely some good things in the proposed patches, but I think they
>> go too far in some places.
>
>
> I'm not sure you've really said where in particular they 'go to far'...
Well, really the only part I was worried about was shutting down direct
access to sending data to the client from the Language modules, but,
given that its only there in an illusory way now anyway, it really
doesn't matter.
.
>
>> The questions that need to be answered are:
>>
>> In general, does switching to incremental caching give us something
>> that we don't already have, or is it arguably a better generic
>> solution than the current all-or-nothing implementation (especially in
>> light of the fact that the typical use-case seems to put the dynamic
>> parts at the front of the processing chain)?
>
>
> Well, it buys you advantages in the cases where dynamic parts are later
> in the chain. And I still haven't heard an overly convincing argument
> for why that shouldn't be a far more appropriate thing in many cases.
> My view is that XSP by nature is a form of 'styling', and should be
> added in rather than being in the original XML document.
XSP (eXtensible Server Pages) is all about *generating* content. It is
not a transformative processor in the way that XSLT and XPathScript are.
There is no XSP stylesheet that gets applied to a source document and
there's nothing in the language itself that presumes an input document..
Markup, inlined code, and taglibs (markup that maps to code) are
combined in one document and the only "transformation" happens when the
XSP processor executes the inlined or taglib-added code.
So, unless you mean "a pipeline of styles that generates an XSP page
that gets executed at the end to generate further content" I'm not sure
what you mean by using XSP as "form of styling". Its not that you
*couldn't* do things that way, but its hardly the common case. The
XSP(to generate content)->XSLT(to style it for the given client) pattern
is far more common.
Here's what I suggest:
1) Accept the patches.
2) Re-implement the current all-or-nothing" caching behavior of the
current Cache.pm using the new interfaces and keep this behavior as the
default.
3) Ship Cache::Incremental (or whatever) as a standard alternative.
Reasonable? Doable? Anyone have thoughts?
>
> [ all: please change the subject to create a new thread if you want to
> discuss this point with me further. The initial discussion on this was
> on -users with the topic 'Adding XSP to an XSLT pipeline'. ]
This discussion is not appropriate for the user's list; that's why I
brought it here. If you want to continue in the conversation you are
certainly most welcome and what you do with the subject line in your
replies is your business. :-)
-kip
Re: Incremental Caching Patches [Long]
Posted by Chris Leishman <ch...@leishman.org>.
On Wednesday, April 9, 2003, at 03:24 PM, Kip Hampton wrote:
<snip>
> Some things to note and random ideas:
>
> * There's small increase in requests per second with caching turned
> off, but a modest loss with caching turned on.
This agrees with my assessment. Of course you should also test with
dynamic objects later in the pipeline (eg. XSLT -> XSLT -> XSP ). In
those situations you should see very marked increases in performance as
the first two stages are entirely cacheable.
> * The content-length returned from requests to the CVS and patched is
> different.
Really? I didn't notice that... AFAIK a Content-Length header is only
actually returned when delivering from a cache file (apache calculates
the content-length, etc, after the $r->filename is set and the handler
declines). When delivering normally I noticed that there is no
Content-Length header (although HTTP/1.1 connections used chunked
encoding which is preferable to content length anyway).
> The changes to the way data is returned from the LibXSLT Language
> module in the new Provider/Cache/Language interactions further exposes
> differences in the document returned from XML::LibXSLT when the result
> is selected as DOM vs.as a string. That is, given:
<snip>
> hence, if only the DOM is returned, there may be unexpected results.
AFAIK my patches should change this situation at all. At the moment a
language module can return a dom (via pnotes), or just a string. And
the next module has the opportunity to look for the dom or the string
in pnotes. So are you saying that there are actually differences in
the output under the new patches, because that seems strange...
> This is not specifically related to Chris' proposed patches directly,
> but it does point to the fact that we my have rethink the very idea of
> passing around a DOM tree between Provider and Language modules and
> between different Language modules in the processing chain. The sad
> fact is that if we are to adhere to the notion of least surprise for
> AxKit users, we may need to fully serialize and re-parse at every
> processing stage in order to allow each processor the chance to
> perform whatever serialization magic it needs to to deliver the proper
> result. .
I don't see any reason why AxKit should serialise and re-parse at each
stage. If a language processor is known to be broken wrt using a
prebuilt DOM, it can query the provider for get_strref and then parse
it's own DOM internally. It would be silly to impose that overhead if
the modules being used don't need it. It would also encourage people
to fix bugs that stop it working properly.
> * It wasn't clear to me whether or not the new Language module design
> allows direct output to the client if the current processor is the
> last in the chain. If not, this presumption puts limits on what can be
> integrated as last-in-chain Language modules that may need access to
> the Apache object to return content appropriately.
The last-in-chain module can't output directly to the client, since the
response should be returned to AxKit for it to deliver. But I don't
think the previous code did either. I noticed some modules (eg.
LibXSLT) took the effort of serialising the DOM and printing it if
last-in-chain was in effect, but the AxKit::Apache module simply
redirects the print and puts the data in pnotes('xml_string') anyway -
so it's still not dynamic (and is thus kind of pointless....maybe this
is historic?).
> * Bolting on the result of the transformation to the return code in
> the new Language modules (e.g. return (Apache::Constants::OK,
> \%results); ) seems a smidge hacky to me. They should return one or
> the other, both is an ugly mix of coding strategies.
It is a smidge hacky I agree. I did it to be backwards compatible. I
guess the same could be achieved by returning a hashref with a 'status'
member, and then detecting whether a scalar or a hash ref was returned
by the processor back in AxKit.
> So, I guess the sum of my evaluation is "I'm not sure". There are
> definitely some good things in the proposed patches, but I think they
> go too far in some places.
I'm not sure you've really said where in particular they 'go to far'....
> The questions that need to be answered are:
>
> In general, does switching to incremental caching give us something
> that we don't already have, or is it arguably a better generic
> solution than the current all-or-nothing implementation (especially in
> light of the fact that the typical use-case seems to put the dynamic
> parts at the front of the processing chain)?
Well, it buys you advantages in the cases where dynamic parts are later
in the chain. And I still haven't heard an overly convincing argument
for why that shouldn't be a far more appropriate thing in many cases.
My view is that XSP by nature is a form of 'styling', and should be
added in rather than being in the original XML document.
[ all: please change the subject to create a new thread if you want to
discuss this point with me further. The initial discussion on this was
on -users with the topic 'Adding XSP to an XSLT pipeline'. ]
> Do the proposed patches make it easier for users to write custom Cache
> modules? How or how not?
My thoughts (for what it's worth) is that it shouldn't make too much
difference to the previous way. The basic way the caching works
internally is pretty much the same - it's just the interface that's
changed to make it fit better with the whole 'Cache is a provider too'
idea.
Regards,
Chris
Re: devel list (was Re: Incremental Caching Patches [Long])
Posted by Chris Leishman <ch...@leishman.org>.
On Friday, April 11, 2003, at 02:24 PM, Robin Berjon wrote:
<snip>
> That one is dead, but I thought we had carried the subscriber base
> over. I guess perhaps not and you missed the switch message.
I probably signed up after the switch message since I just got the
address of the axkit.org website. I've unsubscribed from that list and
now I'm on the 'real' one.
Regards,
Chris
Re: devel list (was Re: Incremental Caching Patches [Long])
Posted by Robin Berjon <ro...@knowscape.com>.
Chris Leishman wrote:
> On Wednesday, April 9, 2003, at 03:24 PM, Kip Hampton wrote:
>> Okay, I finally had a chance to install the proposed caching patches
>> that came from Chris L through the users' list (CC'ing you, Chris since
>> I'm not sure if you're subbed here or not).
>
> I though I was subscribed - but it turns out there are actually two
> devel lists:
>
> axkit-devel@axkit.org (which is, it seems, not actually used)
That one is dead, but I thought we had carried the subscriber base over. I guess
perhaps not and you missed the switch message.
> Someone should probably update the mailing lists page on axkit.org so
> that it references the correct list.
Ick, yes!
--r
Re: devel list (was Re: Incremental Caching Patches [Long])
Posted by Matt Sergeant <ma...@sergeant.org>.
On Fri, 11 Apr 2003, Chris Leishman wrote:
> On Wednesday, April 9, 2003, at 03:24 PM, Kip Hampton wrote:
> <snip>
> > Okay, I finally had a chance to install the proposed caching patches
> > that came from Chris L through the users' list (CC'ing you, Chris since
> > I'm not sure if you're subbed here or not).
>
> I though I was subscribed - but it turns out there are actually two
> devel lists:
>
> axkit-devel@axkit.org (which is, it seems, not actually used)
>
> and this one:
>
> axkit-dev@xml.apache.org
>
>
> Someone should probably update the mailing lists page on axkit.org so
> that it references the correct list.
Done. My bad.
--
<!-- Matt -->
<:->get a SMart net</:->
Spam trap - do not mail: spam-sig@spamtrap.messagelabs.com
devel list (was Re: Incremental Caching Patches [Long])
Posted by Chris Leishman <ch...@leishman.org>.
On Wednesday, April 9, 2003, at 03:24 PM, Kip Hampton wrote:
<snip>
> Okay, I finally had a chance to install the proposed caching patches
> that came from Chris L through the users' list (CC'ing you, Chris since
> I'm not sure if you're subbed here or not).
I though I was subscribed - but it turns out there are actually two
devel lists:
axkit-devel@axkit.org (which is, it seems, not actually used)
and this one:
axkit-dev@xml.apache.org
Someone should probably update the mailing lists page on axkit.org so
that it references the correct list.
Regards,
Chris