You are viewing a plain text version of this content. The canonical link for it is here.

Posted to axkit-dev@xml.apache.org by Kip Hampton <kh...@totalcinema.com> on 2003/04/11 12:42:28 UTC

Re: Incremental Caching Patches

Chris Leishman wrote:

> Really?  I didn't notice that...  AFAIK a Content-Length header is only 
> actually returned when delivering from a cache file (apache calculates 
> the content-length, etc, after the $r->filename is set and the handler 
> declines).  When delivering normally I noticed that there is no 
> Content-Length header (although HTTP/1.1 connections used chunked 
> encoding which is preferable to content length anyway).
> 

The content-length was different because the content returned was 
different. See below.

>> hence, if only the DOM is returned, there may be unexpected results.
> 
> 
> AFAIK my patches should change this situation at all.  At the moment a 
> language module can return a dom (via pnotes), or just a string.  And 
> the next module has the opportunity to look for the dom or the string in 
> pnotes.  So are you saying that there are actually differences in the 
> output under the new patches, because that seems strange...

It does change it, though. Your patched version always returns a DOM 
instance, the current version serializes the result to a string using 
the XSLT processor's output_string( $result ) if the current process is 
$last_in_chain. The difference is that the latter *alters* the output 
based on options that may be set in an <xsl:output/> element in the 
stylesheet itself. The last stylesheet in my test suite had

<xsl:output mode="html"/>

(which is fairly common) so, when your patched version just returned the 
DOM result, the XSLT processor didn't get a chance to do its HTML mode 
serialization tricks (stripping of the XML declaration, adding meta 
headers for encoding, etc.) like it does in the current 
Language::LibXSLT-- hence the results are different,

Its not a big deal, all it takes is checking $last_in_chain and 
returning the result of output_string( $result ) if its defined instead 
of always returning the DOM.

> I don't see any reason why AxKit should serialise and re-parse at each 
> stage.  If a language processor is known to be broken wrt using a 
> prebuilt DOM, it can query the provider for get_strref and then parse 
> it's own DOM internally.  It would be silly to impose that overhead if 
> the modules being used don't need it.  It would also encourage people to 
> fix bugs that stop it working properly.

Agreed. I didn't want it to go away either. The combination of the above 
XSLT output annoyance and Simon Woodside's recent "bug" w.r.t. 
serializing default namespaces made me a little touchy. What can I say, 
I'll try the decaf...

> 
>> * It wasn't clear to me whether or not the new Language module design 
>> allows direct output to the client if the current processor is the 
>> last in the chain. If not, this presumption puts limits on what can be 
>> integrated as last-in-chain Language modules that may need access to 
>> the Apache object to return content appropriately.
> 
> 
> The last-in-chain module can't output directly to the client, since the 
> response should be returned to AxKit for it to deliver.  But I don't 
> think the previous code did either.  I noticed some modules (eg. 
> LibXSLT) took the effort of serialising the DOM and printing it if 
> last-in-chain was in effect, but the AxKit::Apache module simply 
> redirects the print and puts the data in pnotes('xml_string') anyway - 
> so it's still not dynamic (and is thus kind of pointless....maybe this 
> is historic?).

Er, that is pretty silly, actually...
> 
>> * Bolting on the result of the transformation to the return code in 
>> the new Language modules (e.g. return (Apache::Constants::OK, 
>> \%results); ) seems a smidge hacky to me. They should return one or 
>> the other, both is an ugly mix of coding strategies.
> 
> 
> It is a smidge hacky I agree.  I did it to be backwards compatible.  I 
> guess the same could be achieved by returning a hashref with a 'status' 
> member, and then detecting whether a scalar or a hash ref was returned 
> by the processor back in AxKit.

+1

> 
>> So, I guess the sum of my evaluation is "I'm not sure". There are 
>> definitely some good things in the proposed patches, but I think they 
>> go too far in some places.
> 
> 
> I'm not sure you've really said where in particular they 'go to far'...

Well, really the only part I was worried about was shutting down direct 
access to sending data to the client from the Language modules, but, 
given that its only there in an illusory way now anyway, it really 
doesn't matter.
.
> 
>> The questions that need to be answered are:
>>
>> In general, does switching to incremental caching give us something 
>> that we don't already have, or is it arguably a better generic 
>> solution than the current all-or-nothing implementation (especially in 
>> light of the fact that the typical use-case seems to put the dynamic 
>> parts at the front of the processing chain)?
> 
> 
> Well, it buys you advantages in the cases where dynamic parts are later 
> in the chain.  And I still haven't heard an overly convincing argument 
> for why that shouldn't be a far more appropriate thing in many cases.  
> My view is that XSP by nature is a form of 'styling', and should be 
> added in rather than being in the original XML document.

XSP (eXtensible Server Pages) is all about *generating* content. It is 
not a transformative processor in the way that XSLT and XPathScript are. 
There is no XSP stylesheet that gets applied to a source document and 
there's nothing in the language itself that presumes an input document.. 
Markup, inlined code, and taglibs (markup that maps to code) are 
combined in one document and the only "transformation" happens when the 
XSP processor executes the inlined or taglib-added code.

So, unless you mean "a pipeline of styles that generates an XSP page 
that gets executed at the end to generate further content" I'm not sure 
what you mean by using XSP as "form of styling". Its not that you 
*couldn't* do things that way, but its hardly the common case. The 
XSP(to generate content)->XSLT(to style it for the given client) pattern 
is far more common.

Here's what I suggest:

1) Accept the patches.
2) Re-implement the current all-or-nothing" caching behavior of the 
current Cache.pm using the new interfaces and keep this behavior as the 
default.
3) Ship Cache::Incremental (or whatever) as a standard alternative.

Reasonable? Doable? Anyone have thoughts?

> 
> [ all: please change the subject to create a new thread if you want to 
> discuss this point with me further.  The initial discussion on this was 
> on -users with the topic 'Adding XSP to an XSLT pipeline'. ]

This discussion is not appropriate for the user's list; that's why I 
brought it here. If you want to continue in the conversation you are 
certainly most welcome and what you do with the subject line in your 
replies is your business. :-)

-kip

Re: Incremental Caching Patches

Posted by Robin Berjon <ro...@knowscape.com>.

Kip Hampton wrote:
> Here's what I suggest:
> 
> 1) Accept the patches.
> 2) Re-implement the current all-or-nothing" caching behavior of the 
> current Cache.pm using the new interfaces and keep this behavior as the 
> default.
> 3) Ship Cache::Incremental (or whatever) as a standard alternative.
> 
> Reasonable? Doable? Anyone have thoughts?

+1 for me

--r

Re: Incremental Caching Patches [plus offtopic]

Posted by Jörg Walter <eh...@ich.bin.kein.hoschi.de>.

Hi AxDevers...

After being submerged in my new contractor's office, which was getting hacked 
by some annoying trojan right away and cleaning stuff up, I'll finally give 
my 2 cents to this issue.

On Friday, 11. April 2003 13:24, Chris Leishman wrote:

> >> It is a smidge hacky I agree.  I did it to be backwards compatible.
> >> I guess the same could be achieved by returning a hashref with a
> >> 'status' member, and then detecting whether a scalar or a hash ref
> >> was returned by the processor back in AxKit.
> >
> > +1

Sounds reasonable.

> > XSP (eXtensible Server Pages) is all about *generating* content. It is
> > not a transformative processor in the way that XSLT and XPathScript
> > are. There is no XSP stylesheet that gets applied to a source document
> > and there's nothing in the language itself that presumes an input
> > document.. Markup, inlined code, and taglibs (markup that maps to
> > code) are combined in one document and the only "transformation"
> > happens when the XSP processor executes the inlined or taglib-added
> > code.
> >
> > So, unless you mean "a pipeline of styles that generates an XSP page
> > that gets executed at the end to generate further content" I'm not
> > sure what you mean by using XSP as "form of styling". Its not that you
> > *couldn't* do things that way, but its hardly the common case. The
> > XSP(to generate content)->XSLT(to style it for the given client)
> > pattern is far more common.
>
> Well, my way of thinking about XML based content management, is that
> everything should start, where possible, from a simple XML document
> that contains only the data of interest.  Eg, if I wanted to display a
> page that has a list of blog entries, then I'd start with some XML
> similar to RSS - or if I wanted an article on my site, I might start
> with a document in docbook format.  These documents would either come
> from a file or somewhere else based on the initial Provider AxKit uses.

I have been doing quite heavy XSLT->XSP->... processing. It boils down to some 
data document (be it from the filesystem or generated from SQL via a nice 
provider I did, or retrieved from XMLDB, whatever...) that is the 'object' of 
some processing. A user might want to edit it, because it's his own. A user 
might want to add a comment, because he is interested in it, whatever.
So there are actually two stylesheet directives to apply one logical 
transformation: The XSLT which converts the input data to some XSP page 
usually using some taglib which was written for that very purpose, and the 
XSP transformation after that. The result is some internal XML which is later 
transformed into the correct HTML forms, links, whatever.
So this is definitely useful, especially since it makes you independent of 
storage. I transferred the whole site from XMLDB to SQL in a matter of a few 
days, including writing/extending the SQL provider. Re-doing all that in ESQL 
would have been deadly.

Yet, let me take you off topic. Matt already said it: XSP is no styling 
language per se. This is a problem. The split approach can get very 
confusing. XSLT doesn't give you enough access to foreign data, even using 
perl extenstion functions, or allows you to leverage perl's module library 
easily. XPathScript allows you to do fairly complex processing, but lacks 
XSP's mix-and-match taglib flexibility. People actually doing XSP-after-XSLT 
are a good indicator that something is indeed missing. I agree with Matt that 
XSP was meant to *generate* content, but the world seems to want something 
between XSP, XSLT and XPathScript for active transformations. At least I want 
;-) XPathScript with taglibs and some selected XSLT constructs, sort of.

> Maybe I'm just being a content purist though ;-)

Oh, you are not alone. I am purist of everything that makes the job easier or 
cooler ;-)

> > Here's what I suggest:
> >
> > 1) Accept the patches.
> > 2) Re-implement the current all-or-nothing" caching behavior of the
> > current Cache.pm using the new interfaces and keep this behavior as
> > the default.
> > 3) Ship Cache::Incremental (or whatever) as a standard alternative.
> >
> > Reasonable? Doable? Anyone have thoughts?

+1 from me. Variations as discussed elsewhere optionally included, I have 
never had much insight in the caching system, so I am not much of a judge.

-- 
CU
  Joerg

PGP Public Key at http://ich.bin.kein.hoschi.de/~trouble/public_key.asc
PGP Key fingerprint = D34F 57C4 99D8 8F16 E16E  7779 CDDC 41A4 4C48 6F94

Re: XSP in a pipeline (was Re: Incremental Caching Patches)

Posted by Chris Leishman <ch...@leishman.org>.

On Monday, April 14, 2003, at 10:09 AM, Matt Sergeant wrote:

> On Sun, 13 Apr 2003, Chris Leishman wrote:
>
>> Actually - the Cocoon guys have covered this topic in their basic
>> concepts introduction:
>>
>> http://xml.apache.org/cocoon/userdocs/concepts/index.html#c2-
>> abstractions
>>
>> Their conclusion is pretty much the same as mine - starting with XSP 
>> in
>> the pipeline is ok for examples but bad for real systems.  Better to
>> use a 'logicsheet' (xslt) to create an xsp page to be processed.  And
>> as soon as people start trying to do this, incremental caching is
>> invaluable...
>
> That's why I say never use a logicsheet ;-)
>
> Cocoon logicsheets are flawed ideas IMHO when faced with AxKit's 
> ability
> to write taglibs.

But I thought you can write taglibs in Cocoon too.....

Regards,
Chris

Re: XSP in a pipeline (was Re: Incremental Caching Patches)

Posted by Matt Sergeant <ma...@sergeant.org>.

On Sun, 13 Apr 2003, Chris Leishman wrote:

> Actually - the Cocoon guys have covered this topic in their basic
> concepts introduction:
>
> http://xml.apache.org/cocoon/userdocs/concepts/index.html#c2-
> abstractions
>
> Their conclusion is pretty much the same as mine - starting with XSP in
> the pipeline is ok for examples but bad for real systems.  Better to
> use a 'logicsheet' (xslt) to create an xsp page to be processed.  And
> as soon as people start trying to do this, incremental caching is
> invaluable...

That's why I say never use a logicsheet ;-)

Cocoon logicsheets are flawed ideas IMHO when faced with AxKit's ability
to write taglibs.

-- 
<!-- Matt -->
<:->get a SMart net</:->
Spam trap - do not mail: spam-sig@spamtrap.messagelabs.com

Re: XSP in a pipeline (was Re: Incremental Caching Patches)

Posted by Chris Leishman <ch...@leishman.org>.

On Friday, April 11, 2003, at 02:53 PM, Chris Leishman wrote:
<snip>
> I guess my way of doing it would be XML -> XSLT (which creates XHTML  
> but also includes XSP tags) -> XSP -> XHTML.
>
> That way the XSLT is still responsible for creating the XHTML and then  
> the XSP just adds in the dynamic elements.  But I could see situations  
> where you might want to use some intermediate XML format and then do a  
> final XSLT to create XHTML.

Actually - the Cocoon guys have covered this topic in their basic  
concepts introduction:

http://xml.apache.org/cocoon/userdocs/concepts/index.html#c2- 
abstractions

Their conclusion is pretty much the same as mine - starting with XSP in  
the pipeline is ok for examples but bad for real systems.  Better to  
use a 'logicsheet' (xslt) to create an xsp page to be processed.  And  
as soon as people start trying to do this, incremental caching is  
invaluable...

Regards,
Chris

XSP in a pipeline (was Re: Incremental Caching Patches)

Posted by Chris Leishman <ch...@leishman.org>.

On Friday, April 11, 2003, at 02:42 PM, Robin Berjon wrote:
<snip>
> There's nothing "wrong" with your approach, but you are using XSP to 
> produce XHTML, and the latter /may/ contain styling information. 
> Producing styled output from XSP is imho a bad idea which is why, in 
> the process you describe, starting from an initial document I might do 
> FooML(source) -> XSLT(adds some XSP elements) -> XSP(processes XSP 
> stuff) -> XSLT(styles).
>
> In such cases, what I've done is XSP(does dynamic stuff *and* includes 
> the real content, usually from path info) -> XSLT(styles).

I guess my way of doing it would be XML -> XSLT (which creates XHTML 
but also includes XSP tags) -> XSP -> XHTML.

That way the XSLT is still responsible for creating the XHTML and then 
the XSP just adds in the dynamic elements.  But I could see situations 
where you might want to use some intermediate XML format and then do a 
final XSLT to create XHTML.

>> Maybe I'm just being a content purist though ;-)
>
> I am, and I tend to have http://foo.org/src/foo/bar.xml to access the 
> content in itself (with no decoration) and 
> http://foo.org/cms.xsp/foo/bar.xml to get the page people want to see.
<snip>

The seems rather hackish to me...

I would have though something like:

http://foo.org/foo/bar.xml                               (gives nice 
XHTML in the style of the site)
http://foo.org/foo/bar.xml?style=printable  (gives another style 
suitable for printing)
http://foo.org/foo/bar.xml?style=raw           (returns the original 
XML document)

or similar would be nicer than separating all the other styles from the 
'raw' one.  And then it saves having to do tricks with path_info...

But each to his own I guess :-)

Regards,
Chris

Re: Incremental Caching Patches

Posted by Robin Berjon <ro...@knowscape.com>.

Chris Leishman wrote:
> On Friday, April 11, 2003, at 01:42 PM, Kip Hampton wrote:
>> It does change it, though. Your patched version always returns a DOM 
>> instance, the current version serializes the result to a string using 
>> the XSLT processor's output_string( $result ) if the current process 
>> is $last_in_chain. The difference is that the latter *alters* the 
>> output based on options that may be set in an <xsl:output/> element in 
>> the stylesheet itself. The last stylesheet in my test suite had
> 
> Ok....thats annoying!  Can someone tell me how XSLT embeds the desired 
> output format into the DOM, so that when 
> XML::LibXSLT::output_string($result) is called, it knows to serialize in 
> a specific manner?

Well, it may be seen as annoying (and I for one hope that mode='html' will cease 
to be useful soon enough) but it's done as per the spec and it thus won't go 
away (besides, I don't really see an alternative solution...).

The info isn't stored in the DOM, it's stored in the stylesheet object. 
$xslt->output_string() looks that up and uses its own serialisation code instead 
of libxml's.

> Then to create something suitable to my site, I might apply some XSLT to 
> convert to XHTML.  But if I wanted to include some dynamic stuff - eg. a 
> 'Welcome <logged in user>' or even a form like 'Add new entry to blog', 
> then I'd want to convert from the original RSS to an XSP page that could 
> output XHTML containing the user details and perhaps the results of 
> processing the parameters in the request...
> 
> I can certainly see situations where there is no initial data source and 
> thus one is generated entirely dynamically via XSP based on the request 
> parameters or similar - but I would've thought it more common to have 
> some initial XML to work with.

There's nothing "wrong" with your approach, but you are using XSP to produce 
XHTML, and the latter /may/ contain styling information. Producing styled output 
from XSP is imho a bad idea which is why, in the process you describe, starting 
from an initial document I might do FooML(source) -> XSLT(adds some XSP 
elements) -> XSP(processes XSP stuff) -> XSLT(styles).

In such cases, what I've done is XSP(does dynamic stuff *and* includes the real 
content, usually from path info) -> XSLT(styles).

> Maybe I'm just being a content purist though ;-)

I am, and I tend to have http://foo.org/src/foo/bar.xml to access the content in 
itself (with no decoration) and http://foo.org/cms.xsp/foo/bar.xml to get the 
page people want to see.

--r

Re: Incremental Caching Patches

Posted by Chris Leishman <ch...@leishman.org>.

On Friday, April 11, 2003, at 01:42 PM, Kip Hampton wrote:
<snip>
> The content-length was different because the content returned was 
> different. See below.
<snip>

Ok...

> It does change it, though. Your patched version always returns a DOM 
> instance, the current version serializes the result to a string using 
> the XSLT processor's output_string( $result ) if the current process 
> is $last_in_chain. The difference is that the latter *alters* the 
> output based on options that may be set in an <xsl:output/> element in 
> the stylesheet itself. The last stylesheet in my test suite had
<snip>

Ok....thats annoying!  Can someone tell me how XSLT embeds the desired 
output format into the DOM, so that when 
XML::LibXSLT::output_string($result) is called, it knows to serialize 
in a specific manner?

> Its not a big deal, all it takes is checking $last_in_chain and 
> returning the result of output_string( $result ) if its defined 
> instead of always returning the DOM.

Rather than this, it could be better to keep passing the DOM back (for 
consistancy) - but then at the final serialization stage (when sending 
to the client) we use a serialization method that respects the desired 
output format attached to the DOM.  Not that it really matters.....

>> It is a smidge hacky I agree.  I did it to be backwards compatible.  
>> I guess the same could be achieved by returning a hashref with a 
>> 'status' member, and then detecting whether a scalar or a hash ref 
>> was returned by the processor back in AxKit.
>
> +1

Ok....we'll use that way then.  Makes no difference to me :-)

> Well, really the only part I was worried about was shutting down 
> direct access to sending data to the client from the Language modules, 
> but, given that its only there in an illusory way now anyway, it 
> really doesn't matter.

Cool.

> XSP (eXtensible Server Pages) is all about *generating* content. It is 
> not a transformative processor in the way that XSLT and XPathScript 
> are. There is no XSP stylesheet that gets applied to a source document 
> and there's nothing in the language itself that presumes an input 
> document.. Markup, inlined code, and taglibs (markup that maps to 
> code) are combined in one document and the only "transformation" 
> happens when the XSP processor executes the inlined or taglib-added 
> code.
>
> So, unless you mean "a pipeline of styles that generates an XSP page 
> that gets executed at the end to generate further content" I'm not 
> sure what you mean by using XSP as "form of styling". Its not that you 
> *couldn't* do things that way, but its hardly the common case. The 
> XSP(to generate content)->XSLT(to style it for the given client) 
> pattern is far more common.

Well, my way of thinking about XML based content management, is that 
everything should start, where possible, from a simple XML document 
that contains only the data of interest.  Eg, if I wanted to display a 
page that has a list of blog entries, then I'd start with some XML 
similar to RSS - or if I wanted an article on my site, I might start 
with a document in docbook format.  These documents would either come 
from a file or somewhere else based on the initial Provider AxKit uses.

Then to create something suitable to my site, I might apply some XSLT 
to convert to XHTML.  But if I wanted to include some dynamic stuff - 
eg. a 'Welcome <logged in user>' or even a form like 'Add new entry to 
blog', then I'd want to convert from the original RSS to an XSP page 
that could output XHTML containing the user details and perhaps the 
results of processing the parameters in the request...

I can certainly see situations where there is no initial data source 
and thus one is generated entirely dynamically via XSP based on the 
request parameters or similar - but I would've thought it more common 
to have some initial XML to work with.

Maybe I'm just being a content purist though ;-)

> Here's what I suggest:
>
> 1) Accept the patches.
> 2) Re-implement the current all-or-nothing" caching behavior of the 
> current Cache.pm using the new interfaces and keep this behavior as 
> the default.
> 3) Ship Cache::Incremental (or whatever) as a standard alternative.
>
> Reasonable? Doable? Anyone have thoughts?

I think it's do-able.  Stage 3 wouldn't be done via a different Cache 
module though - it would have to be via some sort of option 
(AxIncrementalCache On/Off ?).

>> [ all: please change the subject to create a new thread if you want 
>> to discuss this point with me further.  The initial discussion on 
>> this was on -users with the topic 'Adding XSP to an XSLT pipeline'. ]
>
> This discussion is not appropriate for the user's list; that's why I 
> brought it here. If you want to continue in the conversation you are 
> certainly most welcome and what you do with the subject line in your 
> replies is your business. :-)

I was actually referring to my comments on the pros/cons of putting XSP 
earlier or later in a pipeline....thats really a different discussion 
from whether the patches are good or not.

Regards,
Chris