You are viewing a plain text version of this content. The canonical link for it is here.
Posted to docs@cocoon.apache.org by Diana Shannon <sh...@apache.org> on 2002/07/09 17:14:41 UTC

FAQ and snippet, document() function

Appropriate use of the document() function in Cocoon represents one of 
the
major conceptual FAQs among XSLT users migrating to Cocoon. Recent 
discussions
on cocoon-dev and cocoon-users have shed a lot of light on the subject.

Below is a draft FAQ, based on Stefano and Jan's recent dicussions on 
cocoon-dev.
Within it you will also find a pointer to a proposed snippet ( Conal 
Tuohy's recent
post on cocoon-users) which provides an xinclude-based alternative to 
the document() function.

Please help to review both and comment here (or cocoon-users) as 
necessary.

-- Diana


Q. What's "wrong" with use of the document() function in Cocoon?

A. Using the document() function for aggregation in Cocoon breaks
Separation of Concerns (SoC). That is, the designers of Cocoon
view inclusion and transformation as different functions, best
handled by separate Cocoon components. Treating them
separately allows you to achieve performance gains and increases
the resusability of your pipelines.

Alternatives to the document() in the Cocoon environment include
aggregation or the use of a multi-stage transformation using the
XInclude Transformer. This involves transforming a list of documents
(generated dynamically or statically) by adding xinclude elements which
reference (via xpointer) specific document content, and then transforming
again via the XInclude Transformer, to obtain the desired result.

For an example of this, see:
  (Proposed Snippet)
   http://marc.theaimsgroup.com/?l=xml-cocoon-users&m=102617106411067&w=2

You'll achieve better performance if you aggregate content prior to 
transformation.
This allows you to take full advantage of Cocoon's pipeline caching. In 
contrast,
making dynamic document() calls inside an XSLT within a cached pipeline 
is problematic.
At this time, Cocoon does not recognize changes in documents (called by 
the document() function)
until the requested page expires from cache.

Understand that the document() function was designed *before* xinclude
with xpointer facilities existed. Had such capabilities been available,
perhaps the document() function, which essentially mimics xinclude and 
xpointer,
would have never been added to XSLT.

Please note that if you must work with your XML files outside of the
Cocoon environment as well, you may need to use the document() function
in order to utilize the limited capabilities of other pipeline engines.
This includes engines which are not xinclude-capable or which
lack a predefined way to indicate document processing steps.


RE: FAQ and snippet, document() function

Posted by Conal Tuohy <co...@paradise.net.nz>.
That looks really good to me. Go for it!

I had promised to post an example of recursive aggregation using document,
but I've had some problems which I'm still investigating. It looks to me
like the document() function is actually broken (in 2.0.2), for a start. I'm
going to investigate this further, but in the meantime, I won't be posting
this example. :-(

Con






> -----Original Message-----
> From: Diana Shannon [mailto:shannon@apache.org]
> Sent: Tuesday, 23 July 2002 11:35
> To: cocoon-docs@xml.apache.org
> Cc: conalt@paradise.net.nz
> Subject: Re: FAQ and snippet, document() function
>
>
> Here's what I hope to be a less "dogmatic" rewrite, thanks
> to input from Conal and Nicola Ken. I added the term "may" in
> the first sentence of the opening "A." paragraph. I also included
> advice for those who need to use of the document() function in
> Cocoon toward the end.
>
> |---------------- start ---------------------|
>
> Q. What's "wrong" with use of the document() function in Cocoon?
>
> A. Using the document() function for aggregation in Cocoon may break
> Separation of Concerns (SoC). That is, the designers of Cocoon
> view inclusion and transformation as different functions, best
> handled by separate Cocoon components. Treating them
> separately allows you to achieve performance gains and increases
> the resusability of your pipelines.
>
> Alternatives to the document() in the Cocoon environment include
> aggregation or the use of a multi-stage transformation using the
> XInclude Transformer. This involves transforming a list of documents
> (generated dynamically or statically) by adding xinclude
> elements which
> reference (via xpointer) specific document content, and then
> transforming
> again via the XInclude Transformer, to obtain the desired result.
>
> For an example of this, see:
>
> http://marc.theaimsgroup.com/?l=xml-cocoon-users&m=102617106411067&w=2
>
> You'll achieve better performance if you aggregate content prior to
> transformation.
> This allows you to take full advantage of Cocoon's pipeline
> caching. In
> contrast,
> making dynamic document() calls inside an XSLT within a
> cached pipeline
> is problematic.
> At this time, Cocoon does not recognize changes in documents
> (called by
> the document() function)
> until the requested page expires from cache.
>
> Understand that the document() function was designed *before* xinclude
> with xpointer facilities existed. Had such capabilities been
> available,
> perhaps the document() function, which essentially mimics
> xinclude and
> xpointer,
> would have never been added to XSLT.
>
> Please note that if you must work with your XML files outside of the
> Cocoon environment, you may need to use the document() function
> in order to utilize the limited capabilities of other
> pipeline engines.
> This includes engines which are not xinclude-capable or which
> lack a predefined way to indicate document processing steps. If you
> are working with legacy code from non-pipelined engines, you
> may need to
> use
> the document() function as well, at least initially.
>
> If you do use the document() function in Cocoon, you can
> still observe
> SoC by
> having separate XSLT stylesheets perform inclusion and transformation
> functions.
> For example, you can put multiple XSLT transforms in a
> pipeline and have
> the
> first one perform inclusion and the second one perform
> transformation.
> However,
> be mindful of some unresolved caching issues in Cocoon related to the
> document() function.
> At this time, Cocoon is unable to check validity of content
> included via
> the document()
> function. In addition, the document() function implemented by
> Xalan is
> inefficient. See:
>    http://nagoya.apache.org/bugzilla/show_bug.cgi?id=4257
> Until this bug is fixed, consider using Saxon instead for document()
> function-related parsing
> needs.
>
> For other aggregation/inclusion approaches, please stay tuned for
> XpathDirectoryGenerator (2.1 scratchpad),
> as well as Forrest's Libre (currently alpha in the Forrest cvs).
>
> |---------------- end ---------------------|
>
> If this is acceptable to everyone, I'll commit it in the XSLT FAQs.
>
> Thanks again to Nicola Ken and Conal.
>
> -- Diana
>
>


Re: FAQ and snippet, document() function

Posted by Diana Shannon <sh...@apache.org>.
Here's what I hope to be a less "dogmatic" rewrite, thanks
to input from Conal and Nicola Ken. I added the term "may" in
the first sentence of the opening "A." paragraph. I also included
advice for those who need to use of the document() function in
Cocoon toward the end.

|---------------- start ---------------------|

Q. What's "wrong" with use of the document() function in Cocoon?

A. Using the document() function for aggregation in Cocoon may break
Separation of Concerns (SoC). That is, the designers of Cocoon
view inclusion and transformation as different functions, best
handled by separate Cocoon components. Treating them
separately allows you to achieve performance gains and increases
the resusability of your pipelines.

Alternatives to the document() in the Cocoon environment include
aggregation or the use of a multi-stage transformation using the
XInclude Transformer. This involves transforming a list of documents
(generated dynamically or statically) by adding xinclude elements which
reference (via xpointer) specific document content, and then transforming
again via the XInclude Transformer, to obtain the desired result.

For an example of this, see:
   http://marc.theaimsgroup.com/?l=xml-cocoon-users&m=102617106411067&w=2

You'll achieve better performance if you aggregate content prior to 
transformation.
This allows you to take full advantage of Cocoon's pipeline caching. In 
contrast,
making dynamic document() calls inside an XSLT within a cached pipeline 
is problematic.
At this time, Cocoon does not recognize changes in documents (called by 
the document() function)
until the requested page expires from cache.

Understand that the document() function was designed *before* xinclude
with xpointer facilities existed. Had such capabilities been available,
perhaps the document() function, which essentially mimics xinclude and 
xpointer,
would have never been added to XSLT.

Please note that if you must work with your XML files outside of the
Cocoon environment, you may need to use the document() function
in order to utilize the limited capabilities of other pipeline engines.
This includes engines which are not xinclude-capable or which
lack a predefined way to indicate document processing steps. If you
are working with legacy code from non-pipelined engines, you may need to 
use
the document() function as well, at least initially.

If you do use the document() function in Cocoon, you can still observe 
SoC by
having separate XSLT stylesheets perform inclusion and transformation 
functions.
For example, you can put multiple XSLT transforms in a pipeline and have 
the
first one perform inclusion and the second one perform transformation. 
However,
be mindful of some unresolved caching issues in Cocoon related to the 
document() function.
At this time, Cocoon is unable to check validity of content included via 
the document()
function. In addition, the document() function implemented by Xalan is 
inefficient. See:
   http://nagoya.apache.org/bugzilla/show_bug.cgi?id=4257
Until this bug is fixed, consider using Saxon instead for document() 
function-related parsing
needs.

For other aggregation/inclusion approaches, please stay tuned for 
XpathDirectoryGenerator (2.1 scratchpad),
as well as Forrest's Libre (currently alpha in the Forrest cvs).

|---------------- end ---------------------|

If this is acceptable to everyone, I'll commit it in the XSLT FAQs.

Thanks again to Nicola Ken and Conal.

-- Diana


Re: FAQ and snippet, document() function

Posted by Nicola Ken Barozzi <ni...@apache.org>.

Conal Tuohy wrote:
>>-----Original Message-----
>>From: Nicola Ken Barozzi [mailto:nicolaken@apache.org]
> 
> 
> <snip/>
> 
>>>>You can do many things with XSLT, but also with Java, C,
>>>>Assembler...
>>>>It's just a matter of picking the right tool for the right job.
>>> That's right - that's why we shouldn't tell people "never
>>> se XSLT for
>>> inclusion". That's too dogmatic. Sometimes XSLT is the right tool.
>>
>>It's correct if I tell you not to write your web application in
>>assembler, IMO it's not dogmatic.
>>The fact is that IMO XSLT is not the right tool for this job.
> 
> Well XSLT is not VERY much like assembler ;-)

Some stylesheets though do have the same code clarity ;-)

> I've often found it necessary to use XSLT to transform a source document to
> use <xi:include>, but the tricky thing is when you need to do this
> recursively: to aggregate an entire grove of documents starting from a root
> document ... at each document the XSLT has to identify documents to include,
> and use <xsl:apply-templates select="document(@href)/> to include them
> recursively, or it can create the xi:include elements, but in that case
> (using xi:include) then you must invoke XIncludeTransformer and again the
> XSLT and again the XIncludeTransformer, until the tree is complete.
> 
> In the worst case (this is an actual case I had, in fact), you have a
> linked-list of documents. If using the XIncludeTransformer, to traverse the
> list n documents long (I had n>100, sometimes >200), requires n simultaneous
> invocations of the sitemap, XIncludeTransformer, XSLT transformer, etc. In
> most cases though you would be traversing a much shallower grove.
> 
> I've lost the code I wrote, but I'll recreate it and post it to the list ...
> it was very small and tidy and worked pretty well (apart from the caching
> problem) - it didn't violate SoC in any signifant way I could see.

It seems you didn't because you used solely XSLT as an addvanced xinclude.

>>Well, the main problem with using XSLT for aggregation is
>>that you can
>>simply use also the other things that don't only aggregate,
>>and this is
>>really asks for abuse ;-)
>>
>>Instead, if I define my XSLTI xml syntax, limited to defining
>>tags that
>>include, and then use XSLT underneath as a motor, but without
>>having the
>>user use it directly, I have effectively wrapped it.
> 
> 
> Hmmm ... I think this is an interesting idea. It could be implemented as a
> compiler, with its own pipeline:
> 
> <map:match="xslti-to-xslt/**">
> 	<map:generate src="xslti/**"/>
> 	<map:transform src="xslti-to-xslt-compiler.xslt"/>
> 	<map:serialize/>
> <map:match>
> 
> <!-- and to call the compiler, e.g. when aggregating a website: -->
> <map:match="traverse-website/**">
> 	<map:generate src="{1}" type="html"/>
> 	<map:transform
> src="cocoon:/xslti-to-xslt/html-post-order-tree-traversal.xslti"/>
> 	<map:transform src="
> </map:match>

HA :-)

Interesting solution, why don't you do it?
It could be the first real Cocoon mini block, ie cocoon sitemap that 
becomes a block used by another sitemap.

-- 
Nicola Ken Barozzi                   nicolaken@apache.org
             - verba volant, scripta manent -
    (discussions get forgotten, just code remains)
---------------------------------------------------------------------


RE: FAQ and snippet, document() function

Posted by Conal Tuohy <co...@paradise.net.nz>.
> -----Original Message-----
> From: Nicola Ken Barozzi [mailto:nicolaken@apache.org]

<snip/>

> >>You can do many things with XSLT, but also with Java, C,
> Assembler...
> >>It's just a matter of picking the right tool for the right job.
> >
> > That's right - that's why we shouldn't tell people "never
> use XSLT for
> > inclusion". That's too dogmatic. Sometimes XSLT is the right tool.
>
> It's correct if I tell you not to write your web application in
> assembler, IMO it's not dogmatic.
> The fact is that IMO XSLT is not the right tool for this job.

Well XSLT is not VERY much like assembler ;-)

I've often found it necessary to use XSLT to transform a source document to
use <xi:include>, but the tricky thing is when you need to do this
recursively: to aggregate an entire grove of documents starting from a root
document ... at each document the XSLT has to identify documents to include,
and use <xsl:apply-templates select="document(@href)/> to include them
recursively, or it can create the xi:include elements, but in that case
(using xi:include) then you must invoke XIncludeTransformer and again the
XSLT and again the XIncludeTransformer, until the tree is complete.

In the worst case (this is an actual case I had, in fact), you have a
linked-list of documents. If using the XIncludeTransformer, to traverse the
list n documents long (I had n>100, sometimes >200), requires n simultaneous
invocations of the sitemap, XIncludeTransformer, XSLT transformer, etc. In
most cases though you would be traversing a much shallower grove.

I've lost the code I wrote, but I'll recreate it and post it to the list ...
it was very small and tidy and worked pretty well (apart from the caching
problem) - it didn't violate SoC in any signifant way I could see.

> Well, the main problem with using XSLT for aggregation is
> that you can
> simply use also the other things that don't only aggregate,
> and this is
> really asks for abuse ;-)
>
> Instead, if I define my XSLTI xml syntax, limited to defining
> tags that
> include, and then use XSLT underneath as a motor, but without
> having the
> user use it directly, I have effectively wrapped it.

Hmmm ... I think this is an interesting idea. It could be implemented as a
compiler, with its own pipeline:

<map:match="xslti-to-xslt/**">
	<map:generate src="xslti/**"/>
	<map:transform src="xslti-to-xslt-compiler.xslt"/>
	<map:serialize/>
<map:match>

<!-- and to call the compiler, e.g. when aggregating a website: -->
<map:match="traverse-website/**">
	<map:generate src="{1}" type="html"/>
	<map:transform
src="cocoon:/xslti-to-xslt/html-post-order-tree-traversal.xslti"/>
	<map:transform src="
</map:match>


Re: FAQ and snippet, document() function

Posted by Nicola Ken Barozzi <ni...@apache.org>.
Conal Tuohy wrote:
>>-----Original Message-----
>>From: Nicola Ken Barozzi [mailto:nicolaken@apache.org]
>>Sent: Thursday, 11 July 2002 03:16
>>To: cocoon-docs@xml.apache.org
>>Subject: Re: FAQ and snippet, document() function
> 
> 
> <snip/>
> 
>>You can do many things with XSLT, but also with Java, C, Assembler...
>>It's just a matter of picking the right tool for the right job.
> 
> That's right - that's why we shouldn't tell people "never use XSLT for
> inclusion". That's too dogmatic. Sometimes XSLT is the right tool.

It's correct if I tell you not to write your web application in 
assembler, IMO it's not dogmatic.
The fact is that IMO XSLT is not the right tool for this job.

>>If you find it simpler to do aggregation with XSLT, I think
>>that there
>>is a usage difficulty with XInclude-CInclude stuff; we could
>>simply make
>>a XSLTInclude that wraps the request in an XSLT call, but that's an
>>implementation detail.
> 
> I don't know what you mean by adding a "wrapper" - can you explain?

Well, the main problem with using XSLT for aggregation is that you can 
simply use also the other things that don't only aggregate, and this is 
really asks for abuse ;-)

Instead, if I define my XSLTI xml syntax, limited to defining tags that 
include, and then use XSLT underneath as a motor, but without having the 
user use it directly, I have effectively wrapped it.

-- 
Nicola Ken Barozzi                   nicolaken@apache.org
             - verba volant, scripta manent -
    (discussions get forgotten, just code remains)
---------------------------------------------------------------------


RE: FAQ and snippet, document() function

Posted by Conal Tuohy <co...@paradise.net.nz>.
> -----Original Message-----
> From: Nicola Ken Barozzi [mailto:nicolaken@apache.org]
> Sent: Thursday, 11 July 2002 03:16
> To: cocoon-docs@xml.apache.org
> Subject: Re: FAQ and snippet, document() function

<snip/>

> You can do many things with XSLT, but also with Java, C, Assembler...
> It's just a matter of picking the right tool for the right job.

That's right - that's why we shouldn't tell people "never use XSLT for
inclusion". That's too dogmatic. Sometimes XSLT is the right tool.

> If you find it simpler to do aggregation with XSLT, I think
> that there
> is a usage difficulty with XInclude-CInclude stuff; we could
> simply make
> a XSLTInclude that wraps the request in an XSLT call, but that's an
> implementation detail.

I don't know what you mean by adding a "wrapper" - can you explain?

Con


Re: FAQ and snippet, document() function

Posted by Nicola Ken Barozzi <ni...@apache.org>.
Conal Tuohy wrote:
>>-----Original Message-----
>>From: Nicola Ken Barozzi [mailto:nicolaken@apache.org]
>>Sent: Thursday, 11 July 2002 01:36
>>To: cocoon-docs@xml.apache.org
>>Subject: Re: FAQ and snippet, document() function
>>
>>
>>
>>Diana Shannon wrote:
>>
>>>On Tuesday, July 9, 2002, at 06:04  PM, Conal Tuohy wrote:
>>>
>>>
>>>>Don't be too quick to deprecate document()!
>>>
>>>
>>>Did my draft suggest that? Are you saying that pointing out
>>
>>the problems
>>
>>>associated with using the document() function in a Cocoon
>>
>>environment
>>
>>>(e.g. caching)
>>>which may discourage use of the document() function "in effect"
>>>suggests "deprecation"?
>>
>>It's not deprecated, but simply:
>>1. not supported
>>2. not recommended
>>3. inappropriate, because it mixes concerns.
>>
>>XML is content. XSL style.
> 
> 
> I disagree with the bit in Diana's snippet where she said that "Using the
> document() function for aggregation in Cocoon breaks Separation of Concerns
> (SoC)", which you echoed: "inappropriate, because it mixes concerns". I
> think it MAY break SoC but not necessarily.
> 
> XSLT is a complete programming language. You can do styling with it. And you
> can do aggregation with it. You can do pretty much anything with it. In
> particular you can do some complicated inclusions more easily than with
> xinclude. That's all I'm saying: for some inclusion operations, a small XSLT
> transform is preferable to other inclusion techniques. In particular I'm
> thinking of recursive inclusions, which I'm sure are more efficient coded as
> a single XSLT with a recursive template call, than as a recursive
> multi-stage pipeline with xinclude, easier to maintain, easier to understand
> ... in fact better in every way. I used the document() function once for
> this purpose and was amazed how simple it was to do. NB at no point did I
> mix up styling and inclusion in the same stylesheet!
> 
> If you put multiple XSLT transforms in a pipeline, the first one can do
> inclusion, and the second one can do transformation. There is no need to mix
> them, just because you have written both operations in the same language.
> That's the point I'm trying to make.
> 
> Let's not deprecate the document() function (I didn't mean to imply I
> thought it was being formally deprecated, just that it was being "dissed"
> unfairly), just because people COULD misuse it.
> 
> 
>>Style should never *work* on content, just transform it.
>>Aggregation is basically not something to do at the style
>>phase, because
>>it injects vital information in the data flow.
> 
> 
> I agree.
> 
> 
>>For example, if I used the document() function to aggregate your
>>homepage with some nasty page, would you say it's just style?
> 
> 
> No I wouldn't. But I hope you see why I don't care anyway ;-)

You can do many things with XSLT, but also with Java, C, Assembler...
It's just a matter of picking the right tool for the right job.

If you find it simpler to do aggregation with XSLT, I think that there 
is a usage difficulty with XInclude-CInclude stuff; we could simply make 
a XSLTInclude that wraps the request in an XSLT call, but that's an 
implementation detail.

-- 
Nicola Ken Barozzi                   nicolaken@apache.org
             - verba volant, scripta manent -
    (discussions get forgotten, just code remains)
---------------------------------------------------------------------


RE: FAQ and snippet, document() function

Posted by Conal Tuohy <co...@paradise.net.nz>.
> -----Original Message-----
> From: Nicola Ken Barozzi [mailto:nicolaken@apache.org]
> Sent: Thursday, 11 July 2002 01:36
> To: cocoon-docs@xml.apache.org
> Subject: Re: FAQ and snippet, document() function
>
>
>
> Diana Shannon wrote:
> >
> > On Tuesday, July 9, 2002, at 06:04  PM, Conal Tuohy wrote:
> >
> >> Don't be too quick to deprecate document()!
> >
> >
> > Did my draft suggest that? Are you saying that pointing out
> the problems
> > associated with using the document() function in a Cocoon
> environment
> > (e.g. caching)
> > which may discourage use of the document() function "in effect"
> > suggests "deprecation"?
>
> It's not deprecated, but simply:
> 1. not supported
> 2. not recommended
> 3. inappropriate, because it mixes concerns.
>
> XML is content. XSL style.

I disagree with the bit in Diana's snippet where she said that "Using the
document() function for aggregation in Cocoon breaks Separation of Concerns
(SoC)", which you echoed: "inappropriate, because it mixes concerns". I
think it MAY break SoC but not necessarily.

XSLT is a complete programming language. You can do styling with it. And you
can do aggregation with it. You can do pretty much anything with it. In
particular you can do some complicated inclusions more easily than with
xinclude. That's all I'm saying: for some inclusion operations, a small XSLT
transform is preferable to other inclusion techniques. In particular I'm
thinking of recursive inclusions, which I'm sure are more efficient coded as
a single XSLT with a recursive template call, than as a recursive
multi-stage pipeline with xinclude, easier to maintain, easier to understand
... in fact better in every way. I used the document() function once for
this purpose and was amazed how simple it was to do. NB at no point did I
mix up styling and inclusion in the same stylesheet!

If you put multiple XSLT transforms in a pipeline, the first one can do
inclusion, and the second one can do transformation. There is no need to mix
them, just because you have written both operations in the same language.
That's the point I'm trying to make.

Let's not deprecate the document() function (I didn't mean to imply I
thought it was being formally deprecated, just that it was being "dissed"
unfairly), just because people COULD misuse it.

> Style should never *work* on content, just transform it.
> Aggregation is basically not something to do at the style
> phase, because
> it injects vital information in the data flow.

I agree.

> For example, if I used the document() function to aggregate your
> homepage with some nasty page, would you say it's just style?

No I wouldn't. But I hope you see why I don't care anyway ;-)

Cheers!

Con


Re: FAQ and snippet, document() function

Posted by Nicola Ken Barozzi <ni...@apache.org>.
Diana Shannon wrote:
> 
> On Tuesday, July 9, 2002, at 06:04  PM, Conal Tuohy wrote:
> 
>> Don't be too quick to deprecate document()!
> 
> 
> Did my draft suggest that? Are you saying that pointing out the problems
> associated with using the document() function in a Cocoon environment 
> (e.g. caching)
> which may discourage use of the document() function "in effect"
> suggests "deprecation"?

It's not deprecated, but simply:
1. not supported
2. not recommended
3. inappropriate, because it mixes concerns.

XML is content. XSL style.
Style should never *work* on content, just transform it.
Aggregation is basically not something to do at the style phase, because 
it injects vital information in the data flow.
For example, if I used the document() function to aggregate your 
homepage with some nasty page, would you say it's just style?

-- 
Nicola Ken Barozzi                   nicolaken@apache.org
             - verba volant, scripta manent -
    (discussions get forgotten, just code remains)
---------------------------------------------------------------------


Re: FAQ and snippet, document() function

Posted by Nicola Ken Barozzi <ni...@apache.org>.
Conal Tuohy wrote:
>>-----Original Message-----
>>From: Nicola Ken Barozzi [mailto:nicolaken@apache.org]
> 
> 
> <snip/>
> 
>>Remember that using document(), you are effectively bypassing
>>the Cocoon
>>cache mechanism.
>>We will be able to give a TTL for the cache of single
>>aggregation parts;
>>  with XSLT it will not be possible, because out of our control.
> 
> 
> Yeah the XInclude example of mine (in Diana's posting that started this
> thread) started off using document() (wrongly!), but I changed it to use
> XInclude and fixed that problem.
> 
> I think this cache problem is likely to be resolved soon though.
> Incidentally the same problem exists with <xsl:import>. I read some
> discussion about it on the Cocoon-Dev list recently - it certainly sounded
> feasible.
> http://www.mail-archive.com/cocoon-dev@xml.apache.org/msg16663.html
> 
> 
>>What you need is an XSLT-like syntax for aggregation,
>>because you may need many documents and not only one?
>>
>>Again, you should have them defined in the Generation stage,
>>where you
>>have created the include directives, not after.
> 
> The tricky thing is where the inclusion is recursive. In this situation you
> have a whole series of generation and aggregation stages. You have to use
> recursion, and your choices are to make recursive calls in the sitemap, or
> to make recursive calls in the XSLT.

Ok, see the other mail reply coming :-)

-- 
Nicola Ken Barozzi                   nicolaken@apache.org
             - verba volant, scripta manent -
    (discussions get forgotten, just code remains)
---------------------------------------------------------------------


RE: FAQ and snippet, document() function

Posted by Conal Tuohy <co...@paradise.net.nz>.
> -----Original Message-----
> From: Nicola Ken Barozzi [mailto:nicolaken@apache.org]

<snip/>

> Remember that using document(), you are effectively bypassing
> the Cocoon
> cache mechanism.
> We will be able to give a TTL for the cache of single
> aggregation parts;
>   with XSLT it will not be possible, because out of our control.

Yeah the XInclude example of mine (in Diana's posting that started this
thread) started off using document() (wrongly!), but I changed it to use
XInclude and fixed that problem.

I think this cache problem is likely to be resolved soon though.
Incidentally the same problem exists with <xsl:import>. I read some
discussion about it on the Cocoon-Dev list recently - it certainly sounded
feasible.
http://www.mail-archive.com/cocoon-dev@xml.apache.org/msg16663.html

> What you need is an XSLT-like syntax for aggregation,
> because you may need many documents and not only one?
>
> Again, you should have them defined in the Generation stage,
> where you
> have created the include directives, not after.

The tricky thing is where the inclusion is recursive. In this situation you
have a whole series of generation and aggregation stages. You have to use
recursion, and your choices are to make recursive calls in the sitemap, or
to make recursive calls in the XSLT.


Re: FAQ and snippet, document() function

Posted by Nicola Ken Barozzi <ni...@apache.org>.
Conal Tuohy wrote:
>>-----Original Message-----
>>From: Diana Shannon [mailto:shannon@apache.org]
> 
> 
> Hi Diana
> 
> 
>>On Tuesday, July 9, 2002, at 06:04  PM, Conal Tuohy wrote:
>>
>>
>>>Don't be too quick to deprecate document()!
>>
>>Did my draft suggest that? Are you saying that pointing out
>>the problems
>>associated with using the document() function in a Cocoon environment
>>(e.g. caching)
>>which may discourage use of the document() function "in effect"
>>suggests "deprecation"?
> 
> No ... it's the bit where you said it breaks SoC. That's the statement I
> take issue with - I do think it is excessively deprecatory. It sounds like
> you're telling people never to use document() in Cocoon.
> 
> Whereas I think that, in some instances, an XSLT (with document()) is
> actually a better choice than the alternatives, and I think the Q&A should
> make that clear.
> 
> Perhaps we could say:
> 
> -------------------------------------------------------------
> Q. What's "wrong" with use of the document() function in Cocoon?
> 
> A. There's nothing wrong - so long as you maintain a proper Separation of
> Concerns. But in most cases, there are simpler alternative techniques, so
> document() should only be used when it is the best tool for the job.
> 
> These other techniques encourage (enforce?) SoC, because the aggregation
> takes place as a distinct step in a pipeline. If you use XSLT to aggregate
> content, you must take care to maintain SoC yourself. To do this, create an
> XSLT which performs only the aggregation. Don't be tempted to transform the
> aggregated data into a final result at the same time- use a separate XSLT
> for this, and chain them together in a pipeline.
> -------------------------------------------------------------
> 
> I'll see if I can put together a simple example of using document() and post
> it to the list. In general, though, I think the compelling case is where you
> need to aggregate lots of documents recursively. i.e.
> 
> doc A refers to doc B, doc B refers to doc C, doc C refers to D, etc.
> 
> You can run an XSLT transform on A which will aggregate A, B, C, and D (etc)
> into a single doc. This is simple and efficient in XSLT. But to do the same
> thing with XInclude, for instance, would require recursive calls into the
> sitemap (potentially hundreds of such calls, depending on the number of
> docs), possibly even caching each partial aggregate as the recursion
> progresses. Ouch!
> 
> These are the heuristics I use (in order of decreasing preference):
> 
> 1) Where the number of source documents is a constant, and the sources don't
> themselves link to each other, then you should impose aggregation on them
> "from outside" with map:aggregate. In this case the aggregation is specified
> entirely in the sitemap - it's all nice and clear. But this isn't always
> possible because sometimes you don't even know how many docs you are
> aggregating.
> 
> 2) Where the number of input sources is variable, and consists of links in a
> document (or constant set of documents), you can transform those links to
> <xi:include> elements, and then perform aggregation with XInclude.
> 
> 3) Where the number of input sources is variable, consisting of links in an
> entire "grove" of documents, then perform the aggregation with a custom XSLT
> script.

Remember that using document(), you are effectively bypassing the Cocoon 
cache mechanism.

We will be able to give a TTL for the cache of single aggregation parts; 
  with XSLT it will not be possible, because out of our control.

What you need is an XSLT-like syntax for aggregation,
because you may need many documents and not only one?

Again, you should have them defined in the Generation stage, where you 
have created the include directives, not after.

-- 
Nicola Ken Barozzi                   nicolaken@apache.org
             - verba volant, scripta manent -
    (discussions get forgotten, just code remains)
---------------------------------------------------------------------


RE: FAQ and snippet, document() function

Posted by Conal Tuohy <co...@paradise.net.nz>.
> -----Original Message-----
> From: Diana Shannon [mailto:shannon@apache.org]

Hi Diana

> On Tuesday, July 9, 2002, at 06:04  PM, Conal Tuohy wrote:
>
> > Don't be too quick to deprecate document()!
>
> Did my draft suggest that? Are you saying that pointing out
> the problems
> associated with using the document() function in a Cocoon environment
> (e.g. caching)
> which may discourage use of the document() function "in effect"
> suggests "deprecation"?

No ... it's the bit where you said it breaks SoC. That's the statement I
take issue with - I do think it is excessively deprecatory. It sounds like
you're telling people never to use document() in Cocoon.

Whereas I think that, in some instances, an XSLT (with document()) is
actually a better choice than the alternatives, and I think the Q&A should
make that clear.

Perhaps we could say:

-------------------------------------------------------------
Q. What's "wrong" with use of the document() function in Cocoon?

A. There's nothing wrong - so long as you maintain a proper Separation of
Concerns. But in most cases, there are simpler alternative techniques, so
document() should only be used when it is the best tool for the job.

These other techniques encourage (enforce?) SoC, because the aggregation
takes place as a distinct step in a pipeline. If you use XSLT to aggregate
content, you must take care to maintain SoC yourself. To do this, create an
XSLT which performs only the aggregation. Don't be tempted to transform the
aggregated data into a final result at the same time- use a separate XSLT
for this, and chain them together in a pipeline.
-------------------------------------------------------------

I'll see if I can put together a simple example of using document() and post
it to the list. In general, though, I think the compelling case is where you
need to aggregate lots of documents recursively. i.e.

doc A refers to doc B, doc B refers to doc C, doc C refers to D, etc.

You can run an XSLT transform on A which will aggregate A, B, C, and D (etc)
into a single doc. This is simple and efficient in XSLT. But to do the same
thing with XInclude, for instance, would require recursive calls into the
sitemap (potentially hundreds of such calls, depending on the number of
docs), possibly even caching each partial aggregate as the recursion
progresses. Ouch!

These are the heuristics I use (in order of decreasing preference):

1) Where the number of source documents is a constant, and the sources don't
themselves link to each other, then you should impose aggregation on them
"from outside" with map:aggregate. In this case the aggregation is specified
entirely in the sitemap - it's all nice and clear. But this isn't always
possible because sometimes you don't even know how many docs you are
aggregating.

2) Where the number of input sources is variable, and consists of links in a
document (or constant set of documents), you can transform those links to
<xi:include> elements, and then perform aggregation with XInclude.

3) Where the number of input sources is variable, consisting of links in an
entire "grove" of documents, then perform the aggregation with a custom XSLT
script.

Regards

Con


Re: FAQ and snippet, document() function

Posted by Diana Shannon <sh...@apache.org>.
On Tuesday, July 9, 2002, at 06:04  PM, Conal Tuohy wrote:

> Don't be too quick to deprecate document()!

Did my draft suggest that? Are you saying that pointing out the problems
associated with using the document() function in a Cocoon environment 
(e.g. caching)
which may discourage use of the document() function "in effect"
suggests "deprecation"?

Also, don't forget, the document() function is also inefficient because 
of
Xalan bugs. I guess we could advise the use of Saxon if users choose to 
continue
their use of the document() function -- for whatever reason.

And I forgot to mention other potential (future) approaches: 
XpathDirectoryGenerator (2.1 scratchpad),
Forrest's Libre, etc. Are there others? I remember a thread (Speed up 
DirectoryGenerator)
with Stefan/Steven/Marc on this subject, but it's not clear to me if 
Stefan's recent work impacts
this FAQ.

Thanks for your input.

Diana


RE: FAQ and snippet, document() function

Posted by Conal Tuohy <co...@paradise.net.nz>.
Don't be too quick to deprecate document()!

I think it is very important to fix Cocoon so that it can support
document(), for two reasons:

1) It's necessary for reusing code from non-pipeline-based systems. I don't
know of many people migrating XSLT from Cocoon to non-pipeline systems, but
the other way is common, and for these people (me included), it's important
to be able to reuse such legacy code in Cocoon, BEFORE refactoring it to
properly separate concerns. Otherwise we're imposing a barrier to entry
which is undesirable.

2) The point about SoC is valid, for sure, but I think there is still a
place for document() to be used properly in Cocoon. A small "inclusion" XSLT
using document() could be used in place of xinclude, and in fact there are
some circumstances where it is more suitable than xinclude or Cocoon
aggregation. It seems to me that the point is  not that inclusion should
only be performed by the xinclude, cinclude, or aggregation components, but
that it should be performed as a distinct stage in a pipeline. So I think
document() shouldn't be written off entirely, but instead users should be
encouraged to treat "inclusion" separately from other concerns, even when
the inclusion is performed by an XSLT using the document() function (in
fact, especially then, since xinclude is always a separate step, by
definition, while it's POSSIBLE to mix inclusion with other concerns in a
single XSLT.)

Cheers!

Con

> -----Original Message-----
> From: Diana Shannon [mailto:shannon@apache.org]
> Sent: Wednesday, 10 July 2002 03:15
> To: cocoon-docs@xml.apache.org
> Cc: conalt@paradise.net.nz
> Subject: FAQ and snippet, document() function
>
>
> Appropriate use of the document() function in Cocoon
> represents one of
> the
> major conceptual FAQs among XSLT users migrating to Cocoon. Recent
> discussions
> on cocoon-dev and cocoon-users have shed a lot of light on
> the subject.
>
> Below is a draft FAQ, based on Stefano and Jan's recent dicussions on
> cocoon-dev.
> Within it you will also find a pointer to a proposed snippet ( Conal
> Tuohy's recent
> post on cocoon-users) which provides an xinclude-based alternative to
> the document() function.
>
> Please help to review both and comment here (or cocoon-users) as
> necessary.
>
> -- Diana
>
>
> Q. What's "wrong" with use of the document() function in Cocoon?
>
> A. Using the document() function for aggregation in Cocoon breaks
> Separation of Concerns (SoC). That is, the designers of Cocoon
> view inclusion and transformation as different functions, best
> handled by separate Cocoon components. Treating them
> separately allows you to achieve performance gains and increases
> the resusability of your pipelines.
>
> Alternatives to the document() in the Cocoon environment include
> aggregation or the use of a multi-stage transformation using the
> XInclude Transformer. This involves transforming a list of documents
> (generated dynamically or statically) by adding xinclude
> elements which
> reference (via xpointer) specific document content, and then
> transforming
> again via the XInclude Transformer, to obtain the desired result.
>
> For an example of this, see:
>   (Proposed Snippet)
>
> http://marc.theaimsgroup.com/?l=xml-cocoon-users&m=102617106411067&w=2
>
> You'll achieve better performance if you aggregate content prior to
> transformation.
> This allows you to take full advantage of Cocoon's pipeline
> caching. In
> contrast,
> making dynamic document() calls inside an XSLT within a
> cached pipeline
> is problematic.
> At this time, Cocoon does not recognize changes in documents
> (called by
> the document() function)
> until the requested page expires from cache.
>
> Understand that the document() function was designed *before* xinclude
> with xpointer facilities existed. Had such capabilities been
> available,
> perhaps the document() function, which essentially mimics
> xinclude and
> xpointer,
> would have never been added to XSLT.
>
> Please note that if you must work with your XML files outside of the
> Cocoon environment as well, you may need to use the
> document() function
> in order to utilize the limited capabilities of other
> pipeline engines.
> This includes engines which are not xinclude-capable or which
> lack a predefined way to indicate document processing steps.
>
>