You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@cocoon.apache.org by Stefano Mazzocchi <st...@apache.org> on 2002/07/02 14:54:37 UTC

[RT] reconsidering pipeline semantics

In light of the discussion on blocks, Sylvain pointed out that cocoon
services should be mapped to pipelines and not to resources directly.

This consideration triggered a few RT that I would like to share with
you and trigger further discussion.

NOTE: this is nothing related to blocks or flow, but only at the sitemap
semantics.

                                 - o -

What is a pipeline
------------------

The first and major architectural contribution that Cocoon brought in
the web world is the ability to compose web services using the "pipe and
filters" design pattern. (I'm using 'web services' in the original sense
of the term: any service that is related to the web)

Cocoon decided to follow an XML-oriented approach to pipelines, forcing
everything in the XML real and working on that from there. So, the
Cocoon's pipelines concept is somewhat an extension to the original GoF
"pipe and filters" pattern: in fact, the Cocoon pipeline implements both
'pipe and filters' and 'adaptor' patterns.

Why? well, this comes from the fact that the HTTP protocol is not XML
oriented (unlike SOAP, for example). So, in order to perform XML piping,
we need to adapt in and out from the generic octet-stream world.

So, unlike the UNIX pipeline which doesn't need adaptation (since the
STDIN/OUT streams are all octet-oriented), Cocoon needed to create ways
to adapt to the rest of the world which is not XML oriented.

For this reason, why a UNIX pipeline is composed like this

 input -> filter -[pipe]-> filter -[pipe]-> filter -> output

a cocoon pipeline is composed by

 input -> adaptor -[pipe]-> filter -[pipe]-> adaptor -> output

unfortunately, the above picture isn't entirely correct since the two
adaptors can't be exchanged, thus they are, in fact, different entities:
the first adapts an octet-based world to an XML-based world, the other
does the opposite. They are not symmetrical. In Cocoon terminology, the
first adapter is a generator, the second is a serializer.

We call 'Cocoon pipeline' the collection of all filters (transformers)
and adapters (generator and serializer) because there cannot be a
pipeline without adapters.

I think it's time to challenge this concept.

                            - o -

What are sitemap resources?
---------------------------

Let me tell you: they are a mistake, a mistake I did trying to reduce
the sitemap verbosity and fixing a problem that didn't yet emerged at
that time. Early optimization is the root of all evil and I see that
now: resources overlap with pipelines.

Let me show you why. Consider this sitemap snippet:

 <sitemap>
  <resources>
   <resource name="blah">
    <generate ../>
    <transform ../>
    <serialize ../>
   </resource>
  </resources>
 
  <pipelines>
   <pipeline internal-only="true">
    <match pattern="*">
     <call resource="blah"/>
    </match>
   </pipeline>
  </pipelines>
 </sitemap>

and now this

 <sitemap>
  <pipelines>
   <pipeline name="blah">
    <generate ../>
    <transform ../>
    <serialize ../>
   </pipeline>

   <pipeline>
    <match pattern="*">
     <call pipeline="blah"/>
    </match>
   </pipeline>
  </pipelines>
 </sitemap>

which one is more semantically consistent? Can you say "named XSLT
templates"?

Composing pipelines
-------------------

Let me assume the above syntax gets introduced. At this point, we have
four different ways to call a pipeline:

 - as a pipeline
 - as a generator
 - as a transformer
 - as a serializer

let me write the code so you understand what I mean:

[using a pipeline as a pipeline] (as today)

   <pipeline>
    <match pattern="*">
     <call pipeline="blah"/>
    </match>
   </pipeline>

nothing fancy here. Used mainly for verbosity reduction when the same
pipeline is used in different places.

[using a pipeline as a generator]

   <pipeline>
    <match pattern="*">
     <call pipeline="blah"/>
     <transform .../>
     <serialize ../>
    </match>
   </pipeline>

in this case, the 'serializer' of the called pipeline is not used and
the output of the last tranformer of the named pipeline is connected
with the input of the transformer right after the call.

This is equivalent of *overloading* the serializer of the called
pipeline with the rest of the pipeline in place.

[using a pipeline as a transformer]

   <pipeline>
    <match pattern="*">
     <generate ../>
     <call pipeline="blah"/>
     <serialize ../>
    </match>
   </pipeline>

where both the generator and the serializer of the named pipeline are
not used.

This is equivalent of *overloading* both the generator and the
serializer of the called pipeline with the rest of the pipeline in
place.

[using a pipeline as a serializer]

   <pipeline>
    <match pattern="*">
     <generate ../>
     <tranform ../>
     <call pipeline="blah"/>
    </match>
   </pipeline>

where the generator of the named pipeline is not used.

This is equivalent of *overloading* the generator of the called pipeline
with the rest of the pipeline in place.

                               - o -

So, here is what I propose:

 - add the 'pipeline' attribute to 'map:call'
 - add the 'name' attribute to 'map:pipeline'
 - deprecate the 'map:resources' element
 - deprecate 'internal-only' attribute of 'map:pipeline' 
   [because named pipelines become implicitly internal-only]
 - allow 'map:call' to be executed in any place, performing the pipeline
overloading behavior I explained above.

What do you think?  

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] reconsidering pipeline semantics

Posted by Peter Royal <pr...@apache.org>.

On Tuesday 02 July 2002 08:54 am, Stefano Mazzocchi wrote:
> So, here is what I propose:
>
>  - add the 'pipeline' attribute to 'map:call'
>  - add the 'name' attribute to 'map:pipeline'
>  - deprecate the 'map:resources' element
>  - deprecate 'internal-only' attribute of 'map:pipeline'
>    [because named pipelines become implicitly internal-only]
>  - allow 'map:call' to be executed in any place, performing the pipeline
> overloading behavior I explained above.
>
> What do you think?

So to restate to make sure I got what you said:

 * Resources will go away
 * map:call will now call a named pipeline

and the most important change:

 * map:call now returns to the calling pipeline if the called pipeline does 
not have a serializer.

If so, +1!

I have been abusing resources badly to achieve something like the above.
-pete

-- 
peter royal -> proyal@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] reconsidering pipeline semantics

Posted by Nicola Ken Barozzi <ni...@apache.org>.


Stefano Mazzocchi wrote:
> Sylvain Wallez wrote:
 >
>>Consider the same sitemap with just
>>an additional "what" attribute to <import> :
>>
>><sitemap>
>>  <pipeline name="doc2html">
>>    <generate src="{doc}.xml"/>
>>    <transform src="doc2html.xsl"/>
>>    <serialize type="html"/>
>>  </pipeline>
>>
>>  <pipeline>
>>    <match pattern="faq.html">
>>      <generate src="faq.xml"/>
>>      <transform src="faq2doc.xsl"/>
>>      <import name="doc2html" what="transform, serialize">
>>        <parameter name="doc" value="dummy"/>
>>      </import>
>>    </match>
>>
>>    <match pattern="*.html">
>>      <import name="doc2html" what="all">
>>        <parameter name="doc" value="{1}"/>
>>      </import>
>>    </match>
>>  </pipeline>
>></sitemap>
>>
>>I don't pretend this notation is nice, but it carries the required
>>information : this simple attribute has removed all ambiguities, and we
>>are able to know exactly what will happen.
>>
>>Has this triggered something in your synapses ?
> 
> 
> No, not really :/

Let me see if my neurons collaborate ;-)
I /think/ that Sylvain wants to convey the same feeling I have.

Pipelines contain Generators, Transformers, Serializers.

  -1 Generator.
  -* Transformers.
  -1 Serializer.

If you add things to a pipeline, it would naturally be assumed that you 
add Transformers, right?

Bzzz... wrong.

It could be that your stream is Serialized, reGenerated someway, and 
processing continues.
Or that The serializet contant is XMLed somewhat, bypassing the second 
Generation phase, and retransformed.

What Sylvain says:
 >>      <import name="doc2html" what="transform, serialize">
 >>        <parameter name="doc" value="dummy"/>
 >>      </import>

Tells to import only the transform and serializer explicitly.

Resource pipelines were thought of being common used endpoints.
Named pipelines are more seen as reusable Transformer snippets.
This creates some confusion.

We could just decide that pipelines imported at the beginning give 
Generator+Transformers but no Serializer.
In the center only the Transformers.
At the end Transformer + Serializer.

ie:

<sitemap>
   <pipeline name="doc2html">
     <generate src="{doc}.xml"/>
     <transform src="doc2html.xsl"/>
     <serialize type="html"/>
   </pipeline>

   <pipeline>
     <match pattern="faq.html">
       <import name="doc2html"> (*implicitly generater+transformer*)
         <parameter name="doc" value="dummy"/>
       </import>
       <transform src="faq2doc.xsl"/>
     <serialize type="html"/>
     </match>

     <match pattern="*.html">
       <import name="doc2html"> (*implicitly all*)
         <parameter name="doc" value="{1}"/>
       </import>
     </match>

     <match pattern="faq.html">
       <generate src="faq.xml"/>
       <transform src="faq2doc.xsl"/>
       <import name="doc2html"> (*implicitly transformer+serializer*)
         <parameter name="doc" value="dummy"/>
       </import>
     </match>
    </pipeline>
</sitemap>


What could make all of this declaration more explicit is creating 
something like:

   <transformer-pipeline name="doc2html">
     <transform src="doc2html.xsl"/>
   </transformer-pipeline>

   <generator-pipeline name="doc2html-G">
     <generate src="{doc}.xml"/>
     <import-transformer-pipeline name="doc2html">
        <parameter name="doc" value="dummy"/>
     </import>
   </generator-pipeline>

   <serializer-pipeline name="doc2html-S">
     <import-transformer-pipeline name="doc2html">
        <parameter name="doc" value="dummy"/>
     </import>
     <serialize type="html"/>
   </serializer-pipeline>

and use:

   <pipeline>
     <match pattern="faq.html">
       <import-generator name="doc2html-G">
         <parameter name="doc" value="dummy"/>
       </import-generator>
       <transform src="faq2doc.xsl"/>
       <serialize type="html"/>
     </match>

   etc...

    </pipeline>

But it really gets verbose...

-- 
Nicola Ken Barozzi                   nicolaken@apache.org
             - verba volant, scripta manent -
    (discussions get forgotten, just code remains)
---------------------------------------------------------------------


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] reconsidering pipeline semantics

Posted by Sylvain Wallez <sy...@anyware-tech.com>.

Stefano Mazzocchi wrote:

>:-??? 
>
>sorry, probably I'm stupid and slow, but I still don't get it
>

Don't say that, as we all now you aren't stupid nor slow !

Ok, I give up with this now, but this will certainly resurface at the 
time we will make this real...

>>Has this triggered something in your synapses ?
>>    
>>
>
>No, not really :/
>
>  
>
Too bad :(

Sylvain

-- 
Sylvain Wallez
  Anyware Technologies                  Apache Cocoon
  http://www.anyware-tech.com           mailto:sylvain@apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] reconsidering pipeline semantics

Posted by Stefano Mazzocchi <st...@apache.org>.

Sylvain Wallez wrote:

> >Unfortunately, this requires a different attribute, something like
> >
> > <pipeline exposure="(public|internal|block)">
> >
> >
> 
> Interesting thought. Let's keep it for when we will discuss the
> security-related topics of block.

Ok.

> <snip/>
> 
> >For example, we could have done:
> >
> > <map:generate type="aggregate">
> >
> >instead of
> >
> > <map:aggregate ...>
> >
> >but than we would have left the user the ability to 'remove'
> >functionality from the sitemap.
> >
> >This is why I prefer to encode proper sitemap functionality in its
> >semantics, even if, implementation-wise, it would be equivalent to reuse
> >the existing component infrastructure.
> >
> >Do you see my point?
> >
> >
> 
> Yes, and I agree : visible semantic is better than a magical
> implementation of another one.

Cool.

> <snip/>
> 
> >Admittedly, this very random thinking, but at least it will trigger
> >something in your synapses (hopefully :)
> >
> >
> 
> Yes ! But what it triggered isn't exactly what you may have been waiting
> for : I'm pretty sure now we both have the same understanding of this
> pipeline inheritance and overloading stuff, but each of us re-used
> existing but different terms with their associated meanings, leading to
> this confusion.

All right.

> However, you still did not catch what I want to explain. Let's give it
> another try with your new <import>, which has no previous history,
> unlike <call> or <transform>.

Ok, great.
 
> I wish we could join in front of a whiteboard, this would be much easier :)

Ah, no kidding :)
 
> What I (desperately ;) try to show you is that <import> without
> explicitly saying what is to be imported from the target pipeline is
> sometimes ambiguous in a _real_ sitemap, containing several matchers
> building several different pipelines.
> 
> Let's consider a skin pipeline that renders an xdoc as an html page.
> This pipeline requires a "doc" parameter which should be passed by the
> caller (let's not argue about argument passing, as this is yet another
> problem!) :
> <pipeline name="doc2html">
>   <generate src="{doc}.xml"/>
>   <transform src="doc2html.xsl"/>
>   <serialize type="html"/>
> </pipeline>
> 
> Now let's consider the typical doc generation statements :
> <match pattern="*.html">
>   <import name="doc2html">
>     <parameter name="doc" value="{1}"/>
>   </import>
> </match>
> 
> No problem here : we import the full "doc2html" pipeline.
> 
> Now, let's consider the case of the FAQ which has a special DTD :
> <match pattern="faq.html">
>   <generate src="faq.xml"/>
>   <transform src="faq2doc.xsl"/>
>   <import name="doc2html">
>     <parameter name="doc" value="dummy"/>
>   </import>
> </match>
> 
> We generate our FAQ, transform it into an xdoc and import the
> transformation and serialization of doc2html. The "doc" parameter, used
> by the generator, is actually useless and could even have been avoided.
> 
> Now let's assemble all this in a sitemap : the internal "doc2html" and
> the externally visible pipeline, with "faq.html" matcher first, since it
> is more specific than "*.html" :
> <sitemap>
>   <pipeline name="doc2html">
>     <generate src="{doc}.xml"/>
>     <transform src="doc2html.xsl"/>
>     <serialize type="html"/>
>   </pipeline>
> 
>   <pipeline>
>     <match pattern="faq.html">
>       <generate src="faq.xml"/>
>       <transform src="faq2doc.xsl"/>
>       <import name="doc2html">
>         <parameter name="doc" value="dummy"/>
>       </import>
>     </match>
> 
>     <match pattern="*.html">
>       <import name="doc2html">
>         <parameter name="doc" value="{1}"/>
>       </import>
>     </match>
>   </pipeline>
> 
> </sitemap>
> 
> And here comes the ambiguity that I desperately try to show you : in the
> current sitemap syntax, <serialize> marks the end of the pipeline
> construction, and consequently the end of the processing of sitemap
> statements.
> 
> <import>, on the contrary, can be considered as a generator, a
> transformer or a serializer *depending on the context*. 

Yes, this is correct.

> And the context
> of the two matchers above (assembled in a sitemap) *is different* from
> the context of each matcher considered separately.

:-? I don't get it...
 
> So when we end the <match> for "faq.html", how can we know if pipeline
> construction is finished (with the serializer of "doc2html") or if it
> should continue with the following statements ? The answer is : we
> cannot know.

:-??? 

sorry, probably I'm stupid and slow, but I still don't get it: when it
finds a serializer the sitemap stops. I really don't see the problem
(but since I didn't implement all this stuff, you are seeing something
I'm not)

> And worse, if we continue evaluating statements, the second match
> ("*.html") matches, and we encounter another <import>. What happens here
> ? Since the pipeline being built already has a generator, will we append
> (a second time) the transformer part of "doc2html" ?
> 
> That's why I say a more explicit semantic is required. The calling
> pipeline *must* explicitly say what is to be included and what is to be
> overloaded in the called pipeline.

I keep on not getting it: what is the difference between an explicit
information and an implicit one? The information on *what* to include
and what to overload is taken from the context and the context *is*
specified by where you place the <import> statement.

I'm sorry, but I really can't see your point.

> Consider the same sitemap with just
> an additional "what" attribute to <import> :
> 
> <sitemap>
>   <pipeline name="doc2html">
>     <generate src="{doc}.xml"/>
>     <transform src="doc2html.xsl"/>
>     <serialize type="html"/>
>   </pipeline>
> 
>   <pipeline>
>     <match pattern="faq.html">
>       <generate src="faq.xml"/>
>       <transform src="faq2doc.xsl"/>
>       <import name="doc2html" what="transform, serialize">
>         <parameter name="doc" value="dummy"/>
>       </import>
>     </match>
> 
>     <match pattern="*.html">
>       <import name="doc2html" what="all">
>         <parameter name="doc" value="{1}"/>
>       </import>
>     </match>
>   </pipeline>
> </sitemap>
> 
> I don't pretend this notation is nice, but it carries the required
> information : this simple attribute has removed all ambiguities, and we
> are able to know exactly what will happen.
> 
> Has this triggered something in your synapses ?

No, not really :/

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------



---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] reconsidering pipeline semantics

Posted by Sylvain Wallez <sy...@anyware-tech.com>.

Stefano Mazzocchi wrote:

>Sylvain Wallez wrote:
>  
>
<snip/>

>>Isn't the regular URI-matching enough that you need an additional naming
>>scheme ?
>>    
>>
>
>No, probably not, but in that case, we need another pipeline state.
>
> - [no attribute] -> similar to java 'public'
> - internal-only="true" -> similar to java 'private'
> - ?? -> similar to java 'protected'
>
>where the pipeline is accessible from the inside and from other blocks,
>but not exposed to the public URI space.
>
>Unfortunately, this requires a different attribute, something like
>
> <pipeline exposure="(public|internal|block)">
>  
>

Interesting thought. Let's keep it for when we will discuss the 
security-related topics of block.

<snip/>

>For example, we could have done:
>
> <map:generate type="aggregate">
>
>instead of
> 
> <map:aggregate ...>
>
>but than we would have left the user the ability to 'remove'
>functionality from the sitemap.
>
>This is why I prefer to encode proper sitemap functionality in its
>semantics, even if, implementation-wise, it would be equivalent to reuse
>the existing component infrastructure.
>
>Do you see my point?
>  
>

Yes, and I agree : visible semantic is better than a magical 
implementation of another one.

<snip/>

>Admittedly, this very random thinking, but at least it will trigger
>something in your synapses (hopefully :)
>  
>

Yes ! But what it triggered isn't exactly what you may have been waiting 
for : I'm pretty sure now we both have the same understanding of this 
pipeline inheritance and overloading stuff, but each of us re-used 
existing but different terms with their associated meanings, leading to 
this confusion.

However, you still did not catch what I want to explain. Let's give it 
another try with your new <import>, which has no previous history, 
unlike <call> or <transform>.

I wish we could join in front of a whiteboard, this would be much easier :)

What I (desperately ;) try to show you is that <import> without 
explicitly saying what is to be imported from the target pipeline is 
sometimes ambiguous in a _real_ sitemap, containing several matchers 
building several different pipelines.

Let's consider a skin pipeline that renders an xdoc as an html page. 
This pipeline requires a "doc" parameter which should be passed by the 
caller (let's not argue about argument passing, as this is yet another 
problem!) :
<pipeline name="doc2html">
  <generate src="{doc}.xml"/>
  <transform src="doc2html.xsl"/>
  <serialize type="html"/>
</pipeline>

Now let's consider the typical doc generation statements :
<match pattern="*.html">
  <import name="doc2html">
    <parameter name="doc" value="{1}"/>
  </import>
</match>

No problem here : we import the full "doc2html" pipeline.

Now, let's consider the case of the FAQ which has a special DTD :
<match pattern="faq.html">
  <generate src="faq.xml"/>
  <transform src="faq2doc.xsl"/>
  <import name="doc2html">
    <parameter name="doc" value="dummy"/>
  </import>
</match>

We generate our FAQ, transform it into an xdoc and import the 
transformation and serialization of doc2html. The "doc" parameter, used 
by the generator, is actually useless and could even have been avoided.

Now let's assemble all this in a sitemap : the internal "doc2html" and 
the externally visible pipeline, with "faq.html" matcher first, since it 
is more specific than "*.html" :
<sitemap>
  <pipeline name="doc2html">
    <generate src="{doc}.xml"/>
    <transform src="doc2html.xsl"/>
    <serialize type="html"/>
  </pipeline>

  <pipeline>
    <match pattern="faq.html">
      <generate src="faq.xml"/>
      <transform src="faq2doc.xsl"/>
      <import name="doc2html">
        <parameter name="doc" value="dummy"/>
      </import>
    </match>

    <match pattern="*.html">
      <import name="doc2html">
        <parameter name="doc" value="{1}"/>
      </import>
    </match>
  </pipeline>

</sitemap>

And here comes the ambiguity that I desperately try to show you : in the 
current sitemap syntax, <serialize> marks the end of the pipeline 
construction, and consequently the end of the processing of sitemap 
statements.

<import>, on the contrary, can be considered as a generator, a 
transformer or a serializer *depending on the context*. And the context 
of the two matchers above (assembled in a sitemap) *is different* from 
the context of each matcher considered separately.

So when we end the <match> for "faq.html", how can we know if pipeline 
construction is finished (with the serializer of "doc2html") or if it 
should continue with the following statements ? The answer is : we 
cannot know.

And worse, if we continue evaluating statements, the second match 
("*.html") matches, and we encounter another <import>. What happens here 
? Since the pipeline being built already has a generator, will we append 
(a second time) the transformer part of "doc2html" ?

That's why I say a more explicit semantic is required. The calling 
pipeline *must* explicitly say what is to be included and what is to be 
overloaded in the called pipeline. Consider the same sitemap with just 
an additional "what" attribute to <import> :

<sitemap>
  <pipeline name="doc2html">
    <generate src="{doc}.xml"/>
    <transform src="doc2html.xsl"/>
    <serialize type="html"/>
  </pipeline>

  <pipeline>
    <match pattern="faq.html">
      <generate src="faq.xml"/>
      <transform src="faq2doc.xsl"/>
      <import name="doc2html" what="transform, serialize">
        <parameter name="doc" value="dummy"/>
      </import>
    </match>

    <match pattern="*.html">
      <import name="doc2html" what="all">
        <parameter name="doc" value="{1}"/>
      </import>
    </match>
  </pipeline>
</sitemap>

I don't pretend this notation is nice, but it carries the required 
information : this simple attribute has removed all ambiguities, and we 
are able to know exactly what will happen.

Has this triggered something in your synapses ?

;-)

Sylvain

-- 
Sylvain Wallez
 Anyware Technologies                  Apache Cocoon
 http://www.anyware-tech.com           mailto:sylvain@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] reconsidering pipeline semantics

Posted by Stefano Mazzocchi <st...@apache.org>.

Sylvain Wallez wrote:

> >I've showed how I perceived named pipelines not as 'functions' to call,
> >but as 'small internal URI spaces' (sort of small internal virtual
> >hosts), I think this concept is *much* more powerful than the
> >'resources' we have now and it's much more block-friendly.
> >
> 
> If they are "small internal URI spaces", is it necessary to name them ?

good point.

> Isn't the regular URI-matching enough that you need an additional naming
> scheme ?

No, probably not, but in that case, we need another pipeline state.

 - [no attribute] -> similar to java 'public'
 - internal-only="true" -> similar to java 'private'
 - ?? -> similar to java 'protected'

where the pipeline is accessible from the inside and from other blocks,
but not exposed to the public URI space.

Unfortunately, this requires a different attribute, something like

 <pipeline exposure="(public|internal|block)">

where exposure="public" is implicitly considered if the attribute is not
found and 'internal-only="true"' gets deprecated and falls back to
'exposure="internal"'.

NOTE: the attribute name 'exposure' is not mandated... it's just the
first that popped up on my mind.

> This feeling came from a sitemap reader/writer point of view. Consider
> the following :
> 
> <map:pipeline name="blah">
>   <map:generate src="dummy.xml"/>
>   <map:transform src="style.xsl"/>
>   <map:serialize/>
> </map:pipeline>
> 
> <map:pipeline>
>   <map:match pattern="explicit.html">
>     <map:generate src="explicit.xml"/>
>     <map:call pipeline="blah"/>
>   </map:match>
> 
>   <map:match pattern="*.html">
>     <map:generate src="data/{1}.xml"/>
>     <map:transform src="xml2htm.xsl"/>
>     <map:serialize/>
>   </map:match>
> 
> </map:pipeline>
> 
> What will happen when "explicit.html" is requested ? Please consider it
> as the user wanting to understand what happens to its data :
> - match "explicit.html" : matches, so we go in
> - generate "explicit.xml" : added to the pipeline.
> - "blah" pipeline is called
> - generate in "blah" : won't be used as the caller already defined one.
> - transform in "blah" : added to the pipeline.
> - serialize in "blah" : added, but can potentially be overloaded by what
> will happen later in the caller.
> - return to the caller
> - match "*.html" : also matches, so we go in also !!
> - generate "data/explicit.xml"...
> ==> What happens here ? <==

Gotcha! Your explaination of the concept shows that I was not able to
show you what I mean and I do see why now.

Let me try again (see below)

> Does this implicitely mean the "explicit.html" pipeline is finished and
> can be processed ? Or does this mean the sitemap is erroneous because we
> want to add a generator when there's already one ?
> 
> This also means that we can't reliably define what will happen when we
> close the <map:match> for "explicit.html", since this depends on what is
> written after.
> 
> That's why I'm not comfortable with implicit overloading (even less
> after writing this example) and would prefer a more explicit writing _in
> the caller_ of what we want to use from the called pipeline.

 
> >>Consider serializer overloading. The current sitemap definition says
> >>that a pipeline is terminated when a <map:serialize> or <map:read> is
> >>encountered. With the implicit overloading semantic, this rule is no
> >>longer valid as the calling pipeline _may_ or _may not_ define another
> >>serializer. And as any <map:serialize> that's present _below_ the
> >><map:call> can theoretically terminate the pipeline, this means that
> >>knowning if the called pipeline serializer is overloaded requires
> >>traversal of the entire remaining part of the sitemap, even if all
> >>remaining serializers are enclosed in <map:match> that will never match.
> >>
> >>
> >
> >Hmmm, wait, I don't get it. From an implementation point of view, I
> >don't see how having explicit or implicit overloading makes any
> >difference.
> >
> >For example, what is the difference between
> >
> > <generate .../>
> > <transform type="pipeline" src="/blah"/>
> > <serialize .../>
> >
> >and
> >
> > <generate .../>
> > <call pipeline="blah"/>
> > <serialize .../>
> >
> >???
> >
> >
> 
> Does my detailed example above show more differences ?
> 
> >The pipelines are exactly the same (they could have tons of matchers and
> >selectors and so on) and the resulting behavior is *exactly* the same.
> >There are only two differences:
> >
> > - your example places a sitemap behavior inside a sitemap component.
> >This appears to me an evident example of the use of the
> >overcomponentization anti-pattern.
> >
> 
> Why is it overcompentization ? There are no new component contracts
> defined here, but just the implementation of existing contracts. The
> "pipeline" transformer is just a composite one since it uses
> transformers defined by the pipeline.

Yes, but you are moving a functional feature of the sitemap into a
component.

For example, we could have done:

 <map:generate type="aggregate">

instead of
 
 <map:aggregate ...>

but than we would have left the user the ability to 'remove'
functionality from the sitemap.

This is why I prefer to encode proper sitemap functionality in its
semantics, even if, implementation-wise, it would be equivalent to reuse
the existing component infrastructure.

Do you see my point?

> > - you can't reuse the 'blah' pipeline in another location. So it is
> >possible that you call a pipeline fragment which is wrongly terminated.
> >With implicit overloading, the pipeline is always correctly terminated.
> >
> >
> 
> Why can't it be reused ? A pipeline always starts with a generator and
> ends with a serializer (not considering readers here).
> 
> What the "pipeline" transformer does is build a pipeline, which must
> follow the same rules as any pipeline, and use only its transformers
> (generator and serializer are ignored). Similarly, the "pipeline"
> generator will build the pipeline and use its generator and
> transformers. And so on for the serializer.

Ok, there is a big impedence mismatch between your view of the concept
and mine and I think it's my fault.

My vision of pipeline overloading is triggered by the concept of
functional inheritance. Suppose you have a pipeline like this (named
"A")

 [A] gA -> t1A -> t2A -> sA

this is shipped and exposed by a major cocoon block. If this pipeline is
*exactly* the way I want it to be, I call it and that's it. But what if
I want to *extend* it? 

All right, you could copy it over to your own pipeline and continue, but
then you are forking. You are not using Pipeline A anymore, but your own
local copy, which, from now on, you'll have to maintain.

Following the OO patterns, I thought about using pipeline overloading to
perform pipeline inheritance. Admittedly, the use of the <map:call> tag
lead to misunderstanding, so I'll not use it here again.

Now, my own pipeline is something like this

 [B] gB -> t1B -> t2B -> sB

and I see these meaningful ways of extending the [A] pipeline with the
[B] pipeline:

 - 'gB' := 'gA -> t1A -> t2A'
 - 't1B' := 't1A -> t2A'
 - 't2B' := 't1A -> t2A'
 - 'sB' := 't1A - t2A -> sA'

In fact, you are not 'calling' a pipeline, but assembling an existing
pipeline differently, depending on the overloading information.

More explicitly:

[A] 
 <map:generate type="gA" .../>
 <map:transform type="t1A" .../>
 <map:transform type="t2A" .../>
 <map:serialize type="sA" .../>

then the various possible pipelines become

[B]
 <map:import name="A"/>
 <map:transform type="t1B" .../>
 <map:transform type="t2B" .../>
 <map:serialize type="sB" .../>

which is equivalent to

 <map:generate type="gA" .../>
 <map:transform type="t1A" .../>
 <map:transform type="t2A" .../>
 <map:transform type="t1B" .../>
 <map:transform type="t2B" .../>
 <map:serialize type="sB" .../>

or

[B]
 <map:generate type="gB" .../>
 <map:import name="A"/>
 <map:transform type="t2B" .../>
 <map:serialize type="sB" .../>

which is equivalent to

 <map:generate type="gB" .../>
 <map:transform type="t1A" .../>
 <map:transform type="t2A" .../>
 <map:transform type="t2B" .../>
 <map:serialize type="sB" .../>

or

[B]
 <map:generate type="gB" .../>
 <map:transform type="t1B" .../>
 <map:import name="A"/>
 <map:serialize type="sB" .../>

which is equivalent to

 <map:generate type="gB" .../>
 <map:transform type="t1B" .../>
 <map:transform type="t1A" .../>
 <map:transform type="t2A" .../>
 <map:serialize type="sB" .../>

or

[B]
 <map:generate type="gB" .../>
 <map:transform type="t1B" .../>
 <map:transform type="t2B" .../>
 <map:import name="A"/>

which is equivalent to

 <map:generate type="gB" .../>
 <map:transform type="t1A" .../>
 <map:transform type="t1B" .../>
 <map:transform type="t1A" .../>
 <map:transform type="t2A" .../>
 <map:serialize type="sB" .../>

                                 - o -

While I admit the <call> semantic implied a jump and an execution of
another pipeline, here, we are simply specifying how a pipeline can make
use of another one in a more granular way.

NOTE: there is a functional overlap between

 <generate src="cocoon:/blah">

and 

 <import pipeline="/blah">

but I don't think this is harmful.

NOTE2: there is *NO* functional overlap between

 <tranform src="cocoon:/blah">

and

 <generate src="..."/>
 <import pipeline="blah"/>

since while in the first the src of the transformer is the output of the
'blah' pipeline, in the second it is *the* 'blah' pipeline (except the
generator which is overloaded) which gets appended to the pipeline.

                                 - o -

Admittedly, this very random thinking, but at least it will trigger
something in your synapses (hopefully :)

Ciao

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------



---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] reconsidering pipeline semantics

Posted by Sylvain Wallez <sy...@anyware-tech.com>.

Sorry Stefano, I missed your mail in the flood yesterday.

Stefano Mazzocchi wrote:

>Sylvain Wallez wrote:
>
>>Although it is ok to call named pipelines _inside_ a sitemap (that's
>>just a name change for resources), I don't like it for _inter sitemap_
>>calls, like can or will be the case for subsitemaps and blocks : up to
>>now, the input contract of the sitemap is the environment, and pipeline
>>choice is most often directed by the request URI. Does calling named
>>pipelines mean you want to add a new property to the environment, just
>>as the view and action we have today ?
>>    
>>
>
>The design should not be constrained by implementation details, Sylvain.
>Try not to think about how to make it work for now.
>  
>

Is the environment an implementation detail ? Maybe...

>I've showed how I perceived named pipelines not as 'functions' to call,
>but as 'small internal URI spaces' (sort of small internal virtual
>hosts), I think this concept is *much* more powerful than the
>'resources' we have now and it's much more block-friendly.
>

If they are "small internal URI spaces", is it necessary to name them ? 
Isn't the regular URI-matching enough that you need an additional naming 
scheme ?

>>IMO, the called pipeline should be defined by an URI, just as what we
>>already use for the "cocoon:" pseudo-protocol. This wouldn't introduce
>>yet-another naming scheme and would keep the existing sitemap contract.
>>    
>>
>
>Yes, I do see the danger of adding another protocol for something that
>is equivalent in functionality, but I have the feeling that 'pipeline:'
>could slowly deprecate 'cocoon:' which IMO, would be a good thing in the
>longer term.
>
>[of course, I *DO*NOT* want to deprecated cocoon: , just to show
>different and more rational ways of doing things]
>
>The Cocoon: protocol is not block-friendly and we can't make it so
>without breaking back compatibility. Also, the name is too cocoon
>specific.
>  
>

You're right : its difficult to be more cocoon-specific than "cocoon" ;-)

>>Of course, we must keep today's resources as "named pipeline snippets"
>>inside a single sitemap. To answer Peter's request, we can allow a
>>resource to be not terminated (i.e. not contain a serializer). I even
>>think the treeprocessor already handles this (needs to be verified, though).
>>    
>>
>
>  
>
>>                              --o0o--
>>
>>The second point I'm not comfortable with is implicit overloading. I
>>have the feeling the associated behaviour will be difficult to predict
>>and will make the sitemap hard to read by requiring lots of "look-ahead".
>>    
>>
>
>Again, it's the implementation bias that leads your design judgement.
>  
>

This feeling came from a sitemap reader/writer point of view. Consider 
the following :

<map:pipeline name="blah">
  <map:generate src="dummy.xml"/>
  <map:transform src="style.xsl"/>
  <map:serialize/>
</map:pipeline>

<map:pipeline>
  <map:match pattern="explicit.html">
    <map:generate src="explicit.xml"/>
    <map:call pipeline="blah"/>
  </map:match>

  <map:match pattern="*.html">
    <map:generate src="data/{1}.xml"/>
    <map:transform src="xml2htm.xsl"/>
    <map:serialize/>
  </map:match>

</map:pipeline>

What will happen when "explicit.html" is requested ? Please consider it 
as the user wanting to understand what happens to its data :
- match "explicit.html" : matches, so we go in
- generate "explicit.xml" : added to the pipeline.
- "blah" pipeline is called
- generate in "blah" : won't be used as the caller already defined one.
- transform in "blah" : added to the pipeline.
- serialize in "blah" : added, but can potentially be overloaded by what 
will happen later in the caller.
- return to the caller
- match "*.html" : also matches, so we go in also !!
- generate "data/explicit.xml"...
==> What happens here ? <==

Does this implicitely mean the "explicit.html" pipeline is finished and 
can be processed ? Or does this mean the sitemap is erroneous because we 
want to add a generator when there's already one ?

This also means that we can't reliably define what will happen when we 
close the <map:match> for "explicit.html", since this depends on what is 
written after.

That's why I'm not comfortable with implicit overloading (even less 
after writing this example) and would prefer a more explicit writing _in 
the caller_ of what we want to use from the called pipeline.

>>Consider serializer overloading. The current sitemap definition says
>>that a pipeline is terminated when a <map:serialize> or <map:read> is
>>encountered. With the implicit overloading semantic, this rule is no
>>longer valid as the calling pipeline _may_ or _may not_ define another
>>serializer. And as any <map:serialize> that's present _below_ the
>><map:call> can theoretically terminate the pipeline, this means that
>>knowning if the called pipeline serializer is overloaded requires
>>traversal of the entire remaining part of the sitemap, even if all
>>remaining serializers are enclosed in <map:match> that will never match.
>>    
>>
>
>Hmmm, wait, I don't get it. From an implementation point of view, I
>don't see how having explicit or implicit overloading makes any
>difference.
>
>For example, what is the difference between
>
> <generate .../>
> <transform type="pipeline" src="/blah"/>
> <serialize .../>
>
>and 
>
> <generate .../>
> <call pipeline="blah"/>
> <serialize .../>
>
>???
>  
>

Does my detailed example above show more differences ?

>The pipelines are exactly the same (they could have tons of matchers and
>selectors and so on) and the resulting behavior is *exactly* the same.
>There are only two differences:
>
> - your example places a sitemap behavior inside a sitemap component.
>This appears to me an evident example of the use of the
>overcomponentization anti-pattern.
>

Why is it overcompentization ? There are no new component contracts 
defined here, but just the implementation of existing contracts. The 
"pipeline" transformer is just a composite one since it uses 
transformers defined by the pipeline.

> - you can't reuse the 'blah' pipeline in another location. So it is
>possible that you call a pipeline fragment which is wrongly terminated.
>With implicit overloading, the pipeline is always correctly terminated.
>  
>

Why can't it be reused ? A pipeline always starts with a generator and 
ends with a serializer (not considering readers here).

What the "pipeline" transformer does is build a pipeline, which must 
follow the same rules as any pipeline, and use only its transformers 
(generator and serializer are ignored). Similarly, the "pipeline" 
generator will build the pipeline and use its generator and 
transformers. And so on for the serializer.

>Comments?
>

A lot, you've read them ;)

Sylvain

-- 
Sylvain Wallez
 Anyware Technologies                  Apache Cocoon
 http://www.anyware-tech.com           mailto:sylvain@apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] reconsidering pipeline semantics

Posted by Stefano Mazzocchi <st...@apache.org>.

Sylvain Wallez wrote:

> Although it is ok to call named pipelines _inside_ a sitemap (that's
> just a name change for resources), I don't like it for _inter sitemap_
> calls, like can or will be the case for subsitemaps and blocks : up to
> now, the input contract of the sitemap is the environment, and pipeline
> choice is most often directed by the request URI. Does calling named
> pipelines mean you want to add a new property to the environment, just
> as the view and action we have today ?

The design should not be constrained by implementation details, Sylvain.
Try not to think about how to make it work for now.

I've showed how I perceived named pipelines not as 'functions' to call,
but as 'small internal URI spaces' (sort of small internal virtual
hosts), I think this concept is *much* more powerful than the
'resources' we have now and it's much more block-friendly.

> IMO, the called pipeline should be defined by an URI, just as what we
> already use for the "cocoon:" pseudo-protocol. This wouldn't introduce
> yet-another naming scheme and would keep the existing sitemap contract.

Yes, I do see the danger of adding another protocol for something that
is equivalent in functionality, but I have the feeling that 'pipeline:'
could slowly deprecate 'cocoon:' which IMO, would be a good thing in the
longer term.

[of course, I *DO*NOT* want to deprecated cocoon: , just to show
different and more rational ways of doing things]

The Cocoon: protocol is not block-friendly and we can't make it so
without breaking back compatibility. Also, the name is too cocoon
specific.

> Of course, we must keep today's resources as "named pipeline snippets"
> inside a single sitemap. To answer Peter's request, we can allow a
> resource to be not terminated (i.e. not contain a serializer). I even
> think the treeprocessor already handles this (needs to be verified, though).

>                               --o0o--
> 
> The second point I'm not comfortable with is implicit overloading. I
> have the feeling the associated behaviour will be difficult to predict
> and will make the sitemap hard to read by requiring lots of "look-ahead".

Again, it's the implementation bias that leads your design judgement.

> Consider serializer overloading. The current sitemap definition says
> that a pipeline is terminated when a <map:serialize> or <map:read> is
> encountered. With the implicit overloading semantic, this rule is no
> longer valid as the calling pipeline _may_ or _may not_ define another
> serializer. And as any <map:serialize> that's present _below_ the
> <map:call> can theoretically terminate the pipeline, this means that
> knowning if the called pipeline serializer is overloaded requires
> traversal of the entire remaining part of the sitemap, even if all
> remaining serializers are enclosed in <map:match> that will never match.

Hmmm, wait, I don't get it. From an implementation point of view, I
don't see how having explicit or implicit overloading makes any
difference.

For example, what is the difference between

 <generate .../>
 <transform type="pipeline" src="/blah"/>
 <serialize .../>

and 

 <generate .../>
 <call pipeline="blah"/>
 <serialize .../>

???

The pipelines are exactly the same (they could have tons of matchers and
selectors and so on) and the resulting behavior is *exactly* the same.
There are only two differences:

 - your example places a sitemap behavior inside a sitemap component.
This appears to me an evident example of the use of the
overcomponentization anti-pattern.

 - you can't reuse the 'blah' pipeline in another location. So it is
possible that you call a pipeline fragment which is wrongly terminated.
With implicit overloading, the pipeline is always correctly terminated.
 
Comments?

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------



---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] reconsidering pipeline semantics

Posted by Sylvain Wallez <sy...@anyware-tech.com>.

Stefano Mazzocchi wrote:

>In light of the discussion on blocks, Sylvain pointed out that cocoon
>services should be mapped to pipelines and not to resources directly.
>
>This consideration triggered a few RT that I would like to share with
>you and trigger further discussion.
>
>NOTE: this is nothing related to blocks or flow, but only at the sitemap
>semantics.
>
>                                 - o -
>
>What is a pipeline
>------------------
>
>The first and major architectural contribution that Cocoon brought in
>the web world is the ability to compose web services using the "pipe and
>filters" design pattern. (I'm using 'web services' in the original sense
>of the term: any service that is related to the web)
>
>Cocoon decided to follow an XML-oriented approach to pipelines, forcing
>everything in the XML real and working on that from there. So, the
>Cocoon's pipelines concept is somewhat an extension to the original GoF
>"pipe and filters" pattern: in fact, the Cocoon pipeline implements both
>'pipe and filters' and 'adaptor' patterns.
>
>Why? well, this comes from the fact that the HTTP protocol is not XML
>oriented (unlike SOAP, for example). So, in order to perform XML piping,
>we need to adapt in and out from the generic octet-stream world.
>
>So, unlike the UNIX pipeline which doesn't need adaptation (since the
>STDIN/OUT streams are all octet-oriented), Cocoon needed to create ways
>to adapt to the rest of the world which is not XML oriented.
>
>For this reason, why a UNIX pipeline is composed like this
>
> input -> filter -[pipe]-> filter -[pipe]-> filter -> output
>
>a cocoon pipeline is composed by
>
> input -> adaptor -[pipe]-> filter -[pipe]-> adaptor -> output
>
>unfortunately, the above picture isn't entirely correct since the two
>adaptors can't be exchanged, thus they are, in fact, different entities:
>the first adapts an octet-based world to an XML-based world, the other
>does the opposite. They are not symmetrical. In Cocoon terminology, the
>first adapter is a generator, the second is a serializer.
>
>We call 'Cocoon pipeline' the collection of all filters (transformers)
>and adapters (generator and serializer) because there cannot be a
>pipeline without adapters.
>
>I think it's time to challenge this concept.
>
>                            - o -
>
>What are sitemap resources?
>---------------------------
>
>Let me tell you: they are a mistake, a mistake I did trying to reduce
>the sitemap verbosity and fixing a problem that didn't yet emerged at
>that time. Early optimization is the root of all evil and I see that
>now: resources overlap with pipelines.
>
>Let me show you why. Consider this sitemap snippet:
>
> <sitemap>
>  <resources>
>   <resource name="blah">
>    <generate ../>
>    <transform ../>
>    <serialize ../>
>   </resource>
>  </resources>
> 
>  <pipelines>
>   <pipeline internal-only="true">
>    <match pattern="*">
>     <call resource="blah"/>
>    </match>
>   </pipeline>
>  </pipelines>
> </sitemap>
>
>and now this
>
> <sitemap>
>  <pipelines>
>   <pipeline name="blah">
>    <generate ../>
>    <transform ../>
>    <serialize ../>
>   </pipeline>
>
>   <pipeline>
>    <match pattern="*">
>     <call pipeline="blah"/>
>    </match>
>   </pipeline>
>  </pipelines>
> </sitemap>
>
>which one is more semantically consistent? Can you say "named XSLT
>templates"?
>
>Composing pipelines
>-------------------
>
>Let me assume the above syntax gets introduced. At this point, we have
>four different ways to call a pipeline:
>
> - as a pipeline
> - as a generator
> - as a transformer
> - as a serializer
>
>let me write the code so you understand what I mean:
>
>[using a pipeline as a pipeline] (as today)
>
>   <pipeline>
>    <match pattern="*">
>     <call pipeline="blah"/>
>    </match>
>   </pipeline>
>
>nothing fancy here. Used mainly for verbosity reduction when the same
>pipeline is used in different places.
>
>[using a pipeline as a generator]
>
>   <pipeline>
>    <match pattern="*">
>     <call pipeline="blah"/>
>     <transform .../>
>     <serialize ../>
>    </match>
>   </pipeline>
>
>in this case, the 'serializer' of the called pipeline is not used and
>the output of the last tranformer of the named pipeline is connected
>with the input of the transformer right after the call.
>
>This is equivalent of *overloading* the serializer of the called
>pipeline with the rest of the pipeline in place.
>
>[using a pipeline as a transformer]
>
>   <pipeline>
>    <match pattern="*">
>     <generate ../>
>     <call pipeline="blah"/>
>     <serialize ../>
>    </match>
>   </pipeline>
>
>where both the generator and the serializer of the named pipeline are
>not used.
>
>This is equivalent of *overloading* both the generator and the
>serializer of the called pipeline with the rest of the pipeline in
>place.
>
>[using a pipeline as a serializer]
>
>   <pipeline>
>    <match pattern="*">
>     <generate ../>
>     <tranform ../>
>     <call pipeline="blah"/>
>    </match>
>   </pipeline>
>
>where the generator of the named pipeline is not used.
>
>This is equivalent of *overloading* the generator of the called pipeline
>with the rest of the pipeline in place.
>
>                               - o -
>
>So, here is what I propose:
>
> - add the 'pipeline' attribute to 'map:call'
> - add the 'name' attribute to 'map:pipeline'
> - deprecate the 'map:resources' element
> - deprecate 'internal-only' attribute of 'map:pipeline' 
>   [because named pipelines become implicitly internal-only]
> - allow 'map:call' to be executed in any place, performing the pipeline
>overloading behavior I explained above.
>
>What do you think?  
>  
>

Although it is ok to call named pipelines _inside_ a sitemap (that's 
just a name change for resources), I don't like it for _inter sitemap_ 
calls, like can or will be the case for subsitemaps and blocks : up to 
now, the input contract of the sitemap is the environment, and pipeline 
choice is most often directed by the request URI. Does calling named 
pipelines mean you want to add a new property to the environment, just 
as the view and action we have today ?

IMO, the called pipeline should be defined by an URI, just as what we 
already use for the "cocoon:" pseudo-protocol. This wouldn't introduce 
yet-another naming scheme and would keep the existing sitemap contract.

Of course, we must keep today's resources as "named pipeline snippets" 
inside a single sitemap. To answer Peter's request, we can allow a 
resource to be not terminated (i.e. not contain a serializer). I even 
think the treeprocessor already handles this (needs to be verified, though).

                              --o0o--

The second point I'm not comfortable with is implicit overloading. I 
have the feeling the associated behaviour will be difficult to predict 
and will make the sitemap hard to read by requiring lots of "look-ahead".

Consider serializer overloading. The current sitemap definition says 
that a pipeline is terminated when a <map:serialize> or <map:read> is 
encountered. With the implicit overloading semantic, this rule is no 
longer valid as the calling pipeline _may_ or _may not_ define another 
serializer. And as any <map:serialize> that's present _below_ the 
<map:call> can theoretically terminate the pipeline, this means that 
knowning if the called pipeline serializer is overloaded requires 
traversal of the entire remaining part of the sitemap, even if all 
remaining serializers are enclosed in <map:match> that will never match.

So I'm in favor of a more explicit semantic that clearly defines what 
the caller wants to use in the called pipeline. For this, we can use the 
existing <map:generate>, <map:transform> and <map:serialize> :

* use the full pipeline (generator & transformers & serializer). We 
already have it today :
  <map:redirect-to uri="cocoon:/pipeline_uri"/>

* use generator & transfomers (ignore serializer). We already have it 
today :
  <map:generate type="file" src="cocoon:/pipeline_uri"/>

* use the transformation part (ignore generator & serializer)
  <map:transform type="pipeline" src="pipeline_uri"/>

* use the transformers & serializers (ignore generator)
  <map:serialize type="pipeline" src="pipeline_uri"/>

The first two notations, although currently in use, may be changed to 
something more consistent with the last ones which are new :
  <map:redirect-to pipeline="pipeline_uri"/>
  <map:generate type="pipeline" src="pipeline_uri"/>

Thoughts ?

Sylvain

-- 
Sylvain Wallez
 Anyware Technologies                  Apache Cocoon
 http://www.anyware-tech.com           mailto:sylvain@apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] reconsidering pipeline semantics

Posted by Michael Melhem <mi...@fztig938.bank.dresdner.net>.

On Tue, Jul 02, 2002 at 02:54:37PM +0200, Stefano Mazzocchi wrote:
> 
> So, here is what I propose:
> 
>  - add the 'pipeline' attribute to 'map:call'
>  - add the 'name' attribute to 'map:pipeline'
>  - deprecate the 'map:resources' element
>  - deprecate 'internal-only' attribute of 'map:pipeline' 
>    [because named pipelines become implicitly internal-only]
>  - allow 'map:call' to be executed in any place, performing the pipeline
> overloading behavior I explained above.
> 
> What do you think?  

Just to add my voice to the chorus of approvable....

Yes...It would be nice if map:resources could be 
deprecated in favour of "callable"
pipelines. IMHO this will reduce sitemap complexity while at the same
time increasing flexibilitiy. Plus having the ability to call named pipelines 
seems almost to be an intuitive thing. 

Regards,
Michael Melhem


> 
> -- 
> Stefano Mazzocchi      One must still have chaos in oneself to be
>                           able to give birth to a dancing star.
> <st...@apache.org>                             Friedrich Nietzsche
> --------------------------------------------------------------------
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
> For additional commands, email: cocoon-dev-help@xml.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] reconsidering pipeline semantics

Posted by "J.Pietschmann" <j3...@yahoo.de>.

Stefano Mazzocchi wrote:
> Right now, the note proposed by Eve Maler and Norm Walsh (long time
> contributors of the document-centric XML/SGML world) is a sort of 'ant
> build file for xml processing'. Indeed powerful, but limited in scope
> and many believe too document-oriented.
> 
> I wouldn't personally wait for such a standard to emerge soon. And even
> if it does, I wouldn't bet it would allow us to replace our sitemap with
> it.

I don't think it would replace Cocoons sitemap, nor should it.
The sitemap does, in short, three different things
- declare component classes
- declare pipelines, and fill them with components
- declare mappings from URLs to pipelines (or, in general, to
   a processor for the request)
Admittedly, the sitemap maps also parameter values, cookie values
and everythig else to pipelines, whether this is a good idea or
not.

General pipelining will only deal with the second, and will
only use a very simplified form of the first, if at all.


Well, switch back to the thread topic (pipeline semantics)
Start with an analogy: You have a set of related XML
documents which contains a lot of certain personal data.
You have several choices to deal with this:
- Extreme 1: Expand all the personal dataa whereever it is
   used, for  examaple "contact: J.R.Hacker, Phone 1234567
   and D.A. User, Email dau@fubar.org"
   (use markup as needed)
   Obvious advantage: easy editing of all the docs
   Obvious disadvantage: redundancy, maintenance problems lurking
- Extreme 2: Have one document as a central repository which
   contains all of the personal data and reference them, for
   example:
     <person id="jrh0001">
       <name>
          <cn>J.R.Hacker</cn>
          <firstname>J.</firstname>
          <initial>R.</initial>
          <sn>Hacker</sn>
       </name>
       <contact>
         <phone type="home">1234567</phone>
         <phone type="mobile">99887766</phone>
       </contact>
    </person>
    <person id="dau0231">
       <name>
          <cn>D.A.User</cn>
     ...
   <!-- other doc --> "contact: <personref idref="jrh0001"/> and
    <personref idref="dau0231"/>"
   Obvious advantage: No redundancy, easier maintenance:
   Obvious disadvantage: needs discipline for data entry, needs
   tools to avoid dangling references.
- Middle ground: Use a person element which can contain either a
   full definition or a reference to a full definition:
   "contact: <person><personref idref="jrh0001"/></person> and
    <person><persondef><name><cn>D.A.User</cn></name>
      <contact><email type="unspecified">dau@fubar.org</email></contact>
    </person>"
   Advantages: Documents can be edited with full definitions if
   necessary. Definitions can be moved automatically from documents
   to the  repository and replaced by references, and vice-versa,
   as needed.
   Disadvantages: Needs tool support. <insert more here>.

Apply the principles above to pipelines. A simple pipeline is
an instance of the pipeline component, filled with other component
instances. Let's say the instances could be declared in place, or
referenced (note: no matchers or actions, only processors allowed)
  <!-- declare a few processors -->
  <processordef id="foogenerator" role="filegenerator"
    src="/foo/bar/{name}.xml>
    <param name="name"/>
  </processordef>
  <processordef id="aggregate-and-make-html" role="pipeline">
    <param name="name"/>
    <processorref idref="foogenerator">
      <with-param name="name" value="{name}"/>
    </processorref>
    <processordef role="xslt" src="ref-to-xinclude"/>
    <processorref idref="generic-xinclude"/>
    <processordef idref="doc-to-html"/>
    <processordef role="serializer">
  </processordef>
  <!-- map a URL to a processor -->
  <map match="foo/*.html">
    <processorref idref="aggregate-and-make-html">
      <with-param name="name" value="{1}"/>
    </processorref>
  </map>
  <!-- map another URL to a processor -->
  <map match="download/*.tar.gz">
    <processordef role="pipeline>
      <param name="name" value="{1}"/>
      <processordef role="targenerator" src="/path/to/dnld/{name}">
        <param name="name" value="{name}"/>
      </processorref>
      <processordef role="gzip">
        <param name="method" value="nocompress"/>
      </processorref>
    </processordef>
  </map>
Well, parametrization probably needs some work. I left out
the <processor> around the defs and refs inside the pipeline,
as this seems to be too verbose.
You can replace "ref" by "call" or whatever if you want to.

Well, enough for today
J.Pietschmann


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] reconsidering pipeline semantics

Posted by Stefano Mazzocchi <st...@apache.org>.

"J.Pietschmann" wrote:
> 
> Stefano Mazzocchi wrote:
> > In that case I agree: like I said, if you need to do your stuff without
> > Cocoon around, or without a precise way (xpipe?) to define how a
> > document is processed, document() is the way to go. That's the only
> > argument I acknowledge.
> 
> The problem appears to be that there aren't many
> (any?) stand-alone xinclude processors and XML
> pipeline processors out there, not the mention
> the lack of standardized interfaces, descriptions
> (for pipelines) and behaviour. Cocoon is breaking
> ground here, but for many purposes having to use
> full Cocoon is just too heavyweight (and too
> monolithic).

Agreed.

> What about applying to standards organisations
> for pipeline descriptions and Java interfaces
> to xinclude, pipeline, FO and SVG processors?
> Cocoon could provide a host experience and would
> make a great testbed.

http://xpipe.sourceforge.net/

http://www.w3.org/TR/xml-pipeline/

They are both much more limited than Cocoon, but they are powerful
enough to turn document() into a bad practice most of the times.

Unfortunately, the subject is very critical politically-wise and it will
remain so for ages, I would assume, expecially with the impact on the
web-service processes where either you come up with something as
powerful as cocoon's sitemap or you are doomed.

A general xml pipeline description language might just be a holy grail
or turn into a programming language.

Right now, the note proposed by Eve Maler and Norm Walsh (long time
contributors of the document-centric XML/SGML world) is a sort of 'ant
build file for xml processing'. Indeed powerful, but limited in scope
and many believe too document-oriented.

I wouldn't personally wait for such a standard to emerge soon. And even
if it does, I wouldn't bet it would allow us to replace our sitemap with
it.

So, yes, we are breaking grounds (Cocoon is the *first* pipeline-based
XML engine. Period.) and I think we are even too further along the
road... which sometimes is a good thing, sometimes it's not.

Up to you to choose which one applies.

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] reconsidering pipeline semantics

Posted by Michael Wechner <mi...@wyona.org>.


J.Pietschmann wrote:

> Stefano Mazzocchi wrote:
> 
>> In that case I agree: like I said, if you need to do your stuff without
>> Cocoon around, or without a precise way (xpipe?) to define how a
>> document is processed, document() is the way to go. That's the only
>> argument I acknowledge.
> 
> 
> The problem appears to be that there aren't many
> (any?) stand-alone xinclude processors 



Well, there is
org/wyona/xml/XPSAssembler.java
org/wyona/cms/cocoon/transformation/IncludeTransformer.java

I have programmed it about two years ago and integrated it into
Wyona/Cocoon two months ago. But it still needs some refactoring such
that I will be able to submit it as a patch to Cocoon.

Michael


and XML
> pipeline processors out there, not the mention
> the lack of standardized interfaces, descriptions
> (for pipelines) and behaviour. Cocoon is breaking
> ground here, but for many purposes having to use
> full Cocoon is just too heavyweight (and too
> monolithic).
> 
> What about applying to standards organisations
> for pipeline descriptions and Java interfaces
> to xinclude, pipeline, FO and SVG processors?
> Cocoon could provide a host experience and would
> make a great testbed.
> 
> J.Pietschmann
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
> For additional commands, email: cocoon-dev-help@xml.apache.org
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] reconsidering pipeline semantics

Posted by "J.Pietschmann" <j3...@yahoo.de>.

Stefano Mazzocchi wrote:
> In that case I agree: like I said, if you need to do your stuff without
> Cocoon around, or without a precise way (xpipe?) to define how a
> document is processed, document() is the way to go. That's the only
> argument I acknowledge.

The problem appears to be that there aren't many
(any?) stand-alone xinclude processors and XML
pipeline processors out there, not the mention
the lack of standardized interfaces, descriptions
(for pipelines) and behaviour. Cocoon is breaking
ground here, but for many purposes having to use
full Cocoon is just too heavyweight (and too
monolithic).

What about applying to standards organisations
for pipeline descriptions and Java interfaces
to xinclude, pipeline, FO and SVG processors?
Cocoon could provide a host experience and would
make a great testbed.

J.Pietschmann

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] reconsidering pipeline semantics

Posted by Stefano Mazzocchi <st...@apache.org>.

"J.Pietschmann" wrote:
> 
> Stefano Mazzocchi wrote:
> > I personally don't like the document() function of XSLT. I normally
> > suggest to use an xinclude stage to perform aggregation when the
> > structure of the aggregation is not fixed, but that's really up to you.
> >
> I personally don't like Cocoon aggregation, basically
> because it is proprietary to Cocoon, while document()
> is a mandatory feature of all XSLT processors.

Hmmm, sounds like a physolophical point only to me, unless you really
want to use your xslt stuff in other pipeline systems (AxKit, for
example), but I never heard of anybody having that requirement.

> Well each of the three aggregation methods has its
> place:
> Cocoon aggregation: controlled by sitemap, non-intrusive
>   to XML and XSLT
> xinclude: controlled by XML source
> document(): controlled by stylesheet writer, non-intrusive
>   to XML
> 
> The downsides of of document() are that it is not quite
> compatible yet with Cocoon caching, and many people have
> difficulties to understand the two argument form and how
> relative URIs are resolved.

document() was designed when xinclude with xpointer facilities wasn't
there. I bet that if xinclude existed before, they would not have added
this feature to xslt 

NOTE: document() is not defined in the XPath language because they
thought that didn't really fit there. document() is an XSLT extension to
XPath that mimics xinclude + xpointer.

Today, they cannot deprecate it because there is no way to describe the
a multistaged pipeline execution of an xml document, thus there is no
way to specify that document() behaves like xinclude:include

But for Cocoon we have that way so the only reason why I would use
document() for external aggregation (or external entities, which are
even worse) is when I have to move the XSLT in some pipeline engine
which is not xinclude-capable and where I don't have a predefined way to
indicate the steps of the document processing.

> >>In any case, aggregation seems not to be quite right.
> >
> > Why? what's the problem you are experiencing?
> >
> I have a bunch of XML files with the same DTD. They
> could be processed with the same XSLT and mapped by
> the same pipeline if I use document(). If I use
> aggregation, I'd have to use a separate pipeline and
> XSLT (because of the new document element and skipping
> the aggregated content until it's needed) for the
> currently two sources which need aggregated content.
> Using an aggregation pipeline for all documents seems
> to be a bit of an overkill, especially because the
> directories read are on mounted drives which are slow
> on occasion.
> Using xinclude doesn't change the picture much, instead
> of an aggregating pipeline I'd have to use a three stage
> pipeline (transformation of reference to xinclude,
> xinclude, final transformation), the only advantage
> being that XML content can be used for constructing
> the referencing URLs (which will probably be needed).
> With document(), I can put a <news from=".."> element
> in any source and get the content for aggregation where
> and when it is needed, no need keep track which source
> uses news in order to get it into the correct pipeline.
> The other advantage is that I can develop the XSLT
> locally using a standard XSLT processor, thereby getting
> short turn around times. The other variants require
> a Cocoon running somewhere, and in my experience this
> slows down the edit-test-cycle even if I edit the XSLT
> in place, mainly because I always need to reload the
> page in the browser, and problems must be looked up in
> the logs.

In that case I agree: like I said, if you need to do your stuff without
Cocoon around, or without a precise way (xpipe?) to define how a
document is processed, document() is the way to go. That's the only
argument I acknowledge.

For the rest of the arguments, I simply disagree: inclusion and
transformations are two different concerns, but, like I said before,
it's really up to you to use the one you like the most.

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------



---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] reconsidering pipeline semantics

Posted by "J.Pietschmann" <j3...@yahoo.de>.

Stefano Mazzocchi wrote:
> I personally don't like the document() function of XSLT. I normally
> suggest to use an xinclude stage to perform aggregation when the
> structure of the aggregation is not fixed, but that's really up to you.
> 
I personally don't like Cocoon aggregation, basically
because it is proprietary to Cocoon, while document()
is a mandatory feature of all XSLT processors.

Well each of the three aggregation methods has its
place:
Cocoon aggregation: controlled by sitemap, non-intrusive
  to XML and XSLT
xinclude: controlled by XML source
document(): controlled by stylesheet writer, non-intrusive
  to XML

The downsides of of document() are that it is not quite
compatible yet with Cocoon caching, and many people have
difficulties to understand the two argument form and how
relative URIs are resolved.

>>In any case, aggregation seems not to be quite right.
> 
> Why? what's the problem you are experiencing?
> 
I have a bunch of XML files with the same DTD. They
could be processed with the same XSLT and mapped by
the same pipeline if I use document(). If I use
aggregation, I'd have to use a separate pipeline and
XSLT (because of the new document element and skipping
the aggregated content until it's needed) for the
currently two sources which need aggregated content.
Using an aggregation pipeline for all documents seems
to be a bit of an overkill, especially because the
directories read are on mounted drives which are slow
on occasion.
Using xinclude doesn't change the picture much, instead
of an aggregating pipeline I'd have to use a three stage
pipeline (transformation of reference to xinclude,
xinclude, final transformation), the only advantage
being that XML content can be used for constructing
the referencing URLs (which will probably be needed).
With document(), I can put a <news from=".."> element
in any source and get the content for aggregation where
and when it is needed, no need keep track which source
uses news in order to get it into the correct pipeline.
The other advantage is that I can develop the XSLT
locally using a standard XSLT processor, thereby getting
short turn around times. The other variants require
a Cocoon running somewhere, and in my experience this
slows down the edit-test-cycle even if I edit the XSLT
in place, mainly because I always need to reload the
page in the browser, and problems must be looked up in
the logs.

J.Pietschmann


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] reconsidering pipeline semantics

Posted by Stefano Mazzocchi <st...@apache.org>.

"J.Pietschmann" wrote:
> 
> Stefano Mazzocchi wrote:
> >  - deprecate 'internal-only' attribute of 'map:pipeline'
> >    [because named pipelines become implicitly internal-only]
> 
> I use something like the following:
> 
>    <div class="newsbox">
>      <xsl:for-each select=
>        "document(concat('cocoon://int/news/',$pubdir))/files/file">
>        <xsl:sort select="@last-modified"/>
>        <p class="newsheadline">
>          <xsl:value-of select="document(@name)/document/header/title">
>        </p>
>        <p class="newsabstract">
>          <xsl:value-of select="substring(
>          document(@name)/document/header/abstract,1,42)">
>        </p>
>      </xsl:for-each>
>    ...

I personally don't like the document() function of XSLT. I normally
suggest to use an xinclude stage to perform aggregation when the
structure of the aggregation is not fixed, but that's really up to you.

> and in the sitemap
>    <map:pipeline internal-only="yes">
>       <map:match pattern="int/news/*">
>         <map:generate type="directory" src="/my/news/{1}"/>
>         <map:transform src="xslt/newsfilter.xsl"/>
>         <map:serialize type="xml"/>
>       </map:match>
>    </map:pipeline>
> 
> Is there a better way to solve this kind of problem?
> (I worked around the caching issue).

if you use named pipelines you could do

    <map:pipeline name="news">
       <map:match pattern="*">
         <map:generate type="directory" src="/my/news/{1}"/>
         <map:transform src="xslt/newsfilter.xsl"/>
         <map:serialize type="xml"/>
       </map:match>
    </map:pipeline>

and then call it with

 pipeline://news/*

> How will this be solved without internal-only pipelines?
> The $pubdir is curently a parameter but there are reasons
> to have it been taken from input XML in some cases. In
> any case, aggregation seems not to be quite right.

Why? what's the problem you are experiencing?

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------



---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] reconsidering pipeline semantics

Posted by "J.Pietschmann" <j3...@yahoo.de>.

Stefano Mazzocchi wrote:
>  - deprecate 'internal-only' attribute of 'map:pipeline' 
>    [because named pipelines become implicitly internal-only]

I use something like the following:

   <div class="newsbox">
     <xsl:for-each select=
       "document(concat('cocoon://int/news/',$pubdir))/files/file">
       <xsl:sort select="@last-modified"/>
       <p class="newsheadline">
         <xsl:value-of select="document(@name)/document/header/title">
       </p>
       <p class="newsabstract">
         <xsl:value-of select="substring(
         document(@name)/document/header/abstract,1,42)">
       </p>
     </xsl:for-each>
   ...



and in the sitemap
   <map:pipeline internal-only="yes">
      <map:match pattern="int/news/*">
        <map:generate type="directory" src="/my/news/{1}"/>
        <map:transform src="xslt/newsfilter.xsl"/>
        <map:serialize type="xml"/>
      </map:match>
   </map:pipeline>

Is there a better way to solve this kind of problem?
(I worked around the caching issue).
How will this be solved without internal-only pipelines?
The $pubdir is curently a parameter but there are reasons
to have it been taken from input XML in some cases. In
any case, aggregation seems not to be quite right.

Ideas, thoughts?

J.Pietschmann


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] reconsidering pipeline semantics

Posted by Christian Haul <ha...@dvs1.informatik.tu-darmstadt.de>.

On 03.Jul.2002 -- 10:51 PM, Sylvain Wallez wrote:
> sitemap, and is only visible from this sitemap. The treeprocessor 
> currently "silently" extends the resource definition by allowing a 
> resource to be unterminated (i.e. no serializer). This wasn't done on 
> purpose (this is in fact a bug if resources must be terminated), but we 
> can make it official.

Cool!!! Yes, *please*!!! Now go away, action-set! (a concept I never liked)

	Chris.
-- 
C h r i s t i a n       H a u l
haul@informatik.tu-darmstadt.de
    fingerprint: 99B0 1D9D 7919 644A 4837  7D73 FEF9 6856 335A 9E08


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

RE: Internal Pipelines (was: [RT] reconsidering pipeline semantics)

Posted by Carsten Ziegeler <cz...@s-und-n.de>.

Sylvain Wallez wrote:
>
> So I think we have a bug here, and I have no obvious solution :
>
> I removed the test for "!this.internal", because internal redirects
> should be considered as internal requests, and so always call
> processInternal().
> --> blank page
>
> This blank page is caused by the pipeline not being processed, so I
> added pp.process(newEnv) after processInternal()
> --> Bang ! _"Unable to get transformer handler for
> stylesheets/site2xhtml.xsl_" (comes from excalibur xslt component)
>
> ??? I'm kinda lost here an have no time now for further investigations.
>
> Carsten, this seems to be related to the changes you made on internal
> processing. Got an idea ?
>
To be honest, I don't the purpose why we have this ForwardRedirector,
so it's a little bit difficult to say, what is right and what is wrong for
me. (It's one of those classes adding more and more complexity).

Anyway, if you call processInternal() the processing pipeline is assembled
correctly but not processed. If you now want to process it, you have to
change the environment (or the context) to the context of the sitemap,
the pipeline was assembled in.
For example, the SitemapSource does it this way:
            this.processingPipeline =
this.processor.processInternal(this.environment);
            this.pipelineProcessor = this.environment.changeToLastContext();
            try {
                CocoonComponentManager.enterEnvironment(this.environment,

this.environment.getObjectModel(),

this.pipelineProcessor);
                this.processingPipeline.process(this.environment);
            } finally {
                CocoonComponentManager.leaveEnvironment();
            }
Perhaps adding this piece of code helps?

Carsten


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: Internal Pipelines (was: [RT] reconsidering pipeline semantics)

Posted by Sylvain Wallez <sy...@anyware-tech.com>.

Christian Haul wrote:

>On 04.Jul.2002 -- 03:20 PM, Christian Haul wrote:
>  
>
>>On 04.Jul.2002 -- 11:50 AM, Sylvain Wallez wrote:
>>    
>>
>>>The redirector used (ForwardRedirector) handles the special "cocoon:" 
>>>protocol as internal redirects (aka "forward") which should consider the 
>>>request as internal.
>>>
>>>However, I noticed Carsten has added a processInternal() method to 
>>>Processor and I wonder if there is some side-effects that break internal 
>>>pipelines. Can you check if internal pilepelines still work as expected 
>>>(using a simple cocoon: source) ? I unfortunately have no time to do it 
>>>myself.
>>>      
>>>
>
>Mmmh, looking at the stacktrace, ForwardRedirector is in the
>(!this.internal) branch, method cocoonRedirect.
>
>            if ( !this.internal ) {
>                processingResult = actualProcessor.process(newEnv);
>            } else {
>                ProcessingPipeline pp = actualProcessor.processInternal(newEnv);
>                if (pp != null) pp.release();
>                processingResult = pp != null;
>            }
>
>Does it make sense to have external redirects handled by the
>cocoonRedirect method at all? It seems that it is invoked only, if the
>URL starts with "cocoon:" and thus should be internal, right?
>  
>

Right (see below).

>Regarding the flow: It uses the existing Environment to call
>AbstractInterpreter.forwardTo(), which in turn uses that very
>Environment to obtain a Redirector. Since "internal" is a property of
>that Environment, it cannot be "true" since the original request
>wasn't internal.
>
>Redirecting from a RedirectToURINode works much alike.
>
>So?
>  
>

So I think we have a bug here, and I have no obvious solution :

I removed the test for "!this.internal", because internal redirects 
should be considered as internal requests, and so always call 
processInternal().
--> blank page

This blank page is caused by the pipeline not being processed, so I 
added pp.process(newEnv) after processInternal()
--> Bang ! _"Unable to get transformer handler for 
stylesheets/site2xhtml.xsl_" (comes from excalibur xslt component)

??? I'm kinda lost here an have no time now for further investigations.

Carsten, this seems to be related to the changes you made on internal 
processing. Got an idea ?

Sylvain

-- 
Sylvain Wallez
  Anyware Technologies                  Apache Cocoon
  http://www.anyware-tech.com           mailto:sylvain@apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: Internal Pipelines (was: [RT] reconsidering pipeline semantics)

Posted by Christian Haul <ha...@dvs1.informatik.tu-darmstadt.de>.

On 04.Jul.2002 -- 03:20 PM, Christian Haul wrote:
> On 04.Jul.2002 -- 11:50 AM, Sylvain Wallez wrote:
> > The redirector used (ForwardRedirector) handles the special "cocoon:" 
> > protocol as internal redirects (aka "forward") which should consider the 
> > request as internal.
> > 
> > However, I noticed Carsten has added a processInternal() method to 
> > Processor and I wonder if there is some side-effects that break internal 
> > pipelines. Can you check if internal pilepelines still work as expected 
> > (using a simple cocoon: source) ? I unfortunately have no time to do it 
> > myself.

Mmmh, looking at the stacktrace, ForwardRedirector is in the
(!this.internal) branch, method cocoonRedirect.

            if ( !this.internal ) {
                processingResult = actualProcessor.process(newEnv);
            } else {
                ProcessingPipeline pp = actualProcessor.processInternal(newEnv);
                if (pp != null) pp.release();
                processingResult = pp != null;
            }

Does it make sense to have external redirects handled by the
cocoonRedirect method at all? It seems that it is invoked only, if the
URL starts with "cocoon:" and thus should be internal, right?

Regarding the flow: It uses the existing Environment to call
AbstractInterpreter.forwardTo(), which in turn uses that very
Environment to obtain a Redirector. Since "internal" is a property of
that Environment, it cannot be "true" since the original request
wasn't internal.

Redirecting from a RedirectToURINode works much alike.

So?

	Chris.
-- 
C h r i s t i a n       H a u l
haul@informatik.tu-darmstadt.de
    fingerprint: 99B0 1D9D 7919 644A 4837  7D73 FEF9 6856 335A 9E08

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: Internal Pipelines (was: [RT] reconsidering pipeline semantics)

Posted by Christian Haul <ha...@dvs1.informatik.tu-darmstadt.de>.

On 04.Jul.2002 -- 11:50 AM, Sylvain Wallez wrote:
> The redirector used (ForwardRedirector) handles the special "cocoon:" 
> protocol as internal redirects (aka "forward") which should consider the 
> request as internal.
> 
> However, I noticed Carsten has added a processInternal() method to 
> Processor and I wonder if there is some side-effects that break internal 
> pipelines. Can you check if internal pilepelines still work as expected 
> (using a simple cocoon: source) ? I unfortunately have no time to do it 
> myself.

If I did everything correctly, yes. (Samples are still in disorder, so
I couldn't test with existing sample) In mount/vquadrat/sitemap.xmap:

  <map:pipelines>

	<map:pipeline internal-only="true">
	  <map:match pattern="internal/*">
        <map:call resource="dynamic-page">
		  <map:parameter name="target" value="login"/>
        </map:call>
	  </map:match>
	</map:pipeline>

	<map:pipeline>
	  <map:match pattern="test">
	    <map:read src="cocoon://mount/vquadrat/internal/login"/>
	  </map:match>
	</map:pipeline>

DEBUG   (2002-07-04) 15:16.59:824   [sitemap] (/cocoon/mount/vquadrat/test) HttpProcessor[8080][0]/DefaultComponentFactory: ComponentFactory creating new instance of org.apache.cocoon.components.pipeline.impl.CachingProcessingPipeline.
DEBUG   (2002-07-04) 15:16.59:825   [sitemap] (/cocoon/mount/vquadrat/test) HttpProcessor[8080][0]/DefaultComponentFactory: no logger attribute available, using standard logger
DEBUG   (2002-07-04) 15:16.59:825   [sitemap] (/cocoon/mount/vquadrat/test) HttpProcessor[8080][0]/ResourceLimitingPool: Created a new org.apache.cocoon.components.pipeline.impl.CachingProcessingPipeline from the object factory.
DEBUG   (2002-07-04) 15:16.59:826   [sitemap] (/cocoon/mount/vquadrat/test) HttpProcessor[8080][0]/ResourceLimitingPool: Got a org.apache.cocoon.components.pipeline.impl.CachingProcessingPipeline from the pool.
DEBUG   (2002-07-04) 15:16.59:826   [sitemap] (/cocoon/mount/vquadrat/test) HttpProcessor[8080][0]/ExcaliburComponentSelector: readers: ComponentSelector could not find the component for hint: resource
DEBUG   (2002-07-04) 15:16.59:827   [sitemap] (/cocoon/mount/vquadrat/test) HttpProcessor[8080][0]/DefaultComponentFactory: ComponentFactory creating new instance of org.apache.cocoon.reading.ResourceReader.
DEBUG   (2002-07-04) 15:16.59:827   [sitemap] (/cocoon/mount/vquadrat/test) HttpProcessor[8080][0]/DefaultComponentFactory: logger attribute is sitemap.reader.resource
DEBUG   (2002-07-04) 15:16.59:828   [sitemap] (/cocoon/mount/vquadrat/test) HttpProcessor[8080][0]/ResourceLimitingPool: Created a new org.apache.cocoon.reading.ResourceReader from the object factory.
DEBUG   (2002-07-04) 15:16.59:828   [sitemap] (/cocoon/mount/vquadrat/test) HttpProcessor[8080][0]/ResourceLimitingPool: Got a org.apache.cocoon.reading.ResourceReader from the pool.
INFO    (2002-07-04) 15:16.59:831   [sitemap] (/cocoon/mount/vquadrat/test) HttpProcessor[8080][0]/PreparableMatchNode: Matcher 'wildcard' matched prepared pattern 'mount/*/**' at jndi:/localhost/cocoon/sitemap.xmap:461:37
DEBUG   (2002-07-04) 15:16.59:832   [sitemap] (/cocoon/mount/vquadrat/test) HttpProcessor[8080][0]/InvokeContext:
Current Sitemap Parameters:
PARAM: '2' VALUE: 'internal/login'
PARAM: '1' VALUE: 'vquadrat'
PARAM: '0' VALUE: 'mount/vquadrat/internal/login'

INFO    (2002-07-04) 15:16.59:836   [sitemap] (/cocoon/mount/vquadrat/test) HttpProcessor[8080][0]/PreparableMatchNode: Matcher 'wildcard' matched prepared pattern 'internal/*' at file:/projects/haul/jakarta-tomcat-4.0.4-b3/webapps/cocoon/mount/vquadrat/sitemap.xmap:135:36

But redirecting doesn't:

  <map:pipelines>

	<map:pipeline internal-only="true">
	  <map:match pattern="internal/*">
        <map:call resource="dynamic-page">
		  <map:parameter name="target" value="login"/>
        </map:call>
	  </map:match>
	</map:pipeline>

	<map:pipeline>
	  <map:match pattern="test">
	    <map:redirect-to uri="cocoon://mount/vquadrat/internal/login"/>
	  </map:match>
	</map:pipeline>

INFO    (2002-07-04) 15:07.42:395   [sitemap] (/cocoon/mount/vquadrat/test) HttpProcessor[8080][0]/RedirectToURINode: Redirecting to 'cocoon://mount/vquadrat/internal/login' at file:/projects/haul/jakarta-tomcat-4.0.4-b3/webapps/cocoon/mount/vquadrat/sitemap.xmap:220:68
INFO    (2002-07-04) 15:07.42:395   [sitemap] (/cocoon/mount/vquadrat/test) HttpProcessor[8080][0]/ForwardRedirector: Redirecting to 'cocoon://mount/vquadrat/internal/login'
DEBUG   (2002-07-04) 15:07.42:397   [sitemap] (/cocoon/mount/vquadrat/test) HttpProcessor[8080][0]/EnvironmentWrapper: Setting uri (prefix=, uris=mount/vquadrat/internal/login)
DEBUG   (2002-07-04) 15:07.42:397   [sitemap] (/cocoon/mount/vquadrat/test) HttpProcessor[8080][0]/AbstractEnvironment: Set the URI Prefix (OLD=mount/vquadrat/, NEW=)
INFO    (2002-07-04) 15:07.42:398   [sitemap] (/cocoon/mount/vquadrat/test) HttpProcessor[8080][0]/PreparableMatchNode: Matcher 'wildcard' matched prepared pattern 'mount/*/**' at jndi:/localhost/cocoon/sitemap.xmap:461:37
DEBUG   (2002-07-04) 15:07.42:398   [sitemap] (/cocoon/mount/vquadrat/test) HttpProcessor[8080][0]/InvokeContext:
Current Sitemap Parameters:
PARAM: '2' VALUE: 'internal/login'
PARAM: '1' VALUE: 'vquadrat'
PARAM: '0' VALUE: 'mount/vquadrat/internal/login'

DEBUG   (2002-07-04) 15:07.42:399   [sitemap] (/cocoon/mount/vquadrat/test) HttpProcessor[8080][0]/AbstractEnvironment: Changing Cocoon context
DEBUG   (2002-07-04) 15:07.42:399   [sitemap] (/cocoon/mount/vquadrat/test) HttpProcessor[8080][0]/AbstractEnvironment:   from context(file:/projects/haul/jakarta-tomcat-4.0.4-b3/webapps/cocoon/) and prefix()
DEBUG   (2002-07-04) 15:07.42:399   [sitemap] (/cocoon/mount/vquadrat/test) HttpProcessor[8080][0]/AbstractEnvironment:   to context(mount/vquadrat/) and prefix(mount/vquadrat)
DEBUG   (2002-07-04) 15:07.42:400   [sitemap] (/cocoon/mount/vquadrat/test) HttpProcessor[8080][0]/AbstractEnvironment:   at URI mount/vquadrat/internal/login
DEBUG   (2002-07-04) 15:07.42:400   [sitemap] (/cocoon/mount/vquadrat/test) HttpProcessor[8080][0]/AbstractEnvironment: New context is file:/projects/haul/jakarta-tomcat-4.0.4-b3/webapps/cocoon/mount/vquadrat/
INFO    (2002-07-04) 15:07.42:401   [sitemap] (/cocoon/mount/vquadrat/test) HttpProcessor[8080][0]/PipelineNode: No pipeline matched request: mount/vquadrat/internal/login
WARN    (2002-07-04) 15:07.42:401   [sitemap] (/cocoon/mount/vquadrat/test) HttpProcessor[8080][0]/PipelineNode: Resource not found in pipeline at file:/projects/haul/jakarta-tomcat-4.0.4-b3/webapps/cocoon/mount/vquadrat/sitemap.xmap:143:19

With line 143 being the begin of the externally visible pipeline.

	Chris.
-- 
C h r i s t i a n       H a u l
haul@informatik.tu-darmstadt.de
    fingerprint: 99B0 1D9D 7919 644A 4837  7D73 FEF9 6856 335A 9E08

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: Internal Pipelines (was: [RT] reconsidering pipeline semantics)

Posted by Sylvain Wallez <sy...@anyware-tech.com>.

Carsten Ziegeler wrote:

>  
>
>>-----Original Message-----
>>From: Sylvain Wallez [mailto:sylvain.wallez@anyware-tech.com]
>>Sent: Thursday, July 04, 2002 11:50 AM
>>To: cocoon-dev@xml.apache.org
>>Subject: Re: Internal Pipelines (was: [RT] reconsidering pipeline
>>semantics)
>>
>>
>>Christian Haul wrote:
>>
>>    
>>
>>>Sylvain, 
>>>
>>>I'm still trying to fully understand flow and the treeprocessor. I
>>>have the impression, that sending a page does not access internal-only
>>>pipelines. I found that sendPage uses a redirector to actually send
>>>the page in AbstractInterpreter.java. Any idea how this can be marked
>>>as an internal request?
>>> 
>>>
>>>      
>>>
>>The redirector used (ForwardRedirector) handles the special "cocoon:" 
>>protocol as internal redirects (aka "forward") which should consider the 
>>request as internal.
>>
>>However, I noticed Carsten has added a processInternal() method to 
>>Processor and I wonder if there is some side-effects that break internal 
>>pipelines. Can you check if internal pilepelines still work as expected 
>>(using a simple cocoon: source) ? I unfortunately have no time to do it 
>>myself.
>>    
>>

<snip/>

>>BTW, Carsten, I like very much this new processInternal() as it returns 
>>a pipeline object. This should make it very easy to implement pipeline 
>>services.
>>
>>    
>>
>Thanks.
>Actually this method is not so new at all. In former versions it was
>    boolean process(Environment environment, StreamPipeline pipeline,
>                    EventPipeline eventPipeline)
>  
>

Yes, but with your change Processor handles the pipeline creation rather 
than accepting an existing one, and I think it's better this way.

Sylvain

-- 
Sylvain Wallez
  Anyware Technologies                  Apache Cocoon
  http://www.anyware-tech.com           mailto:sylvain@apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

RE: Internal Pipelines (was: [RT] reconsidering pipeline semantics)

Posted by Carsten Ziegeler <cz...@s-und-n.de>.


> -----Original Message-----
> From: Sylvain Wallez [mailto:sylvain.wallez@anyware-tech.com]
> Sent: Thursday, July 04, 2002 11:50 AM
> To: cocoon-dev@xml.apache.org
> Subject: Re: Internal Pipelines (was: [RT] reconsidering pipeline
> semantics)
> 
> 
> Christian Haul wrote:
> 
> >Sylvain, 
> >
> >I'm still trying to fully understand flow and the treeprocessor. I
> >have the impression, that sending a page does not access internal-only
> >pipelines. I found that sendPage uses a redirector to actually send
> >the page in AbstractInterpreter.java. Any idea how this can be marked
> >as an internal request?
> >  
> >
> 
> The redirector used (ForwardRedirector) handles the special "cocoon:" 
> protocol as internal redirects (aka "forward") which should consider the 
> request as internal.
> 
> However, I noticed Carsten has added a processInternal() method to 
> Processor and I wonder if there is some side-effects that break internal 
> pipelines. Can you check if internal pilepelines still work as expected 
> (using a simple cocoon: source) ? I unfortunately have no time to do it 
> myself.
> 
> BTW, Carsten, I like very much this new processInternal() as it returns 
> a pipeline object. This should make it very easy to implement pipeline 
> services.
> 
Thanks.
Actually this method is not so new at all. In former versions it was
    boolean process(Environment environment, StreamPipeline pipeline,
                    EventPipeline eventPipeline)


Carsten

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: Internal Pipelines (was: [RT] reconsidering pipeline semantics)

Posted by Sylvain Wallez <sy...@anyware-tech.com>.

Christian Haul wrote:

>Sylvain, 
>
>I'm still trying to fully understand flow and the treeprocessor. I
>have the impression, that sending a page does not access internal-only
>pipelines. I found that sendPage uses a redirector to actually send
>the page in AbstractInterpreter.java. Any idea how this can be marked
>as an internal request?
>  
>

The redirector used (ForwardRedirector) handles the special "cocoon:" 
protocol as internal redirects (aka "forward") which should consider the 
request as internal.

However, I noticed Carsten has added a processInternal() method to 
Processor and I wonder if there is some side-effects that break internal 
pipelines. Can you check if internal pilepelines still work as expected 
(using a simple cocoon: source) ? I unfortunately have no time to do it 
myself.

BTW, Carsten, I like very much this new processInternal() as it returns 
a pipeline object. This should make it very easy to implement pipeline 
services.

Sylvain

-- 
Sylvain Wallez
  Anyware Technologies                  Apache Cocoon
  http://www.anyware-tech.com           mailto:sylvain@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Internal Pipelines (was: [RT] reconsidering pipeline semantics)

Posted by Christian Haul <ha...@dvs1.informatik.tu-darmstadt.de>.

Sylvain, 

I'm still trying to fully understand flow and the treeprocessor. I
have the impression, that sending a page does not access internal-only
pipelines. I found that sendPage uses a redirector to actually send
the page in AbstractInterpreter.java. Any idea how this can be marked
as an internal request?

TIA.

	Chris.
-- 
C h r i s t i a n       H a u l
haul@informatik.tu-darmstadt.de
    fingerprint: 99B0 1D9D 7919 644A 4837  7D73 FEF9 6856 335A 9E08


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] reconsidering pipeline semantics

Posted by Sylvain Wallez <sy...@anyware-tech.com>.

Jeremy Quinn wrote:

<snip/>

> I never liked map:resource, always having to be at the end, makes them 
> far less useful for pipeline reuse!
>
> It's like going to a plumber's merchant to buy some joints and they 
> say 'yes we have elbows and tee joints, but they are all pre-assembled 
> into special shapes, you cannot buy them individually'. :P 

What if resources didn't have to be terminated, thus being arbitrary 
pipeline snippets (teasing, see below) ?

<snip/>

> I think overloading is a grand idea in principal, though I fear it 
> will be confusing and difficult to understand in this context (unless 
> you are a Java architect ;) 

Same fear here, even if I'm a Java architect ;)

<snip/>

> I think the idea of having a pipeline snippet, that is able to be 
> 'called' from the middle of another pipeline is a great idea, it 
> allows much more effective componentisation and the ability to hide 
> complexity (in blocks or other pipelines/sitemaps).
>
> What I am not so happy about is being forced to put a component 
> (generator|serializer) into a pipeline, when it is obviously not going 
> to be used!
>
> IMHO calling a pipeline snippet (for whatever purpose) should be 
> equivalent to XIncluding that pipeline snippet into the calling 
> pipeline at Sitemap runtime.
>
> So:
>     <map:call pipeline="blah"/> includes pipeline components
>
> While:
>     <map:generate src="cocoon:/blah"/>,
>     <map:transform src="cocoon:/blah"/> and
>     <map:part src="cocoon:/blah"/> import pipeline output
>
> Anyway, I think this (general idea) will lead to great improvements!

Jeremy, this resonates with what I proposed at 
http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=102564464204469&w=2 
(see at the bottom, I forgot to snip...), and here are some additional 
inputs :

A resource is a sitemap snippet, which is used locally in the current 
sitemap, and is only visible from this sitemap. The treeprocessor 
currently "silently" extends the resource definition by allowing a 
resource to be unterminated (i.e. no serializer). This wasn't done on 
purpose (this is in fact a bug if resources must be terminated), but we 
can make it official.

Defined this way (a pipeline snippet, without mandatory serializer), a 
resource is equivalent (minus its additional parameters) to inlining its 
content at the location of the <map:call>.

Using pipelines to define other pipelines came from a different need, 
which is to identify cross-sitemap/block services. I don't like named 
pipelines, since the current scope of resource names is limited to the 
sitemap that defines them. Using names to identify cross-sitemap/block 
services will add a new contract when we already have one : the sitemap 
environment, and mainly its request URI, as used by the "cocoon:" protocol.

That's why I'm in favor of a more explicit semantic that both :
- differenciates internal snippets (a writing facility) and pipeline 
services (inter-sitemap/block contracts),
- clearly states what should be used in pipeline services, and not rely 
on implicit overloading.

Sylvain

-- 
Sylvain Wallez
 Anyware Technologies                  Apache Cocoon
 http://www.anyware-tech.com           mailto:sylvain@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] reconsidering pipeline semantics

Posted by Jeremy Quinn <sh...@mac.com>.

On Tuesday, July 2, 2002, at 01:54 PM, Stefano Mazzocchi wrote:

> In light of the discussion on blocks, Sylvain pointed out that cocoon
> services should be mapped to pipelines and not to resources directly.
>

I totally agree with that suggestion.


<snip/>


>
> We call 'Cocoon pipeline' the collection of all filters (transformers)
> and adapters (generator and serializer) because there cannot be a
> pipeline without adapters.
>
> I think it's time to challenge this concept.
>

Great!


>                             - o -
>
> What are sitemap resources?
> ---------------------------
>
> Let me tell you: they are a mistake, a mistake I did trying to reduce
> the sitemap verbosity and fixing a problem that didn't yet emerged at
> that time. Early optimization is the root of all evil and I see that
> now: resources overlap with pipelines.
>

+ 1x10^6

I never liked map:resource, always having to be at the end, 
makes them far less useful for pipeline reuse!

It's like going to a plumber's merchant to buy some joints and 
they say 'yes we have elbows and tee joints, but they are all 
pre-assembled into special shapes, you cannot buy them 
individually'. :P

>

<snip/>


> Composing pipelines
> -------------------
>
> Let me assume the above syntax gets introduced. At this point, we have
> four different ways to call a pipeline:
>
>  - as a pipeline
>  - as a generator
>  - as a transformer
>  - as a serializer
>
> let me write the code so you understand what I mean:
>
> [using a pipeline as a pipeline] (as today)
>
>    <pipeline>
>     <match pattern="*">
>      <call pipeline="blah"/>
>     </match>
>    </pipeline>
>
> nothing fancy here. Used mainly for verbosity reduction when the same
> pipeline is used in different places.
>
> [using a pipeline as a generator]
>
>    <pipeline>
>     <match pattern="*">
>      <call pipeline="blah"/>
>      <transform .../>
>      <serialize ../>
>     </match>
>    </pipeline>
>

these are both the same as:

    <pipeline>
     <match pattern="*">
      <generate src="cocoon:/blah"/>
     </match>
    </pipeline>



> in this case, the 'serializer' of the called pipeline is not used and
> the output of the last tranformer of the named pipeline is connected
> with the input of the transformer right after the call.
>
> This is equivalent of *overloading* the serializer of the called
> pipeline with the rest of the pipeline in place.
>
> [using a pipeline as a transformer]
>
>    <pipeline>
>     <match pattern="*">
>      <generate ../>
>      <call pipeline="blah"/>
>      <serialize ../>
>     </match>
>    </pipeline>
>

This we cannot currently do, and it would be an excellent addition!

We have two different transformation types here (as others have 
pointed out)

	1. a sub-pipeline that generates an XSLT Stylesheet
	2. a sub-pipeline that does some transformations for you

Case 1 you do like this:

	<map:transform src="cocoon:/make-xslt"/>

Case 2 we cannot do right now and would be a great addition!


> where both the generator and the serializer of the named pipeline are
> not used.
>

so why have them in there?

> This is equivalent of *overloading* both the generator and the
> serializer of the called pipeline with the rest of the pipeline in
> place.
>
> [using a pipeline as a serializer]
>
>    <pipeline>
>     <match pattern="*">
>      <generate ../>
>      <tranform ../>
>      <call pipeline="blah"/>
>     </match>
>    </pipeline>
>
> where the generator of the named pipeline is not used.
>
> This is equivalent of *overloading* the generator of the called 
> pipeline
> with the rest of the pipeline in place.
>
>                                - o -
>
> So, here is what I propose:
>
>  - add the 'pipeline' attribute to 'map:call'
>  - add the 'name' attribute to 'map:pipeline'

I think it is interesting that you want to do this to the 
pipeline rather than individual matchers.

Hmm.

Considering you can do this in XSLT:

	<xsl:template match="blah/*" name="call-blah"/>

to make a template that can be 'called' in two different ways ....

Would it make more sense to do this instead?:

	<map:match pattern="blah/*" name="call-blah"/>

>  - deprecate the 'map:resources' element

you can replicate this with map:redirect-to (internal redirect?)

>  - deprecate 'internal-only' attribute of 'map:pipeline'
>    [because named pipelines become implicitly internal-only]

See above ....

>  - allow 'map:call' to be executed in any place, performing the 
> pipeline
> overloading behavior I explained above.
>

I think overloading is a grand idea in principal, though I fear 
it will be confusing and difficult to understand in this context 
(unless you are a Java architect ;)

I think 'underloading' ;) would make it easier to read the sitemap.

ie. I call an internal pipeline, if that pipeline serialises, I do not.

So following on from that (maybe this is where it breaks down), 
if you call an internal pipeline which generates, it's output 
replaces yours (!).

> What do you think?
>

I think the idea of having a pipeline snippet, that is able to 
be 'called' from the middle of another pipeline is a great idea, 
it allows much more effective componentisation and the ability 
to hide complexity (in blocks or other pipelines/sitemaps).


What I am not so happy about is being forced to put a component 
(generator|serializer) into a pipeline, when it is obviously not 
going to be used!

IMHO calling a pipeline snippet (for whatever purpose) should be 
equivalent to XIncluding that pipeline snippet into the calling 
pipeline at Sitemap runtime.


So:

	<map:call pipeline="blah"/> includes pipeline components

While:

	<map:generate src="cocoon:/blah"/>,
	<map:transform src="cocoon:/blah"/> and
	<map:part src="cocoon:/blah"/> import pipeline output

Anyway, I think this (general idea) will lead to great improvements!



regards Jeremy


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

RE: [RT] reconsidering pipeline semantics

Posted by Berin Loritsch <bl...@apache.org>.

> From: Michael Hartle [mailto:mhartle@hartle-klug.com] 
> 
> Berin Loritsch wrote:
> 
> >The problem is that the sitemap exposes everything underneath to the 
> >end user.  It would be much better if the sitemap markup EITHER 
> >supported the resource/view paradigm OR purely the pipeline 
> paradigm.  
> >The resource would specify the pipeline fragment that starts the 
> >pipeline.  The view would specify the fragment that ends the 
> pipeline.
> >
> Hm, currently, we are processing the pipeline and caching its output, 
> but whenever there is a request, we at least have to verify in the 
> pipeline whether the cached output can be used or not. When 
> there is a 
> seperation into resources and views as described here, we 
> would be able 
> to (pre)generate content independently from requests based on the 
> underlying data (sources, etc) and its own updating cycles 
> (publishing 
> every x hours, by database-triggered notification, etc) and 
> deliver it 
> via views to the requestor and their request cycles (500 times per 
> hour), thous better decoupling both.


It satisfies 99% of all requirements.


> I guess we might also get the same effect with a pure 
> pipeline paradigm, 
> Stefanos overloading/pipeline fragments and some sort of 
> Notifier that 
> can be set per pipeline fragment that triggers pregeneration for the 
> cache (on database updates, etc). IIRC, some time ago someone 
> (Berin ?) 
> spoke about the Monitor components in Excalibour and about 
> how it could 
> be used in conjunction with Sources and the expensive 
> getLastModified()-calls; that decoupling went in the same 
> direction, but 
> this seperation between resources and views makes it easier to see.


The abstraction that lets the user get the most done is the correct
one.  What I outlined has nothing to do with the underlying
architecture,
but how it is *presented* to the end user.

The pure pipeline approach is less intuitive for users exposed to
web architectures.  They are used to the resource concept.  By
mixing/matching
the pipeline pattern and the resource/view concepts we are adding
conceptual mud.  Stick with one way of doing things and do it well.


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] reconsidering pipeline semantics

Posted by Michael Hartle <mh...@hartle-klug.com>.

Berin Loritsch wrote:

>The problem is that the sitemap exposes everything underneath to the end
>user.  It would be much better if the sitemap markup EITHER supported
>the
>resource/view paradigm OR purely the pipeline paradigm.  The resource
>would specify the pipeline fragment that starts the pipeline.  The view
>would specify the fragment that ends the pipeline.
>
Hm, currently, we are processing the pipeline and caching its output, 
but whenever there is a request, we at least have to verify in the 
pipeline whether the cached output can be used or not. When there is a 
seperation into resources and views as described here, we would be able 
to (pre)generate content independently from requests based on the 
underlying data (sources, etc) and its own updating cycles (publishing 
every x hours, by database-triggered notification, etc) and deliver it 
via views to the requestor and their request cycles (500 times per 
hour), thous better decoupling both.

I guess we might also get the same effect with a pure pipeline paradigm, 
Stefanos overloading/pipeline fragments and some sort of Notifier that 
can be set per pipeline fragment that triggers pregeneration for the 
cache (on database updates, etc). IIRC, some time ago someone (Berin ?) 
spoke about the Monitor components in Excalibour and about how it could 
be used in conjunction with Sources and the expensive 
getLastModified()-calls; that decoupling went in the same direction, but 
this seperation between resources and views makes it easier to see.

Best regards,

Michael Hartle,
Hartle & Klug GbR

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

RE: [RT] reconsidering pipeline semantics

Posted by Berin Loritsch <bl...@apache.org>.

Good post.  My comments interspersed (I just got back from vacation,
so maybe the points were already made):

> From: Stefano Mazzocchi [mailto:stefano@apache.org] 
> 
> In light of the discussion on blocks, Sylvain pointed out 
> that cocoon services should be mapped to pipelines and not to 
> resources directly.
> 
> This consideration triggered a few RT that I would like to 
> share with you and trigger further discussion.
> 
> NOTE: this is nothing related to blocks or flow, but only at 
> the sitemap semantics.

Difining the pipeline.


> What is a pipeline
> ------------------
> 
> We call 'Cocoon pipeline' the collection of all filters 
> (transformers) and adapters (generator and serializer) 
> because there cannot be a pipeline without adapters.
> 
> I think it's time to challenge this concept.

I agree, but not necessarily in the way that you might be thinking.

> What are sitemap resources?
> ---------------------------
> 
> Let me tell you: they are a mistake, a mistake I did trying 
> to reduce the sitemap verbosity and fixing a problem that 
> didn't yet emerged at that time. Early optimization is the 
> root of all evil and I see that
> now: resources overlap with pipelines.

I agree on the statement of early optimization.  I have not used
resources or views (with the exception of link crawling).

However, consider which is clearer to the user:

* Dynamically generating a pipeline based on runtime parameters or
  URI.
* Statically selecting the resource based on runtime paramters or
  URI.

Resources do not necessarily reduce verbosity, but they clarify the
end result.  The problem is not pipeline vs. resources, but the mixing
of the concepts.

I have had the pleasure of working on a rats nest.  It mixed OO
design, procedural design, and a quasi finite state machine that
spanned four objects (states where controlled by static integers).
The thing functioned, it did its job (sort of).  Trying to understand
it requires the equivalent of three doctorates and 5,000,000 years
of experience.  My solution: scrap it and start again.  The new version
still has some bugs to work out, but it is *much* easier to understand.

Under the hood, Cocoon is firmly entrenched in the pipeline semantics.
This is good.  It helps make the components very flexible and powerful,
without having to learn several different ways of doing the same thing.

The problem is that the sitemap exposes everything underneath to the end
user.  It would be much better if the sitemap markup EITHER supported
the
resource/view paradigm OR purely the pipeline paradigm.  The resource
would specify the pipeline fragment that starts the pipeline.  The view
would specify the fragment that ends the pipeline.

It helps with the formalization of a *component* model for web
applications.
There is still the matching/selecting section of the sitemap, but
instead
of dynamically assembling a pipeline as it goes, it uses resources, and
then
selects the view.

What would really help is if the component definition were in one file,
the
resource definitions and view definitions were in a second file, and
lastly
the matching/selection logic were in a third file.  Each of the files
would
have a purpose (SOC).  We would know what we are editing with each of
the
resources.

The logic, or sitemap file is what calls the resource and selects the
view.
In essence, we could set up a really simple sitemap like this:

<map:sitemap>
  <map:match pattern="*.*">
    <map:resource name="{1}"/>
    <map:view name="{2}"/>
  </map:match>
</map:sitemap>

It's really easy to understand.  There is no notion of separate
pipelines,
although they can be implemented under the hood.

The problem with sitemap verbosity is not so much what it represents,
but
that it mixes the representation paradigms to the user.  That makes it
more
complex to understand than necessary.


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org