You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@cocoon.apache.org by Sylvain Wallez <sy...@anyware-tech.com> on 2002/03/26 18:45:40 UTC

Re: Substractive view labels (long)

Stefano Mazzocchi wrote:

>Sylvain Wallez wrote:
>
>>Attaching a label to a component (e.g. a generator) makes this label
>>implicit for _all_ uses of that component. This is useful when the kind
>>of information represented by the label is always produced by that
>>component, since this avoids adding this label everywhere in the
>>pipeline where this component is used.
>>
>>In Cocoon's documentation sitemap, the "content" label is associated to
>>the document DTD, and this is _most often_ produced from files, hence
>>the label "content" on the "file" generator. Now this is only _most
>>often_ and not _always_ : todo.xml, faq.xml, changes.xml aren't in the
>>document DTD and need an initial transformation, and only after this
>>transformation can they can have the "content" label, as in the
>>following extract from the doc sitemap :
>>
>>   <map:match pattern="body-todo.xml">
>>     <map:generate type="file-nolabel" src="xdocs/todo.xml"/>
>>     <map:transform src="stylesheets/todo2document.xsl" label="content"/>
>>     <map:transform src="stylesheets/document2html.xsl"/>
>>     <map:serialize/>
>>   </map:match>
>>
>>Since "file" generator has a "content" label, it cannot be used here,
>>otherwise "todo2document.xsl" isn't applied when the view is requested.
>>So a solution was to define a new "file-nolabel" generator, which has
>>the *exact same definition* as the "file" generator but doesn't have the
>>label.
>>
>>The problem with this approach is that only 3 files in the whole docs
>>have this special requirement of an initial transformation, and this
>>requires to create a new component and all the associated overhead :
>>duplicate configuration, duplicate handler in the component selector and
>>duplicate object pool. As Vadim pointed out, do that in a shared Cocoon
>>installation where every user can have its sitemap and you can buy some
>>more RAM !
>>
>>So my proposal was to allow to locally "substract" a label defined at
>>the component level. The above snippet would then become :
>>
>>   <map:match pattern="body-todo.xml">
>>     <map:generate type="file" src="xdocs/todo.xml" label="-content"/>
>>     <map:transform src="stylesheets/todo2document.xsl" label="content"/>
>>     <map:transform src="stylesheets/document2html.xsl"/>
>>     <map:serialize/>
>>   </map:match>
>>
>>This avoids the overhead of declaring a new generator and clearly shows
>>that we have a local modification of the label globally attached to the
>>"file" generator.
>>
>>Is it more clear ? And if yes, what do you think ?
>>
>
>Ok, perfectly clear.
>
>Now, please, tell me: why is the other solution we proposed to this
>problem (that is: exit on the 'last' conten view, not the first one)
>wasn't accepted. I still think it's the most elegant solution.
>
>Sure it is harder to implement, but we never did designed forced by
>implementation difficulties and I don't see why we should start now.
>
Let me put back the explanations I gave to Volker Schmitt, with some 
more details. The below sitemap will be used for these details 
(high-level structural elements skipped for simplicity) :

<map:view name="content" from-label="content">
  <map:transform src="content2html.xsl"/>
</map:view>

<map:resource name="foo">
  <map:transform src="foo.xsl" label="content"/>
  <map:serialize/>
</map:resource>

<map:resource name="bar">
  <map:act type="updatedatabase"/>
  <map:transform src="bar.xsl"/>
  <map:serialize/>
</map:resource>

<map:pipeline>
  <map:match pattern="foobar">
    <map:generate src="foobar.xml" label="content"/>
    <map:act type="findtype">
      <map:call resource="{type}"/>
    </map:act>
  </map:match>
</map:pipeline>

                            ---oOo---

 From a user point of view, knowing when branching will occur can become 
a nightmare since you have to crawl all branches that can participate in 
a request handling (matchers, selectors, actions, resource calls, etc) 
in search for this last label.

In the above sitemap, there's a "content" label on the generator, but 
the "foo" resource also has this label. If the last label is used, you 
cannot know by reading the "foobar" pipeline if the view will start at 
the <map:generate> or not. You have to examine all possible branches 
(and in the above case, they're dynamic) to find other places where the 
same label is used.

Using the first label makes the behaviour more predictible : if a 
labelled statement in the sitemap is reached, then we *know* that the 
view starts at this statement.

                            ---oOo---

Implementation will be difficult, as it requires the whole regular 
pipeline (the view-less one) to be built before deciding at which point 
should occur branching.

I agree that specs shouldn't be constrained by implementations details. 
However, we must be aware that this requires some big changes in the 
existing pipeline architecture to "break" the regular pipeline at a 
point. But the important point here is that we need to fully build the 
regular pipeline to know the branching point (see below).

                            ---oOo---

Corollary to the previous point, building the regular pipeline may have 
some side effects (e.g. actions) _after_ the branching label, but we 
cannot know beforehand that these actions shouldn't have been executed 
because they're not in the view.

This is illustrated in the above sitemap : the "bar" resource has an 
action that modifies the system state, but since there is no "content" 
label in the "bar" resource, the view starts from the generator, that is 
*before* the action in the sitemap flow. Should this action be executed 
when the view is requested ?

This leads to an interesting question. In "retuning sitemap design" (see 
http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=101057440717758&w=2), 
you classify sitemap statements in two main categories : direct 
components (generators, transfomers, etc) and indirect components 
(matchers, selectors, actions, etc). How does view handling relate to 
this classification ?

In the current definition of views, there is no difference between 
direct and indirect components, and the sitemap is executed up to a 
matching label that causes a jump to the view. Components after the view 
label, both direct and indirect, aren't executed.

If we change the behaviour and branch from the last label, this means 
that *all* indirect components of the regular pipeline are to be 
executed because they perform the routing in the sitemap and there 
execution is therefore required to find the last label.

Is this really what we want ? My opinion is no : this would make 
understanding views and predict there behaviour really difficult. Views 
are a powerful concept and many users already have difficulties to 
master them. Turning them to black magic won't promote their use.

                            ---oOo---

Conclusion (thanks for those who have read all of the above ;)

I consider the label-on-component feature a writing facility to avoid 
tedious repetition in the pipelines. The documentation sitemap (with 
"file" / "file-nolabel") clearly shows that views are attached to a 
particular DTD that exists at some places in the pipelines, and that 
attaching their labels to general-purpose components like the resource 
generator may not be a good thing : 80% of the uses of that generator 
produce the correct DTD, but we need to be able to handle the remaining 
20% without sacrificing the writing facility. That's why I suggested 
these "substractive labels" to avoid the declaration of a new component 
and the associated overhead.

Also, I didn't find in the archive the reasons for this "move to last 
label" todo. And I wouldn't be happy if this was proposed as a 
workaround for the label-on-component problem. We should not constrain 
the definition of views by the bad side-effects of a writing facility.

                            ---oOo---

>>And please don't forget my other post about views in aggregation :)
>>
>
>Sorry, I thought that was sorted out: what's the problem again (I think
>I missed it previously).
>

One-click reminder ;)
http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=101683844805545&w=2

Sylvain

-- 
Sylvain Wallez
  Anyware Technologies                  Apache Cocoon
  http://www.anyware-tech.com           mailto:sylvain@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: Substractive view labels (long)

Posted by Stefano Mazzocchi <st...@apache.org>.

Sylvain Wallez wrote:

> Stefano, the above "outside-in" sentence shed a completely new light on
> my understanding of views, and I'm now convinced that views should start
> from the last label !

That's great!
 
> The point that made me wait before answering was about the execution of
> indirect components and their possible side-effects (actions). I now
> think that side effects of indirect components are part of a pipeline
> semantics. And views should not change this semantic, but only provide a
> different mmmh... view ;) on the pipeline result.

Precisely :) the name 'view' in fact, comes exactly from this thought!

> So branching on a view should not change the actions taken in the
> pipeline. If we need different side-effects, then we should consider
> having different pipelines. What do you think ?

Absolutely! in fact, the design pattern I use for 'is this a view or
another resource' is something along the lines of:

 if this 'data I need' can be applied to an entire collection of 
 resources, I need a  view, otherwise, I need another resource.

> Ah, and since you gave us so brilliant explanations, would you please
> (finally ;) consider answering my other question about views and
> aggregation ? 

I'm doing right after this.

> In short, the question is about the behaviour associated
> to <map:part> labels ? Should they cause view branching, or only filter
> those parts that belong to the view, branching being decided by the
> labels on <map:aggregate>. My (current) opinion is filtering, but you
> may again have an explanation that will make me change my mind.
> 
> Again, see
> http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=101683844805545 for
> more details on this.
> 
> Thanks for being such a great mind and sharing it with us.

I'm no great mind, Sylvain, I'm just overly concerned by esthetic
elegance :)

And believe me, here it is considered a value, in some other realms
surely it is not :/

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------



---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: Substractive view labels (long)

Posted by Sylvain Wallez <sy...@anyware-tech.com>.

Stefano Mazzocchi wrote:

<snip-long-discussion>

>When I was designing the view concept, I was looking at resources from
>the 'user-agent' perspective and wanted to provide a way for them to
>'unlock' the resources and have access to specific 'views' of them.
>
>This looking into the resource from the outside (looking into the
>serializer, so to speak), triggered the solution of having the pipeline
>stop at the 'last' encoutered label, which is the "first" label that the
>'user-agent' would encounter if it was scanning the pipe from the
>outside-in.
>
>Why so?
>
>Well, the server uses the pipelines to *augment* the information and
>shape it until it's ready for consumption (in case of POST/PUT requests)
>or production (in case of GET/HEAD requests).
>
>The 'user-agent' looks at the pipelines from the outside-in and wants to
>connect to 'different' or less processed 'pipeline stages'.
>
>With this vision in mind, it looks very elegant to provide this 'exit
>from last label' behavior, because is the behavior that a 'user-agent'
>would expect from decomposing the pipeline from the outside looking in.
>

Stefano, the above "outside-in" sentence shed a completely new light on 
my understanding of views, and I'm now convinced that views should start 
from the last label !

The point that made me wait before answering was about the execution of 
indirect components and their possible side-effects (actions). I now 
think that side effects of indirect components are part of a pipeline 
semantics. And views should not change this semantic, but only provide a 
different mmmh... view ;) on the pipeline result.

So branching on a view should not change the actions taken in the 
pipeline. If we need different side-effects, then we should consider 
having different pipelines. What do you think ?

Ah, and since you gave us so brilliant explanations, would you please 
(finally ;) consider answering my other question about views and 
aggregation ? In short, the question is about the behaviour associated 
to <map:part> labels ? Should they cause view branching, or only filter 
those parts that belong to the view, branching being decided by the 
labels on <map:aggregate>. My (current) opinion is filtering, but you 
may again have an explanation that will make me change my mind.

Again, see 
http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=101683844805545 for 
more details on this.

Thanks for being such a great mind and sharing it with us.

Sylvain

-- 
Sylvain Wallez
 Anyware Technologies                  Apache Cocoon
 http://www.anyware-tech.com           mailto:sylvain@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: Substractive view labels (long)

Posted by Stefano Mazzocchi <st...@apache.org>.

Sylvain Wallez wrote:

> Let me put back the explanations I gave to Volker Schmitt, with some
> more details. The below sitemap will be used for these details
> (high-level structural elements skipped for simplicity) :
> 
> <map:view name="content" from-label="content">
>   <map:transform src="content2html.xsl"/>

you missed a <map:serialize> here, or am I wrong?

> </map:view>
> 
> <map:resource name="foo">
>   <map:transform src="foo.xsl" label="content"/>
>   <map:serialize/>
> </map:resource>
> 
> <map:resource name="bar">
>   <map:act type="updatedatabase"/>
>   <map:transform src="bar.xsl"/>
>   <map:serialize/>
> </map:resource>
> 
> <map:pipeline>
>   <map:match pattern="foobar">
>     <map:generate src="foobar.xml" label="content"/>
>     <map:act type="findtype">
>       <map:call resource="{type}"/>
>     </map:act>
>   </map:match>
> </map:pipeline>
> 
>                             ---oOo---
> 
>  From a user point of view, knowing when branching will occur can become
> a nightmare since you have to crawl all branches that can participate in
> a request handling (matchers, selectors, actions, resource calls, etc)
> in search for this last label.

This is done as well for views that are connected to the 'last'
component, since you don't know what that is unless you've executed the
pipeline completely.

> In the above sitemap, there's a "content" label on the generator, but
> the "foo" resource also has this label. If the last label is used, you
> cannot know by reading the "foobar" pipeline if the view will start at
> the <map:generate> or not. You have to examine all possible branches
> (and in the above case, they're dynamic) to find other places where the
> same label is used.
> 
> Using the first label makes the behaviour more predictible : if a
> labelled statement in the sitemap is reached, then we *know* that the
> view starts at this statement.

Granted, althought we all agree (as you rightly point out below) that
this behavior leads to 'inelegant' uses of the sitemap semantics with
the creation of different components (and their pools) just because of
different view behaviors.

>                             ---oOo---
> 
> Implementation will be difficult, as it requires the whole regular
> pipeline (the view-less one) to be built before deciding at which point
> should occur branching.
> 
> I agree that specs shouldn't be constrained by implementations details.
> However, we must be aware that this requires some big changes in the
> existing pipeline architecture to "break" the regular pipeline at a
> point. But the important point here is that we need to fully build the
> regular pipeline to know the branching point (see below).

Ok

>                             ---oOo---
> 
> Corollary to the previous point, building the regular pipeline may have
> some side effects (e.g. actions) _after_ the branching label, but we
> cannot know beforehand that these actions shouldn't have been executed
> because they're not in the view.

Ok

> This is illustrated in the above sitemap : the "bar" resource has an
> action that modifies the system state, but since there is no "content"
> label in the "bar" resource, the view starts from the generator, that is
> *before* the action in the sitemap flow. Should this action be executed
> when the view is requested ?

Yes, it must be in order to understand "which" label to exit from. Even
implementing some 'subtractive view labels' is it perfectly legal to
have something like the above (with not explicit subtraction) meaning
that if the action returns 'bar', the 'content' view is associated to
the first generator, but if the action returns 'foo', the 'content' view
is associated to the transformer of the 'foo' resource. (see more on
this below).

> This leads to an interesting question. In "retuning sitemap design" (see
> http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=101057440717758&w=2),
> you classify sitemap statements in two main categories : direct
> components (generators, transfomers, etc) and indirect components
> (matchers, selectors, actions, etc). How does view handling relate to
> this classification ?

Views are orthogonal to pipelines, they aren't designed to stop the
execution of the pipeline somewhere for performance reasons, but just to
provide access to internal points of the pipeline in an explicit and
well-determined way.

> In the current definition of views, there is no difference between
> direct and indirect components, and the sitemap is executed up to a
> matching label that causes a jump to the view. Components after the view
> label, both direct and indirect, aren't executed.

Eh, I know this: in fact the original design of the views was supposed
to stop execution at the last encountered label, not the first one!

> If we change the behaviour and branch from the last label, this means
> that *all* indirect components of the regular pipeline are to be
> executed because they perform the routing in the sitemap and there
> execution is therefore required to find the last label.
> 
> Is this really what we want?

Yes, I see no way around this.

> My opinion is no : this would make
> understanding views and predict there behaviour really difficult. 

Yes, this is a valid concern, but we must make sure to implement the
best solution for the problem and I personally believe that the 'exit
from last label' behavior is the cleanest way to implement this, even if
this, admittedly, makes view behavior less explicit.

> Views
> are a powerful concept and many users already have difficulties to
> master them. Turning them to black magic won't promote their use.

Absolutely, but the view semantic in the sitemap was not intended to be
verbose and explicit, but rather implicit so that their behavior could
be easily inherited by subsitemaps (see below).

>                             ---oOo---
> 
> Conclusion (thanks for those who have read all of the above ;)
> 
> I consider the label-on-component feature a writing facility to avoid
> tedious repetition in the pipelines. 

Yes, it was placed there for that reason and for more (again, see
below).

> The documentation sitemap (with
> "file" / "file-nolabel") clearly shows that views are attached to a
> particular DTD that exists at some places in the pipelines, and that
> attaching their labels to general-purpose components like the resource
> generator may not be a good thing : 80% of the uses of that generator
> produce the correct DTD, but we need to be able to handle the remaining
> 20% without sacrificing the writing facility. That's why I suggested
> these "substractive labels" to avoid the declaration of a new component
> and the associated overhead.
> 
> Also, I didn't find in the archive the reasons for this "move to last
> label" todo. And I wouldn't be happy if this was proposed as a
> workaround for the label-on-component problem. We should not constrain
> the definition of views by the bad side-effects of a writing facility.

Ok, these are all great points and must be addressed in full detail.

                             ---oOo---

Views, by design, must apply to entires collection of pipelines. They
must be general enough to have a significant meaning if projected on top
of every pipeline.

For this reason, Views are defined *externally* from the pipelines and
can be seen as 'generator-less' pipelines, where the generator is
performed by part of the pipeline on top of which the view is projected.

It is the view responsibility to indicate *from where* the view should
connect to the pipeline. There are two different ways to indicate the
"exit point" of the pipeline by the view:

 1) positional: 

       first -> right after the generator
       last -> right before the serializer

 2) indirect (labelled):

       from a first occurrence of the specified label

the 'first' positional is the easiest to implement, but it's useful only
on trivial cases.

the 'last' positional is much more important and requires the entire
pipeline to be executed, but it can be considered as a 'serializer'
substitution by another more complex serializer (the view's
generator-less pipeline).

the 'labelled' indirect location connects the input of the view (which
globally is a consumer of SAX events) to the output of the component
which has that label, either explicitly written, or inherited from the
component definition (either in the current sitemap, or in the closest
sitemap parent).

If a pipeline has one and only one label (either explicit or implicit),
there is no problem.

We must address the cases where more than one component has the exact
same label.

                             ---oOo---

If a view is connected to a label, the pipeline must have these labels
attached by the pipeline writer since it's not the view responsibility,
in this case, to provide a specific positional element (it doesn't make
sense to have a positional view call to the 'second' or 'third'
component in the pipeline!).

The choice I made in the original design was to attach those labels
implicitly to the components at instantiation time. This made the system
more implict and views harder to understand from the sitemap directly,
but made the system much less verbose and views property easier to
'inherit' from subsitemaps.

In fact, this was the main reason rather than verbosity: I was afraid of
people *not getting* the view concept at first, so I made it possible
for them to 'inherit' their capabilities without having to write
anything more than what they were writing.

And I think this worked very well, expecially since I was able to
implement the command line functionality without even letting people
know I was using specific 'views' on top of their samples (and they
didn't even know what views were)

                             ---oOo---

When I was designing the view concept, I was looking at resources from
the 'user-agent' perspective and wanted to provide a way for them to
'unlock' the resources and have access to specific 'views' of them.

This looking into the resource from the outside (looking into the
serializer, so to speak), triggered the solution of having the pipeline
stop at the 'last' encoutered label, which is the "first" label that the
'user-agent' would encounter if it was scanning the pipe from the
outside-in.

Why so?

Well, the server uses the pipelines to *augment* the information and
shape it until it's ready for consumption (in case of POST/PUT requests)
or production (in case of GET/HEAD requests).

The 'user-agent' looks at the pipelines from the outside-in and wants to
connect to 'different' or less processed 'pipeline stages'.

With this vision in mind, it looks very elegant to provide this 'exit
from last label' behavior, because is the behavior that a 'user-agent'
would expect from decomposing the pipeline from the outside looking in.

                             ---oOo---

I completely understand that implementing this harder and somewhat
inefficient compared to the 'exit from first label' behavior, but we all
agree that this behavior is a bad one and must be eliminated.

So, there are two proposals so far on the table:

 1) Stefano's "exit from last label"
 2) Sylvain's "subtractive labels"

My proposal doesn't require any change in the sitemap semantics, while
Sylvains's requires pipelines to explicit indicate to 'remove' labels
that are implicitly defined at the component definition level. 

So, taking the current example:

   <map:match pattern="body-todo.xml">
     <map:generate type="file-no-label" src="xdocs/todo.xml"/>
     <map:transform src="stylesheets/todo2document.xsl"
label="content"/>
     <map:transform src="stylesheets/document2html.xsl"/>
     <map:serialize/>
   </map:match>

which uses the ugly hack of having a label-less generator type, in my
proposal the above would be rewritten as:

   <map:match pattern="body-todo.xml">
     <map:generate type="file" src="xdocs/todo.xml"/>
     <map:transform src="stylesheets/todo2document.xsl"
label="content"/>
     <map:transform src="stylesheets/document2html.xsl"/>
     <map:serialize/>
   </map:match>

while in Sylvain's it would be rewritten as:

   <map:match pattern="body-todo.xml">
     <map:generate type="file" src="xdocs/todo.xml" label="-content"/>
     <map:transform src="stylesheets/todo2document.xsl"
label="content"/>
     <map:transform src="stylesheets/document2html.xsl"/>
     <map:serialize/>
   </map:match>

Sylvain's point is that 'subtractive labels' are easier to understand
since they are more explicit and their implementation is easier because
those labels indicate to the sitemap engine what to do.

Let me rewrite the above sample:

 <map:resource name="foo">
   <map:transform src="foo.xsl" label="content"/>
   <map:serialize/>
 </map:resource>

 <map:resource name="bar">
   <map:act type="updatedatabase"/>
   <map:transform src="bar.xsl"/>
   <map:serialize/>
 </map:resource>

 <map:pipeline>
   <map:match pattern="foobar">
     <map:generate src="foobar.xml"/>
     <map:act type="findtype">
       <map:call resource="{type}"/>
     </map:act>
   </map:match>
 </map:pipeline>

where I removed the explicit label 'content' from the pipeline
generator.

Here, the *wanted* behavior is to have the 'content' view associated to
the generator for the 'bar' type and to the transformer for the 'foo'
type.

If we explicitly subtract the label from the generator, the behavior for
the 'bar' type becomes undefined.

This is to show that 'subtractive labels' might create more harm than
good for those admittedly complex examples where understanding view
behavior is already complex due to the implicit-ness of labels.

So, at the end, I don't think that my proposal is a choice to defend a
poor choice of labelling, but rather the way the view system was
designed from day one (but was not implemented for technical
difficulties, the same ones that Sylvain is facing right now and I
totally agree they are rather complex).

But I don't think the proposed subtractive labels make the system any
more elegant and the implementation any less hard.

But, of course, this is only my very personal perception of the problem
so I'll be very interested in seeing what others think about this.

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

RE: Substractive view labels (long)

Posted by Vadim Gritsenko <va...@verizon.net>.

> From: Sylvain Wallez [mailto:sylvain.wallez@anyware-tech.com]

... 

> Corollary to the previous point, building the regular pipeline may
have
> some side effects (e.g. actions) _after_ the branching label, but we
> cannot know beforehand that these actions shouldn't have been executed
> because they're not in the view.

I'm convinced that views should start with first label, especially after
reading paragraph above.


Vadim

...


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org