You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@cocoon.apache.org by Nicola Ken Barozzi <ba...@nicolaken.com> on 2002/02/15 03:45:22 UTC

[RT] Cocoon subcomponent object model (was: Re: is cocoon symmetry a holy grail?)

From: "Vadim Gritsenko" <va...@verizon.net>

> I have another one, it provides different functionality but it features
> similar approach. As I don't have a name for this (multiplexer?), here
> is the diagram:
>
>                   - pipeline1 -
>                  /              \
> request -> A -> X - pipeline2 - X -> C -> response
>                  \              /
>                   - pipelineN -
>
> Explanation:
> 1. Request goes in
> 2. Pipeline is being constructed from A, X, C
> 3. SAX events passed from the A to X, where they are dispatched (same as
> separator) to several other pipelines
> 4. SAX events passed from these events reassembled into the one SAX
> stream by the same instance of X component
> 5. Result passed down the original pipeline to the C
> 6. C spits out the response

Oh my, I've seen this in ApacheCon two more that a year ago, we knew it was
going to come out again! ;-)

The first comment that come to me is that IMHO, to have better performance
you need to have good control over what is happening, and that leads to
KISS. The whole concept of making pipelines split, recombine and branch
could make it difficult to maintain control.

But the concept is intriguing. IMHO it could be transformed in another
concept, a sub-component object model.

We have been seeing the picture from a sitemap POV, but never talked about
helping the developer in writing the components themselves.

So, since it's 3:26 and I can't get sleep, here's my first RT.

===============================
Cocoon Sub-Component Object Model
===============================

Abstract
-------------

This RT describes a finer grained object model for Cocoon that is meant
to attain a better separation of concerns and usability.

Description
-----------------

Cocoon has a macro object model based on the pipeline metaphore.

Each Cocoon "object" is a pipeline component and can be of three major
kinds:
1.Generator: initiates the XML pipeline by converting generic data into XML.
2.Transformer: filters the XML events.
3. Serializer converts the resulting XML into something useful for the
client.

The pipelines are defined in a sitemap that specifies order, parameters and
condition of pipeline components.

This componentization is useful because it enforces separation of concerns
between content providers, graphic-layout designers, developers and site
administrators.

Cocoon1 made life easy for the first two and quite hard for the last, who
had the data he is responsible for scattered in all three kinds of
components. The sitemap of Cocoon2 changed this and put things where they
belong.

My opinion is that developers are not yet taken correctly into account.
While the other three have a componentization which is sufficient for their
part of work, developers suffer for the lack of it. Usually a developer has
to write a component, and doesn't have a (sub) component model to deal with.

Ok, it's not really true, there are XSPs.
But in many respects there are not sufficient:
XSPs are hard to write
XSPs mix (declarative) XML and (procedural) Java in an unmaintainable and
undebuggable tangle
XSPs cannot aid writing transformers
XSPs must have their main tag
XSPs do not automagically scale well (no automantic pooling or brokering)
XSPs have slooow startup and are not good for dynamic pages that change
often
XSPs are a nightmare to debug (just try ;-) )
XSPs have the 64k limit
XSP taglibs are hard to understand, write and maintain

Also, Cocoon components do not have scope and filter all events coming
in (security: I don't want sensitive tags passing in a transformer that is
useful but not completely known).

Cocoon doesn't have context scoping for session or global values.

As you can see these remarks are not in a small number, but come all from
simple shortcomings of Cocoon IMHO:
- The coexistence between Java and XML is a key problem.
- The current component model is too coarse grained to help pipeline
component writers.

A finer grained object model could also have the notion of context
variables.

These have nothing to do with and do not endanger necessarily the existence
of:
- XSP syntax.
- Current level of object abstractions for other roles.

How can we solve this?
Here are some possibility

A----------------------------
First we have to change slightly the notion of cocoon pipeline components
introducing scope.
Pipeline components need not access <all> SAX events but only what pertains
them. This also means that the pipeline coulde be evaluated eventually in
parallel
fashion, improving scalability in heavy processor intensive or high latency
pages.

For example let's say that we have this XML:
<page>
<longquery name="account"/>
<query name="username"/>
<page>
Let's say that in another file (the developer's sitemap) is written that
query tags must be processed by the foo.sql.QueryTransformer and the
longquery tags by acme.sql.BankTransformer.
As SAX events come into action the start page tag is directly sent to the
serializer.
Then the acme.sql.BankTransformer is given only the longquery tag and starts
processing in a non-blocking fashion.
This means that SAX events can continue and parallely
foo.sql.QueryTransformer can start processing his tag.
Now the pipeline has to wait for the first transformer to finish because
embedded tags link page cannot be processen in non-blocking fashion. When
they finish their output events are outputted in order and finally the last
page tag.

As you can see if there are transformers that take longer to perform (also
because of latency of DBs and likes) they can be performed this way in
non-blocking fashion, speeding up total response time.

B----------------------------
A global context-aware object broker could also be inserted in the scheme.
This doesn't really change the framework, it's just a useful addition.

C----------------------------
Now let's explain how a finer-grained object model can be devised.
First of all it must be capable of specifying a pipeline component as a sum
of smaller components possibly only by writing XML described "glue".
I's like:
pipeline component : pipeline = smaller component : pipeline component
Which basically means that these smaller components are a second level of
indirection with regerds to the pipeline.

What guided the specification of the pipeline components?
The fact that they had to
- Interface XML with other streams.
- Transform XML.

Basically They had to
- detokenize
- make-change grammar
- retokenize

So it's all about interfacing generic streams to XML so to be able to
transform them the XML way with Transformers.
In our case it's about interfacing XML to Java to be able to transform it
with Java Objects (beans, EJB, etc.).
This means that we could:
1 Change XML tokens with something meaningful to Java: variables and data
structures.
2 Call Java methods on them to have results.
3 Retransform Java data structures into XML tokens.

The great thing is that phase 1 is usually quite long and cumbersome to
write but is essentially the same code over and over, the usual "if"s in the
SAX event handlers.
I think that a basic set of "(De)Tokenizers" can be used in 95% of cases. A
very used one would for example store a variable with the same name of the
tag it's in when it has certain parents.
Phase 2 is where the real "coding" takes place.
(omissis)
Phase 3 is easy to write, and it's the only part of XSPs which really works.

As you can see XSPs don't have phase 1 reusable and are cumbersome with
phase 2. This is because the mix them into a same phase, putting Java
code directly on the page with the <xsp:logic> tag.

Here the separation is done by relegating the interaction of Java and XML to
the simple and reusable contract of (De)Tokenizers.
In this way the coding can be done in Java and simply mapped to XML with
reusable components.

Seeing this globally the pipeline should work this way:
- (Generation) Tokenize and make SAX events from streams
- Filter events and dispatch to Transformers
- For each new Transformer (in parallel if necessary and requested)
- de-tokenize events and convert to Java.
- Call methods
- Retokenize
- Serialize SAX events to stream

With this mail is also an illustrative image.

Stefano, could you please lend me your asbestos garments, you don't need
them anymore AFAIK ;-)

--
Nicola Ken Barozzi                 krysalis.org@nicolaken.com
            - verba volant, scripta manent -
   (discussions get forgotten, just code remains)
---------------------------------------------------------------------

Re: [RT] Cocoon subcomponent object model (was: Re: is cocoon symmetry a holy grail?)

Posted by Stefano Mazzocchi <st...@apache.org>.

Nicola Ken Barozzi wrote:

> > And this should be KISS?
> 
> Ok, so let me explain it from a sitemap view.
> 
> You can add a parameter parallel="true" to make queries run simultaneously.
> 
> Simple enough?

In order to use something you have to understand how it works.

For a cocoon pipeline, the concept is simple.

With the system you are proposing I see this simplicity greatly reduced,
no matter how simple is the semantic that turns it on.

> > > B----------------------------
> > > A global context-aware object broker could also be inserted in the
> scheme.
> > > This doesn't really change the framework, it's just a useful addition.
> >
> > Don't forget we can still use Alberto's X:Forge for this.
> 
> Where is it? <hint, hint, nudge, nudge ;-)>

Where is has always been.

http://opensource.bibop.it/

> > > C----------------------------
> > > Now let's explain how a finer-grained object model can be devised.
> > > First of all it must be capable of specifying a pipeline component as a
> sum
> > > of smaller components possibly only by writing XML described "glue".
> > > I's like:
> ...
> > What you propose is similar to X:Forge and to DXML that Ricardo was
> > working on (I say 'was' because I can't reach him anymore :( but was
> > much simpler:
> 
> Similar yes. I still remember the very interesting brainstorming I had with
> them on these.
> 
> > I'd rather attach X:Forge to Cocoon (at the generation level) than
> > having to write something so complex at the pipeline level.
> 
> X:Forge is ok for me, but where is it? Can you put it in the scratchpad?

You are the one proposing this, not me :)

Supply a patch :)
 
> > I think the tokenize/detokenize part are really far from my view of
> > keeping it simple and go into a deep mess.
> 
> Ok. Could be true.
> Only live code can be really discussed, we'll continue this at that time.

Ok

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] Cocoon subcomponent object model (was: Re: is cocoon symmetry a holy grail?)

Posted by Nicola Ken Barozzi <ba...@nicolaken.com>.

From: "Stefano Mazzocchi" <st...@apache.org>

> Nicola Ken Barozzi wrote:
>
> > My opinion is that developers are not yet taken correctly into account.
> > While the other three have a componentization which is sufficient for
their
> > part of work, developers suffer for the lack of it. Usually a developer
has
> > to write a component, and doesn't have a (sub) component model to deal
with.
>
> ??? I disagree, you can write avalon-aware sitemap components by hand
> (and many people do just as easy as they write servlets... or even
> better).

Yes, this is right, but I would also like a *sub*Component model.
Components to make Cocoon Components, so to minimize impedence mismatch
between Java and XML.
With Ricardo, we agreed that somehow Java and XML intermingling is bad, and
that they should be separated.
Generators don't do it, and have a fixed schema.

> > Also, Cocoon components do not have scope and filter all events coming
> > in (security: I don't want sensitive tags passing in a transformer that
is
> > useful but not completely known).
>
> please, let's get real here: I *strongly* doubt you'll ever use a filter
> in your pipeline that you don't trust. Security is ok, but at this
> granularity becomes a nightmare (and a serious performance limitation)

I tend to agree, but I'm no sys-admin.
This concern came from a boss you know well... ;-)

> > A----------------------------
> > First we have to change slightly the notion of cocoon pipeline
components
> > introducing scope.
> > Pipeline components need not access <all> SAX events but only what
pertains
> > them. This also means that the pipeline coulde be evaluated eventually
in
> > parallel
> > fashion, improving scalability in heavy processor intensive or high
latency
> > pages.
> >
> > For example let's say that we have this XML:
> > <page>
> > <longquery name="account"/>
> > <query name="username"/>
> > <page>
> > Let's say that in another file (the developer's sitemap) is written that
> > query tags must be processed by the foo.sql.QueryTransformer and the
> > longquery tags by acme.sql.BankTransformer.
> > As SAX events come into action the start page tag is directly sent to
the
> > serializer.
> > Then the acme.sql.BankTransformer is given only the longquery tag and
starts
> > processing in a non-blocking fashion.
> > This means that SAX events can continue and parallely
> > foo.sql.QueryTransformer can start processing his tag.
> > Now the pipeline has to wait for the first transformer to finish because
> > embedded tags link page cannot be processen in non-blocking fashion.
When
> > they finish their output events are outputted in order and finally the
last
> > page tag.
> >
> > As you can see if there are transformers that take longer to perform
(also
> > because of latency of DBs and likes) they can be performed this way in
> > non-blocking fashion, speeding up total response time.
>
> And this should be KISS?

Ok, so let me explain it from a sitemap view.

You can add a parameter parallel="true" to make queries run simultaneously.

Simple enough?

> > B----------------------------
> > A global context-aware object broker could also be inserted in the
scheme.
> > This doesn't really change the framework, it's just a useful addition.
>
> Don't forget we can still use Alberto's X:Forge for this.

Where is it? <hint, hint, nudge, nudge ;-)>

> > C----------------------------
> > Now let's explain how a finer-grained object model can be devised.
> > First of all it must be capable of specifying a pipeline component as a
sum
> > of smaller components possibly only by writing XML described "glue".
> > I's like:
...
> What you propose is similar to X:Forge and to DXML that Ricardo was
> working on (I say 'was' because I can't reach him anymore :( but was
> much simpler:

Similar yes. I still remember the very interesting brainstorming I had with
them on these.

> I'd rather attach X:Forge to Cocoon (at the generation level) than
> having to write something so complex at the pipeline level.

X:Forge is ok for me, but where is it? Can you put it in the scratchpad?

> I think the tokenize/detokenize part are really far from my view of
> keeping it simple and go into a deep mess.

Ok. Could be true.
Only live code can be really discussed, we'll continue this at that time.

--
Nicola Ken Barozzi                 krysalis.org@nicolaken.com
            - verba volant, scripta manent -
   (discussions get forgotten, just code remains)
---------------------------------------------------------------------


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] Cocoon subcomponent object model (was: Re: is cocoon symmetry a holy grail?)

Posted by Stefano Mazzocchi <st...@apache.org>.

Sorry, took me a while to respond.

Nicola Ken Barozzi wrote:
> 
> From: "Vadim Gritsenko" <va...@verizon.net>
> 
> > I have another one, it provides different functionality but it features
> > similar approach. As I don't have a name for this (multiplexer?), here
> > is the diagram:
> >
> >                   - pipeline1 -
> >                  /              \
> > request -> A -> X - pipeline2 - X -> C -> response
> >                  \              /
> >                   - pipelineN -
> >
> > Explanation:
> > 1. Request goes in
> > 2. Pipeline is being constructed from A, X, C
> > 3. SAX events passed from the A to X, where they are dispatched (same as
> > separator) to several other pipelines
> > 4. SAX events passed from these events reassembled into the one SAX
> > stream by the same instance of X component
> > 5. Result passed down the original pipeline to the C
> > 6. C spits out the response
> 
> Oh my, I've seen this in ApacheCon two more that a year ago, we knew it was
> going to come out again! ;-)
> 
> The first comment that come to me is that IMHO, to have better performance
> you need to have good control over what is happening, and that leads to
> KISS. The whole concept of making pipelines split, recombine and branch
> could make it difficult to maintain control.

I agree here.
 
> But the concept is intriguing. IMHO it could be transformed in another
> concept, a sub-component object model.
> 
> We have been seeing the picture from a sitemap POV, but never talked about
> helping the developer in writing the components themselves.
> 
> So, since it's 3:26 and I can't get sleep, here's my first RT.

:)
 
> ===============================
> Cocoon Sub-Component Object Model
> ===============================
> 
> Abstract
> -------------
> 
> This RT describes a finer grained object model for Cocoon that is meant
> to attain a better separation of concerns and usability.
> 
> Description
> -----------------
> 
> Cocoon has a macro object model based on the pipeline metaphore.
> 
> Each Cocoon "object" is a pipeline component and can be of three major
> kinds:
> 1.Generator: initiates the XML pipeline by converting generic data into XML.
> 2.Transformer: filters the XML events.
> 3. Serializer converts the resulting XML into something useful for the
> client.
> 
> The pipelines are defined in a sitemap that specifies order, parameters and
> condition of pipeline components.
> 
> This componentization is useful because it enforces separation of concerns
> between content providers, graphic-layout designers, developers and site
> administrators.
> 
> Cocoon1 made life easy for the first two and quite hard for the last, who
> had the data he is responsible for scattered in all three kinds of
> components. The sitemap of Cocoon2 changed this and put things where they
> belong.
> 
> My opinion is that developers are not yet taken correctly into account.
> While the other three have a componentization which is sufficient for their
> part of work, developers suffer for the lack of it. Usually a developer has
> to write a component, and doesn't have a (sub) component model to deal with.

??? I disagree, you can write avalon-aware sitemap components by hand
(and many people do just as easy as they write servlets... or even
better).
 
> Ok, it's not really true, there are XSPs.
> But in many respects there are not sufficient:
> XSPs are hard to write
> XSPs mix (declarative) XML and (procedural) Java in an unmaintainable and
> undebuggable tangle
> XSPs cannot aid writing transformers
> XSPs must have their main tag
> XSPs do not automagically scale well (no automantic pooling or brokering)
> XSPs have slooow startup and are not good for dynamic pages that change
> often
> XSPs are a nightmare to debug (just try ;-) )
> XSPs have the 64k limit
> XSP taglibs are hard to understand, write and maintain

I agree on all these limitations, but if you don't varying XML content
in your pages (say it pretty much all comes from data stores), I'd
suggest you to write your generators directly, where you have the nice
component model that avalon gives you.
 
> Also, Cocoon components do not have scope and filter all events coming
> in (security: I don't want sensitive tags passing in a transformer that is
> useful but not completely known).

please, let's get real here: I *strongly* doubt you'll ever use a filter
in your pipeline that you don't trust. Security is ok, but at this
granularity becomes a nightmare (and a serious performance limitation)
 
> Cocoon doesn't have context scoping for session or global values.
> 
> As you can see these remarks are not in a small number, but come all from
> simple shortcomings of Cocoon IMHO:
> - The coexistence between Java and XML is a key problem.
> - The current component model is too coarse grained to help pipeline
> component writers.
> 
> A finer grained object model could also have the notion of context
> variables.
> 
> These have nothing to do with and do not endanger necessarily the existence
> of:
> - XSP syntax.
> - Current level of object abstractions for other roles.
> 
> How can we solve this?
> Here are some possibility
> 
> A----------------------------
> First we have to change slightly the notion of cocoon pipeline components
> introducing scope.
> Pipeline components need not access <all> SAX events but only what pertains
> them. This also means that the pipeline coulde be evaluated eventually in
> parallel
> fashion, improving scalability in heavy processor intensive or high latency
> pages.
> 
> For example let's say that we have this XML:
> <page>
> <longquery name="account"/>
> <query name="username"/>
> <page>
> Let's say that in another file (the developer's sitemap) is written that
> query tags must be processed by the foo.sql.QueryTransformer and the
> longquery tags by acme.sql.BankTransformer.
> As SAX events come into action the start page tag is directly sent to the
> serializer.
> Then the acme.sql.BankTransformer is given only the longquery tag and starts
> processing in a non-blocking fashion.
> This means that SAX events can continue and parallely
> foo.sql.QueryTransformer can start processing his tag.
> Now the pipeline has to wait for the first transformer to finish because
> embedded tags link page cannot be processen in non-blocking fashion. When
> they finish their output events are outputted in order and finally the last
> page tag.
> 
> As you can see if there are transformers that take longer to perform (also
> because of latency of DBs and likes) they can be performed this way in
> non-blocking fashion, speeding up total response time.

And this should be KISS?
 
> B----------------------------
> A global context-aware object broker could also be inserted in the scheme.
> This doesn't really change the framework, it's just a useful addition.

Don't forget we can still use Alberto's X:Forge for this.
 
> C----------------------------
> Now let's explain how a finer-grained object model can be devised.
> First of all it must be capable of specifying a pipeline component as a sum
> of smaller components possibly only by writing XML described "glue".
> I's like:
> pipeline component : pipeline = smaller component : pipeline component
> Which basically means that these smaller components are a second level of
> indirection with regerds to the pipeline.
> 
> What guided the specification of the pipeline components?
> The fact that they had to
> - Interface XML with other streams.
> - Transform XML.
> 
> Basically They had to
> - detokenize
> - make-change grammar
> - retokenize
> 
> So it's all about interfacing generic streams to XML so to be able to
> transform them the XML way with Transformers.
> In our case it's about interfacing XML to Java to be able to transform it
> with Java Objects (beans, EJB, etc.).
> This means that we could:
> 1 Change XML tokens with something meaningful to Java: variables and data
> structures.
> 2 Call Java methods on them to have results.
> 3 Retransform Java data structures into XML tokens.
> 
> The great thing is that phase 1 is usually quite long and cumbersome to
> write but is essentially the same code over and over, the usual "if"s in the
> SAX event handlers.
> I think that a basic set of "(De)Tokenizers" can be used in 95% of cases. A
> very used one would for example store a variable with the same name of the
> tag it's in when it has certain parents.
> Phase 2 is where the real "coding" takes place.
> (omissis)
> Phase 3 is easy to write, and it's the only part of XSPs which really works.
> 
> As you can see XSPs don't have phase 1 reusable and are cumbersome with
> phase 2. This is because the mix them into a same phase, putting Java
> code directly on the page with the <xsp:logic> tag.
> 
> Here the separation is done by relegating the interaction of Java and XML to
> the simple and reusable contract of (De)Tokenizers.
> In this way the coding can be done in Java and simply mapped to XML with
> reusable components.
> 
> Seeing this globally the pipeline should work this way:
> - (Generation) Tokenize and make SAX events from streams
> - Filter events and dispatch to Transformers
> - For each new Transformer (in parallel if necessary and requested)
> - de-tokenize events and convert to Java.
> - Call methods
> - Retokenize
> - Serialize SAX events to stream
> 
> With this mail is also an illustrative image.
> 
> Stefano, could you please lend me your asbestos garments, you don't need
> them anymore AFAIK ;-)

What you propose is similar to X:Forge and to DXML that Ricardo was
working on (I say 'was' because I can't reach him anymore :( but was
much simpler:

I'd rather attach X:Forge to Cocoon (at the generation level) than
having to write something so complex at the pipeline level.

I think the tokenize/detokenize part are really far from my view of
keeping it simple and go into a deep mess.

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org