You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@cocoon.apache.org by Carsten Ziegeler <cz...@apache.org> on 2006/07/11 22:20:05 UTC

[2.2] New processor interface/approach

Some time ago we discussed very lengthy that our core interface, the
Processor, is not the best interface we ever invented. The processor
should be the main entrance to the Cocoon processing engine.

As I need a simple way of adding my own processor in my project, I
started to create new interfaces and implementations which should make
the whole thing much easier (to use and implement).

Before I go into the details, please note that these changes do not
affect the usual Cocoon user at all. It should just clean up the core
which is usually never used by any project. But with changes like these,
this hopefully will change.

I committed the new stuff as a prototype to the whiteboard
http://svn.apache.org/repos/asf/cocoon/whiteboard/processor/

The code is not tested yet, as I would like to discuss the things first,
before completly implementing them.

Ok, the new processor interface is very simply as it just has one single
method (process) which gets a HttpServletRequest and a
HttpServletResponse. That's it - this makes integrating Cocoon in any
other web environment very easy and I can't think of any simpler
interface :) Basically integrating Cocoon in another framework is then
a) get the cocoon spring bean container from the servlet context, b) get
the processor bean from the spring container and c) invoke the processor.

There might be the need to receive sax events instead of getting the
generated content from cocoon in a stream. So you can pass in a sub
interface of HttpServletResponse - the SAXAwareHttpServletResponse - and
in this case the processor sends sax events to the response instead of
writing to the output stream - I'm not sure if this makes sense, it was
just an idea.

Currently I have two Processor implementations. One of them is the
sitemap processor embedding the TreeProcessor - so this is the usual stuff.

The other processor implementation uses the mount table approach. It
reads in an xml configuration file for the mounted sitemaps and then
forwards the request to a sitemap processor. So instead of having a
mount in the main sitemap, you have no main sitemap with this approach
anymore but directly jump into the correct sitemap based on the mount
table configuration. This can be enhanced later on to jump to a
completly different processor which might not use the a sitemap at all.

So, WDYT?

Carsten
-- 
Carsten Ziegeler - Open Source Group, S&N AG
http://www.s-und-n.de
http://www.osoco.org/weblogs/rael/

Re: [2.2] New processor interface/approach

Posted by Bertrand Delacretaz <bd...@apache.org>.

On 7/11/06, Carsten Ziegeler <cz...@apache.org> wrote:

> ...Basically integrating Cocoon in another framework is then
> a) get the cocoon spring bean container from the servlet context, b) get
> the processor bean from the spring container and c) invoke the processor...

I like this, and pluggable Processors look useful for integrating
Cocoon in various ways.

-Bertrand

Re: [2.2] New processor interface/approach

Posted by Sylvain Wallez <sy...@apache.org>.

Daniel Fagerstrom wrote:
> Carsten Ziegeler skrev:
>> Sylvain Wallez wrote:
>>  
>>> However, a few remarks:
>>> - how is Processor different from Servlet?
>>>     
>> The interface is similar. The difference is configuration and who
>> manages the component. If a processor is a component we can use all nice
>> features of spring for that and I think the configuration is nicer.
>> Things like changing the configuration at runtime should be easier as
>> well as you don't have to modify the web.xml.
>>   
> Servlet is just an interface, no one prevents you from managing a
> component that implements Servlet in Spring. And as you rather often
> is going to need a Servlet context or configuration within a
> Processor, you end up pretty close to Servlet anyway.
>
> The advantage to use the Servlet interface is that we give the users
> greater flexibility in reusing the various controllers that we might
> implement.

Just for the record, here's the Servlet.init() method I use to have
Spring dependency injection in servlets declared in web.xml. Works like
a charm!

    public void init(ServletConfig servletConfig) throws ServletException {
        super.init(servletConfig);
        // Get the application context,
        ApplicationContext ctx = (ApplicationContext)
getServletContext().getAttribute(
               
WebApplicationContext.ROOT_WEB_APPLICATION_CONTEXT_ATTRIBUTE);

        // Create an autowire-capable factory
        DefaultListableBeanFactory factory = new
DefaultListableBeanFactory(ctx);

        factory.autowireBeanProperties(this,
AutowireCapableBeanFactory.AUTOWIRE_BY_NAME, true);
    }


Sylvain

-- 
Sylvain Wallez - http://bluxte.net

Re: [2.2] New processor interface/approach

Posted by Carsten Ziegeler <cz...@apache.org>.

Daniel Fagerstrom wrote:
> Here I also would like to advertise a neat pattern (the whiteboard 
> pattern) that we use in the OSGi-blocks architecture, that we maybe 
> could use without OSGi in 2.2.
> 
> So, in the block architecture we set up a BlockServlet (or other 
> controllers) as a managed component by using the declarative services, 
> this could be done in Spring instead. Anyway, the configuration contains 
> a default mount path that can be overridden by the configuration 
> service. Then the component that corresponds to the MountTableProcessor 
> can ask the service manager for all services that implements Servlet and 
> that also has a mount path property. And it can mount these servlets 
> without any global configurations at all. It also allows for dynamic 
> installation/uninstallation/update of the servlets, but that is another 
> story ;)
> 
> Maybe we could use the same pattern with Spring managed processors. Is 
> there any way to ask the container for all beans that implements a 
> certain interface? It would make it possible to just deploy a block 
> without needing to update any mount tables.
> 
Yes, that's possible. You can add bean factory post processors (which
are spring beans as well) which gets notified by all beans, or you can
access all beans that implement a certain interface by asking the bean
factory for it.

Using the whiteboard pattern here sounds really good. I hope to come up
with an implementation in the next days.

Carsten
-- 
Carsten Ziegeler - Open Source Group, S&N AG
http://www.s-und-n.de
http://www.osoco.org/weblogs/rael/

Re: [2.2] New processor interface/approach

Posted by Daniel Fagerstrom <da...@nada.kth.se>.

Carsten Ziegeler skrev:
> Daniel Fagerstrom wrote:
>   
>> Are you using so many sitemap processors so that becomes a problem? We 
>> have something like that today. The tree processor is configured as a 
>> component, or will a Spring bean configuration be much more verbose?
>>     
> No, the spring bean config might look the same (more or less). Perhaps
> we could even use spring templates to simplify (don't know yet, I have
> to look into that).
> Anyway, my idea is to not use mount (= sub sitemaps) anymore. So I guess
> I will end up with several sitemap processors. I think we should remove
> the concept of sub sitemaps completly (this would make our code much
> more cleaner) as you never really know which sitemap processes what. And
> with this new processor concept we don't need it anymore.
>   
OK, that explain your concern and the need for the MountTableProcessor. 
I agree about getting rid of the mounts. They have mainly been used for 
modularization, and with the current development we get better 
mechanisms for that. Now, it is not the mounts in them self that create 
complexity, it is mounts combined with component declarations in 
(sub)sitemaps that complicates things.

>> Irrespectively of if the bean configuration becomes verbose or not, it 
>> will help us towards better modularization. Each block, or rather each 
>> sample block, can contribute its sitemap as a sitemap processor that is 
>> configured in its bean configuration. In this way we get rid of the 
>> sitemap copying part of the deployer.
>>     
> Yes, definitly true. So let's try this way.
>   
Cool!

Here I also would like to advertise a neat pattern (the whiteboard 
pattern) that we use in the OSGi-blocks architecture, that we maybe 
could use without OSGi in 2.2.

So, in the block architecture we set up a BlockServlet (or other 
controllers) as a managed component by using the declarative services, 
this could be done in Spring instead. Anyway, the configuration contains 
a default mount path that can be overridden by the configuration 
service. Then the component that corresponds to the MountTableProcessor 
can ask the service manager for all services that implements Servlet and 
that also has a mount path property. And it can mount these servlets 
without any global configurations at all. It also allows for dynamic 
installation/uninstallation/update of the servlets, but that is another 
story ;)

Maybe we could use the same pattern with Spring managed processors. Is 
there any way to ask the container for all beans that implements a 
certain interface? It would make it possible to just deploy a block 
without needing to update any mount tables.

/Daniel

Re: [2.2] New processor interface/approach

Posted by Carsten Ziegeler <cz...@apache.org>.

Daniel Fagerstrom wrote:
> Are you using so many sitemap processors so that becomes a problem? We 
> have something like that today. The tree processor is configured as a 
> component, or will a Spring bean configuration be much more verbose?
No, the spring bean config might look the same (more or less). Perhaps
we could even use spring templates to simplify (don't know yet, I have
to look into that).
Anyway, my idea is to not use mount (= sub sitemaps) anymore. So I guess
I will end up with several sitemap processors. I think we should remove
the concept of sub sitemaps completly (this would make our code much
more cleaner) as you never really know which sitemap processes what. And
with this new processor concept we don't need it anymore.

> 
> Irrespectively of if the bean configuration becomes verbose or not, it 
> will help us towards better modularization. Each block, or rather each 
> sample block, can contribute its sitemap as a sitemap processor that is 
> configured in its bean configuration. In this way we get rid of the 
> sitemap copying part of the deployer.
Yes, definitly true. So let's try this way.

Carsten
-- 
Carsten Ziegeler - Open Source Group, S&N AG
http://www.s-und-n.de
http://www.osoco.org/weblogs/rael/

Re: [2.2] New processor interface/approach

Posted by Daniel Fagerstrom <da...@nada.kth.se>.

Carsten Ziegeler skrev:
> Daniel Fagerstrom wrote:
>   
>> Don't think this is the best way to do it. The current 
>> MountTableProcessor means that you only can mount SitemapProcessor and 
>> no other kinds of controllers, as it explicitly manage the creation of 
>> the SitemapProcessors. IMO it is better to let the component container 
>> (Spring) create the SitemapProcessor (and other processors), then the 
>> MountTableProcessor instead contains an association between uri prefixes 
>> and component ids. And this makes it usable together with any processor 
>> as its not the concern of the MountTableProcessor to create the 
>> processor anymore.
>>     
> Ah, yes, you're right of course. I forgot to mention that I wanted to
> change the
> MountTableProcessor later on to exactly provide that feature.
>   
Great.

>> Also it makes the path to the sitemap for a specific SitemapProcessor a 
>> concern for the configuration of the SietmapProcessor rather than the 
>> configuration of the MountTableProcessor. Much better SoC IMO.
>>     
> Yes, true - unfortunately the configuration gets a little bit
> complicated as you then have to add each and every sitemap processor as
> a bean in spring instead of using a generic approach.
>   
Are you using so many sitemap processors so that becomes a problem? We 
have something like that today. The tree processor is configured as a 
component, or will a Spring bean configuration be much more verbose?

Irrespectively of if the bean configuration becomes verbose or not, it 
will help us towards better modularization. Each block, or rather each 
sample block, can contribute its sitemap as a sitemap processor that is 
configured in its bean configuration. In this way we get rid of the 
sitemap copying part of the deployer.

/Daniel

Re: [2.2] New processor interface/approach

Posted by Carsten Ziegeler <cz...@apache.org>.

Daniel Fagerstrom wrote:
> Don't think this is the best way to do it. The current 
> MountTableProcessor means that you only can mount SitemapProcessor and 
> no other kinds of controllers, as it explicitly manage the creation of 
> the SitemapProcessors. IMO it is better to let the component container 
> (Spring) create the SitemapProcessor (and other processors), then the 
> MountTableProcessor instead contains an association between uri prefixes 
> and component ids. And this makes it usable together with any processor 
> as its not the concern of the MountTableProcessor to create the 
> processor anymore.
Ah, yes, you're right of course. I forgot to mention that I wanted to
change the
MountTableProcessor later on to exactly provide that feature.

> Also it makes the path to the sitemap for a specific SitemapProcessor a 
> concern for the configuration of the SietmapProcessor rather than the 
> configuration of the MountTableProcessor. Much better SoC IMO.
Yes, true - unfortunately the configuration gets a little bit
complicated as you then have to add each and every sitemap processor as
a bean in spring instead of using a generic approach.

Carsten
-- 
Carsten Ziegeler - Open Source Group, S&N AG
http://www.s-und-n.de
http://www.osoco.org/weblogs/rael/

Re: [2.2] New processor interface/approach

Posted by Daniel Fagerstrom <da...@nada.kth.se>.

Carsten Ziegeler skrev:
> Sylvain Wallez wrote:
>   
>> However, a few remarks:
>> - how is Processor different from Servlet?
>>     
> The interface is similar. The difference is configuration and who
> manages the component. If a processor is a component we can use all nice
> features of spring for that and I think the configuration is nicer.
> Things like changing the configuration at runtime should be easier as
> well as you don't have to modify the web.xml.
>   
Servlet is just an interface, no one prevents you from managing a 
component that implements Servlet in Spring. And as you rather often is 
going to need a Servlet context or configuration within a Processor, you 
end up pretty close to Servlet anyway.

The advantage to use the Servlet interface is that we give the users 
greater flexibility in reusing the various controllers that we might 
implement.

But as said before, we can evaluate the consequences when you have 
implemented a little bit more. It shouldn't be much work to change to 
Servlet, if that is a better idea.

>> - SitemapProcessorFactory is pretty close to
>> ServletContext.getRequestDipatcher()
>>     
> No, not exactly. The dispatcher is based in the request uri where as the
> sitemap process factory gives you a processor for a sitemap specified by
> the sitemap location which might be totally different to the incomming uri.
>   
Don't think this is the best way to do it. The current 
MountTableProcessor means that you only can mount SitemapProcessor and 
no other kinds of controllers, as it explicitly manage the creation of 
the SitemapProcessors. IMO it is better to let the component container 
(Spring) create the SitemapProcessor (and other processors), then the 
MountTableProcessor instead contains an association between uri prefixes 
and component ids. And this makes it usable together with any processor 
as its not the concern of the MountTableProcessor to create the 
processor anymore.

Also it makes the path to the sitemap for a specific SitemapProcessor a 
concern for the configuration of the SietmapProcessor rather than the 
configuration of the MountTableProcessor. Much better SoC IMO.

/Daniel

Re: [2.2] New processor interface/approach

Posted by Carsten Ziegeler <cz...@apache.org>.

Sylvain Wallez wrote:
> 
> However, a few remarks:
> - how is Processor different from Servlet?
The interface is similar. The difference is configuration and who
manages the component. If a processor is a component we can use all nice
features of spring for that and I think the configuration is nicer.
Things like changing the configuration at runtime should be easier as
well as you don't have to modify the web.xml.

> - SitemapProcessorFactory is pretty close to
> ServletContext.getRequestDipatcher()
No, not exactly. The dispatcher is based in the request uri where as the
sitemap process factory gives you a processor for a sitemap specified by
the sitemap location which might be totally different to the incomming uri.

> IMO, we should always have a single ContentHandler object, and test if
> it's also instanceof LexicalHandler. And only in the rare occasions
> where we have two separate objects, then use a wrapper that implements
> both interfaces.
Agreed.


Carsten
-- 
Carsten Ziegeler - Open Source Group, S&N AG
http://www.s-und-n.de
http://www.osoco.org/weblogs/rael/

Re: [2.2] New processor interface/approach

Posted by Sylvain Wallez <sy...@apache.org>.

Carsten Ziegeler wrote:
> Some time ago we discussed very lengthy that our core interface, the
> Processor, is not the best interface we ever invented. The processor
> should be the main entrance to the Cocoon processing engine.
>   

<snip/>

> So, WDYT?
>   

Interesting.

However, a few remarks:
- how is Processor different from Servlet?
- SitemapProcessorFactory is pretty close to
ServletContext.getRequestDipatcher()

Also, I would avoid at all costs the Consumer inerface. In most cases
where a LexicalHandler is present, it is the same object as the
ContentHandler, but of course it doesn't implement the Cocoon-defined
interface, thus requiring a wrapper to be used.

IMO, we should always have a single ContentHandler object, and test if
it's also instanceof LexicalHandler. And only in the rare occasions
where we have two separate objects, then use a wrapper that implements
both interfaces.

Sylvain

-- 
Sylvain Wallez - http://bluxte.net

Re: [2.2] New processor interface/approach

Posted by Carsten Ziegeler <cz...@apache.org>.

Sylvain Wallez wrote:
> Daniel Fagerstrom wrote:
>> Often you are going to need the servlet configuration and/or context
>> and then you are back on the Servlet interface again. OTH, having a
>> Servlet as a managed component is slightly complicated as one need to
>> keep track on both its life cycle as a managed component and as a
>> servlet. I guess we will see which way that is best to go when you
>> have implemented a little bit more.
> 
> IMO that should be servlets, which allow for many other things (and not
> only Cocoon) to be integrated in a container. Being able to manage
> servlets means we're able to manage anything and can also benefit
> from/share with other people's work in this domain.
> 
Hmm, not sure if this is necessary. Imho it should not be the concern of
Cocoon to
manage other servlets. So if you want to integrate something else which
is a servlet, you can use the request dispatcher and leave the rest to
the servlet container.

My idea with the processor interface was to say that Cocoon provides you
usable *components*. You can just lookup these components and use them
which is in the end easier than using servlets - now the difference is
very subtle I agree. But it might be that we need an interface which
differs in some aspects to the servlet later on.

Carsten
-- 
Carsten Ziegeler - Open Source Group, S&N AG
http://www.s-und-n.de
http://www.osoco.org/weblogs/rael/

Re: [2.2] New processor interface/approach

Posted by Sylvain Wallez <sy...@apache.org>.

Daniel Fagerstrom wrote:
> Often you are going to need the servlet configuration and/or context
> and then you are back on the Servlet interface again. OTH, having a
> Servlet as a managed component is slightly complicated as one need to
> keep track on both its life cycle as a managed component and as a
> servlet. I guess we will see which way that is best to go when you
> have implemented a little bit more.

IMO that should be servlets, which allow for many other things (and not
only Cocoon) to be integrated in a container. Being able to manage
servlets means we're able to manage anything and can also benefit
from/share with other people's work in this domain.

Sylvain

-- 
Sylvain Wallez - http://bluxte.net

Re: [2.2] New processor interface/approach

Posted by Daniel Fagerstrom <da...@nada.kth.se>.

Carsten Ziegeler skrev:
> Some time ago we discussed very lengthy that our core interface, the
> Processor, is not the best interface we ever invented. The processor
> should be the main entrance to the Cocoon processing engine.
>
> As I need a simple way of adding my own processor in my project, I
> started to create new interfaces and implementations which should make
> the whole thing much easier (to use and implement).
>   
Sounds good.

> Before I go into the details, please note that these changes do not
> affect the usual Cocoon user at all. It should just clean up the core
> which is usually never used by any project. But with changes like these,
> this hopefully will change.
>
> I committed the new stuff as a prototype to the whiteboard
> http://svn.apache.org/repos/asf/cocoon/whiteboard/processor/
>
> The code is not tested yet, as I would like to discuss the things first,
> before completly implementing them.
>
> Ok, the new processor interface is very simply as it just has one single
> method (process) which gets a HttpServletRequest and a
> HttpServletResponse. That's it - this makes integrating Cocoon in any
> other web environment very easy and I can't think of any simpler
> interface :) Basically integrating Cocoon in another framework is then
> a) get the cocoon spring bean container from the servlet context, b) get
> the processor bean from the spring container and c) invoke the processor.
>   
Often you are going to need the servlet configuration and/or context and 
then you are back on the Servlet interface again. OTH, having a Servlet 
as a managed component is slightly complicated as one need to keep track 
on both its life cycle as a managed component and as a servlet. I guess 
we will see which way that is best to go when you have implemented a 
little bit more.

> There might be the need to receive sax events instead of getting the
> generated content from cocoon in a stream. So you can pass in a sub
> interface of HttpServletResponse - the SAXAwareHttpServletResponse - and
> in this case the processor sends sax events to the response instead of
> writing to the output stream - I'm not sure if this makes sense, it was
> just an idea.
>   
A good idea IMO. I considered going that way while working on the VPCs, 
but back then it seemed to be to much work as hadn't start to move 
towards the servlet set of interfaces. Instead we have a 
VirtualProcessingPipeline that contains the methods:

ordinary pipelines:
    public boolean process(Environment environment)
VPC generators (and Cocoon protocol):
    public boolean process(Environment environment, XMLConsumer consumer)
VPC transformers:
    public XMLConsumer getXMLConsumer(Environment environment, 
XMLConsumer consumer)
VPC -pipelines:
    public XMLConsumer getXMLConsumer(Environment environment)

I have never been happy with that solution and if we want to add XML 
pull processing to that we can see that we get an combinatorial 
explosion of the number of methods. Putting the knowledge of the input 
and output format into the request and responses probably results in 
much more manageable and possibly reusable code.

> Currently I have two Processor implementations. One of them is the
> sitemap processor embedding the TreeProcessor - so this is the usual stuff.
>
> The other processor implementation uses the mount table approach. It
> reads in an xml configuration file for the mounted sitemaps and then
> forwards the request to a sitemap processor. So instead of having a
> mount in the main sitemap, you have no main sitemap with this approach
> anymore but directly jump into the correct sitemap based on the mount
> table configuration. This can be enhanced later on to jump to a
> completly different processor which might not use the a sitemap at all.
>   
For the generalizations that you suggest it is better to use the blocks 
architecture. Right now it is tied to OSGi and cannot be used in 2.2. 
But there are not that many dependencies, so it would not be that much 
work to make it usable in 2.2. For this there would be a dispatcher 
processor that depending on the uri prefix, dispatches to the relevant 
BlockServlet.

> So, WDYT?
>   
Looks good.

/Daniel

Re: [2.2] New processor interface/approach

Posted by Carsten Ziegeler <cz...@apache.org>.

Peter Hunsberger wrote:

> 
> Interesting.  Sounds perhaps useful without the dual implementations
> as just simplifying the core, but the ability to plug in your own base
> processor also sounds interesting; if we ever go to the point where
> all our processing is completely database driven I take it that, in
> theory, this would basically let us plug in something to jump straight
> to a single fixed pipline (or two) without the rest of the sitemap
> processing overhead?
> 
Yes, that's possible and one of the ideas behind this. It seems that if
you're developing web applications with Cocoon, you always end up with
one single pipeline for 95% of your app. In this case you could just
provide a processor creating this pipeline and that's it.

I'm also about to make a suggestion for a simpler pipeline api in the
next days.

Carsten

-- 
Carsten Ziegeler - Open Source Group, S&N AG
http://www.s-und-n.de
http://www.osoco.org/weblogs/rael/

Re: [2.2] New processor interface/approach

Posted by Peter Hunsberger <pe...@gmail.com>.

On 7/11/06, Carsten Ziegeler <cz...@apache.org> wrote:
<snip/>
> I committed the new stuff as a prototype to the whiteboard
> http://svn.apache.org/repos/asf/cocoon/whiteboard/processor/

<snip/>
>
> Currently I have two Processor implementations. One of them is the
> sitemap processor embedding the TreeProcessor - so this is the usual stuff.
>
> The other processor implementation uses the mount table approach. It
> reads in an xml configuration file for the mounted sitemaps and then
> forwards the request to a sitemap processor. So instead of having a
> mount in the main sitemap, you have no main sitemap with this approach
> anymore but directly jump into the correct sitemap based on the mount
> table configuration. This can be enhanced later on to jump to a
> completly different processor which might not use the a sitemap at all.
>
> So, WDYT?

Interesting.  Sounds perhaps useful without the dual implementations
as just simplifying the core, but the ability to plug in your own base
processor also sounds interesting; if we ever go to the point where
all our processing is completely database driven I take it that, in
theory, this would basically let us plug in something to jump straight
to a single fixed pipline (or two) without the rest of the sitemap
processing overhead?

-- 
Peter Hunsberger