You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@cocoon.apache.org by Carsten Ziegeler <cz...@apache.org> on 2008/07/11 15:19:58 UTC

[Corona] PIpeline API

Hi,

I'm currently looking for a nice and simple pipeline api to be 
integrated with Apache Sling :)
And of course I had a quick look at Corona (as everything else I found 
was not what I was searching for) which would be the prefered way of
implementing pipelines :)

Now, I only need the naked pipeline stuff - and most important here are
the interfaces and perhaps a simple pipeline implementation (without 
caching).

There are some points I would like to discuss:
a) Simple API separated from the implementation
I think it makes sense to put all API stuff into one single package, 
these are only a handfull of classes - perhaps there might be an 
additional util package.
The implementations of the various components should go into a different 
module as they are not needed by everyone. At least they should be in a 
different package for modularization purposes.
I would also package the whole caching stuff into an own module.

b) Actions should not be part of the pipeline api
I think we discussed this some time ago :) Removing actions from the 
pipeline stuff does not really hurt - they are invoked before the 
pipeline, so it shouldn't be too hard to build custom code which 
collects actions, assembles the pipeline, invokes the actions and then 
the pipeline.

c) Pre and post processing
As the pipeline interfaces are not tied to sax or any other model (which 
is ok), there is no explicit notion of indicating that the processing 
starts or is finished - the latter is especially interesting for 
cleanup. So I think we should add these two lifecycle methods to the 
pipeline component interface.

d) Splitting setup and execute
I would like to split the Pipeline#execute method into two, one for 
initialisation and one (without arguments) for executing.

WDYT?

Regards
Carsten
-- 
Carsten Ziegeler
cziegeler@apache.org

Re: [Corona] PIpeline API

Posted by Carsten Ziegeler <cz...@apache.org>.

Reinhard Pötz wrote:
> Carsten Ziegeler wrote:
>> Reinhard Pötz wrote:
>>>
>>> currently corona-pipeline (pipeline API, pipeline impls, SAX 
>>> components) only has a dependency on commons-logging. Is this good 
>>> enough for your needs?
>> :) Ok, that should be fine.
>>
>> I've moved the action to the sitemap module and split up execute into 
>> setup and execute. I think we should specify in the contract that a 
>> modifiable map is passed to the components during setup. Any objections?
> 
> Currently sitemap components get passed to maps:
> 
>  1. the input parameters (a map of all objects that you want to pass
>     to a pipeline)
> 
>  2. the configuration parameters (to provide a way to pass parameters
>     from the sitemap to the component -> for API usage those parameters
>     are usually passed by constructor parameters or by setters).
> 
> Are you talking about passing the input parameters as parameters of the 
> setup() method?
> 
> void setup(Map<String, Object> inputParameters)
> 
> I'd be fine by this.
> 
Yes, this is what I did.

Carsten
-- 
Carsten Ziegeler
cziegeler@apache.org

Re: [Corona] PIpeline API

Posted by Grzegorz Kossakowski <gr...@tuffmail.com>.

Andreas Hartmann pisze:
> Just for my understanding:
> 
> Is the context (runtime/environment) information used by the pipeline 
> itself, or only by the pipeline components? From an SoC point of view 
> I'd assume that the pipeline implementation is independent from the 
> execution environment. I'd imagine it to handle only pipelining aspects:
> 
> * wiring components together
> * passing events
> * caching
> * … ?

I agree here. Thanks Andreas for reminding me my own thoughts (that are similar to yours) that I had 
at ApacheCon when Reinhard was explaining Corona's design.

Now I can join the discussion. :-)

> If this is the case, wouldn't it be more appropriate to pass the context 
> information to the pipeline components without bothering the pipeline 
> object itself? Or is it intended that the pipeline is seen as a black 
> box after it has been set up, to simplify the parameterization of all 
> its components?

Even if it's the case, should be this "black box" pipeline just one of the possible implementations? 
Then it would be a pipeline implementation that sets up components and passes what's needed. In some 
cases that would make a perfect sense.

-- 
Best regards,
Grzegorz Kossakowski

Re: [Corona] PIpeline API

Posted by Andreas Hartmann <an...@apache.org>.

Hi Cocoon devs,

I'd be very interested in a pipeline API for Sling, so I'd like to 
understand this discussion. Please excuse any unqualified remarks :)

Carsten Ziegeler schrieb:
>> A listener/callback approach would be cleaner for the API, but more 
>> complex.
>> Question is if it needs to be part of the API at all.
>>
> We're not talking about component configuration here, we are talking 
> about providing runtime/environment information for a pipeline run. 
> Think of the request/response object for the servlet case. I think it's 
> more convenient to pass this information through the api, than providing 
> some callback. And then is the question, how does a component get the 
> object it cann call as a callback?

Just for my understanding:

Is the context (runtime/environment) information used by the pipeline 
itself, or only by the pipeline components? From an SoC point of view 
I'd assume that the pipeline implementation is independent from the 
execution environment. I'd imagine it to handle only pipelining aspects:

* wiring components together
* passing events
* caching
* … ?

If this is the case, wouldn't it be more appropriate to pass the context 
information to the pipeline components without bothering the pipeline 
object itself? Or is it intended that the pipeline is seen as a black 
box after it has been set up, to simplify the parameterization of all 
its components?

Thanks for any clarification!

-- Andreas



-- 
Andreas Hartmann, CTO
BeCompany GmbH
http://www.becompany.ch
Tel.: +41 (0) 43 818 57 01

Re: [Corona] PIpeline API

Posted by Carsten Ziegeler <cz...@apache.org>.

Joerg Heinicke wrote:
> 
> A listener/callback approach would be cleaner for the API, but more complex.
> Question is if it needs to be part of the API at all.
> 
We're not talking about component configuration here, we are talking 
about providing runtime/environment information for a pipeline run. 
Think of the request/response object for the servlet case. I think it's 
more convenient to pass this information through the api, than providing 
some callback. And then is the question, how does a component get the 
object it cann call as a callback?

Carsten

-- 
Carsten Ziegeler
cziegeler@apache.org

Re: [Corona] Pipeline API

Posted by Reinhard Pötz <re...@apache.org>.

Joerg Heinicke wrote:
> Torsten Curdt <tcurdt <at> apache.org> writes:
> 
>> The question if those configuration are needed in a generic form in  
>> the API. (I doubt it) As I would expect them to be implementation  
>> specific a configuration callback that sets up the pipeline might be a  
>> way around this?
> 
> I guess we are on the same position on this one, setup and clean up are usually
> implementation specific and should therefore not be part of the API. Even for
> the finish() method it might be necessary to pass a context or parameters.
> Already passing it in setup() might be an option, but then you force the
> component to have special handling for thread-safety.
> 
> Carsten Ziegeler <cziegeler <at> apache.org> writes:
> 
>> I added now a finish method which is called by the pipeline implementation.
>> This keeps me free from any configuration hassels with the various 
>> containers. Some want to use spring, some others something different.
> 
> But that's exactly what these container are there for.
> 
>> And perhaps someone doesn't want to use a container at all, just 
>> instantiate the objects, run the pipeline and that's it.
> 
> By just instantiating the objects you know exactly with which implementation you
> work - and which setup and finish method you are supposed to call.
> 
>> Therefore I really think that these lifecycle methods belong to the api.
> 
> That's what I don't agree with :-)
> 
>> I see no other reliable way of closing resources.
> 
> A listener/callback approach would be cleaner for the API, but more complex.
> Question is if it needs to be part of the API at all.

Can you provide an example how your ideas would materialize as Java code?

-- 
Reinhard Pötz                           Managing Director, {Indoqa} GmbH
                          http://www.indoqa.com/en/people/reinhard.poetz/

Member of the Apache Software Foundation
Apache Cocoon Committer, PMC member                  reinhard@apache.org
________________________________________________________________________

Re: [Corona] PIpeline API

Posted by Joerg Heinicke <jo...@gmx.de>.

Torsten Curdt <tcurdt <at> apache.org> writes:

> The question if those configuration are needed in a generic form in  
> the API. (I doubt it) As I would expect them to be implementation  
> specific a configuration callback that sets up the pipeline might be a  
> way around this?

I guess we are on the same position on this one, setup and clean up are usually
implementation specific and should therefore not be part of the API. Even for
the finish() method it might be necessary to pass a context or parameters.
Already passing it in setup() might be an option, but then you force the
component to have special handling for thread-safety.

Carsten Ziegeler <cziegeler <at> apache.org> writes:

> I added now a finish method which is called by the pipeline implementation.
> This keeps me free from any configuration hassels with the various 
> containers. Some want to use spring, some others something different.

But that's exactly what these container are there for.

> And perhaps someone doesn't want to use a container at all, just 
> instantiate the objects, run the pipeline and that's it.

By just instantiating the objects you know exactly with which implementation you
work - and which setup and finish method you are supposed to call.

> Therefore I really think that these lifecycle methods belong to the api.

That's what I don't agree with :-)

> I see no other reliable way of closing resources.

A listener/callback approach would be cleaner for the API, but more complex.
Question is if it needs to be part of the API at all.

Joerg

Re: [Corona] PIpeline API

Posted by Torsten Curdt <tc...@apache.org>.

On Jul 15, 2008, at 18:33, Carsten Ziegeler wrote:

> Peter Hunsberger wrote:
>> On Tue, Jul 15, 2008 at 5:42 AM, Reinhard Pötz  
>> <re...@apache.org> wrote:
>>> Are you talking about passing the input parameters as parameters  
>>> of the
>>> setup() method?
>>>
>>> void setup(Map<String, Object> inputParameters)
>>>
>>> I'd be fine by this.
>>>
>> I hate seeing Maps used as dumping grounds for randomly typed  
>> objects.
>> Could you use something that gives a little more strong typing?
>> Perhaps more like a ServletContext though I don't think I'd go that
>> far in this case?
> I agree that strong typing would be great - but the pipeline api  
> does not define any concrete key/object for the map. So this is use- 
> case specific. Therefore I think a map is the best we can come up.

The question if those configuration are needed in a generic form in  
the API. (I doubt it) As I would expect them to be implementation  
specific a configuration callback that sets up the pipeline might be a  
way around this?

Just my 2 cents

cheers
--
Torsten

Re: [Corona] PIpeline API

Posted by Peter Hunsberger <pe...@gmail.com>.

On Thu, Jul 17, 2008 at 11:22 AM, Andreas Hartmann <an...@apache.org> wrote:
> Hi Peter,
>
> Peter Hunsberger schrieb:
>>
>> On Thu, Jul 17, 2008 at 9:22 AM, Carsten Ziegeler <cz...@apache.org>
>> wrote:
>>>
>>> Andreas Hartmann wrote:
>>
>>>> I don't think that the calling code has to know the actual components,
>>>> but rather the environment-specific interfaces of the components. It
>>>> only makes sense to pass an environment to a pipeline component if the
>>>> component is designed to use this environment.
>>
>> Yeah, but if you've got really generic code this can be hard to figure
>> out....
>
> IMO there are two types of components which have to be generic:
>
> * the pipeline implementation
> * multi-purpose pipeline components
>
> The application itself is not generic, it knows which environment
> information to provide to the application-specific components.

Well yes and no.  In our case the application is generic, but that's
not the issue;  you're right that the application knows what to pass
in to the adapter (for any given application), the point is that
multiple applications can exist and all use the same adapter but pass
in different things.  For example, a servlet may have a Map of request
parameters but a CLI app may have a Array of command line parameters.

> For
> multi-purpose pipeline components (e.g., an event logger), an adapter class
> could be used (e.g., to pass an application-specific output stream to log
> the events to).
>
>

Yup...

[…]
>
>> I keep wondering if this perhaps an Adapter type pattern?  You've
>> potentially got a lot of different types of applications that each
>> could have different requirements for setting up pipelines.
>> Similarly, you've potentially got a lot of different types of
>> pipelines (in particular since Corona isn't just SAX).   So what you
>> do is define some adapter that gets passed around and leave it up to
>> the adapter to manage the use case specifics,
>>
>> Adaper adp = new MyUseCaseAdapter();
>> adp.setRequest(  req );
>>   .
>>   .
>>   .
>>  for (Iterator i = pipeline.getComponents().iterator(); … ) {
>>    PipelineComponent c = (PipelineComponent) i.next();
>>      c.setup(adp);
>>    }
>
> I'm not quite sure what c.setup() would look like. Perhaps like this?
>
> class XsltTransformer {
>  public void setup(Adapter adp) {
>    setXsltParams(adp.getXsltParams());
>  }
> }
>
> In this case, the Adapter implementation would have to provide methods for
> various types of pipeline components …
>
>> And then in the component:
>>
>> Object myConfigParam = adp.getParam( NAME );
>
> But wouldn't this require a contract between the application and the
> components based on parameter names? I thought this is what we wanted to
> avoid, otherwise we could just use the Map suggested by Carsten …

Well yes, but that's a loose example.  I'd rather have specific setter
/ getters that the adapter knows how to handle.  But if someone (eg.
Carsten) has a pipelines that use maps with named parameters that's
fine,  one should be able to write (or extend) a generic adapter to
handle the specific expectations of a given pipeline.

>
> But I guess I'm misunderstanding something :)
>
>> has no knowledge of how the param (or params) was passed into the adapter.
>>
>> The app is responsible for setting up the adapter with app specific
>> data and the adapter has multiple standard methods for allowing this
>> (and can be extended for new use cases).  The adapter is responsible
>> for passing it into the components in some more standardized way.
>> Think of the adapter as a data class with extra logic for converting
>> use case specific data into generic data.
>
> Hmm, what does the term "generic data" refer to? Would you mind giving an
> example? TIA!

In this case it would be up to the pipeline.  Like in the example
above, where someone  has a pipeline expecting a Map.

>
>> As such, the adapter can also be responsible for more than
>> initialization data, it can become the use case specific way of
>> communicating between the app and the components.  Instead of using
>> marker interfaces to define the use case specific responsibilities you
>> end up with the adapter having multiple methods for different use
>> cases as needed.
>
> As I understand it, one would have to provide application-specific adapters
> to "environmentalize" multi-purpose components; I don't see a way how this
> could be handled by a single adapter class for the whole application …

You do end up extending adapters for specific applications, but in
general you don't need a lot of them for any given application.
Instead of adding new adapters for new use cases you end up adding
methods to the adapters for the new use cases. The main reason to add
new adapters is to reduce unnecessary work; you don't want the adapter
converting  data into generic formats if it's not needed, but most of
the time this can be avoided by having the getters do the conversion
and not the setters.

> I'll think a bit more about this, sorry if my remarks are rubbish :)
>

This is such an abstract discussion I don't see how any of this could
be rubbish! Go back to your previous email on the Visitor pattern, you
summarized with:

> In this case the application can provide a layer of classes to adapt the standard pipeline components to the application-specific environment.
> The marker interface check would allow to use other components as well - they just wouldn't be "environmentalized".

I think we're chasing the same basic ideas it's a problem of fiduring
out where to do the work. I'm suggesting, in part, that we avoid the
Marker interfaces by adding use case specific methods to the "extra
layer of classes".   Basically take the code you run as a result of
finding a Marker interface and move it into a use case specific method
in the Adapter class.  Think of it this way, any time you would have
to recognize the existence of a Marker interface you instead call a
method in the adapter (that may or may not do anything depending on
the use case at hand).

This might all be overkill for the problem at hand, but I find it a
powerful way to build really generic code...

-- 
Peter Hunsberger

Re: [Corona] PIpeline API

Posted by Andreas Hartmann <an...@apache.org>.

Hi Peter,

Peter Hunsberger schrieb:
> On Thu, Jul 17, 2008 at 9:22 AM, Carsten Ziegeler <cz...@apache.org> wrote:
>> Andreas Hartmann wrote:
> 
>>> I don't think that the calling code has to know the actual components,
>>> but rather the environment-specific interfaces of the components. It
>>> only makes sense to pass an environment to a pipeline component if the
>>> component is designed to use this environment.
> 
> Yeah, but if you've got really generic code this can be hard to figure out....

IMO there are two types of components which have to be generic:

* the pipeline implementation
* multi-purpose pipeline components

The application itself is not generic, it knows which environment 
information to provide to the application-specific components. For 
multi-purpose pipeline components (e.g., an event logger), an adapter 
class could be used (e.g., to pass an application-specific output stream 
to log the events to).

[…]

> I keep wondering if this perhaps an Adapter type pattern?  You've
> potentially got a lot of different types of applications that each
> could have different requirements for setting up pipelines.
> Similarly, you've potentially got a lot of different types of
> pipelines (in particular since Corona isn't just SAX).   So what you
> do is define some adapter that gets passed around and leave it up to
> the adapter to manage the use case specifics,
> 
> Adaper adp = new MyUseCaseAdapter();
> adp.setRequest(  req );
>    .
>    .
>    .
>   for (Iterator i = pipeline.getComponents().iterator(); … ) {
>     PipelineComponent c = (PipelineComponent) i.next();
>       c.setup(adp);
>     }

I'm not quite sure what c.setup() would look like. Perhaps like this?

class XsltTransformer {
   public void setup(Adapter adp) {
     setXsltParams(adp.getXsltParams());
   }
}

In this case, the Adapter implementation would have to provide methods 
for various types of pipeline components …

> And then in the component:
> 
> Object myConfigParam = adp.getParam( NAME );

But wouldn't this require a contract between the application and the 
components based on parameter names? I thought this is what we wanted to 
avoid, otherwise we could just use the Map suggested by Carsten …

But I guess I'm misunderstanding something :)

> has no knowledge of how the param (or params) was passed into the adapter.
> 
> The app is responsible for setting up the adapter with app specific
> data and the adapter has multiple standard methods for allowing this
> (and can be extended for new use cases).  The adapter is responsible
> for passing it into the components in some more standardized way.
> Think of the adapter as a data class with extra logic for converting
> use case specific data into generic data.

Hmm, what does the term "generic data" refer to? Would you mind giving 
an example? TIA!

> As such, the adapter can also be responsible for more than
> initialization data, it can become the use case specific way of
> communicating between the app and the components.  Instead of using
> marker interfaces to define the use case specific responsibilities you
> end up with the adapter having multiple methods for different use
> cases as needed.

As I understand it, one would have to provide application-specific 
adapters to "environmentalize" multi-purpose components; I don't see a 
way how this could be handled by a single adapter class for the whole 
application …

I'll think a bit more about this, sorry if my remarks are rubbish :)

-- Andreas


-- 
Andreas Hartmann, CTO
BeCompany GmbH
http://www.becompany.ch
Tel.: +41 (0) 43 818 57 01

Re: [Corona] PIpeline API

Posted by Peter Hunsberger <pe...@gmail.com>.

On Thu, Jul 17, 2008 at 5:27 PM, Steven Dolg <st...@gmx.at> wrote:
> Peter Hunsberger schrieb:
>>

<snip/>

> AFAIK an adapter is used to adapt one interface or class to another.
> So what does the adapter adapt - what's the adapted class/interface?

Well given that the entire discussion is abstract that's sort of hard
to answer.....  However, I sort of covered a bit of this, if Corona is
going to support non SAX pipelines they you've to assume source and
sinks that you'll want to mix and match.  In addition you can assume
multiple  types of applications, for example Servlets and Command Line
(and I'm sure others).  Whenever I see three distinct orthogonal class
dimensions like this it seems that you end up with a multitude of
constructors or initialization methods each taking a set of different
of classes. I'm basically proposing that instead we end up with a
single set of Adapter classes instead.  The exterior classes (being
adapted from)  come from  the application specific environments and
may be Servlet context, pipeline config, command line params or even
the output of other pipelines (I've got a secret agenda here).  The
interior classes (those we adapting to) are yet to be defined but are
basically going to configure the pipeline components.

>
> Since c.setup(adp) belongs to the general PipelineComponent, the
> MyUseCaseAdapter has to be passed as Adapter (exactly as described above).
> So this way providing parameters might become easier, but IMO reading
> parameters (inside the component) is just the same as using a map (also
> exactly as described above).
> Except you want to go for a downcast... ;-)

Umm no,  I'm trying to move away from something like a Map, the
example was just  because Carsten was already using a Map and there
shoudl be no reason that the Adapter shouldn't make it easy for him to
retain most of his existing code.

>
> Furthermore MyUseCaseAdapter would have to account for every possible
> configuration requirement of every component that might be in the pipeline.
> This could be become quite messy if the pipeline is assembled dynamically.
>

nto sure how you figure this?  If there is a mess the adapter makes it
easier to handle, not worse, since you;'re basically converting to a
canonical form instead of having something like N X M transforms.

>
> The other thing I'm sure about:
> Why would I want to iterate over the pipeline components?

That was from the previous code as an example of pipeline
configuration (I think).  I'm guessing this is similar to Cocoon where
all the Setup methods are run for the pipelnie.

> Somehow I think it should not be necessary that client (or sitemap) code
> directly accesses the components of a pipeline after it is assembled.
> That's also the reason, why there is no method to expose the components on
> the pipeline interface.

Yea, the implementation sketched out is part of the internal Corona
code, it's not something the client / app code should see.

Gotta run, hope that doesn't confuse things even more...

-- 
Peter Hunsberger

Re: [Corona] PIpeline API

Posted by Steven Dolg <st...@gmx.at>.

Peter Hunsberger schrieb:
> On Thu, Jul 17, 2008 at 9:22 AM, Carsten Ziegeler <cz...@apache.org> wrote:
>   
>> Andreas Hartmann wrote:
>>     
>
>   
>>> I don't think that the calling code has to know the actual components,
>>> but rather the environment-specific interfaces of the components. It
>>> only makes sense to pass an environment to a pipeline component if the
>>> component is designed to use this environment.
>>>       
>
> Yeah, but if you've got really generic code this can be hard to figure out....
>
>   
>> Agreed.
>>
>>     
>>> Maybe I can try to come
>>>
>>> up with a more generic example:
>>>
>>> public interface WebappPipelineComponent extends PipelineComponent {
>>>  void setRequest(Request request);
>>> }
>>>
>>> Client code inside a web application:
>>>
>>> public void parameterizeComponents(Request req, Pipeline pipeline) {
>>>  for (Iterator i = pipeline.getComponents().iterator(); … ) {
>>>    PipelineComponent c = (PipelineComponent) i.next();
>>>    if (c instanceof WebappPipelineComponent) {
>>>      WebappPipelineComponent wpc = (…) c;
>>>      wpc.setRequest(req);
>>>    }
>>>  }
>>> }
>>>
>>> The pipeline is executed in a specific environment. The actual
>>> pipeline object itself is oblivious of the environment information, but
>>> the pipeline components are directly dependent on the environment.
>>>       
>> Hmm, yes this would work, but :) this would make it harder to have a
>> reusable pipeline implementation that frees my application from passing the
>> information to the components.
>> Currently the app creates a map, passed it to the pipeline implementation
>> and this implementation passes the map on to the components.
>> With the approach above, I would need a custom pipeline implementation to do
>> this. Furthermore there might be a lot of marker interfaces to test.
>> Actually I'm not sure which approach is nicer :)
>>     
>
> I keep wondering if this perhaps an Adapter type pattern?  You've
> potentially got a lot of different types of applications that each
> could have different requirements for setting up pipelines.
> Similarly, you've potentially got a lot of different types of
> pipelines (in particular since Corona isn't just SAX).   So what you
> do is define some adapter that gets passed around and leave it up to
> the adapter to manage the use case specifics,
>
> Adaper adp = new MyUseCaseAdapter();
> adp.setRequest(  req );
>    .
>    .
>    .
>   for (Iterator i = pipeline.getComponents().iterator(); … ) {
>     PipelineComponent c = (PipelineComponent) i.next();
>       c.setup(adp);
>     }
>
> And then in the component:
>
> Object myConfigParam = adp.getParam( NAME );
>
> has no knowledge of how the param (or params) was passed into the adapter.
>   
AFAIK an adapter is used to adapt one interface or class to another.
So what does the adapter adapt - what's the adapted class/interface?

Since c.setup(adp) belongs to the general PipelineComponent, the 
MyUseCaseAdapter has to be passed as Adapter (exactly as described above).
So this way providing parameters might become easier, but IMO reading 
parameters (inside the component) is just the same as using a map (also 
exactly as described above).
Except you want to go for a downcast... ;-)

Furthermore MyUseCaseAdapter would have to account for every possible 
configuration requirement of every component that might be in the pipeline.
This could be become quite messy if the pipeline is assembled dynamically.


The other thing I'm sure about:
Why would I want to iterate over the pipeline components?
Somehow I think it should not be necessary that client (or sitemap) code 
directly accesses the components of a pipeline after it is assembled.
That's also the reason, why there is no method to expose the components 
on the pipeline interface.

The Law of Demeter is surely something to argue about, but IMO it is 
usually better to follow it - unless you have a good reason, which I 
fail to see here.

> The app is responsible for setting up the adapter with app specific
> data and the adapter has multiple standard methods for allowing this
> (and can be extended for new use cases).  The adapter is responsible
> for passing it into the components in some more standardized way.
> Think of the adapter as a data class with extra logic for converting
> use case specific data into generic data.
> As such, the adapter can also be responsible for more than
> initialization data, it can become the use case specific way of
> communicating between the app and the components.  Instead of using
> marker interfaces to define the use case specific responsibilities you
> end up with the adapter having multiple methods for different use
> cases as needed.
>
>

Re: [Corona] PIpeline API

Posted by Peter Hunsberger <pe...@gmail.com>.

On Thu, Jul 17, 2008 at 9:22 AM, Carsten Ziegeler <cz...@apache.org> wrote:
> Andreas Hartmann wrote:

>>
>> I don't think that the calling code has to know the actual components,
>> but rather the environment-specific interfaces of the components. It
>> only makes sense to pass an environment to a pipeline component if the
>> component is designed to use this environment.
>

Yeah, but if you've got really generic code this can be hard to figure out....

> Agreed.
>
>> Maybe I can try to come
>>
>> up with a more generic example:
>>
>> public interface WebappPipelineComponent extends PipelineComponent {
>>  void setRequest(Request request);
>> }
>>
>> Client code inside a web application:
>>
>> public void parameterizeComponents(Request req, Pipeline pipeline) {
>>  for (Iterator i = pipeline.getComponents().iterator(); … ) {
>>    PipelineComponent c = (PipelineComponent) i.next();
>>    if (c instanceof WebappPipelineComponent) {
>>      WebappPipelineComponent wpc = (…) c;
>>      wpc.setRequest(req);
>>    }
>>  }
>> }
>>
>> The pipeline is executed in a specific environment. The actual
>> pipeline object itself is oblivious of the environment information, but
>> the pipeline components are directly dependent on the environment.
>
> Hmm, yes this would work, but :) this would make it harder to have a
> reusable pipeline implementation that frees my application from passing the
> information to the components.
> Currently the app creates a map, passed it to the pipeline implementation
> and this implementation passes the map on to the components.
> With the approach above, I would need a custom pipeline implementation to do
> this. Furthermore there might be a lot of marker interfaces to test.
> Actually I'm not sure which approach is nicer :)

I keep wondering if this perhaps an Adapter type pattern?  You've
potentially got a lot of different types of applications that each
could have different requirements for setting up pipelines.
Similarly, you've potentially got a lot of different types of
pipelines (in particular since Corona isn't just SAX).   So what you
do is define some adapter that gets passed around and leave it up to
the adapter to manage the use case specifics,

Adaper adp = new MyUseCaseAdapter();
adp.setRequest(  req );
   .
   .
   .
  for (Iterator i = pipeline.getComponents().iterator(); … ) {
    PipelineComponent c = (PipelineComponent) i.next();
      c.setup(adp);
    }

And then in the component:

Object myConfigParam = adp.getParam( NAME );

has no knowledge of how the param (or params) was passed into the adapter.

The app is responsible for setting up the adapter with app specific
data and the adapter has multiple standard methods for allowing this
(and can be extended for new use cases).  The adapter is responsible
for passing it into the components in some more standardized way.
Think of the adapter as a data class with extra logic for converting
use case specific data into generic data.
As such, the adapter can also be responsible for more than
initialization data, it can become the use case specific way of
communicating between the app and the components.  Instead of using
marker interfaces to define the use case specific responsibilities you
end up with the adapter having multiple methods for different use
cases as needed.

-- 
Peter Hunsberger

Re: [Corona] PIpeline API

Posted by Steven Dolg <st...@gmx.at>.

Carsten Ziegeler schrieb:
> Andreas Hartmann schrieb:
>> Carsten Ziegeler schrieb:
>>> Bertrand Delacretaz wrote:
>>>> On Wed, Jul 16, 2008 at 5:14 AM, Sylvain Wallez 
>>>> <sy...@apache.org> wrote:
>>>>
>>>>> ...Can't we just give the parameters they need
>>>>> individually to every pipeline component, thus allowing per-component
>>>>> strongly typed and well defined contracts?..
>>>>
>>>> I'm also catching up on this thread but I tend to agree with the above
>>>> suggestion, i.e. something like
>>>>
>>>>   Transformer t = new XsltTransformer();
>>>>   t.setXslt("cocoon:/mytransform.xsl");
>>>>   t.setRequestParameters(request.getParameterMap());
>>>>   ...
>>>>
>>> Ok, in this case you can't use "Transformer" as the class type. This 
>>> would be:
>>> XsltTransformer t = ...
>>>
>>> Now, it seems that we are still mixing up things here. As I said, 
>>> I'm not talking about configuration of the components. The 
>>> stylesheet from above is a configuration.
>>> We're talking about information about the current environment for 
>>> executing the already configured pipeline.
>>
>> IIUC the request parameters in the above example would belong to the 
>> execution environment, wouldn't they?
>>
>> I'd imagine something like this:
>>
>> public void parameterizeTransformers(Request req, Pipeline pipeline) {
>>   for (Iterator i = pipeline.getTransformers().iterator(); … ) {
>>     Transformer t = (Transformer) i.next();
>>     if (t instanceof WebappXsltTransformer) {
>>       WebappXsltTransformer xsltTr = (WebappXsltTransformer) t;
>>       if (xsltTr.useRequestParameters()) {
>>         xsltTr.setXsltParams(req.getParameterMap());
>>       }
>>     }
>>   }
>> }
>>
> Now all these examples assume that the calling code knows the components.
> For my use case - and it's the same with the Cocoon sitemap - I've a
> description of a pipeline (think of the sitemap) which has just the name
> of the pipeline components to chain. A generic code instantiates these
> pipeline components - through a service registry - and does not know 
> anything about these components apart from the fact that they are 
> pipeline components.
I completely agree here.
Specific situations might allow specific solutions, e.g. providing 
listeners/callbacks while creating the individual components.
However a more general solution that also supports more general 
scenarios like the one described above (components are created by a 
service/factory/etc.) should also be available.

I wouldn't mind having a general purpose setup/teardown (or 
preExecution/postExecution, etc.) method defined and called by the 
pipeline API itself.
Might not be a good example, but JUnit provides such methods for unit 
tests and does not delegate this to another framework/container that 
might or might not be available, since this is a genuine requirement for 
certain tasks to be solved with the framework.

However the listener approach also appears to be appealing to me. 
Although I'm not entirely sure this would be as easy to use as the 
lifecycle methods proposed before (especially when a pipeline is built 
by the sitemap engine or service/factory).

Steven
>
> Carsten
>

Re: [Corona] PIpeline API

Posted by Ralph Goers <Ra...@dslextreme.com>.

Carsten Ziegeler wrote:
>
> Ok, this all depends on what you consider configuration vs execution 
> information. If you look at the current Cocoon sitemap components 
> they've only a little configuration (everything that can be configured 
> in the "components" section of the sitemap). Most information is 
> passed in as execution information like the source to read from or any 
> additional paramter.
>
> The component configuration can be easily done by a spring bean 
> configuration. The execution information is the interesting part as 
> this is different between each pipeline run.
You've seen the sitemaps and configuration we used. Often we would code 
something like src="${request:bankid}/somedata.xml" in the xconf (Spring 
configuration). On every execution bankid would be dynamically resolved 
so that every request could be on behalf of different banks. Of course, 
the ability to do this is highly dependent on the components ability (or 
the sitemap's) ability to perform variable resolution. We were often 
force to do something like src="cocoon://${request:bankid}/somedata.xml" 
just to get around this. 
Wouldn't liberal use of this concept solve the problem?
>
> So, some pseudo code could look like this:
>
> // read configuration of pipeline from "somewhere"
> String type = // the component type
> Map m = // the execution information
>
> Transformer t = springContext.getBean(type);
> t.setup(m);
>
> Now, my configuration of the pipeline is actually stored in a JCR :) 
> but this is comparable with using the sitemap.xmap. Of course it 
> should contain a "src" information for the transformer to work.
Are you just getting the pipeline via WebDav? If so that isn't really 
any different. If you are using the JCR API that could probably just be 
abstracted into a JCR protocol handler.

Am I missing something?

Ralph

Re: [Corona] PIpeline API

Posted by Carsten Ziegeler <cz...@apache.org>.

Ralph Goers wrote:
>>
> Can you show an example?
> 
> Can they be declared as Spring beans and wired with the appropriate 
> parameters, or at least beans that know how to dynamically obtain the 
> correct information and then be referenced in the sitemap?
> 
Ok, this all depends on what you consider configuration vs execution 
information. If you look at the current Cocoon sitemap components 
they've only a little configuration (everything that can be configured 
in the "components" section of the sitemap). Most information is passed 
in as execution information like the source to read from or any 
additional paramter.

The component configuration can be easily done by a spring bean 
configuration. The execution information is the interesting part as this 
is different between each pipeline run.

So, some pseudo code could look like this:

// read configuration of pipeline from "somewhere"
String type = // the component type
Map m = // the execution information

Transformer t = springContext.getBean(type);
t.setup(m);

Now, my configuration of the pipeline is actually stored in a JCR :) but 
this is comparable with using the sitemap.xmap. Of course it should 
contain a "src" information for the transformer to work.

Carsten
-- 
Carsten Ziegeler
cziegeler@apache.org

Re: [Corona] PIpeline API

Posted by Ralph Goers <Ra...@dslextreme.com>.


Carsten Ziegeler wrote:
>
> Now all these examples assume that the calling code knows the components.
> For my use case - and it's the same with the Cocoon sitemap - I've a
> description of a pipeline (think of the sitemap) which has just the name
> of the pipeline components to chain. A generic code instantiates these
> pipeline components - through a service registry - and does not know 
> anything about these components apart from the fact that they are 
> pipeline components.
>
Can you show an example?

Can they be declared as Spring beans and wired with the appropriate 
parameters, or at least beans that know how to dynamically obtain the 
correct information and then be referenced in the sitemap?

Ralph

Re: [Corona] PIpeline API

Posted by Andreas Hartmann <an...@apache.org>.

Carsten Ziegeler schrieb:

[…]

>> Client code inside a web application:
>>
>> public void parameterizeComponents(Request req, Pipeline pipeline) {
>>   for (Iterator i = pipeline.getComponents().iterator(); … ) {
>>     PipelineComponent c = (PipelineComponent) i.next();
>>     if (c instanceof WebappPipelineComponent) {
>>       WebappPipelineComponent wpc = (…) c;
>>       wpc.setRequest(req);
>>     }
>>   }
>> }
>>
>> The pipeline is executed in a specific environment. The actual
>> pipeline object itself is oblivious of the environment information, but
>> the pipeline components are directly dependent on the environment.
 >
> Hmm, yes this would work, but :) this would make it harder to have a 
> reusable pipeline implementation that frees my application from passing 
> the information to the components.

Does it make a big difference if the information is passed to the 
pipeline or to the components? To get rid of the boilerplate loop above, 
the visitor pattern could be used:

public class Webapp implements PipelineEnvironment {

   public void preparePipelineForExecution() {
     // let the pipeline visit its components
     this.pipeline.setEnvironment(this);
   }

   /**
    * @see PipelineEnvironment.environmentalize(…)
    */
   public void environmentalize(PipelineComponent c) {
     if (c instanceof WebappPipelineComponent) {
       WebappPipelineComponent wpc = (…) c;
       wpc.setRequest(req);
     }
   }

}


> Currently the app creates a map, passed it to the pipeline 
> implementation and this implementation passes the map on to the components.
> With the approach above, I would need a custom pipeline implementation 
> to do this.

Hmm, why would you need a custom pipeline implementation? Wouldn't this 
be generic enough:

public class PipelineImpl implements Pipeline {

   public void setEnvironment(PipelineEnvironment env) {
     for (Iterator i = getComponents().iterator(); … ) {
       PipelineComponent c = (…) i.next();
       env.environmentalize(c);
     }
   }

}


> Furthermore there might be a lot of marker interfaces to test.

The marker interfaces wouldn't be generic, but application-specific, so 
there would only be one needed per application.

I imagine to use application-specific wrappers for "standard" pipeline 
components, e.g.

public class WebappXsltTransformerWrapper implements Transformer,
   WebappPipelineComponent {

   private XsltTransformer delegate;

   public void setRequest(Request req) {
     if (this.useRequestParams()) {
       this.delegate.setXsltParams(req.getParameterMap());
     }
   }
}

In this case the application can provide a layer of classes to adapt the 
standard pipeline components to the application-specific environment. 
The marker interface check would allow to use other components as well - 
they just wouldn't be "environmentalized".

-- Andreas



-- 
Andreas Hartmann, CTO
BeCompany GmbH
http://www.becompany.ch
Tel.: +41 (0) 43 818 57 01

Re: [Corona] PIpeline API

Posted by Carsten Ziegeler <cz...@apache.org>.

Andreas Hartmann wrote:
 > <SNIP/>
> 
> The DSL interpreter would use reflection to call the setStylesheet() and 
> setXsltParams() methods of the XsltTransformer. A resolver service would 
> be used for parameter expansion, e.g. for input module calls in Cocoon.
Yes, sure, reflection tricks could be used instead of course. This would 
also detect typos in the parameters as the according set method would 
not be available.

> About the first aspect:
> 
> I don't think that the calling code has to know the actual components,
> but rather the environment-specific interfaces of the components. It
> only makes sense to pass an environment to a pipeline component if the
> component is designed to use this environment. 
Agreed.

 > Maybe I can try to come
> up with a more generic example:
> 
> public interface WebappPipelineComponent extends PipelineComponent {
>   void setRequest(Request request);
> }
> 
> Client code inside a web application:
> 
> public void parameterizeComponents(Request req, Pipeline pipeline) {
>   for (Iterator i = pipeline.getComponents().iterator(); … ) {
>     PipelineComponent c = (PipelineComponent) i.next();
>     if (c instanceof WebappPipelineComponent) {
>       WebappPipelineComponent wpc = (…) c;
>       wpc.setRequest(req);
>     }
>   }
> }
> 
> The pipeline is executed in a specific environment. The actual
> pipeline object itself is oblivious of the environment information, but
> the pipeline components are directly dependent on the environment.
Hmm, yes this would work, but :) this would make it harder to have a 
reusable pipeline implementation that frees my application from passing 
the information to the components.
Currently the app creates a map, passed it to the pipeline 
implementation and this implementation passes the map on to the components.
With the approach above, I would need a custom pipeline implementation 
to do this. Furthermore there might be a lot of marker interfaces to test.
Actually I'm not sure which approach is nicer :)

Carsten


-- 
Carsten Ziegeler
cziegeler@apache.org

Re: [Corona] PIpeline API

Posted by Andreas Hartmann <an...@apache.org>.

Hi Carsten,

Carsten Ziegeler schrieb:

[…]

>> I'd imagine something like this:
>>
>> public void parameterizeTransformers(Request req, Pipeline pipeline) {
>>   for (Iterator i = pipeline.getTransformers().iterator(); … ) {
>>     Transformer t = (Transformer) i.next();
>>     if (t instanceof WebappXsltTransformer) {
>>       WebappXsltTransformer xsltTr = (WebappXsltTransformer) t;
>>       if (xsltTr.useRequestParameters()) {
>>         xsltTr.setXsltParams(req.getParameterMap());
>>       }
>>     }
>>   }
>> }
>>
> Now all these examples assume that the calling code knows the components.

Maybe we have to differentiate between two aspects of execution environment:

1) Information that is not contained in the pipeline description, e.g. 
the request in a web application

2) Parameters which are part of the pipeline description DSL (like 
<map:parameter/> in Cocoon)

The second aspect could be handled by the pipeline DSL interpreter in a 
generic way. I imagine something like this:

   match wildcard(pattern: "*") :
   generate xml(uri: "context://{1}.xml") >
   transform xslt(stylesheet: "{request-param:style}", xsltParams: …) >
   serialize xml;

The DSL interpreter would use reflection to call the setStylesheet() and 
setXsltParams() methods of the XsltTransformer. A resolver service would 
be used for parameter expansion, e.g. for input module calls in Cocoon.

About the first aspect:

I don't think that the calling code has to know the actual components,
but rather the environment-specific interfaces of the components. It
only makes sense to pass an environment to a pipeline component if the
component is designed to use this environment. Maybe I can try to come
up with a more generic example:

public interface WebappPipelineComponent extends PipelineComponent {
   void setRequest(Request request);
}

Client code inside a web application:

public void parameterizeComponents(Request req, Pipeline pipeline) {
   for (Iterator i = pipeline.getComponents().iterator(); … ) {
     PipelineComponent c = (PipelineComponent) i.next();
     if (c instanceof WebappPipelineComponent) {
       WebappPipelineComponent wpc = (…) c;
       wpc.setRequest(req);
     }
   }
}

The pipeline is executed in a specific environment. The actual
pipeline object itself is oblivious of the environment information, but
the pipeline components are directly dependent on the environment.

You gave this example in a subsequent mail:

   String type = // the component type
   Map m = // the execution information

   Transformer t = springContext.getBean(type);
   t.setup(m);

IIUC this code would be part of the "pipeline executor" implementation. 
I assume that m only contains information in the sense of aspect 1 
described above, because the DSL params wouldn't have to be passed to 
the pipeline. If this is the case, there has to be client code like this:

   Map m = new HashMap();
   m.put("request", this.request);
   pipelineExecutor.setEnvironment(m);

Passing the environment to the pipeline is only necessary if there might 
be some pipeline components which could be interested in the request 
object. Wouldn't it then make sense to pass this information directly to 
these components, like in the parameterizeComponents() method above?

-- Andreas



-- 
Andreas Hartmann, CTO
BeCompany GmbH
http://www.becompany.ch
Tel.: +41 (0) 43 818 57 01

Re: [Corona] PIpeline API

Posted by Carsten Ziegeler <cz...@apache.org>.

Andreas Hartmann schrieb:
> Carsten Ziegeler schrieb:
>> Bertrand Delacretaz wrote:
>>> On Wed, Jul 16, 2008 at 5:14 AM, Sylvain Wallez <sy...@apache.org> 
>>> wrote:
>>>
>>>> ...Can't we just give the parameters they need
>>>> individually to every pipeline component, thus allowing per-component
>>>> strongly typed and well defined contracts?..
>>>
>>> I'm also catching up on this thread but I tend to agree with the above
>>> suggestion, i.e. something like
>>>
>>>   Transformer t = new XsltTransformer();
>>>   t.setXslt("cocoon:/mytransform.xsl");
>>>   t.setRequestParameters(request.getParameterMap());
>>>   ...
>>>
>> Ok, in this case you can't use "Transformer" as the class type. This 
>> would be:
>> XsltTransformer t = ...
>>
>> Now, it seems that we are still mixing up things here. As I said, I'm 
>> not talking about configuration of the components. The stylesheet from 
>> above is a configuration.
>> We're talking about information about the current environment for 
>> executing the already configured pipeline.
> 
> IIUC the request parameters in the above example would belong to the 
> execution environment, wouldn't they?
> 
> I'd imagine something like this:
> 
> public void parameterizeTransformers(Request req, Pipeline pipeline) {
>   for (Iterator i = pipeline.getTransformers().iterator(); … ) {
>     Transformer t = (Transformer) i.next();
>     if (t instanceof WebappXsltTransformer) {
>       WebappXsltTransformer xsltTr = (WebappXsltTransformer) t;
>       if (xsltTr.useRequestParameters()) {
>         xsltTr.setXsltParams(req.getParameterMap());
>       }
>     }
>   }
> }
> 
Now all these examples assume that the calling code knows the components.
For my use case - and it's the same with the Cocoon sitemap - I've a
description of a pipeline (think of the sitemap) which has just the name
of the pipeline components to chain. A generic code instantiates these
pipeline components - through a service registry - and does not know 
anything about these components apart from the fact that they are 
pipeline components.

Carsten

-- 
Carsten Ziegeler
cziegeler@apache.org

Re: [Corona] PIpeline API

Posted by Andreas Hartmann <an...@apache.org>.

Carsten Ziegeler schrieb:
> Bertrand Delacretaz wrote:
>> On Wed, Jul 16, 2008 at 5:14 AM, Sylvain Wallez <sy...@apache.org> 
>> wrote:
>>
>>> ...Can't we just give the parameters they need
>>> individually to every pipeline component, thus allowing per-component
>>> strongly typed and well defined contracts?..
>>
>> I'm also catching up on this thread but I tend to agree with the above
>> suggestion, i.e. something like
>>
>>   Transformer t = new XsltTransformer();
>>   t.setXslt("cocoon:/mytransform.xsl");
>>   t.setRequestParameters(request.getParameterMap());
>>   ...
>>
> Ok, in this case you can't use "Transformer" as the class type. This 
> would be:
> XsltTransformer t = ...
> 
> Now, it seems that we are still mixing up things here. As I said, I'm 
> not talking about configuration of the components. The stylesheet from 
> above is a configuration.
> We're talking about information about the current environment for 
> executing the already configured pipeline.

IIUC the request parameters in the above example would belong to the 
execution environment, wouldn't they?

I'd imagine something like this:

public void parameterizeTransformers(Request req, Pipeline pipeline) {
   for (Iterator i = pipeline.getTransformers().iterator(); … ) {
     Transformer t = (Transformer) i.next();
     if (t instanceof WebappXsltTransformer) {
       WebappXsltTransformer xsltTr = (WebappXsltTransformer) t;
       if (xsltTr.useRequestParameters()) {
         xsltTr.setXsltParams(req.getParameterMap());
       }
     }
   }
}

IMO passing the execution environment to a pipeline component depends 
very much on the nature of the environment, and also on the nature of 
the individual components. I'm not sure if it should be attempted to 
find a generic API for this. Wouldn't it be sufficient to let the client 
code handle this? I guess this is what Torsten and Jörg had in mind when 
they suggested a callback mechanism:


public class ServletPipelineInvoker {

   protected Pipeline createPipeline() {
     Pipeline pipeline = new NonCachingPipeline();
     pipeline.addListener(this);
     return pipeline;
   }

   /**
    * Called before pipeline execution.
    * @see PipelineListener.contextualizePipeline(Pipeline)
    */
   public void contextualizePipeline(Pipeline pipeline) {
     parameterizeTransformers(this.request, pipeline);
   }

}


-- Andreas


-- 
Andreas Hartmann, CTO
BeCompany GmbH
http://www.becompany.ch
Tel.: +41 (0) 43 818 57 01

Re: [Corona] PIpeline API

Posted by Carsten Ziegeler <cz...@apache.org>.

Bertrand Delacretaz wrote:
> On Wed, Jul 16, 2008 at 5:14 AM, Sylvain Wallez <sy...@apache.org> wrote:
> 
>> ...Can't we just give the parameters they need
>> individually to every pipeline component, thus allowing per-component
>> strongly typed and well defined contracts?..
> 
> I'm also catching up on this thread but I tend to agree with the above
> suggestion, i.e. something like
> 
>   Transformer t = new XsltTransformer();
>   t.setXslt("cocoon:/mytransform.xsl");
>   t.setRequestParameters(request.getParameterMap());
>   ...
> 
Ok, in this case you can't use "Transformer" as the class type. This 
would be:
XsltTransformer t = ...

Now, it seems that we are still mixing up things here. As I said, I'm 
not talking about configuration of the components. The stylesheet from 
above is a configuration.
We're talking about information about the current environment for 
executing the already configured pipeline.

Carsten

-- 
Carsten Ziegeler
cziegeler@apache.org

Re: [Corona] PIpeline API

Posted by Bertrand Delacretaz <bd...@apache.org>.

On Wed, Jul 16, 2008 at 5:14 AM, Sylvain Wallez <sy...@apache.org> wrote:

> ...Can't we just give the parameters they need
> individually to every pipeline component, thus allowing per-component
> strongly typed and well defined contracts?..

I'm also catching up on this thread but I tend to agree with the above
suggestion, i.e. something like

  Transformer t = new XsltTransformer();
  t.setXslt("cocoon:/mytransform.xsl");
  t.setRequestParameters(request.getParameterMap());
  ...

looks clearer than

  Transformer t = new XslTransformer(opaqueMapOfParameters);

And the second form could be added later, if needed for some other reason.

I didn't even look at Corona lately, no idea how things work
currently, just trying to explain how I'd like things to be.

-Bertrand

Re: [Corona] PIpeline API

Posted by Sylvain Wallez <sy...@apache.org>.

Carsten Ziegeler wrote:
> Peter Hunsberger wrote:
>> On Tue, Jul 15, 2008 at 5:42 AM, Reinhard Pötz <re...@apache.org> 
>> wrote:
>>
>>> Are you talking about passing the input parameters as parameters of the
>>> setup() method?
>>>
>>> void setup(Map<String, Object> inputParameters)
>>>
>>> I'd be fine by this.
>>>
>>
>> I hate seeing Maps used as dumping grounds for randomly typed objects.
>> Could you use something that gives a little more strong typing?
>> Perhaps more like a ServletContext though I don't think I'd go that
>> far in this case?
>>
> I agree that strong typing would be great - but the pipeline api does 
> not define any concrete key/object for the map. So this is use-case 
> specific. Therefore I think a map is the best we can come up.

Trying to catch up on this discussion. What parameters are we talking 
about exactly? Are these pipeline parameters?

What's the need for that? Can't we just give the parameters they need 
individually to every pipeline component, thus allowing per-component 
strongly typed and well defined contracts?

Sylvain

-- 
Sylvain Wallez - http://bluxte.net

Re: [Corona] PIpeline API

Posted by Carsten Ziegeler <cz...@apache.org>.

Peter Hunsberger wrote:
> On Tue, Jul 15, 2008 at 5:42 AM, Reinhard Pötz <re...@apache.org> wrote:
> 
>> Are you talking about passing the input parameters as parameters of the
>> setup() method?
>>
>> void setup(Map<String, Object> inputParameters)
>>
>> I'd be fine by this.
>>
> 
> I hate seeing Maps used as dumping grounds for randomly typed objects.
> Could you use something that gives a little more strong typing?
> Perhaps more like a ServletContext though I don't think I'd go that
> far in this case?
> 
I agree that strong typing would be great - but the pipeline api does 
not define any concrete key/object for the map. So this is use-case 
specific. Therefore I think a map is the best we can come up.

Carsten

-- 
Carsten Ziegeler
cziegeler@apache.org

Re: [Corona] PIpeline API

Posted by Peter Hunsberger <pe...@gmail.com>.

On Tue, Jul 15, 2008 at 5:42 AM, Reinhard Pötz <re...@apache.org> wrote:

> Are you talking about passing the input parameters as parameters of the
> setup() method?
>
> void setup(Map<String, Object> inputParameters)
>
> I'd be fine by this.
>

I hate seeing Maps used as dumping grounds for randomly typed objects.
Could you use something that gives a little more strong typing?
Perhaps more like a ServletContext though I don't think I'd go that
far in this case?

-- 
Peter Hunsberger

Re: [Corona] PIpeline API

Posted by Reinhard Pötz <re...@apache.org>.

Carsten Ziegeler wrote:
> Reinhard Pötz wrote:
>>
>> currently corona-pipeline (pipeline API, pipeline impls, SAX 
>> components) only has a dependency on commons-logging. Is this good 
>> enough for your needs?
> :) Ok, that should be fine.
> 
> I've moved the action to the sitemap module and split up execute into 
> setup and execute. I think we should specify in the contract that a 
> modifiable map is passed to the components during setup. Any objections?

Currently sitemap components get passed to maps:

  1. the input parameters (a map of all objects that you want to pass
     to a pipeline)

  2. the configuration parameters (to provide a way to pass parameters
     from the sitemap to the component -> for API usage those parameters
     are usually passed by constructor parameters or by setters).

Are you talking about passing the input parameters as parameters of the 
setup() method?

void setup(Map<String, Object> inputParameters)

I'd be fine by this.

> I'm wondering if we need the ErrorThrowingAction and the CustomException?

That is only needed for some integration tests in corona-sample where 
they should be moved to.

> I'll add a setup/cleanup method to the components next.

ok

-- 
Reinhard Pötz                           Managing Director, {Indoqa} GmbH
                          http://www.indoqa.com/en/people/reinhard.poetz/

Member of the Apache Software Foundation
Apache Cocoon Committer, PMC member                  reinhard@apache.org
________________________________________________________________________

Re: [Corona] PIpeline API

Posted by Carsten Ziegeler <cz...@apache.org>.

Reinhard Pötz wrote:
> 
> currently corona-pipeline (pipeline API, pipeline impls, SAX components) 
> only has a dependency on commons-logging. Is this good enough for your 
> needs?
:) Ok, that should be fine.

I've moved the action to the sitemap module and split up execute into 
setup and execute. I think we should specify in the contract that a 
modifiable map is passed to the components during setup. Any objections?

I'm wondering if we need the ErrorThrowingAction and the CustomException?

I'll add a setup/cleanup method to the components next.

Carsten

-- 
Carsten Ziegeler
cziegeler@apache.org

Re: [Corona] PIpeline API

Posted by Reinhard Pötz <re...@apache.org>.

Carsten Ziegeler wrote:
> Reinhard Pötz wrote:
>> I agree with you that the package structure should be cleaned up. It's 
>> also a good idea to create a 'corona-pipeline-sax' module that 
>> contains the SAX based components. I'm not so sure if we should really 
>> move the pipeline implementations into their own modules. This seems 
>> to be too much modularization for my taste. (The corona-pipeline.jar, 
>> that currently contains the SAX components, is about 70kb only.)
> Yes, I'm not concerned about the jar size itself, I'm concerned about 
> dependencies. The ideal solution would be to have an api jar which has 
> no other dependencies, perhaps one to a logging framework if we provide 
> utility or abstract classes.

currently corona-pipeline (pipeline API, pipeline impls, SAX components) 
only has a dependency on commons-logging. Is this good enough for your 
needs?

>>> b) Actions should not be part of the pipeline api
>>> I think we discussed this some time ago :) Removing actions from the 
>>> pipeline stuff does not really hurt - they are invoked before the 
>>> pipeline, so it shouldn't be too hard to build custom code which 
>>> collects actions, assembles the pipeline, invokes the actions and 
>>> then the pipeline.
>>
>> no objection ;-)
>>
>>> c) Pre and post processing
>>> As the pipeline interfaces are not tied to sax or any other model 
>>> (which is ok), there is no explicit notion of indicating that the 
>>> processing starts or is finished - the latter is especially 
>>> interesting for cleanup. So I think we should add these two lifecycle 
>>> methods to the pipeline component interface.
>>
>> I don't see any problem either. Being curious, what are your use cases?
> I've some pipeline components that open/close resources, like a JCR 
> session for instance. If the contract includes that the pre and post 
> processing methods are always called, the post processing method acts 
> like kind of a disposal method where I can close my session.
> I've other - in this sense rather obscure - use cases where the pipeline 
> components need to pass information back to the caller before the 
> processing starts. So the caller calls the pre processing method where 
> the pipeline component adds something to the map, the caller can pick it 
> up and then start the processing based in this information.

thanks

>>> d) Splitting setup and execute
>>> I would like to split the Pipeline#execute method into two, one for 
>>> initialisation and one (without arguments) for executing.
>>
>> I was thinking about this myself because we need this separation also 
>> to optimize conditional GET operations when the servlet URLs are 
>> involved.
> Great :)
> 
> So how do we proceed? Do you mind if I start with the minor changes?

of course not :-)

Corona comes with integration tests that you can run by invoking

mvn clean install -P it -Dhtmlunit.base-url=http://localhost:8888

from the root directory. Please make sure that they run through after 
your refactorings. Many thanks in advance!

-- 
Reinhard Pötz                           Managing Director, {Indoqa} GmbH
                          http://www.indoqa.com/en/people/reinhard.poetz/

Member of the Apache Software Foundation
Apache Cocoon Committer, PMC member                  reinhard@apache.org
________________________________________________________________________

Re: [Corona] PIpeline API

Posted by Reinhard Pötz <re...@apache.org>.

Carsten Ziegeler wrote:
> Joerg Heinicke wrote:
>> Carsten Ziegeler <cziegeler <at> apache.org> writes:
>>
>>>>> c) Pre and post processing
>>>>> As the pipeline interfaces are not tied to sax or any other model 
>>>>> (which is ok), there is no explicit notion of indicating that the 
>>>>> processing starts or is finished - the latter is especially 
>>>>> interesting for cleanup. So I think we should add these two 
>>>>> lifecycle methods to the pipeline component interface.
>>>> I don't see any problem either. Being curious, what are your use cases?
>>> I've some pipeline components that open/close resources, like a JCR 
>>> session for instance. If the contract includes that the pre and post 
>>> processing methods are always called, the post processing method acts 
>>> like kind of a disposal method where I can close my session.
>>> I've other - in this sense rather obscure - use cases where the 
>>> pipeline components need to pass information back to the caller 
>>> before the processing starts. So the caller calls the pre processing 
>>> method where the pipeline component adds something to the map, the 
>>> caller can pick it up and then start the processing based in this 
>>> information.
>>
>> Isn't that a rather specific use case? It feels kinda "wrong" when 
>> lifecycle
>> methods will now "pollute" the pipeline interfaces rather than letting 
>> pipeline
>> components implement lifecycle interfaces. Doesn't Spring show how to 
>> do these
>> kind of things with templates or aspects?
>>
>> And is there actually something like "start" and "end" in a pipeline? 
>> How do you
>> determine that? It's rather easy with SAX since there is 
>> startDocument() and
>> endDocument() but in general?
>>
> Hmm, someone starts the pipeline and at some point of time it ends :) As 
> we have an execute() method on the pipeline, the contract is that if 
> execute returns, the pipeline is finished. Therefore it's easy to call 
> the according methods inside execute() - regardless which model is used 
> between the pipeline components.
> We have the pre-execution method already - thats the setup(Map) method.
> I added now a finish method which is called by the pipeline implementation.
> This keeps me free from any configuration hassels with the various 
> containers. Some want to use spring, some others something different. 
> And perhaps someone doesn't want to use a container at all

that's IMO the main reason for adding the finish() method.

>, just 
> instantiate the objects, run the pipeline and that's it.
> Therefore I really think that these lifecycle methods belong to the api.
> I see no other reliable way of closing resources.

agreed

-- 
Reinhard Pötz                           Managing Director, {Indoqa} GmbH
                          http://www.indoqa.com/en/people/reinhard.poetz/

Member of the Apache Software Foundation
Apache Cocoon Committer, PMC member                  reinhard@apache.org
________________________________________________________________________

Re: [Corona] PIpeline API

Posted by Carsten Ziegeler <cz...@apache.org>.

Joerg Heinicke wrote:
> Carsten Ziegeler <cziegeler <at> apache.org> writes:
> 
>>>> c) Pre and post processing
>>>> As the pipeline interfaces are not tied to sax or any other model 
>>>> (which is ok), there is no explicit notion of indicating that the 
>>>> processing starts or is finished - the latter is especially 
>>>> interesting for cleanup. So I think we should add these two lifecycle 
>>>> methods to the pipeline component interface.
>>> I don't see any problem either. Being curious, what are your use cases?
>> I've some pipeline components that open/close resources, like a JCR 
>> session for instance. If the contract includes that the pre and post 
>> processing methods are always called, the post processing method acts 
>> like kind of a disposal method where I can close my session.
>> I've other - in this sense rather obscure - use cases where the pipeline 
>> components need to pass information back to the caller before the 
>> processing starts. So the caller calls the pre processing method where 
>> the pipeline component adds something to the map, the caller can pick it 
>> up and then start the processing based in this information.
> 
> Isn't that a rather specific use case? It feels kinda "wrong" when lifecycle
> methods will now "pollute" the pipeline interfaces rather than letting pipeline
> components implement lifecycle interfaces. Doesn't Spring show how to do these
> kind of things with templates or aspects?
> 
> And is there actually something like "start" and "end" in a pipeline? How do you
> determine that? It's rather easy with SAX since there is startDocument() and
> endDocument() but in general?
> 
Hmm, someone starts the pipeline and at some point of time it ends :) As 
we have an execute() method on the pipeline, the contract is that if 
execute returns, the pipeline is finished. Therefore it's easy to call 
the according methods inside execute() - regardless which model is used 
between the pipeline components.
We have the pre-execution method already - thats the setup(Map) method.
I added now a finish method which is called by the pipeline implementation.
This keeps me free from any configuration hassels with the various 
containers. Some want to use spring, some others something different. 
And perhaps someone doesn't want to use a container at all, just 
instantiate the objects, run the pipeline and that's it.
Therefore I really think that these lifecycle methods belong to the api.
I see no other reliable way of closing resources.

Carsten
-- 
Carsten Ziegeler
cziegeler@apache.org

Re: [Corona] PIpeline API

Posted by Joerg Heinicke <jo...@gmx.de>.

Carsten Ziegeler <cziegeler <at> apache.org> writes:

> >> c) Pre and post processing
> >> As the pipeline interfaces are not tied to sax or any other model 
> >> (which is ok), there is no explicit notion of indicating that the 
> >> processing starts or is finished - the latter is especially 
> >> interesting for cleanup. So I think we should add these two lifecycle 
> >> methods to the pipeline component interface.
> > 
> > I don't see any problem either. Being curious, what are your use cases?
> I've some pipeline components that open/close resources, like a JCR 
> session for instance. If the contract includes that the pre and post 
> processing methods are always called, the post processing method acts 
> like kind of a disposal method where I can close my session.
> I've other - in this sense rather obscure - use cases where the pipeline 
> components need to pass information back to the caller before the 
> processing starts. So the caller calls the pre processing method where 
> the pipeline component adds something to the map, the caller can pick it 
> up and then start the processing based in this information.

Isn't that a rather specific use case? It feels kinda "wrong" when lifecycle
methods will now "pollute" the pipeline interfaces rather than letting pipeline
components implement lifecycle interfaces. Doesn't Spring show how to do these
kind of things with templates or aspects?

And is there actually something like "start" and "end" in a pipeline? How do you
determine that? It's rather easy with SAX since there is startDocument() and
endDocument() but in general?

It's not an objection, just some thoughts ...

Joerg

Re: [Corona] PIpeline API

Posted by Carsten Ziegeler <cz...@apache.org>.

Reinhard Pötz wrote:
> I agree with you that the package structure should be cleaned up. It's 
> also a good idea to create a 'corona-pipeline-sax' module that contains 
> the SAX based components. I'm not so sure if we should really move the 
> pipeline implementations into their own modules. This seems to be too 
> much modularization for my taste. (The corona-pipeline.jar, that 
> currently contains the SAX components, is about 70kb only.)
Yes, I'm not concerned about the jar size itself, I'm concerned about 
dependencies. The ideal solution would be to have an api jar which has 
no other dependencies, perhaps one to a logging framework if we provide 
utility or abstract classes.

> 
>> b) Actions should not be part of the pipeline api
>> I think we discussed this some time ago :) Removing actions from the 
>> pipeline stuff does not really hurt - they are invoked before the 
>> pipeline, so it shouldn't be too hard to build custom code which 
>> collects actions, assembles the pipeline, invokes the actions and then 
>> the pipeline.
> 
> no objection ;-)
> 
>> c) Pre and post processing
>> As the pipeline interfaces are not tied to sax or any other model 
>> (which is ok), there is no explicit notion of indicating that the 
>> processing starts or is finished - the latter is especially 
>> interesting for cleanup. So I think we should add these two lifecycle 
>> methods to the pipeline component interface.
> 
> I don't see any problem either. Being curious, what are your use cases?
I've some pipeline components that open/close resources, like a JCR 
session for instance. If the contract includes that the pre and post 
processing methods are always called, the post processing method acts 
like kind of a disposal method where I can close my session.
I've other - in this sense rather obscure - use cases where the pipeline 
components need to pass information back to the caller before the 
processing starts. So the caller calls the pre processing method where 
the pipeline component adds something to the map, the caller can pick it 
up and then start the processing based in this information.

> 
>> d) Splitting setup and execute
>> I would like to split the Pipeline#execute method into two, one for 
>> initialisation and one (without arguments) for executing.
> 
> I was thinking about this myself because we need this separation also to 
> optimize conditional GET operations when the servlet URLs are involved.
Great :)

So how do we proceed? Do you mind if I start with the minor changes?

Carsten

-- 
Carsten Ziegeler
cziegeler@apache.org

Re: [Corona] PIpeline API

Posted by Reinhard Pötz <re...@apache.org>.

Carsten Ziegeler wrote:
> Hi,
> 
> I'm currently looking for a nice and simple pipeline api to be 
> integrated with Apache Sling :)
> And of course I had a quick look at Corona (as everything else I found 
> was not what I was searching for) which would be the prefered way of
> implementing pipelines :)

great!

> Now, I only need the naked pipeline stuff - and most important here are
> the interfaces and perhaps a simple pipeline implementation (without 
> caching).
> 
> There are some points I would like to discuss:
> a) Simple API separated from the implementation
> I think it makes sense to put all API stuff into one single package, 
> these are only a handfull of classes - perhaps there might be an 
> additional util package.
> The implementations of the various components should go into a different 
> module as they are not needed by everyone. At least they should be in a 
> different package for modularization purposes.
> I would also package the whole caching stuff into an own module.

I agree with you that the package structure should be cleaned up. It's 
also a good idea to create a 'corona-pipeline-sax' module that contains 
the SAX based components. I'm not so sure if we should really move the 
pipeline implementations into their own modules. This seems to be too 
much modularization for my taste. (The corona-pipeline.jar, that 
currently contains the SAX components, is about 70kb only.)

> b) Actions should not be part of the pipeline api
> I think we discussed this some time ago :) Removing actions from the 
> pipeline stuff does not really hurt - they are invoked before the 
> pipeline, so it shouldn't be too hard to build custom code which 
> collects actions, assembles the pipeline, invokes the actions and then 
> the pipeline.

no objection ;-)

> c) Pre and post processing
> As the pipeline interfaces are not tied to sax or any other model (which 
> is ok), there is no explicit notion of indicating that the processing 
> starts or is finished - the latter is especially interesting for 
> cleanup. So I think we should add these two lifecycle methods to the 
> pipeline component interface.

I don't see any problem either. Being curious, what are your use cases?

> d) Splitting setup and execute
> I would like to split the Pipeline#execute method into two, one for 
> initialisation and one (without arguments) for executing.

I was thinking about this myself because we need this separation also to 
optimize conditional GET operations when the servlet URLs are involved.

-- 
Reinhard Pötz                           Managing Director, {Indoqa} GmbH
                          http://www.indoqa.com/en/people/reinhard.poetz/

Member of the Apache Software Foundation
Apache Cocoon Committer, PMC member                  reinhard@apache.org
________________________________________________________________________

Re: [Corona] PIpeline API

Posted by Peter Hunsberger <pe...@gmail.com>.

On Fri, Jul 11, 2008 at 8:19 AM, Carsten Ziegeler <cz...@apache.org> wrote:
> Hi,
>
> I'm currently looking for a nice and simple pipeline api to be integrated
> with Apache Sling :)
> And of course I had a quick look at Corona (as everything else I found was
> not what I was searching for) which would be the prefered way of
> implementing pipelines :)
>
[snip] Corona refactoring ideas [/snip]

>
> WDYT?
>

This sounds like what I'm looking for one project we have.   It makes
sense to me....

-- 
Peter Hunsberger