You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@cocoon.apache.org by Grzegorz Kossakowski <gr...@tuffmail.com> on 2008/03/22 14:29:22 UTC

JNet integration doubts

Hello,

I've played with JNet for a while trying to integrate it with SSF and run into many troubles.

First of all, I'm not sure if I understand whole concept correctly. Do I understand correctly that
JNet provides SourceURLStreamHandlerFactory class that acts just like a bridge supporting legacy
Source implementations? Should we consider URLStreamHandlerFactory and URLStreamHandler as general
replacements for SourceFactory and Source interfaces?

If a long-term goal is to drop Source and SourceFactory interfaces what about extension like
ModifiableSource, MoveableSource, PostableSource? How can they be supported by URLConnection and
friends?

                                             --- o0o ---

Another problem is with the implementation. There is a problem with installing
SourceURLStreamHandlerFactory because:
  a) it must be installed before ServletFactoryBean is being used at Spring initialization phase
  b) it must be installed after ApplicationContext is created because SourceFactories are components
that must be initialized by Spring container.

I have no clue how to solve this problem. Any ideas?

-- 
Grzegorz Kossakowski

Re: JNet integration doubts

Posted by Vadim Gritsenko <va...@reverycodes.com>.

On Mar 25, 2008, at 3:10 AM, Carsten Ziegeler wrote:
> The abstraction we introduced with all these sub interfaces from  
> source looked great in the beginning, but today I'm not sure that  
> you really need this. Traversing over http urls is not working for  
> instance; if you want to traverse of files, well use the file api etc.

Just to give an example, WebDAV, FTP, XML:DB all are traversable and  
all do not implement File API.

I don't even think it is possible to extend File APIs with your own  
file systems (java.io.FileSystem is package private).

Vadim

Re: JNet integration doubts

Posted by Carsten Ziegeler <cz...@apache.org>.

I'll try to respond in more details to this during the week :)
But as a first quick answer: jnet can be considered alpha, so it might 
have some rough edges especially when it comes to integration.
I think it makes more sense to move the excalibur sourceresolver support 
into an optional module for jnet and keep jnet completly free from such 
references.
The abstraction we introduced with all these sub interfaces from source 
looked great in the beginning, but today I'm not sure that you really 
need this. Traversing over http urls is not working for instance; if you 
want to traverse of files, well use the file api etc.

More during the week

Carsten

Grzegorz Kossakowski wrote:
> Grzegorz Kossakowski pisze:
>>> AFAIU, you call
>>>
>>> Installer.setURLStreamHandlerFactory(new SourceURLStreamHandlerFactory());
>>>
>>> at the startup of your application.
>>>
>>> Then you can use the SourceFactoriesManager to install and uninstall
>>> source factories.
>> Yes, but when and where should I call SourceFactoriesManager to install SourceFactories?
>> That's the main problem here.
> 
> Ok, somehow "solved" and committed. The stuff I committed should be considered as experimental (even
> though it works...) so don't be surprised seeing lots of hacks.
> 
> After playing with JNet idea for a while I'm more and more doubtful about the direction we have
> taken. I really like Source, SourceFactory interfaces, they are clean, focused and obvious to use
> contrary to the URL machinery from Java API. Look at what I committed, there is no way to release
> underlying Source object if InputStream was not been obtained.
> 
> Moreover, if you need some advanced functionality (e.g. traversable source) you still need switch
> back to Excalibur interfaces. Same goes for modifiable, postable etc.
> 
> I'm going to invest my energy into implementation of my original idea of providing default
> SourceResolver for SSF internal needs so we can release SSF 1.1.0 ASAP. I'll wait with JNet
> integration until someone (Carsten?) else chimes in and explains how everything should be glued.
> 
> Abstract description explaining what are _real_ benefits of integrating JNet into SSF and Cocoon
> (Corona?) in general would be good. I really need to get some roadmap if I'm going to continue.
> 
> Thanks for listening and have happy post-Easter time!
> 


-- 
Carsten Ziegeler
cziegeler@apache.org

Re: JNet integration

Posted by Grzegorz Kossakowski <gr...@tuffmail.com>.

Reinhard Poetz pisze:
> Reinhard Poetz wrote:
>> What about the developer? He could assemble pipelines this way:
>>
>> URL baseUrl = new URL("file:///C:/temp/");
>> Pipeline pipeline = new NonCachingPipeline();
>> pipeline.addComponent(new FileGenerator(baseUrl, "xyz.xml");
>> pipeline.addComponent(new XSLTTransformer(baseUrl, "xyz.xslt");
>> pipeline.addComponent(new XMLSerializer());
>> pipeline.invoke(new InvocationImpl(System.out));
> 
> uuups, small correction:
> 
> URL baseUrl = new URL("file:///C:/temp/");
> Pipeline pipeline = new NonCachingPipeline();
> pipeline.addComponent(new FileGenerator(new URL(baseUrl, "xyz.xml"));
> pipeline.addComponent(new XSLTTransformer(new URL(baseUrl, "xyz.xslt"));
> pipeline.addComponent(new XMLSerializer());
> pipeline.invoke(new InvocationImpl(System.out));

Hmm, getting back to more complicated scenarios. Do you think that baseURL should be scoped (with
"call" scope) Spring bean?

Another question is weather we should still support "meta-protocols" like blockcontext: one.
Currently it works following way:
  * if you ask for "blockcontext:" or "blockcontext:/" then you will get instance of
BlockContextSource that implements TraversableSource (to list all blocks) and always returns null if
it's asked for InputSource.
  * if you ask for blockcontext:/block_name/directory then casual FileSource is returned pointing to
the filesystem path of block's root directory.

In second case, blockcontext: works as a "meta-protocol" because its Factory never returns its
instance. This leads to the issue that getURI()/getExternalForm() returns path beginning with
"file:/" instead of "blockcontext:" which is bad IMHO.
I think one should always expect the same protocol used in canonical representation of newly created
URL in order to avoid confusion.

WDYT?

-- 
Grzegorz

Re: JNet integration

Posted by Sylvain Wallez <sy...@apache.org>.

Reinhard Poetz wrote:
> Reinhard Poetz wrote:
>> What about the developer? He could assemble pipelines this way:
>>
>> URL baseUrl = new URL("file:///C:/temp/");
>> Pipeline pipeline = new NonCachingPipeline();
>> pipeline.addComponent(new FileGenerator(baseUrl, "xyz.xml");
>> pipeline.addComponent(new XSLTTransformer(baseUrl, "xyz.xslt");
>> pipeline.addComponent(new XMLSerializer());
>> pipeline.invoke(new InvocationImpl(System.out));
>
> uuups, small correction:
>
> URL baseUrl = new URL("file:///C:/temp/");
> Pipeline pipeline = new NonCachingPipeline();
> pipeline.addComponent(new FileGenerator(new URL(baseUrl, "xyz.xml"));
> pipeline.addComponent(new XSLTTransformer(new URL(baseUrl, "xyz.xslt"));
> pipeline.addComponent(new XMLSerializer());
> pipeline.invoke(new InvocationImpl(System.out));

Or even using method chaining

new NonCachingPipeline()
    .setBaseURL(new URL("file:///C:/temp/"))
    .setGenerator(new FileGenerator("xyz.xml"))
    .addTransformer(new XSLTransformer("xyz.xslt"))
    .setSerializer(new XMLSerializer(new StreamResult(System.out)))
    .process();

Sylvain

-- 
Sylvain Wallez - http://bluxte.net

Re: JNet integration

Posted by Reinhard Poetz <re...@apache.org>.

Reinhard Poetz wrote:
> What about the developer? He could assemble pipelines this way:
> 
> URL baseUrl = new URL("file:///C:/temp/");
> Pipeline pipeline = new NonCachingPipeline();
> pipeline.addComponent(new FileGenerator(baseUrl, "xyz.xml");
> pipeline.addComponent(new XSLTTransformer(baseUrl, "xyz.xslt");
> pipeline.addComponent(new XMLSerializer());
> pipeline.invoke(new InvocationImpl(System.out));

uuups, small correction:

URL baseUrl = new URL("file:///C:/temp/");
Pipeline pipeline = new NonCachingPipeline();
pipeline.addComponent(new FileGenerator(new URL(baseUrl, "xyz.xml"));
pipeline.addComponent(new XSLTTransformer(new URL(baseUrl, "xyz.xslt"));
pipeline.addComponent(new XMLSerializer());
pipeline.invoke(new InvocationImpl(System.out));

-- 
Reinhard Pötz                            Managing Director, {Indoqa} GmbH
                           http://www.indoqa.com/en/people/reinhard.poetz/

Member of the Apache Software Foundation
Apache Cocoon Committer, PMC member, PMC Chair        reinhard@apache.org
_________________________________________________________________________

Re: JNet integration

Posted by Reinhard Poetz <re...@apache.org>.

Grzegorz Kossakowski wrote:
> Carsten Ziegeler pisze:
>> Reinhard Poetz wrote:
>>> 
>>> Once again, my goal is that if you use e.g. Corona in its simplest form,
>>> I don't want to make everybody and his dog depend on 
>>> JNet/SourceResolve/Source. E.g. see the FileGenerator. Using the URL 
>>> object is enough for simple use cases of a pipeline API.
>>> 
>>> Yes, I understand that when it comes to caching pipelines, you need more,
>>> but not everybody needs caching pipelines. For that purpose there could
>>> be a CacheableFileGenerator, etc.
>>> 
>>> If you are right and it is difficult or even impossible to remove the 
>>> dependencies on source/sourceresolve/xmlutils/jnet, then be it. I 
>>> withdraw my example Url("servlet:...") from above. When we can switch to
>>> sourceresolve 3.0, the dependency graph will get smaller anyway.
>>> 
>>> The main benefit from using URLs (instead of the SourceResolver) comes 
>>> from simple use cases, e.g. you need a pipeline in your Java application
>>> that reads in some XML file, performs some transformations and finally
>>> creates a PDF document. FWIW, using URLs should be all that you need.
>>> 
>> I totally agree with Reinhard; for most uses cases getting an input stream
>> (or sax events) via a url is totally sufficient. With the source interface
>> we created another abstraction like the request/response abstraction in the
>> cocoon environment which seems to be nice and great but in the end is not
>> really needed, creates problems in other places etc.
> 
> I agree that our Environment abstraction was awkward - it introduced
> abstraction that never was a real abstraction and mostly duplicated existing,
> quite nice servlet API.
> 
> At least now I fail to see coherent, nice to use standard Java API that
> Excalibur's Source and SourceFactory duplicates. As for now I can only see
> obstacles like:
> 
> new URL("blabla/foo");
> 
> will fail with java.net.MalformedURLException: no protocol: blabla/foo so one
> must use:
> 
> new URL(baseURL, "blabla/foo");
> 
> Who will judge if given path is relative and requires baseURL instance? How
> one will get this baseURL instance?

What about the developer? He could assemble pipelines this way:

URL baseUrl = new URL("file:///C:/temp/");
Pipeline pipeline = new NonCachingPipeline();
pipeline.addComponent(new FileGenerator(baseUrl, "xyz.xml");
pipeline.addComponent(new XSLTTransformer(baseUrl, "xyz.xslt");
pipeline.addComponent(new XMLSerializer());
pipeline.invoke(new InvocationImpl(System.out));

Any need for a Source object?

-- 
Reinhard Pötz                            Managing Director, {Indoqa} GmbH
                           http://www.indoqa.com/en/people/reinhard.poetz/

Member of the Apache Software Foundation
Apache Cocoon Committer, PMC member, PMC Chair        reinhard@apache.org
_________________________________________________________________________

Re: JNet integration

Posted by Ralph Goers <Ra...@dslextreme.com>.

I had to create a class at work for handling some files. I started with 
an input stream. What I needed, though, required caching and being able 
to check whether the file was still valid. In this case I soon realized 
that I would have to reinvent the Excalibur Source interface since I had 
to cache the Validity (or something like it) along with the information 
about the file.  In the end it made far more sense to just use the 
Source interface. I ended up extending the Excalibur Source 
implementations or creating my own though, as my Validity checking 
didn't match any of the existing implementations exactly.

The point is, if you are planning on caching your files and checking 
whether they are valid or not just using java.net, etc. isn't going to 
be sufficient.

Ralph

Grzegorz Kossakowski wrote:
> Carsten Ziegeler pisze:
>   
>
> I agree that our Environment abstraction was awkward - it introduced abstraction that never was a
> real abstraction and mostly duplicated existing, quite nice servlet API.
>
> At least now I fail to see coherent, nice to use standard Java API that Excalibur's Source and
> SourceFactory duplicates. As for now I can only see obstacles like:
>
>   new URL("blabla/foo");
>
> will fail with java.net.MalformedURLException: no protocol: blabla/foo
> so one must use:
>
>   new URL(baseURL, "blabla/foo");
>
> Who will judge if given path is relative and requires baseURL instance? How one will get this
> baseURL instance?
>
> Guys, it's non-sense...
>
>   
>

Re: JNet integration

Posted by Grzegorz Kossakowski <gr...@tuffmail.com>.

Carsten Ziegeler pisze:
> Reinhard Poetz wrote:
>>
>>
>> Once again, my goal is that if you use e.g. Corona in its simplest
>> form, I don't want to make everybody and his dog depend on
>> JNet/SourceResolve/Source. E.g. see the FileGenerator. Using the URL
>> object is enough for simple use cases of a pipeline API.
>>
>> Yes, I understand that when it comes to caching pipelines, you need
>> more, but not everybody needs caching pipelines. For that purpose
>> there could be a CacheableFileGenerator, etc.
>>
>> If you are right and it is difficult or even impossible to remove the
>> dependencies on source/sourceresolve/xmlutils/jnet, then be it. I
>> withdraw my example Url("servlet:...") from above. When we can switch
>> to sourceresolve 3.0, the dependency graph will get smaller anyway.
>>
>> The main benefit from using URLs (instead of the SourceResolver) comes
>> from simple use cases, e.g. you need a pipeline in your Java
>> application that reads in some XML file, performs some transformations
>> and finally creates a PDF document. FWIW, using URLs should be all
>> that you need.
>>
> I totally agree with Reinhard; for most uses cases getting an input
> stream (or sax events) via a url is totally sufficient. With the source
> interface we created another abstraction like the request/response
> abstraction in the cocoon environment which seems to be nice and great
> but in the end is not really needed, creates problems in other places etc.

I agree that our Environment abstraction was awkward - it introduced abstraction that never was a
real abstraction and mostly duplicated existing, quite nice servlet API.

At least now I fail to see coherent, nice to use standard Java API that Excalibur's Source and
SourceFactory duplicates. As for now I can only see obstacles like:

  new URL("blabla/foo");

will fail with java.net.MalformedURLException: no protocol: blabla/foo
so one must use:

  new URL(baseURL, "blabla/foo");

Who will judge if given path is relative and requires baseURL instance? How one will get this
baseURL instance?

Guys, it's non-sense...

> Let's forget jnet for a second and see if the java net api can be
> sufficient. The only other use case might really be caching. You need a
> way to find out if a resource might have changed or not, but I think
> that should be possible.
> Using java net api for Corona makes totally sense to me; it keeps it
> simple and small.

Yep, the idea sounds great - that's why I started to dig into JNet. As usual, the devil is in a detail.

-- 
Grzegorz

Re: JNet integration

Posted by Carsten Ziegeler <cz...@apache.org>.

Reinhard Poetz wrote:
> 
> 
> Once again, my goal is that if you use e.g. Corona in its simplest form, 
> I don't want to make everybody and his dog depend on 
> JNet/SourceResolve/Source. E.g. see the FileGenerator. Using the URL 
> object is enough for simple use cases of a pipeline API.
> 
> Yes, I understand that when it comes to caching pipelines, you need 
> more, but not everybody needs caching pipelines. For that purpose there 
> could be a CacheableFileGenerator, etc.
> 
> If you are right and it is difficult or even impossible to remove the 
> dependencies on source/sourceresolve/xmlutils/jnet, then be it. I 
> withdraw my example Url("servlet:...") from above. When we can switch to 
> sourceresolve 3.0, the dependency graph will get smaller anyway.
> 
> The main benefit from using URLs (instead of the SourceResolver) comes 
> from simple use cases, e.g. you need a pipeline in your Java application 
> that reads in some XML file, performs some transformations and finally 
> creates a PDF document. FWIW, using URLs should be all that you need.
> 
I totally agree with Reinhard; for most uses cases getting an input 
stream (or sax events) via a url is totally sufficient. With the source 
interface we created another abstraction like the request/response 
abstraction in the cocoon environment which seems to be nice and great 
but in the end is not really needed, creates problems in other places etc.
Let's forget jnet for a second and see if the java net api can be 
sufficient. The only other use case might really be caching. You need a 
way to find out if a resource might have changed or not, but I think 
that should be possible.
Using java net api for Corona makes totally sense to me; it keeps it 
simple and small.

Carsten

-- 
Carsten Ziegeler
cziegeler@apache.org

Re: JNet integration

Posted by Reinhard Poetz <re...@apache.org>.

Ralph Goers wrote:
> 
> I think you are out of your mind. (Not seriously).
> I have to tell you, Cocoon without caching pipelines would suck so bad 
> with performance problems you would give it the boot in very short 
> order. Even without "Cocoon", as soon as you start doing anything 
> serious caching will become necessary.

Sure, caching is important but this doesn't mean that we can't provide a *basic* 
pipeline API that works with URLs only 
(http://marc.info/?l=xml-cocoon-dev&m=120646488429681&w=2). If you need more, 
you can always build a layer on top of it, e.g. by using

Source source =
   (Source) new URL("file:///C:/Temp/foo.xml").getContent(Source.class)

or using a SourceResolver.

-- 
Reinhard Pötz                            Managing Director, {Indoqa} GmbH
                           http://www.indoqa.com/en/people/reinhard.poetz/

Member of the Apache Software Foundation
Apache Cocoon Committer, PMC member, PMC Chair        reinhard@apache.org
_________________________________________________________________________

Re: JNet integration

Posted by Ralph Goers <Ra...@dslextreme.com>.

I think you are out of your mind. (Not seriously). 

I have to tell you, Cocoon without caching pipelines would suck so bad 
with performance problems you would give it the boot in very short 
order. Even without "Cocoon", as soon as you start doing anything 
serious caching will become necessary.

I'll give you a trivial example. I wrote my own I18n implementation for 
use with JSF and used Excalibur Source to read an XML properties file 
containing the keys and values. The first implementation checked to see 
if the file was valid for every key that was read. This didn't perform 
well at all and I changed my Validity so that the file validity was only 
checked once per Request. This made it so the overhead of this utility 
was not noticeable. Now imagine that instead of just checking the 
validity I had been actually reading the file for every key!

Ralph

Reinhard Poetz wrote:
>
>
> Once again, my goal is that if you use e.g. Corona in its simplest 
> form, I don't want to make everybody and his dog depend on 
> JNet/SourceResolve/Source. E.g. see the FileGenerator. Using the URL 
> object is enough for simple use cases of a pipeline API.
>
> Yes, I understand that when it comes to caching pipelines, you need 
> more, but not everybody needs caching pipelines. For that purpose 
> there could be a CacheableFileGenerator, etc.
>
> If you are right and it is difficult or even impossible to remove the 
> dependencies on source/sourceresolve/xmlutils/jnet, then be it. I 
> withdraw my example Url("servlet:...") from above. When we can switch 
> to sourceresolve 3.0, the dependency graph will get smaller anyway.
>
> The main benefit from using URLs (instead of the SourceResolver) comes 
> from simple use cases, e.g. you need a pipeline in your Java 
> application that reads in some XML file, performs some transformations 
> and finally creates a PDF document. FWIW, using URLs should be all 
> that you need.
>

Re: Layered software designs

Posted by Carsten Ziegeler <cz...@apache.org>.

David Crossley wrote:
> Reinhard Poetz wrote:
>> David Crossley wrote:
>>> Reinhard Poetz wrote:
>>>> A simple scenario could be:
>>>>
>>>>  Pipeline API  +  java.net.URL   +  XML-SAX components
>>>>
>>>> A more advanced scenario could consist of
>>>>
>>>>  Pipeline API  +  Sourceresolve  +  XML-SAX components  +  Sitemap Engine
>>> Is "sourceresolve" where the Apache XML Commons Resolver
>>> is hooked up?
>> no, I was talking about the Excalibur Sourceresolve component.
> 
> Yes, i know. However that might involve the Catalog Entity Resolver too.
> This configuration used to be done in Cocoon, then we
> moved it to Excalibur so that it would be more widely used.
And now we have a better version back in Cocoon for 2.2 :)

Carsten

-- 
Carsten Ziegeler
cziegeler@apache.org

Re: Layered software designs

Posted by David Crossley <cr...@apache.org>.

Reinhard Poetz wrote:
> David Crossley wrote:
> >Reinhard Poetz wrote:
> >>A simple scenario could be:
> >>
> >>  Pipeline API  +  java.net.URL   +  XML-SAX components
> >>
> >>A more advanced scenario could consist of
> >>
> >>  Pipeline API  +  Sourceresolve  +  XML-SAX components  +  Sitemap Engine
> >
> >Is "sourceresolve" where the Apache XML Commons Resolver
> >is hooked up?
> 
> no, I was talking about the Excalibur Sourceresolve component.

Yes, i know. However that might involve the Catalog Entity Resolver too.
This configuration used to be done in Cocoon, then we
moved it to Excalibur so that it would be more widely used.

> >I would be concerned if our base offering enabled
> >mis-use of net resources, e.g. processing an xml file
> >which declares a DTD, causes an extra network trip if
> >we don't have the xml entity resolver.
> 
> Resolving XML entities is important and we will definitly offer solutions 
> for it in the future too.
> 
> Corona, Steven's and my proposal of a Cocoon reimplementation, is about 
> doing things differently so that Cocoon becomes easily embeddable and 
> reuseable from within as many environments as possible. We think that a 
> "layered design" is the key to achive this goal. When you put all layers 
> together, the result (= a web application framework) should be nearly[1] as 
> powerful as that what we have today.
> 
> For that purpose Steven and I have also started to reimplement existing 
> concepts instead of doing everything from the scratch. First, we believe 
> that many of the existing concepts are good as they are and second, this 
> makes it easier for others to chime in because if you see Corona as a black 
> box, it should (more or less) provide the same results as 2.x.
> 
> HTH

Yes it does.

-David

> [1] The only exceptions are things that we want to remove, e.g. sub 
> sitemaps etc. -> see the Micro-Cocoon discussion: 
> http://marc.info/?l=xml-cocoon-dev&m=119903256406947&w=2
> 
> -- 
> Reinhard P?tz                            Managing Director, {Indoqa} GmbH
>                           http://www.indoqa.com/en/people/reinhard.poetz/
> 
> Member of the Apache Software Foundation
> Apache Cocoon Committer, PMC member, PMC Chair        reinhard@apache.org
> _________________________________________________________________________

Re: Layered software designs

Posted by Reinhard Poetz <re...@apache.org>.

David Crossley wrote:
> Reinhard Poetz wrote:
>> A simple scenario could be:
>>
>>   Pipeline API  +  java.net.URL   +  XML-SAX components
>>
>> A more advanced scenario could consist of
>>
>>   Pipeline API  +  Sourceresolve  +  XML-SAX components  +  Sitemap Engine
> 
> Is "sourceresolve" where the Apache XML Commons Resolver
> is hooked up?

no, I was talking about the Excalibur Sourceresolve component.

> I would be concerned if our base offering enabled
> mis-use of net resources, e.g. processing an xml file
> which declares a DTD, causes an extra network trip if
> we don't have the xml entity resolver.

Resolving XML entities is important and we will definitly offer solutions for it 
in the future too.

Corona, Steven's and my proposal of a Cocoon reimplementation, is about doing 
things differently so that Cocoon becomes easily embeddable and reuseable from 
within as many environments as possible. We think that a "layered design" is the 
key to achive this goal. When you put all layers together, the result (= a web 
application framework) should be nearly[1] as powerful as that what we have today.

For that purpose Steven and I have also started to reimplement existing concepts 
instead of doing everything from the scratch. First, we believe that many of the 
existing concepts are good as they are and second, this makes it easier for 
others to chime in because if you see Corona as a black box, it should (more or 
less) provide the same results as 2.x.

HTH

[1] The only exceptions are things that we want to remove, e.g. sub sitemaps 
etc. -> see the Micro-Cocoon discussion: 
http://marc.info/?l=xml-cocoon-dev&m=119903256406947&w=2

-- 
Reinhard Pötz                            Managing Director, {Indoqa} GmbH
                           http://www.indoqa.com/en/people/reinhard.poetz/

Member of the Apache Software Foundation
Apache Cocoon Committer, PMC member, PMC Chair        reinhard@apache.org
_________________________________________________________________________

Re: Layered software designs

Posted by David Crossley <cr...@apache.org>.

Reinhard Poetz wrote:
> 
> A simple scenario could be:
> 
>   Pipeline API  +  java.net.URL   +  XML-SAX components
>
> A more advanced scenario could consist of
> 
>   Pipeline API  +  Sourceresolve  +  XML-SAX components  +  Sitemap Engine

Is "sourceresolve" where the Apache XML Commons Resolver
is hooked up?

I would be concerned if our base offering enabled
mis-use of net resources, e.g. processing an xml file
which declares a DTD, causes an extra network trip if
we don't have the xml entity resolver.

-David

Re: Layered software designs

Posted by Reinhard Poetz <re...@apache.org>.

Rainer Pruy wrote:
> 
> Reinhard Poetz schrieb:
>> Dev at weitling wrote:
>>> But (maybe I have missed some mails) how do you want to make this 
>>> Pipeline API? E.g. a SAX-based pipeline is something different than image
>>> data running through several filters. How do you want to prevent the use
>>> of a SAX-events generating component together with an image data 
>>> transformer? What about something like it's used in clipboards: each 
>>> component offers a list of importable and exportable formats?
>> See http://marc.info/?l=xml-cocoon-dev&m=120611990603725&w=2 Do the
>> explanations that this mail answer your question?
>> 
> 
> They answer a number of questions.
> 
> Nevertheless, the question of connection "type" is still open.
> 
> Ok, the answer given is "it is up to the components to ensure compatibility".
> 
> 
> However, what does this mean practically? Will there be different
> "Interfaces" for different type of data to e processed. (SAX events will use
> SAX interfaces, image data will use others)

yes, that's the idea (and IIRC it is already implemented this way Corona).

-- 
Reinhard Pötz                            Managing Director, {Indoqa} GmbH
                           http://www.indoqa.com/en/people/reinhard.poetz/

Member of the Apache Software Foundation
Apache Cocoon Committer, PMC member, PMC Chair        reinhard@apache.org
_________________________________________________________________________

Re: Layered software designs

Posted by Rainer Pruy <Ra...@Acrys.COM>.

Reinhard Poetz schrieb:
> Dev at weitling wrote:
>> But (maybe I have missed some mails) how do you want to make this
>> Pipeline API?
>> E.g. a SAX-based pipeline is something different than image data
>> running through several filters. How do you want to prevent the use of
>> a SAX-events generating component together with an image data
>> transformer? What about something like it's used in clipboards: each
>> component offers a list of importable and exportable formats?
> 
> See http://marc.info/?l=xml-cocoon-dev&m=120611990603725&w=2
> Do the explanations that this mail answer your question?
> 

They answer a number of questions.

Nevertheless, the question of connection "type" is still open.

Ok, the answer given is "it is up to the components to ensure compatibility".

However, what does this mean practically?
Will there be different "Interfaces" for different type of data to e processed.
(SAX events will use SAX interfaces, image data will use others)

Or will there be some "tagging" where a component be asked for a list of "types" it will be able to process?
(An image "resizer" will state "PNG JPEG and GIF" as being acceptable inputs for consumer side and "PNG" the type being produced, or
something along?) And compatibility will be assumed if a linked pair of producer and consumer share a common "tag" string?

Thus answering the "type" question might effect back to the API definitions...

Rainer

Re: Layered software designs

Posted by Dev at weitling <de...@weitling.net>.


Reinhard Poetz wrote:
> Dev at weitling wrote:
>> But (maybe I have missed some mails) how do you want to make this 
>> Pipeline API?
>> E.g. a SAX-based pipeline is something different than image data 
>> running through several filters. How do you want to prevent the use 
>> of a SAX-events generating component together with an image data 
>> transformer? What about something like it's used in clipboards: each 
>> component offers a list of importable and exportable formats?
>
> See http://marc.info/?l=xml-cocoon-dev&m=120611990603725&w=2
> Do the explanations that this mail answer your question?

Quote from the text:
"We hope that this gives you enough hints in order to understand Corona. 
We are
very interested in further discussions. So fire at will!"

My Answer: Great work!
And yes, it answered my questions :-)

Re: Layered software designs

Posted by Reinhard Poetz <re...@apache.org>.

Dev at weitling wrote:
> But (maybe I have missed some mails) how do you want to make this 
> Pipeline API?
> E.g. a SAX-based pipeline is something different than image data running 
> through several filters. How do you want to prevent the use of a 
> SAX-events generating component together with an image data transformer? 
> What about something like it's used in clipboards: each component offers 
> a list of importable and exportable formats?

See http://marc.info/?l=xml-cocoon-dev&m=120611990603725&w=2
Do the explanations that this mail answer your question?

-- 
Reinhard Pötz                            Managing Director, {Indoqa} GmbH
                           http://www.indoqa.com/en/people/reinhard.poetz/

Member of the Apache Software Foundation
Apache Cocoon Committer, PMC member, PMC Chair        reinhard@apache.org
_________________________________________________________________________

Re: Layered software designs

Posted by Dev at weitling <de...@weitling.net>.


Reinhard Poetz wrote:
> Carsten Ziegeler wrote:
>> The question is now if we need support for caching in the low level 
>> apis or if it is possible to have a layered approach - which would 
>> make the entry barrier much easier.
>
> Yes, this layered approach is what I'm aiming for. All the reactions 
> in this thread make me think that everybody, who has commented on this 
> mailing list so far (except Carsten), believes that we want to throw 
> away good things that have profed to be useful in many situations.
> Rest assured, that's not the case. Carsten and I only want to break up 
> this all-or-nothing situation that we (still) have now.
>
> What I want to see is a concise pipeline API that comes with only 
> little overhead in terms of its learning curve and its dependencies on 
> third-party software. Usually this means that we stick with standard 
> APIs as much as possible - and I think this rule applies for our 
> situation too.
>
> This means that the user of the API only needs to learn as little as 
> possible. When he wants more, we offer additional modules that help 
> him. Since he has a concrete need, the motiviation to learn something 
> new is much higher than when he has to learn everything right from the 
> beginning.
>
> If you want to learn how this whole concept *might* apply for a next 
> generation Cocoon, have a look at Steven's and my "Exploring Corona" 
> mail from last week 
> (http://marc.info/?l=xml-cocoon-dev&m=120611990603725&w=2).
>
> The idea of Corona is having a concise core that doesn't have any 
> dependencies on a particular component container (Spring, OSGi, etc.), 
> source resolving mechanisms or environment (http, java  only, etc.) or 
> even the type of the components (XML-SAX event stream, XML-Stax event 
> stream, binary streams, etc.) that are linked together in a pipeline.
>
> A simple scenario could be:
>
>   Pipeline API  +  java.net.URL   +  XML-SAX components
>
>
> A more advanced scenario could consist of
>
>   Pipeline API  +  Sourceresolve  +  XML-SAX components  +  Sitemap 
> Engine
>
>
> or maybe you need the full stack that corresponds to Cocoon Core 2.2 - 
> here you are:
>
>   Pipeline API  +  Sourceresolve  +  HTTP-enabled  +  Sitemap Engine  
> +  Spring
>                                        XML-SAX
>                                      componnents
>
>
> This layered approach makes Cocoon easily embeddable in any Java 
> application and Cocoon's learning curve becomes more gradual.
>
> Is such a situation only appealing to Carsten, Steven and me?

Your mail makes things a little bit more clear.
But (maybe I have missed some mails) how do you want to make this 
Pipeline API?
E.g. a SAX-based pipeline is something different than image data running 
through several filters. How do you want to prevent the use of a 
SAX-events generating component together with an image data transformer? 
What about something like it's used in clipboards: each component offers 
a list of importable and exportable formats?

Just my 2 Räppli :-)
Florian

Re: Layered software designs

Posted by Reinhard Poetz <re...@apache.org>.

Steven Dolg wrote:
> 
> 
> Joerg Heinicke schrieb:
>> On 26.03.2008 09:14, Reinhard Poetz wrote:
>>
>>> What I want to see is a concise pipeline API that comes with only 
>>> little overhead in terms of its learning curve and its dependencies 
>>> on third-party software. Usually this means that we stick with 
>>> standard APIs as much as possible - and I think this rule applies for 
>>> our situation too.
>>
>> See, one thing that I just don't get (and already asked) is how the 
>> pipeline API has anything to do with uri resolving. For me the latter 
>> (using java.net or source resolve) is an implementation detail. Our 
>> current pipeline interface [1] has no relationship to uri resolving. 
>> It only has a reference to SourceValidity because of caching [2].
>>
>> If all this discussion is about removing this method (and the related 
>> getKeyForEventPipeline()) to get rid of this dependency I'm ok with 
>> it. The caching concern could be implemented in a separate Cacheable 
>> interface which should then also be decoupled from uri resolving 
>> (which seems to point to Peter's approach [3]).
>>
>> Joerg
>>
>> [1] 
>> http://cocoon.apache.org/2.1/apidocs/org/apache/cocoon/components/pipeline/ProcessingPipeline.html 
>>
>> [2] 
>> http://cocoon.apache.org/2.1/apidocs/org/apache/cocoon/components/pipeline/ProcessingPipeline.html#getValidityForEventPipeline() 
>>
>> [3] http://marc.info/?l=xml-cocoon-dev&m=120654005017297&w=4
>>
> Just as a general observation:
> I'm starting to believe that I do not understand (anymore) what this is 
> all about.
> We're jumping from high-level concepts ("caching", "layering") straight 
> down to the lowest level ("it's just method a in class B") and then 
> right back.
> We're arguing that a certain feature is already existing and working, 
> while talking about a rewrite-experiment that definitely does not have 
> this feature.
> 
> 
> But back to caching:
> Caching appears to be incredibly important. Even to the point where "no 
> caching" means "not acceptable".
> On the other hand, when trying to find out what is really necessary and 
> wanted, there isn't much left. Suddenly it's "just an implementation 
> detail", "not really important", "makes no difference for the user".
> 
> Forgive me for being blunt, but all this appears to me like "I need that 
> feature. I do not care how it is implemented or how it works. I just 
> have to have it."
> I did not expect this kind of discussion on a dev list. (I even have a 
> hard time accepting this from a paying customer).
> 
> But since it's just a not very important implementation detail, I added 
> a (simple) caching approach to Corona. (I hope to get a patch ready today)
> Perhaps this is all just too abstract and far fetched without any common 
> basis (iow. code).

Steven,

one of the greatest strengths of community-driven opensource development (I 
don't talk about projects like Spring, Eclipse or many other company-driven 
projects but about those projects here at the Apache Software Foundation) is the 
diversity of the people who participate. At the same time it is also one of its 
weaknesses because things need time to grow.

In addition, we people here have different development and language skills and, 
don't forget that we both have already had many discussions about what a 
reimplementation is about and how we intend to implement this layered design.

I'm happy that people are catching up and participate in the discussion on 
different levels. I'm sure that we will become more focused in the future - and 
I think that you are right that code will help. At least people will finally 
come up with their requirements in more detail ("I need caching" is too 
general!) finally so that terms like "layered design" and "caching" become more 
conrete.

-- 
Reinhard Pötz                            Managing Director, {Indoqa} GmbH
                           http://www.indoqa.com/en/people/reinhard.poetz/

Member of the Apache Software Foundation
Apache Cocoon Committer, PMC member, PMC Chair        reinhard@apache.org
_________________________________________________________________________

Re: Layered software designs

Posted by Reinhard Poetz <re...@apache.org>.

Rainer Pruy wrote:
> Hi, I was off the net for some time and while catching up this discussion I
> also got the feeling of being somehow lost a bit in the different aspects of
> the discussions....
> 
>> From what I see the starting point was
> the (technical) question of how to get rid of a unwanted dependency.
>> From that it changed to the conceptual question of how to provide flexible
>> support for arbitrary URI strings.
> (out-of-the-box support for "standard" protocols and an easy extensibility
> for more complex needs)
> 
> If I did get it right, the main difference between the two approaches in
> discussion (URL vs. SourceResolver) is support for caching.
> 
> It was noted (and I fully agree) that adding caching later on will require
> support from the lower level. (e.g. caching a file will need info about
> modification times. caching a resource accessed via http might need access to
> the expires header for example. This is not a place for talking about cache
> *implementations* or what kind of caching is suited for what layer, it is
> just a question of what information will any kind of caching require from
> lower levels and will the intended implementation provide such info).
> 
> The standard URL implementation does not provide related methods (for
> accessing cache control data). On the other hand URL has the benefit of being
> plain standard and familiar to a large community of developers.
> 
> This to answer the initial question of "can we drop SourceResolver support", 
> we must answer the question "Can URL be extended to support "cachable"
> implementations of protocols?" and if true "Is it possible to override
> standard (non-cachable) implementations of protocols with cachable ones?"
> 
> If both question can be answered with YES, then it will be possible to
> implement the uri interpreting layer using plain Java URL and to drop use of
> source resolver.

Thanks for sharing your thoughts. I think we have to get our hands dirty now to 
figure out all the details.

> This will raise questions of semantic differences remaining. E.g. was the
> default protocol implementation used for "path" values without protocol
> different with different locations (e.g. per component implementation?)
> Probably, it will then be necessary to add some glue code to keep semantics,
> or point out differences for migration guidelines.

The question is also, *where* we will introduce this change (if we do it at 
all). Corona doesn't impose any constraints in terms of backwards-compatibility, 
whereas we can't drop important contracts in 2.2.

-- 
Reinhard Pötz                            Managing Director, {Indoqa} GmbH
                           http://www.indoqa.com/en/people/reinhard.poetz/

Member of the Apache Software Foundation
Apache Cocoon Committer, PMC member, PMC Chair        reinhard@apache.org
_________________________________________________________________________

Re: Layered software designs

Posted by Rainer Pruy <Ra...@Acrys.COM>.

Hi,
I was off the net for some time
and while catching up this discussion I also got the feeling of being somehow lost a bit in the different aspects of the discussions....

>From what I see the starting point was
the (technical) question of how to get rid of a unwanted dependency.
>From that it changed to the conceptual question of how to provide flexible support for arbitrary URI strings.
(out-of-the-box support for "standard" protocols and an easy extensibility for more complex needs)

If I did get it right, the main difference between the two approaches in discussion (URL vs. SourceResolver)
is support for caching.

It was noted (and I fully agree) that adding caching later on will require support from the lower level.
(e.g. caching a file will need info about modification times. caching a resource accessed via http might need access to the expires
header for example. This is not a place for talking about cache *implementations* or what kind of caching is suited for what layer, it
is just a question of what information will any kind of caching require from lower levels and will the intended implementation provide
such info).

The standard URL implementation does not provide related methods (for accessing cache control data).
On the other hand URL has the benefit of being plain standard and familiar to a large community of developers.

This to answer the initial question of "can we drop SourceResolver support",
we must answer the question
"Can URL be extended to support "cachable" implementations of protocols?"
and if true "Is it possible to override standard (non-cachable) implementations of protocols with cachable ones?"

If both question can be answered with YES, then it will be possible to implement the uri interpreting layer using plain Java URL
and to drop use of source resolver.

This will raise questions of semantic differences remaining.
E.g. was the default protocol implementation used for "path" values without protocol different with different locations (e.g. per
component implementation?) Probably, it will then be necessary to add some glue code to keep semantics, or point out differences for
migration guidelines.

Rainer

Joerg Heinicke schrieb:
> On 27.03.2008 02:25, Steven Dolg wrote:
> 
>>>> What I want to see is a concise pipeline API that comes with only
>>>> little overhead in terms of its learning curve and its dependencies
>>>> on third-party software. Usually this means that we stick with
>>>> standard APIs as much as possible - and I think this rule applies
>>>> for our situation too.
>>>
>>> See, one thing that I just don't get (and already asked) is how the
>>> pipeline API has anything to do with uri resolving. For me the latter
>>> (using java.net or source resolve) is an implementation detail. Our
>>> current pipeline interface [1] has no relationship to uri resolving.
>>> It only has a reference to SourceValidity because of caching [2].
>>>
>>> If all this discussion is about removing this method (and the related
>>> getKeyForEventPipeline()) to get rid of this dependency I'm ok with
>>> it. The caching concern could be implemented in a separate Cacheable
>>> interface which should then also be decoupled from uri resolving
>>> (which seems to point to Peter's approach [3]).
>>>
>> Just as a general observation:
>> I'm starting to believe that I do not understand (anymore) what this
>> is all about.
>> We're jumping from high-level concepts ("caching", "layering")
>> straight down to the lowest level ("it's just method a in class B")
>> and then right back.
> 
> I have had the same problem from the beginning [1] and expressed it in
> my question above again. What is this discussion about - if uri
> resolving has nothing to do with the pipeline API? And what do we win
> when replacing source resolve with java.net except that we have one less
> dependency - when nobody really gets in contact with uri resolving
> anyway? I have only received half an answer to only the second question
> (standard API). Only then I started to look at the relationship between
> uri resolving and pipeline API and found only this one reference to the
> source resolve package.
> 
> Joerg
> 
> [1] http://marc.info/?l=xml-cocoon-dev&m=120649777119480&w=4

-- 
Rainer Pruy
Geschäftsführer

Acrys Consult GmbH & Co. KG
Untermainkai 29-30, D-60329 Frankfurt
Tel: +49-69-244506-0 - Fax: +49-69-244506-50
Web: http://www.acrys.com -  Email: office@acrys.com
Handelsregister: Frankfurt am Main, HRA 31151

Re: Layered software designs

Posted by Joerg Heinicke <jo...@gmx.de>.

On 27.03.2008 02:25, Steven Dolg wrote:

>>> What I want to see is a concise pipeline API that comes with only 
>>> little overhead in terms of its learning curve and its dependencies 
>>> on third-party software. Usually this means that we stick with 
>>> standard APIs as much as possible - and I think this rule applies for 
>>> our situation too.
>>
>> See, one thing that I just don't get (and already asked) is how the 
>> pipeline API has anything to do with uri resolving. For me the latter 
>> (using java.net or source resolve) is an implementation detail. Our 
>> current pipeline interface [1] has no relationship to uri resolving. 
>> It only has a reference to SourceValidity because of caching [2].
>>
>> If all this discussion is about removing this method (and the related 
>> getKeyForEventPipeline()) to get rid of this dependency I'm ok with 
>> it. The caching concern could be implemented in a separate Cacheable 
>> interface which should then also be decoupled from uri resolving 
>> (which seems to point to Peter's approach [3]).
>>
> Just as a general observation:
> I'm starting to believe that I do not understand (anymore) what this is 
> all about.
> We're jumping from high-level concepts ("caching", "layering") straight 
> down to the lowest level ("it's just method a in class B") and then 
> right back.

I have had the same problem from the beginning [1] and expressed it in 
my question above again. What is this discussion about - if uri 
resolving has nothing to do with the pipeline API? And what do we win 
when replacing source resolve with java.net except that we have one less 
dependency - when nobody really gets in contact with uri resolving 
anyway? I have only received half an answer to only the second question 
(standard API). Only then I started to look at the relationship between 
uri resolving and pipeline API and found only this one reference to the 
source resolve package.

Joerg

[1] http://marc.info/?l=xml-cocoon-dev&m=120649777119480&w=4

Re: Layered software designs

Posted by Steven Dolg <st...@gmx.at>.

Joerg Heinicke schrieb:
> On 26.03.2008 09:14, Reinhard Poetz wrote:
>
>> What I want to see is a concise pipeline API that comes with only 
>> little overhead in terms of its learning curve and its dependencies 
>> on third-party software. Usually this means that we stick with 
>> standard APIs as much as possible - and I think this rule applies for 
>> our situation too.
>
> See, one thing that I just don't get (and already asked) is how the 
> pipeline API has anything to do with uri resolving. For me the latter 
> (using java.net or source resolve) is an implementation detail. Our 
> current pipeline interface [1] has no relationship to uri resolving. 
> It only has a reference to SourceValidity because of caching [2].
>
> If all this discussion is about removing this method (and the related 
> getKeyForEventPipeline()) to get rid of this dependency I'm ok with 
> it. The caching concern could be implemented in a separate Cacheable 
> interface which should then also be decoupled from uri resolving 
> (which seems to point to Peter's approach [3]).
>
> Joerg
>
> [1] 
> http://cocoon.apache.org/2.1/apidocs/org/apache/cocoon/components/pipeline/ProcessingPipeline.html 
>
> [2] 
> http://cocoon.apache.org/2.1/apidocs/org/apache/cocoon/components/pipeline/ProcessingPipeline.html#getValidityForEventPipeline() 
>
> [3] http://marc.info/?l=xml-cocoon-dev&m=120654005017297&w=4
>
Just as a general observation:
I'm starting to believe that I do not understand (anymore) what this is 
all about.
We're jumping from high-level concepts ("caching", "layering") straight 
down to the lowest level ("it's just method a in class B") and then 
right back.
We're arguing that a certain feature is already existing and working, 
while talking about a rewrite-experiment that definitely does not have 
this feature.

But back to caching:
Caching appears to be incredibly important. Even to the point where "no 
caching" means "not acceptable".
On the other hand, when trying to find out what is really necessary and 
wanted, there isn't much left. Suddenly it's "just an implementation 
detail", "not really important", "makes no difference for the user".

Forgive me for being blunt, but all this appears to me like "I need that 
feature. I do not care how it is implemented or how it works. I just 
have to have it."
I did not expect this kind of discussion on a dev list. (I even have a 
hard time accepting this from a paying customer).

But since it's just a not very important implementation detail, I added 
a (simple) caching approach to Corona. (I hope to get a patch ready today)
Perhaps this is all just too abstract and far fetched without any common 
basis (iow. code).

Steven

Re: Layered software designs

Posted by Joerg Heinicke <jo...@gmx.de>.

On 26.03.2008 09:14, Reinhard Poetz wrote:

> What I want to see is a concise pipeline API that comes with only little 
> overhead in terms of its learning curve and its dependencies on 
> third-party software. Usually this means that we stick with standard 
> APIs as much as possible - and I think this rule applies for our 
> situation too.

See, one thing that I just don't get (and already asked) is how the 
pipeline API has anything to do with uri resolving. For me the latter 
(using java.net or source resolve) is an implementation detail. Our 
current pipeline interface [1] has no relationship to uri resolving. It 
only has a reference to SourceValidity because of caching [2].

If all this discussion is about removing this method (and the related 
getKeyForEventPipeline()) to get rid of this dependency I'm ok with it. 
The caching concern could be implemented in a separate Cacheable 
interface which should then also be decoupled from uri resolving (which 
seems to point to Peter's approach [3]).

Joerg

[1] 
http://cocoon.apache.org/2.1/apidocs/org/apache/cocoon/components/pipeline/ProcessingPipeline.html
[2] 
http://cocoon.apache.org/2.1/apidocs/org/apache/cocoon/components/pipeline/ProcessingPipeline.html#getValidityForEventPipeline()
[3] http://marc.info/?l=xml-cocoon-dev&m=120654005017297&w=4

Re: Layered software designs

Posted by Reinhard Poetz <re...@apache.org>.

Grzegorz Kossakowski wrote:
> Ralph Goers pisze:
>> 
>> Reinhard Poetz wrote:
>>> Pipeline API  +  java.net.URL   +  XML-SAX components
>>> 
>>> 
>>> A more advanced scenario could consist of
>>> 
>>> Pipeline API  +  Sourceresolve  +  XML-SAX components  +  Sitemap Engine
>>> 
>>> 
>>> or maybe you need the full stack that corresponds to Cocoon Core 2.2 - 
>>> here you are:
>>> 
>>> Pipeline API  +  Sourceresolve  +  HTTP-enabled  +  Sitemap Engine +
>>> Spring XML-SAX componnents
>>> 
>>> 
>>> This layered approach makes Cocoon easily embeddable in any Java 
>>> application and Cocoon's learning curve becomes more gradual.
>>> 
>>> Is such a situation only appealing to Carsten, Steven and me?
>> Appealing? yes.  Actually implementable in Java so that it isn;t even more
>> complicated than what we have? I don't know.
> 
> +1.
> 
> I've got my hands dirty with URL stuff from Java and I fail to see how we can
> avoid using Source interface, especially when caching comes in.
> 
> The issue is not about lack of the will but more about lack of ability, at
> least at this stage. Examine current
> Source/SourcaFactory/CocoonSourceResolver usage patterns to see what I mean.

Why do you want to give up even before you start? Because things _seem_ to be 
difficult?

At the current stage we haven't even defined what we expect from caching. Please 
let's first put all our requirements on the table.

My (and that of many others) *main requirement is a layered software design* 
that makes Cocoon easily useable and embeddable in many different environments. 
Given, caching is important, but compared to that it is of subordinate relevance.

Maybe we have to go back to the old Source/SourceResolver solution because 
everything is impossible as some of us think, but let's decide this latter when 
we are sure that there is no better alternative.

-- 
Reinhard Pötz                            Managing Director, {Indoqa} GmbH
                           http://www.indoqa.com/en/people/reinhard.poetz/

Member of the Apache Software Foundation
Apache Cocoon Committer, PMC member, PMC Chair        reinhard@apache.org
_________________________________________________________________________

Re: Layered software designs

Posted by Grzegorz Kossakowski <gr...@tuffmail.com>.

Ralph Goers pisze:
> 
> 
> Reinhard Poetz wrote:
>>
>>   Pipeline API  +  java.net.URL   +  XML-SAX components
>>
>>
>> A more advanced scenario could consist of
>>
>>   Pipeline API  +  Sourceresolve  +  XML-SAX components  +  Sitemap
>> Engine
>>
>>
>> or maybe you need the full stack that corresponds to Cocoon Core 2.2 -
>> here you are:
>>
>>   Pipeline API  +  Sourceresolve  +  HTTP-enabled  +  Sitemap Engine 
>> +  Spring
>>                                        XML-SAX
>>                                      componnents
>>
>>
>> This layered approach makes Cocoon easily embeddable in any Java
>> application and Cocoon's learning curve becomes more gradual.
>>
>> Is such a situation only appealing to Carsten, Steven and me?
> Appealing? yes.  Actually implementable in Java so that it isn;t even
> more complicated than what we have? I don't know.

+1.

I've got my hands dirty with URL stuff from Java and I fail to see how we can avoid using Source
interface, especially when caching comes in.

The issue is not about lack of the will but more about lack of ability, at least at this stage.
Examine current Source/SourcaFactory/CocoonSourceResolver usage patterns to see what I mean.

-- 
Grzegorz

Re: Layered software designs

Posted by Reinhard Poetz <re...@apache.org>.

Yes, using Cocoon pipelines in Ant is one of my long-time favorits ;-)

Reinhard

Bruce Atherton wrote:
> I'm with you, too.
> 
> Just as an example, I think it might be useful to use Corona as the 
> basis for a new Ant task called Pipeline. That task could do any number 
> text transformations to a set of files as part of a build process. Here, 
> caching is a non-issue for the most part since the point of using a 
> pipeline would be to process each file only once. Rather than either 
> java.net.URL or SourceResolve, we'd probably want to feed the pipeline 
> based on Ant resources[1], just as the existing XSLT task is fed.
> 
> It may be that SourceResolver and an AntResourceSource is the best way 
> to solve the problem, but on a cursory glance it sure looks like it is 
> difficult to separate the banana from the gorilla[2] that is Avalon.
> 
> [1] http://ant.apache.org/manual/CoreTypes/resources.html
> [2] http://www.ddj.com/architect/184408251
> 
> Torsten Curdt wrote:
>> On Mar 26, 2008, at 14:44, Ralph Goers wrote:
>>> Reinhard Poetz wrote:
>>>>
>>>>  Pipeline API  +  java.net.URL   +  XML-SAX components
>>>>
>>>>
>>>> A more advanced scenario could consist of
>>>>
>>>>  Pipeline API  +  Sourceresolve  +  XML-SAX components  +  Sitemap 
>>>> Engine
>>>>
>>>>
>>>> or maybe you need the full stack that corresponds to Cocoon Core 2.2 
>>>> - here you are:
>>>>
>>>>  Pipeline API  +  Sourceresolve  +  HTTP-enabled  +  Sitemap Engine  
>>>> +  Spring
>>>>                                       XML-SAX
>>>>                                     componnents
>>>>
>>>>
>>>> This layered approach makes Cocoon easily embeddable in any Java 
>>>> application and Cocoon's learning curve becomes more gradual.
>>>>
>>>> Is such a situation only appealing to Carsten, Steven and me?
>>
>> Just lurking but I am with you guys.
>>
>>>>
>>> Appealing? yes.  Actually implementable in Java so that it isn;t even 
>>> more complicated than what we have? I don't know.
>>
>>
>> IMO this would simplify a lot as it separates concerns and the inner 
>> guts can be used in other projects without the pain of dependencies we 
>> have right now. People have been asking for this for years. I really 
>> think think this is the right direction.

-- 
Reinhard Pötz                            Managing Director, {Indoqa} GmbH
                           http://www.indoqa.com/en/people/reinhard.poetz/

Member of the Apache Software Foundation
Apache Cocoon Committer, PMC member, PMC Chair        reinhard@apache.org
_________________________________________________________________________

Re: Layered software designs

Posted by Bruce Atherton <br...@callenish.com>.

I'm with you, too.

Just as an example, I think it might be useful to use Corona as the 
basis for a new Ant task called Pipeline. That task could do any number 
text transformations to a set of files as part of a build process. Here, 
caching is a non-issue for the most part since the point of using a 
pipeline would be to process each file only once. Rather than either 
java.net.URL or SourceResolve, we'd probably want to feed the pipeline 
based on Ant resources[1], just as the existing XSLT task is fed.

It may be that SourceResolver and an AntResourceSource is the best way 
to solve the problem, but on a cursory glance it sure looks like it is 
difficult to separate the banana from the gorilla[2] that is Avalon.

[1] http://ant.apache.org/manual/CoreTypes/resources.html
[2] http://www.ddj.com/architect/184408251

Torsten Curdt wrote:
> On Mar 26, 2008, at 14:44, Ralph Goers wrote:
>> Reinhard Poetz wrote:
>>>
>>>  Pipeline API  +  java.net.URL   +  XML-SAX components
>>>
>>>
>>> A more advanced scenario could consist of
>>>
>>>  Pipeline API  +  Sourceresolve  +  XML-SAX components  +  Sitemap 
>>> Engine
>>>
>>>
>>> or maybe you need the full stack that corresponds to Cocoon Core 2.2 
>>> - here you are:
>>>
>>>  Pipeline API  +  Sourceresolve  +  HTTP-enabled  +  Sitemap Engine  
>>> +  Spring
>>>                                       XML-SAX
>>>                                     componnents
>>>
>>>
>>> This layered approach makes Cocoon easily embeddable in any Java 
>>> application and Cocoon's learning curve becomes more gradual.
>>>
>>> Is such a situation only appealing to Carsten, Steven and me?
>
> Just lurking but I am with you guys.
>
>>>
>> Appealing? yes.  Actually implementable in Java so that it isn;t even 
>> more complicated than what we have? I don't know.
>
>
> IMO this would simplify a lot as it separates concerns and the inner 
> guts can be used in other projects without the pain of dependencies we 
> have right now. People have been asking for this for years. I really 
> think think this is the right direction.
>
> cheers
> -- 
> Torsten

Re: Layered software designs

Posted by Torsten Curdt <tc...@apache.org>.

On Mar 26, 2008, at 14:44, Ralph Goers wrote:
> Reinhard Poetz wrote:
>>
>>  Pipeline API  +  java.net.URL   +  XML-SAX components
>>
>>
>> A more advanced scenario could consist of
>>
>>  Pipeline API  +  Sourceresolve  +  XML-SAX components  +  Sitemap  
>> Engine
>>
>>
>> or maybe you need the full stack that corresponds to Cocoon Core  
>> 2.2 - here you are:
>>
>>  Pipeline API  +  Sourceresolve  +  HTTP-enabled  +  Sitemap  
>> Engine  +  Spring
>>                                       XML-SAX
>>                                     componnents
>>
>>
>> This layered approach makes Cocoon easily embeddable in any Java  
>> application and Cocoon's learning curve becomes more gradual.
>>
>> Is such a situation only appealing to Carsten, Steven and me?

Just lurking but I am with you guys.

>>
> Appealing? yes.  Actually implementable in Java so that it isn;t  
> even more complicated than what we have? I don't know.


IMO this would simplify a lot as it separates concerns and the inner  
guts can be used in other projects without the pain of dependencies we  
have right now. People have been asking for this for years. I really  
think think this is the right direction.

cheers
--
Torsten

Re: Layered software designs

Posted by Ralph Goers <Ra...@dslextreme.com>.

Steven Dolg wrote:
>
>
> Ralph Goers schrieb:
>>
>> Appealing? yes.  Actually implementable in Java so that it isn;t even 
>> more complicated than what we have? I don't know.
>>
>
> Just curious - do you have doubts, that this is achievable 
> specifically with Java, or generally with any language?
Taking 5 or 6 concepts (interfaces?) and trying to tie them together can 
get quite messy in Java. C++ could use multiple inheritence. While that 
would work it also brings along all the problems all the problems 
multiple inheritence introduces.  I'm sure there are other languages 
that might tie these concepts better, but off hand I can't think of one.

I'm also sure there are some interesting ways to do this in Java. 
Whether they end up being complicated (either to maintain or use) 
remains to be seen.

I'm all for letting whoever wants to move forward on this and see what 
they come up with. But I won't be willing to accept it until it supports 
caching.

Ralph

Re: Layered software designs

Posted by Steven Dolg <st...@gmx.at>.


Ralph Goers schrieb:
>
>
> Reinhard Poetz wrote:
>>
>>   Pipeline API  +  java.net.URL   +  XML-SAX components
>>
>>
>> A more advanced scenario could consist of
>>
>>   Pipeline API  +  Sourceresolve  +  XML-SAX components  +  Sitemap 
>> Engine
>>
>>
>> or maybe you need the full stack that corresponds to Cocoon Core 2.2 
>> - here you are:
>>
>>   Pipeline API  +  Sourceresolve  +  HTTP-enabled  +  Sitemap Engine  
>> +  Spring
>>                                        XML-SAX
>>                                      componnents
>>
>>
>> This layered approach makes Cocoon easily embeddable in any Java 
>> application and Cocoon's learning curve becomes more gradual.
>>
>> Is such a situation only appealing to Carsten, Steven and me?
> Appealing? yes.  Actually implementable in Java so that it isn;t even 
> more complicated than what we have? I don't know.
>

Just curious - do you have doubts, that this is achievable specifically 
with Java, or generally with any language?

Re: Layered software designs

Posted by Ralph Goers <Ra...@dslextreme.com>.


Reinhard Poetz wrote:
>
>   Pipeline API  +  java.net.URL   +  XML-SAX components
>
>
> A more advanced scenario could consist of
>
>   Pipeline API  +  Sourceresolve  +  XML-SAX components  +  Sitemap 
> Engine
>
>
> or maybe you need the full stack that corresponds to Cocoon Core 2.2 - 
> here you are:
>
>   Pipeline API  +  Sourceresolve  +  HTTP-enabled  +  Sitemap Engine  
> +  Spring
>                                        XML-SAX
>                                      componnents
>
>
> This layered approach makes Cocoon easily embeddable in any Java 
> application and Cocoon's learning curve becomes more gradual.
>
> Is such a situation only appealing to Carsten, Steven and me?
Appealing? yes.  Actually implementable in Java so that it isn;t even 
more complicated than what we have? I don't know.

Re: Layered software designs

Posted by Reinhard Poetz <re...@apache.org>.

Rainer Pruy wrote:
> Reinhard Poetz schrieb:
>> The idea of Corona is having a concise core that doesn't have any 
>> dependencies on a particular component container (Spring, OSGi, etc.), 
>> source resolving mechanisms or environment (http, java  only, etc.) or even
>> the type of the components (XML-SAX event stream, XML-Stax event stream,
>> binary streams, etc.) that are linked together in a pipeline.
>> 
>> A simple scenario could be:
>> 
>> Pipeline API  +  java.net.URL   +  XML-SAX components
>> 
>> 
>> A more advanced scenario could consist of
>> 
>> Pipeline API  +  Sourceresolve  +  XML-SAX components  +  Sitemap Engine
>> 
>> 
>> or maybe you need the full stack that corresponds to Cocoon Core 2.2 - here
>> you are:
>> 
>> Pipeline API  +  Sourceresolve  +  HTTP-enabled  +  Sitemap Engine  + 
>> Spring XML-SAX componnents
>> 
>> 
>> This layered approach makes Cocoon easily embeddable in any Java 
>> application and Cocoon's learning curve becomes more gradual.
>> 
>> Is such a situation only appealing to Carsten, Steven and me?
>> 
> 
> It's appealing to me also.
> 
> However, I'm not sure I did get the "layers" correctly.

Note: All my answers are related to Corona. Here we go:

> I did see:
> 
> Pipeline API: responsible for composing different components, introduces the
> notion of Producer/Consumer and first and last component

the pipeline api will be enough to execute pipelines too.

> Sitemap API: responsible for executing pipelines

no, the sitemap is a builder that builds pipelines and is put on top of the 
pipeline API.

> Spring:  responsible for setting up layers and identifying implementations to
> "wildcard" functionality (not really a layer for itself?)

Corona has a SpringComponentProvider which is used by the sitemap to lookup its 
components. Spring manages the components but is completly hidden behind the 
ComponentProvider interface. This makes it very easy to use alternative 
containers (e.g. OSGi).

> URI API (had no better term at hand):  responsible for interpreting protocol
> strings for resource access. Currently two implementations in dicussion: URL
> and SourceResolver

yes. And as Sylvain and you pointed out, we maybe don't even need the 
Sourceresolver anymore.

> Pipeline Components? really a layer or just implementations of Pipeline API

Steven came up with the idea of having components that belong to different 
layers. The first type is useable in every environment. This could be a 
SAX-based FileGenerator or a SAX-based XSLTTransformer. There is no need for any 
ties to the ServletAPI. Then there might be a second group of components that 
are based on e.g. the Stax-API.

A third group of components has relations to the ServletAPI in order to produce 
useful results.

In short, the idea is that the pipeline API is completly independant from any 
environments. It can "only" connect components that belong together.

> Can we line out the intended or existing "layers"? I do feel this would help
> focussing the discussion.

I hope that my explanations above are already helpful. Steven and I will work on 
Corona next week again and we hope that we can show how this "layered design" 
can be applied in code.

> We then can have a track reflecting actual layers, a track exploring
> interaction among layers (e.g caching, configuration...), and a track
> pondering implementation aspects.

-- 
Reinhard Pötz                            Managing Director, {Indoqa} GmbH
                           http://www.indoqa.com/en/people/reinhard.poetz/

Member of the Apache Software Foundation
Apache Cocoon Committer, PMC member, PMC Chair        reinhard@apache.org
_________________________________________________________________________

Re: Layered software designs

Posted by Rainer Pruy <Ra...@Acrys.COM>.


Reinhard Poetz schrieb:
> Carsten Ziegeler wrote:
>> The question is now if we need support for caching in the low level
>> apis or if it is possible to have a layered approach - which would
>> make the entry barrier much easier.
> 
> Yes, this layered approach is what I'm aiming for. All the reactions in
> this thread make me think that everybody, who has commented on this
> mailing list so far (except Carsten), believes that we want to throw
> away good things that have profed to be useful in many situations.
> Rest assured, that's not the case. Carsten and I only want to break up
> this all-or-nothing situation that we (still) have now.
> 
> What I want to see is a concise pipeline API that comes with only little
> overhead in terms of its learning curve and its dependencies on
> third-party software. Usually this means that we stick with standard
> APIs as much as possible - and I think this rule applies for our
> situation too.
> 
> This means that the user of the API only needs to learn as little as
> possible. When he wants more, we offer additional modules that help him.
> Since he has a concrete need, the motiviation to learn something new is
> much higher than when he has to learn everything right from the beginning.
> 
> If you want to learn how this whole concept *might* apply for a next
> generation Cocoon, have a look at Steven's and my "Exploring Corona"
> mail from last week
> (http://marc.info/?l=xml-cocoon-dev&m=120611990603725&w=2).
> 
> The idea of Corona is having a concise core that doesn't have any
> dependencies on a particular component container (Spring, OSGi, etc.),
> source resolving mechanisms or environment (http, java  only, etc.) or
> even the type of the components (XML-SAX event stream, XML-Stax event
> stream, binary streams, etc.) that are linked together in a pipeline.
> 
> A simple scenario could be:
> 
>   Pipeline API  +  java.net.URL   +  XML-SAX components
> 
> 
> A more advanced scenario could consist of
> 
>   Pipeline API  +  Sourceresolve  +  XML-SAX components  +  Sitemap Engine
> 
> 
> or maybe you need the full stack that corresponds to Cocoon Core 2.2 -
> here you are:
> 
>   Pipeline API  +  Sourceresolve  +  HTTP-enabled  +  Sitemap Engine  + 
> Spring
>                                        XML-SAX
>                                      componnents
> 
> 
> This layered approach makes Cocoon easily embeddable in any Java
> application and Cocoon's learning curve becomes more gradual.
> 
> Is such a situation only appealing to Carsten, Steven and me?
> 

It's appealing to me also.

However, I'm not sure I did get the "layers" correctly.

I did see:

Pipeline API: responsible for composing different components, introduces the notion of Producer/Consumer and first and last component

Sitemap API: responsible for executing pipelines

Spring:  responsible for setting up layers and identifying implementations to "wildcard" functionality
(not really a layer for itself?)

URI API (had no better term at hand):  responsible for interpreting protocol strings for resource access.
	Currently two implementations in dicussion: URL and SourceResolver

Pipeline Components? really a layer or just implementations of Pipeline API


Can we line out the intended or existing "layers"?
I do feel this would help focussing the discussion.

We then can have a track reflecting actual layers, a track exploring interaction among layers (e.g caching, configuration...), and a
track pondering implementation aspects.

Rainer

Re: Layered software designs

Posted by Reinhard Poetz <re...@apache.org>.

Vadim Gritsenko wrote:
> On Mar 26, 2008, at 9:14 AM, Reinhard Poetz wrote:
>> A simple scenario could be:
>>
>>  Pipeline API  +  java.net.URL   +  XML-SAX components
>>
>>
>> A more advanced scenario could consist of
>>
>>  Pipeline API  +  Sourceresolve  +  XML-SAX components  +  Sitemap Engine
> 
> Do you really need both URL and SourceResolver? As Sylvain described it, 
> with java.net.URL we should be able to implement replacement for all 
> features of existing SourceResolver. I just don't think it is worth it, 
> if in the end we are going to have a mix of two paradigms.

I had written this before I read Sylvain's mail. Even better, if we don't need 
the SourceResolver at all.

-- 
Reinhard Pötz                            Managing Director, {Indoqa} GmbH
                           http://www.indoqa.com/en/people/reinhard.poetz/

Member of the Apache Software Foundation
Apache Cocoon Committer, PMC member, PMC Chair        reinhard@apache.org
_________________________________________________________________________

Re: Layered software designs

Posted by Vadim Gritsenko <va...@reverycodes.com>.

On Mar 26, 2008, at 9:14 AM, Reinhard Poetz wrote:
> A simple scenario could be:
>
>  Pipeline API  +  java.net.URL   +  XML-SAX components
>
>
> A more advanced scenario could consist of
>
>  Pipeline API  +  Sourceresolve  +  XML-SAX components  +  Sitemap  
> Engine

Do you really need both URL and SourceResolver? As Sylvain described  
it, with java.net.URL we should be able to implement replacement for  
all features of existing SourceResolver. I just don't think it is  
worth it, if in the end we are going to have a mix of two paradigms.

Vadim

Layered software designs

Posted by Reinhard Poetz <re...@apache.org>.

Carsten Ziegeler wrote:
> The question is now if we need support for caching in the low level apis 
> or if it is possible to have a layered approach - which would make the 
> entry barrier much easier.

Yes, this layered approach is what I'm aiming for. All the reactions in this 
thread make me think that everybody, who has commented on this mailing list so 
far (except Carsten), believes that we want to throw away good things that have 
profed to be useful in many situations.
Rest assured, that's not the case. Carsten and I only want to break up this 
all-or-nothing situation that we (still) have now.

What I want to see is a concise pipeline API that comes with only little 
overhead in terms of its learning curve and its dependencies on third-party 
software. Usually this means that we stick with standard APIs as much as 
possible - and I think this rule applies for our situation too.

This means that the user of the API only needs to learn as little as possible. 
When he wants more, we offer additional modules that help him. Since he has a 
concrete need, the motiviation to learn something new is much higher than when 
he has to learn everything right from the beginning.

If you want to learn how this whole concept *might* apply for a next generation 
Cocoon, have a look at Steven's and my "Exploring Corona" mail from last week 
(http://marc.info/?l=xml-cocoon-dev&m=120611990603725&w=2).

The idea of Corona is having a concise core that doesn't have any dependencies 
on a particular component container (Spring, OSGi, etc.), source resolving 
mechanisms or environment (http, java  only, etc.) or even the type of the 
components (XML-SAX event stream, XML-Stax event stream, binary streams, etc.) 
that are linked together in a pipeline.

A simple scenario could be:

   Pipeline API  +  java.net.URL   +  XML-SAX components

A more advanced scenario could consist of

   Pipeline API  +  Sourceresolve  +  XML-SAX components  +  Sitemap Engine

or maybe you need the full stack that corresponds to Cocoon Core 2.2 - here you are:

   Pipeline API  +  Sourceresolve  +  HTTP-enabled  +  Sitemap Engine  +  Spring
                                        XML-SAX
                                      componnents

This layered approach makes Cocoon easily embeddable in any Java application and 
Cocoon's learning curve becomes more gradual.

Is such a situation only appealing to Carsten, Steven and me?

-- 
Reinhard Pötz                            Managing Director, {Indoqa} GmbH
                           http://www.indoqa.com/en/people/reinhard.poetz/

Member of the Apache Software Foundation
Apache Cocoon Committer, PMC member, PMC Chair        reinhard@apache.org
_________________________________________________________________________

Re: JNet integration

Posted by Sylvain Wallez <sy...@apache.org>.

Carsten Ziegeler wrote:
> It seems that this topic is even more interesting than the environment 
> abstraction threads we had a while back :)
>
> Ok, first of all, the sourceresolver is a very cool component and I 
> really like it (and it was also me who started this component years 
> ago :) ). As the java api lacks such kind of an abstraction, its good 
> to have it.

Does it? It seems to me the URLConnection class provides the same level 
of abstraction and functionnality than a Source.

> Other approaches like the Spring resource stuff are not as good as 
> they are not very flexible.
> So, it doesn't make sense to drop this stuff for Cocoon - seeing 
> Cocoon as the "big" framework for building web applications.
>
> But on the other side we are talking about making Cocoon more modular 
> and easier to understand. The Corona work is a great start and for me 
> it makes totally sense to base the core api just on the java net 
> stuff. It's much easier to understand and use. When we were doing 
> Cocoon trainings, it was always a little bit difficult to get the 
> point across that people should use the source resolver and not the 
> file api or the java net api. Not a big deal, but still it adds more 
> new stuff on top of the core api than you might need.
>
> While the source resolver, source factory and source interfaces are of 
> great use, I'm not so sure about the extensions of the source 
> interface like modifiable source or traversable source. I rarely 
> needed them through the years, but your mileage might vary.
>
> Caching is very important and the java api does not provide a very 
> good way to support caching in general. The source abstaction was 
> built with caching in mind.

Hmm... If this is just about Source.getSourceValidity which is 
optionally implemented (non-caching source can simply return null), we 
could equally well have a CachableURLConnection.

> If you look at the pipeline caching we have in cocoon it is rather 
> complicated (it can cache partial pipelines etc.), and during the 
> years we switched for a time based/event based caching of complete 
> pipelines. But again your mileage may vary.
>
> The question is now if we need support for caching in the low level 
> apis or if it is possible to have a layered approach - which would 
> make the entry barrier much easier.

The main need for the Source abstraction was to circumvent the 
staticness of URLStreamHandlerFactory. Otherwise, pretty much everything 
we need is there:
- extensible framework for cacheable, modifiable, traversable, 
whatever-able sources with URLConnection sub-classes,
- integrated parsing/conversion features with the getContent() method
- source-resolving using the URL(URL context, String spec) constructor

The staticness problems of URLStreamHandlerFactory are still there in 
the JVM, but the OSGi/Equinox guys have found creative ways to 
circumvent them on most JVMs. If we're happy with this solution, it 
seems to me the Source abstraction is one of the many "cocoon-isms" we 
should consider to gently deprecate just as the copy of the servlets 
that we have in our environment abstraction.

Fewer new concepts to learn for newcomers, fewer specific concepts that 
hinder the integration of Cocoon with other environments.

My 0.02 euros (which are worth their $ these days!)

Sylvain

-- 
Sylvain Wallez - http://bluxte.net

Re: JNet integration

Posted by Peter Hunsberger <pe...@gmail.com>.

On Wed, Mar 26, 2008 at 4:46 AM, Carsten Ziegeler <cz...@apache.org> wrote:

<snip/>

>  The question is now if we need support for caching in the low level apis
>  or if it is possible to have a layered approach - which would make the
>  entry barrier much easier.
>

We've ended up layering our own caching on top (or maybe under,
depending on your POV) of much of the Cocoon caching.  This is because
we mainly use a single generator and serve most of our content from a
database.  In our case different database resources have very
different cache lifetimes, we have very granular resources with very
aggressive caching (very simplified example; metadata and data is
pulled from the same pipeline.)  Basically I've found a couple of
things:

- Cache key construction is the thing that matters.  We allow
polymorphic behavior in the cache keys so that multiple resources can
resolve to the same resource.  It is up to the cache key to determine
what underlying resource is being accessed and not the URI.

- Since we allow cache key polymorphisim we cannot use a Hashmap since
a hash code look up doesn't allow two objects with different hash
values to occupy the same object space. Instead we use a TreeMap that
tests equality explicitly, a seperate equality test can use whatever
rules it needs.

- all of the call context is made available for the cache key
construction, we make no assumptions on how the cache key is going
manage itself, there is no such thing as different types of cache keys
any cache key might be based on time to live, events, URI, request
parameters and attributes, or whatever.

- cache keys can register other cache keys that they are dependent on
and invalidating a cache key causes resources defendant on it to also
be invalidated.  The important thing here is that all caching
decisions are contained in a single place (the cache key).

- all resource access goes through the same path.  Non cached
resources simply do not get a cache key, if no cache key is built the
attempt to retrieve the resource from the cache is skipped (duh).

Don't know if this helps any, but for us, having made this design
decision our caching is completely decoupled form URI resolving (and
in fact the rest of the Cocoon infrastructure).

-- 
Peter Hunsberger

Re: JNet integration

Posted by Carsten Ziegeler <cz...@apache.org>.

It seems that this topic is even more interesting than the environment 
abstraction threads we had a while back :)

Ok, first of all, the sourceresolver is a very cool component and I 
really like it (and it was also me who started this component years ago 
:) ). As the java api lacks such kind of an abstraction, its good to 
have it.
Other approaches like the Spring resource stuff are not as good as they 
are not very flexible.
So, it doesn't make sense to drop this stuff for Cocoon - seeing Cocoon 
as the "big" framework for building web applications.

But on the other side we are talking about making Cocoon more modular 
and easier to understand. The Corona work is a great start and for me it 
makes totally sense to base the core api just on the java net stuff. 
It's much easier to understand and use. When we were doing Cocoon 
trainings, it was always a little bit difficult to get the point across 
that people should use the source resolver and not the file api or the 
java net api. Not a big deal, but still it adds more new stuff on top of 
the core api than you might need.

While the source resolver, source factory and source interfaces are of 
great use, I'm not so sure about the extensions of the source interface 
like modifiable source or traversable source. I rarely needed them 
through the years, but your mileage might vary.

Caching is very important and the java api does not provide a very good 
way to support caching in general. The source abstaction was built with 
caching in mind.
If you look at the pipeline caching we have in cocoon it is rather 
complicated (it can cache partial pipelines etc.), and during the years 
we switched for a time based/event based caching of complete pipelines. 
But again your mileage may vary.

The question is now if we need support for caching in the low level apis 
or if it is possible to have a layered approach - which would make the 
entry barrier much easier.

Carsten
-- 
Carsten Ziegeler
cziegeler@apache.org

Re: JNet integration

Posted by Carsten Ziegeler <cz...@apache.org>.

Ralph Goers wrote:
> 
> 
> Carsten Ziegeler wrote:
>>
>> Hmm, I don't think so. Imagine a pipeline java api just taking a uri 
>> for the sources used in the pipeline. That's simple and easy.
>> Now, you can use the source resolver on top of that, resolve your 
>> sources and you get a uri from your source that you can put into the 
>> pipeline api.
>> That's neither a mess nor does it require more java coding.
>>
> That sounds good in theory, but the proof is in when you actually try to 
> do it with caching enabled. As I said, I'm not really too interested in 
> the non-caching use case as I view that as the minority use case. 
> Furthermore, the non-caching use case can always be dealt with by using 
> the caching use case and just turning off the cache.
:) Sure, I have many use cases for pipelines where I don't need caching 
at all - like some processing pipelines that are not used for creating 
web responses.

> 
> So you build this pipeline API that only uses java.net. Now you want to 
> build pipelines that cache. Does the Source now have to show through the 
> caching version of the pipeline API?  If it does you will have a real 
> mess as now users of pipelines have to determine whether they are 
> caching or non-caching just to determine what methods they can use.
> 
Hmm, ok, I have not thought about this very deeply, so I don't have an 
answer yet and perhaps there is no good answer and it might turn out 
that all this is not a good idea.

But :) without looking further into it, I have the feeling that it 
should be possible to build a clean layered architecture that solves all 
our problems. And perhaps it turns out that it might make more sense to 
use the sourceresolver throughout all of these layers.

Carsten
-- 
Carsten Ziegeler
cziegeler@apache.org

Re: JNet integration

Posted by Ralph Goers <Ra...@dslextreme.com>.

Carsten Ziegeler wrote:
>
> Hmm, I don't think so. Imagine a pipeline java api just taking a uri 
> for the sources used in the pipeline. That's simple and easy.
> Now, you can use the source resolver on top of that, resolve your 
> sources and you get a uri from your source that you can put into the 
> pipeline api.
> That's neither a mess nor does it require more java coding.
>
That sounds good in theory, but the proof is in when you actually try to 
do it with caching enabled. As I said, I'm not really too interested in 
the non-caching use case as I view that as the minority use case. 
Furthermore, the non-caching use case can always be dealt with by using 
the caching use case and just turning off the cache.

So you build this pipeline API that only uses java.net. Now you want to 
build pipelines that cache. Does the Source now have to show through the 
caching version of the pipeline API?  If it does you will have a real 
mess as now users of pipelines have to determine whether they are 
caching or non-caching just to determine what methods they can use.

Ralph

Re: JNet integration

Posted by Dev at weitling <de...@weitling.net>.


Carsten Ziegeler wrote:
> Joerg Heinicke wrote:
>> That's why I wonder where a pipeline writer gets into uri resolving 
>> at all? Why does he have to care about URLs or Sources? That's the 
>> component developer's concern.

> The pipeline writer needs to know how uri resolving works. 

Still: Why?

> He needs to know what input values are allowed, what relative values 
> mean etc.

Have to think about this part.

Regards,
Florian

RE: JNet integration

Posted by Ard Schrijvers <a....@onehippo.com>.

> > To be honest, I don't care about caching or how complex it 
> is. It has 
> > to work and it does it nicely in Cocoon. If your name isn't 
> Ard ( ;-) 

:-)

> > ) you usually don't need to know more details. And that's 
> what it is 
> > as Vadim pointed out: implementation details.
> > 
> My name is not Ard but I care; knowing how the caching works 
> and fine tuning it (by changing parameters or choosing the 
> right strategy) makes a big difference.

I agree that you should care Carsten, but....you have been a Cocoon
committer for the last zillion years: should a fresh new cocoon user
from the start know how to do everything correct to have the best of
Cocoon's caching? 

I do not agree the former Cocoon caching was ultra complex (to use!!): I
started working with Cocoon, and only after some months I noticed:
heeey, what is that caching vs noncaching doing in the top of my
pipelines :-)  So...how hard and complex is it? I used it without ever
noticing it

And in the end, when you want to push Cocoon to its limits, and you hit
some performance issues, then you can take a look how the caching
actually works: that is IMHO the way how it should be implemented, and,
how it used to be implemented

-Ard

> 
> Carsten
>

Re: JNet integration

Posted by Carsten Ziegeler <cz...@apache.org>.

Joerg Heinicke wrote:
> CachingFileGenerator was something Reinhard came up with - I only 
> explained why I considered this as starting point for a "mess" which you 
> disagreed with.
> 
>> But I will turn around the questions :) What caching do you need? The 
>> ultra complex caching we currently have which can cache partial 
>> pipelines? Or just the result of a pipeline?
> 
> To be honest, I don't care about caching or how complex it is. It has to 
> work and it does it nicely in Cocoon. If your name isn't Ard ( ;-) ) you 
> usually don't need to know more details. And that's what it is as Vadim 
> pointed out: implementation details.
> 
My name is not Ard but I care; knowing how the caching works and fine 
tuning it (by changing parameters or choosing the right strategy) makes 
a big difference.

Carsten

-- 
Carsten Ziegeler
cziegeler@apache.org

Re: JNet integration

Posted by Joerg Heinicke <jo...@gmx.de>.

On 26.03.2008 09:08, Carsten Ziegeler wrote:

>> and want to add caching to it and therefore have to switch from URL to 
>> Source and from FileGenerator to CachingFileGenerator - sorry, but 
>> that's a mess since this means pretty much rewriting the application 
>> for adding caching. Why is this page so slow? Oh, sorry we forgot this 
>> pipeline to switch to caching ...
> Ah, no no, I'm not saying this :) There should only be one file generator.

CachingFileGenerator was something Reinhard came up with - I only 
explained why I considered this as starting point for a "mess" which you 
disagreed with.

> But I will turn around the questions :) What caching do you need? The 
> ultra complex caching we currently have which can cache partial 
> pipelines? Or just the result of a pipeline?

To be honest, I don't care about caching or how complex it is. It has to 
work and it does it nicely in Cocoon. If your name isn't Ard ( ;-) ) you 
usually don't need to know more details. And that's what it is as Vadim 
pointed out: implementation details.

Joerg

Re: JNet integration

Posted by Vadim Gritsenko <va...@reverycodes.com>.

On Mar 26, 2008, at 12:01 PM, Steven Dolg wrote:
> Vadim Gritsenko schrieb:
>> On Mar 26, 2008, at 11:09 AM, Steven Dolg wrote:
>>>
>>> Carsten Ziegeler schrieb:
>>>> The pipeline writer needs to know how uri resolving works. He  
>>>> needs to know what input values are allowed, what relative values  
>>>> mean etc.
>>>>
>>>> But I will turn around the questions :) What caching do you need?  
>>>> The ultra complex caching we currently have which can cache  
>>>> partial pipelines? Or just the result of a pipeline?
>>> I believe this is a very important question.
>>
>> Not really. Partial pipeline caching is implemented completely by  
>> the pipeline using all already existing APIs which were used by  
>> regular caching. From the perspective of pipeline class users, or  
>> from the perspective of classes which pipeline uses, there is no  
>> difference between using 'partial-response-caching' and 'complete- 
>> response-caching' pipelines. The distinction is only in private  
>> implementation details.
> I completely agree.
> But aren't we the ones talking about those "private implementation  
> details"...
> And on the other hand: If partial vs. complete caching makes no  
> difference for the user/caller, why bother maintaining both?

It means no difference in the code, but for the user it has a real  
benefit - user does not have to be an expert in cocoon caching,  
partial response caching will give him better performance without him  
expending an effort into optimizing his sitemap.

Vadim

Re: JNet integration

Posted by Steven Dolg <st...@gmx.at>.


Vadim Gritsenko schrieb:
> On Mar 26, 2008, at 11:09 AM, Steven Dolg wrote:
>>
>> Carsten Ziegeler schrieb:
>>> The pipeline writer needs to know how uri resolving works. He needs 
>>> to know what input values are allowed, what relative values mean etc.
>>>
>>> But I will turn around the questions :) What caching do you need? 
>>> The ultra complex caching we currently have which can cache partial 
>>> pipelines? Or just the result of a pipeline?
>> I believe this is a very important question.
>
> Not really. Partial pipeline caching is implemented completely by the 
> pipeline using all already existing APIs which were used by regular 
> caching. From the perspective of pipeline class users, or from the 
> perspective of classes which pipeline uses, there is no difference 
> between using 'partial-response-caching' and 
> 'complete-response-caching' pipelines. The distinction is only in 
> private implementation details.
I completely agree.
But aren't we the ones talking about those "private implementation 
details"...
And on the other hand: If partial vs. complete caching makes no 
difference for the user/caller, why bother maintaining both?
>
> Vadim
>
>
>> I'm sure use cases that require the "ultra complex" solution and use 
>> cases that will work with a more simple approach are easily found.
>> So the question becomes "What makes the most sense?" and "How can all 
>> this still be easy to *use*?".
>> After all I'm convinced that at the end of the day, the way Cocoon 
>> feels for someone who uses it or - maybe even more important - for 
>> someone who thinks about using it, is all that matters.
>>
>> Still this shouldn't result in a total mess for all who work on 
>> Cocoon itself.
>> But I'm sure there are way*s* (yes, more than just one ;-) ) to have 
>> all this.
>>
>> And I believe a layered approach is a good candicate to achieve this.
>> But as I said above: I'm sure there is more than just one way.
>>>
>>> Carsten
>

Re: JNet integration

Posted by Vadim Gritsenko <va...@reverycodes.com>.

On Mar 26, 2008, at 11:09 AM, Steven Dolg wrote:
>
> Carsten Ziegeler schrieb:
>> The pipeline writer needs to know how uri resolving works. He needs  
>> to know what input values are allowed, what relative values mean etc.
>>
>> But I will turn around the questions :) What caching do you need?  
>> The ultra complex caching we currently have which can cache partial  
>> pipelines? Or just the result of a pipeline?
> I believe this is a very important question.

Not really. Partial pipeline caching is implemented completely by the  
pipeline using all already existing APIs which were used by regular  
caching. From the perspective of pipeline class users, or from the  
perspective of classes which pipeline uses, there is no difference  
between using 'partial-response-caching' and 'complete-response- 
caching' pipelines. The distinction is only in private implementation  
details.

Vadim


> I'm sure use cases that require the "ultra complex" solution and use  
> cases that will work with a more simple approach are easily found.
> So the question becomes "What makes the most sense?" and "How can  
> all this still be easy to *use*?".
> After all I'm convinced that at the end of the day, the way Cocoon  
> feels for someone who uses it or - maybe even more important - for  
> someone who thinks about using it, is all that matters.
>
> Still this shouldn't result in a total mess for all who work on  
> Cocoon itself.
> But I'm sure there are way*s* (yes, more than just one ;-) ) to have  
> all this.
>
> And I believe a layered approach is a good candicate to achieve this.
> But as I said above: I'm sure there is more than just one way.
>>
>> Carsten

Re: JNet integration

Posted by Steven Dolg <st...@gmx.at>.


Carsten Ziegeler schrieb:
> The pipeline writer needs to know how uri resolving works. He needs to 
> know what input values are allowed, what relative values mean etc.
>
> But I will turn around the questions :) What caching do you need? The 
> ultra complex caching we currently have which can cache partial 
> pipelines? Or just the result of a pipeline?
I believe this is a very important question.

I'm sure use cases that require the "ultra complex" solution and use 
cases that will work with a more simple approach are easily found.
So the question becomes "What makes the most sense?" and "How can all 
this still be easy to *use*?".
After all I'm convinced that at the end of the day, the way Cocoon feels 
for someone who uses it or - maybe even more important - for someone who 
thinks about using it, is all that matters.

Still this shouldn't result in a total mess for all who work on Cocoon 
itself.
But I'm sure there are way*s* (yes, more than just one ;-) ) to have all 
this.

And I believe a layered approach is a good candicate to achieve this.
But as I said above: I'm sure there is more than just one way.
>
> Carsten

Re: JNet integration

Posted by Carsten Ziegeler <cz...@apache.org>.

Joerg Heinicke wrote:
> 
> I'm not sure if we are talking about the same. If I take Reinhard's 
> sample [1]
> 
> URL baseUrl = new URL("file:///C:/temp/");
> Pipeline pipeline = new NonCachingPipeline();
> pipeline.addComponent(new FileGenerator(new URL(baseUrl, "xyz.xml"));
> pipeline.addComponent(new XSLTTransformer(new URL(baseUrl, "xyz.xslt"));
> pipeline.addComponent(new XMLSerializer());
> pipeline.invoke(new InvocationImpl(System.out));

Ok, up to here we are talking about the same :) Adding the 
sourceresolver and sources to the above example and let the sources 
return the urls should be easy. That's more or less what you do today.

> 
> and want to add caching to it and therefore have to switch from URL to 
> Source and from FileGenerator to CachingFileGenerator - sorry, but 
> that's a mess since this means pretty much rewriting the application for 
> adding caching. Why is this page so slow? Oh, sorry we forgot this 
> pipeline to switch to caching ...
Ah, no no, I'm not saying this :) There should only be one file generator.
Let's use the above example: you setup the pipeline as shown above and 
then do a:
Pipeline cachingPipeline = new CachingPipeline(pipeline);

Et voila! That's all you have to do.

Now, as I answered to Ralphs mail, I don't know if this is possible or 
how to do it right now. So it might turn out that it's not or it's not 
worth doing it.

> That's why I wonder where a pipeline writer gets into uri resolving at 
> all? Why does he have to care about URLs or Sources? That's the 
> component developer's concern.
The pipeline writer needs to know how uri resolving works. He needs to 
know what input values are allowed, what relative values mean etc.

But I will turn around the questions :) What caching do you need? The 
ultra complex caching we currently have which can cache partial 
pipelines? Or just the result of a pipeline?

Carsten
-- 
Carsten Ziegeler
cziegeler@apache.org

Re: JNet integration

Posted by Joerg Heinicke <jo...@gmx.de>.

On 26.03.2008 08:04, Carsten Ziegeler wrote:

>>> What's the advantage of giving our components the responsibility to 
>>> deal with strings that represent sources?
>>
>> Isn't that what components are there for? Encapsulating commonly used 
>> functionality? Why should I extract that to the pipeline *instance* 
>> level? So writing pipelines is becoming a mess!
>>
> Hmm, I don't think so. Imagine a pipeline java api just taking a uri for 
> the sources used in the pipeline. That's simple and easy.
> Now, you can use the source resolver on top of that, resolve your 
> sources and you get a uri from your source that you can put into the 
> pipeline api.
> That's neither a mess nor does it require more java coding.

I'm not sure if we are talking about the same. If I take Reinhard's 
sample [1]

URL baseUrl = new URL("file:///C:/temp/");
Pipeline pipeline = new NonCachingPipeline();
pipeline.addComponent(new FileGenerator(new URL(baseUrl, "xyz.xml"));
pipeline.addComponent(new XSLTTransformer(new URL(baseUrl, "xyz.xslt"));
pipeline.addComponent(new XMLSerializer());
pipeline.invoke(new InvocationImpl(System.out));

and want to add caching to it and therefore have to switch from URL to 
Source and from FileGenerator to CachingFileGenerator - sorry, but 
that's a mess since this means pretty much rewriting the application for 
adding caching. Why is this page so slow? Oh, sorry we forgot this 
pipeline to switch to caching ...

Why can't high-level pipeline components use low-level components like 
SourceResolver? Then it is up to the component to use either URLs or 
Sources. That would be like implementing CacheableProcessingComponent or 
not.

That's why I wonder where a pipeline writer gets into uri resolving at 
all? Why does he have to care about URLs or Sources? That's the 
component developer's concern.

I really wonder if I miss something. All this seems to be too obvious to 
me ...

Joerg

[1] http://marc.info/?l=xml-cocoon-dev&m=120646488429681&w=4

Re: JNet integration

Posted by Carsten Ziegeler <cz...@apache.org>.

Joerg Heinicke wrote:
> On 26.03.2008 03:34, Reinhard Poetz wrote:
> 
>>> I never had the need to implement a Source and for the mentioned 
>>> simple cases I wonder where you have to cope with them at all? Cocoon 
>>> used to be a framework for non-Java developers ... even if we 
>>> introduce a pipeline API as in the examples in this thread why do I 
>>> need to care about Urls or Sources at all? Why should it be different 
>>> then map:generate with its src attribtue? And when I read 
>>> CacheableFileGenerator something tells me this approach is wrong.
>>
>> What's the advantage of giving our components the responsibility to 
>> deal with strings that represent sources?
> 
> Isn't that what components are there for? Encapsulating commonly used 
> functionality? Why should I extract that to the pipeline *instance* 
> level? So writing pipelines is becoming a mess!
> 
Hmm, I don't think so. Imagine a pipeline java api just taking a uri for 
the sources used in the pipeline. That's simple and easy.
Now, you can use the source resolver on top of that, resolve your 
sources and you get a uri from your source that you can put into the 
pipeline api.
That's neither a mess nor does it require more java coding.

Carsten

-- 
Carsten Ziegeler
cziegeler@apache.org

Re: JNet integration

Posted by Joerg Heinicke <jo...@gmx.de>.

On 26.03.2008 03:34, Reinhard Poetz wrote:

>> I never had the need to implement a Source and for the mentioned 
>> simple cases I wonder where you have to cope with them at all? Cocoon 
>> used to be a framework for non-Java developers ... even if we 
>> introduce a pipeline API as in the examples in this thread why do I 
>> need to care about Urls or Sources at all? Why should it be different 
>> then map:generate with its src attribtue? And when I read 
>> CacheableFileGenerator something tells me this approach is wrong.
> 
> What's the advantage of giving our components the responsibility to deal 
> with strings that represent sources?

Isn't that what components are there for? Encapsulating commonly used 
functionality? Why should I extract that to the pipeline *instance* 
level? So writing pipelines is becoming a mess!

Joerg

Re: JNet integration

Posted by Reinhard Poetz <re...@apache.org>.

Joerg Heinicke wrote:
> On 25.03.2008 10:53, Reinhard Poetz wrote:
> 
>> Once again, my goal is that if you use e.g. Corona in its simplest 
>> form, I don't want to make everybody and his dog depend on 
>> JNet/SourceResolve/Source. E.g. see the FileGenerator. Using the URL 
>> object is enough for simple use cases of a pipeline API.
>>
>> Yes, I understand that when it comes to caching pipelines, you need 
>> more, but not everybody needs caching pipelines. For that purpose 
>> there could be a CacheableFileGenerator, etc.
>>
>> If you are right and it is difficult or even impossible to remove the 
>> dependencies on source/sourceresolve/xmlutils/jnet, then be it. I 
>> withdraw my example Url("servlet:...") from above. When we can switch 
>> to sourceresolve 3.0, the dependency graph will get smaller anyway.
>>
>> The main benefit from using URLs (instead of the SourceResolver) comes 
>> from simple use cases, e.g. you need a pipeline in your Java 
>> application that reads in some XML file, performs some transformations 
>> and finally creates a PDF document. FWIW, using URLs should be all 
>> that you need.
> 
> Hmm, I don't see the advantages of dropping the Source abstractions. Why 
> giving up all the "good things" just to remove one dependency? 

I'm not talking about giving up the Source abstraction (as long as there is no 
equivalent replacement in the Java Standard API) but to put it on top of 
java.net.URL.

> What are 
> the downsides of the Source abstraction? 

When we are able to introduce a pipeline API, we don't need it for simple use 
cases. The concept of URLs is well-known in the Java world, the Source 
abstraction is not.

> I never had the need to 
> implement a Source and for the mentioned simple cases I wonder where you 
> have to cope with them at all? Cocoon used to be a framework for 
> non-Java developers ... even if we introduce a pipeline API as in the 
> examples in this thread why do I need to care about Urls or Sources at 
> all? Why should it be different then map:generate with its src 
> attribtue? And when I read CacheableFileGenerator something tells me 
> this approach is wrong.

What's the advantage of giving our components the responsibility to deal with 
strings that represent sources?

-- 
Reinhard Pötz                            Managing Director, {Indoqa} GmbH
                           http://www.indoqa.com/en/people/reinhard.poetz/

Member of the Apache Software Foundation
Apache Cocoon Committer, PMC member, PMC Chair        reinhard@apache.org
_________________________________________________________________________

Re: JNet integration

Posted by Joerg Heinicke <jo...@gmx.de>.

On 25.03.2008 10:53, Reinhard Poetz wrote:

> Once again, my goal is that if you use e.g. Corona in its simplest form, 
> I don't want to make everybody and his dog depend on 
> JNet/SourceResolve/Source. E.g. see the FileGenerator. Using the URL 
> object is enough for simple use cases of a pipeline API.
> 
> Yes, I understand that when it comes to caching pipelines, you need 
> more, but not everybody needs caching pipelines. For that purpose there 
> could be a CacheableFileGenerator, etc.
> 
> If you are right and it is difficult or even impossible to remove the 
> dependencies on source/sourceresolve/xmlutils/jnet, then be it. I 
> withdraw my example Url("servlet:...") from above. When we can switch to 
> sourceresolve 3.0, the dependency graph will get smaller anyway.
> 
> The main benefit from using URLs (instead of the SourceResolver) comes 
> from simple use cases, e.g. you need a pipeline in your Java application 
> that reads in some XML file, performs some transformations and finally 
> creates a PDF document. FWIW, using URLs should be all that you need.

Hmm, I don't see the advantages of dropping the Source abstractions. Why 
giving up all the "good things" just to remove one dependency? What are 
the downsides of the Source abstraction? I never had the need to 
implement a Source and for the mentioned simple cases I wonder where you 
have to cope with them at all? Cocoon used to be a framework for 
non-Java developers ... even if we introduce a pipeline API as in the 
examples in this thread why do I need to care about Urls or Sources at 
all? Why should it be different then map:generate with its src 
attribtue? And when I read CacheableFileGenerator something tells me 
this approach is wrong.

Joerg

Re: JNet integration

Posted by Reinhard Poetz <re...@apache.org>.

Grzegorz Kossakowski wrote:
> Reinhard Poetz pisze:
>> Are there any other use cases for releasing a source than the SitemapSource
>> (cocoon:/ protocol)?
> 
> Hmmm. CachingSource has non-trivial release() method as well. Anyway, I agree
> that most Sources do not need to be released at all.
> 
>> What's the problem with that? If you are happy with that what the URL 
>> object can do for you, you don't need to depend on any external stuff. If
>> you want more, you have to add some more dependencies to your code.
>> 
>> This sounds to me very familiar: If I want to use "advanced" logging, I 
>> have to add e.g. log4j. If I'm happy with that what the JDK offers, I don't
>> have to do anything.
>> 
>> What's so special in the case of Excalibur source?
> 
> I agree with you reasoning but I have a feeling that JDK API does not have
> its counterparts for the most basic functionality found in
> Source/SourceFactory:
> 
> * exists() - no counterpart * getInputStream() - openInputStream() * getURI()
> - toExternalForm() ???? (Javadocs suggest it's not a counterpart but practice
> suggests something else...) * getLastModified() - no counterpart
> 
> Dropping usage of JDK API only to resolve relative URI into absolute form
> feels strange. You will need to do that no matter where, in Corona (think
> caching pipelines), in SSF and anywhere else you do something non-trivial
> with Sources.
> 
>>> I'm going to invest my energy into implementation of my original idea of 
>>> providing default SourceResolver for SSF internal needs so we can release
>>> SSF 1.1.0 ASAP. I'll wait with JNet integration until someone (Carsten?)
>>> else chimes in and explains how everything should be glued.
>> I don't understand this. From a quick glance at your code I see that there
>> we are able to set the servlet context in the SSF without depending on
>> Excalibur sourceresolve or Excalibur source.
>> 
>> Why and what exactly do you want to change?
> 
> Current way of installing JNet through init() method of dummy Spring bean is
> a very, very dirt hack. Morever, since there is no way to resolve
> blockcontext: path into absolute ones I still need to obtain underlying
> Source instance. If it's the case, I don't see how all these hacks pay off.
 >
>>> Abstract description explaining what are _real_ benefits of integrating
>>> JNet into SSF and Cocoon (Corona?) in general would be good.
>> With JNet being set up correctly, Corona doesn't depend on any third-party
>> library. E.g. if you want to create a simple pipeline, you don't have to
>> provide a SourceResolver - using URLs us enough.
> 
> Yep, until caching comes in. Or until you want to log path of file being
> processed in /absolute/ form. ;-)
> 
>>> I really need to get some roadmap if I'm going to continue.
>> I think that the main goal is making SSF implementation useable for the 
>> usage without Cocoon core (2.2) and IMHO without having to setup a 
>> SourceResolver. A test case for this is when you can do
>> 
>> URL u = new URL("servlet:otherService:/a/b/c");
>> 
>> from within Corona and you get the expected inputstream afterwards.
>> 
> 
> I think little bit more should be expected. See above...


Once again, my goal is that if you use e.g. Corona in its simplest form, I don't 
want to make everybody and his dog depend on JNet/SourceResolve/Source. E.g. see 
the FileGenerator. Using the URL object is enough for simple use cases of a 
pipeline API.

Yes, I understand that when it comes to caching pipelines, you need more, but 
not everybody needs caching pipelines. For that purpose there could be a 
CacheableFileGenerator, etc.

If you are right and it is difficult or even impossible to remove the 
dependencies on source/sourceresolve/xmlutils/jnet, then be it. I withdraw my 
example Url("servlet:...") from above. When we can switch to sourceresolve 3.0, 
the dependency graph will get smaller anyway.

The main benefit from using URLs (instead of the SourceResolver) comes from 
simple use cases, e.g. you need a pipeline in your Java application that reads 
in some XML file, performs some transformations and finally creates a PDF 
document. FWIW, using URLs should be all that you need.

-- 
Reinhard Pötz                            Managing Director, {Indoqa} GmbH
                           http://www.indoqa.com/en/people/reinhard.poetz/

Member of the Apache Software Foundation
Apache Cocoon Committer, PMC member, PMC Chair        reinhard@apache.org
_________________________________________________________________________

Re: JNet integration

Posted by Grzegorz Kossakowski <gr...@tuffmail.com>.

Reinhard Poetz pisze:
> Are there any other use cases for releasing a source than the
> SitemapSource (cocoon:/ protocol)?

Hmmm. CachingSource has non-trivial release() method as well. Anyway, I agree that most Sources do
not need to be released at all.

> 
> What's the problem with that? If you are happy with that what the URL
> object can do for you, you don't need to depend on any external stuff.
> If you want more, you have to add some more dependencies to your code.
> 
> This sounds to me very familiar: If I want to use "advanced" logging, I
> have to add e.g. log4j. If I'm happy with that what the JDK offers, I
> don't have to do anything.
> 
> What's so special in the case of Excalibur source?

I agree with you reasoning but I have a feeling that JDK API does not have its counterparts for the
most basic functionality found in Source/SourceFactory:

  * exists() - no counterpart
  * getInputStream() - openInputStream()
  * getURI() - toExternalForm() ???? (Javadocs suggest it's not a counterpart but practice suggests
something else...)
  * getLastModified() - no counterpart

Dropping usage of JDK API only to resolve relative URI into absolute form feels strange. You will
need to do that no matter where, in Corona (think caching pipelines), in SSF and anywhere else you
do something non-trivial with Sources.

>> I'm going to invest my energy into implementation of my original idea of
>> providing default SourceResolver for SSF internal needs so we can
>> release SSF
>> 1.1.0 ASAP. I'll wait with JNet integration until someone (Carsten?) else
>> chimes in and explains how everything should be glued.
> 
> I don't understand this. From a quick glance at your code I see that
> there we are able to set the servlet context in the SSF without
> depending on Excalibur sourceresolve or Excalibur source.
> 
> Why and what exactly do you want to change?

Current way of installing JNet through init() method of dummy Spring bean is a very, very dirt hack.
Morever, since there is no way to resolve blockcontext: path into absolute ones I still need to
obtain underlying Source instance. If it's the case, I don't see how all these hacks pay off.

>> Abstract description explaining what are _real_ benefits of
>> integrating JNet
>> into SSF and Cocoon (Corona?) in general would be good. 
> 
> With JNet being set up correctly, Corona doesn't depend on any
> third-party library. E.g. if you want to create a simple pipeline, you
> don't have to provide a SourceResolver - using URLs us enough.

Yep, until caching comes in. Or until you want to log path of file being processed in /absolute/
form. ;-)

>> I really need to get
>> some roadmap if I'm going to continue.
> 
> I think that the main goal is making SSF implementation useable for the
> usage without Cocoon core (2.2) and IMHO without having to setup a
> SourceResolver. A test case for this is when you can do
> 
> URL u = new URL("servlet:otherService:/a/b/c");
> 
> from within Corona and you get the expected inputstream afterwards.
> 

I think little bit more should be expected. See above...

-- 
Grzegorz

Re: JNet integration

Posted by Reinhard Poetz <re...@apache.org>.

Grzegorz Kossakowski wrote:
> Grzegorz Kossakowski pisze:
>>> AFAIU, you call
>>> 
>>> Installer.setURLStreamHandlerFactory(new
>>> SourceURLStreamHandlerFactory());
>>> 
>>> at the startup of your application.
>>> 
>>> Then you can use the SourceFactoriesManager to install and uninstall 
>>> source factories.
>> Yes, but when and where should I call SourceFactoriesManager to install
>> SourceFactories? That's the main problem here.
> 
> Ok, somehow "solved" and committed. The stuff I committed should be
> considered as experimental (even though it works...) so don't be surprised
> seeing lots of hacks.
> 
> After playing with JNet idea for a while I'm more and more doubtful about the
> direction we have taken. I really like Source, SourceFactory interfaces, they
> are clean, focused and obvious to use contrary to the URL machinery from Java
> API. Look at what I committed, there is no way to release underlying Source
> object if InputStream was not been obtained.

Are there any other use cases for releasing a source than the SitemapSource 
(cocoon:/ protocol)?

> Moreover, if you need some advanced functionality (e.g. traversable source)
> you still need switch back to Excalibur interfaces. Same goes for modifiable,
> postable etc.

What's the problem with that? If you are happy with that what the URL object can 
do for you, you don't need to depend on any external stuff. If you want more, 
you have to add some more dependencies to your code.

This sounds to me very familiar: If I want to use "advanced" logging, I have to 
add e.g. log4j. If I'm happy with that what the JDK offers, I don't have to do 
anything.

What's so special in the case of Excalibur source?

> I'm going to invest my energy into implementation of my original idea of
> providing default SourceResolver for SSF internal needs so we can release SSF
> 1.1.0 ASAP. I'll wait with JNet integration until someone (Carsten?) else
> chimes in and explains how everything should be glued.

I don't understand this. From a quick glance at your code I see that there we 
are able to set the servlet context in the SSF without depending on Excalibur 
sourceresolve or Excalibur source.

Why and what exactly do you want to change?

> Abstract description explaining what are _real_ benefits of integrating JNet
> into SSF and Cocoon (Corona?) in general would be good. 

With JNet being set up correctly, Corona doesn't depend on any third-party 
library. E.g. if you want to create a simple pipeline, you don't have to provide 
a SourceResolver - using URLs us enough.

> I really need to get
> some roadmap if I'm going to continue.

I think that the main goal is making SSF implementation useable for the usage 
without Cocoon core (2.2) and IMHO without having to setup a SourceResolver. A 
test case for this is when you can do

URL u = new URL("servlet:otherService:/a/b/c");

from within Corona and you get the expected inputstream afterwards.

-- 
Reinhard Pötz                            Managing Director, {Indoqa} GmbH
                           http://www.indoqa.com/en/people/reinhard.poetz/

Member of the Apache Software Foundation
Apache Cocoon Committer, PMC member, PMC Chair        reinhard@apache.org
_________________________________________________________________________

Re: JNet integration doubts

Posted by Grzegorz Kossakowski <gr...@tuffmail.com>.

Grzegorz Kossakowski pisze:
>>
>> AFAIU, you call
>>
>> Installer.setURLStreamHandlerFactory(new SourceURLStreamHandlerFactory());
>>
>> at the startup of your application.
>>
>> Then you can use the SourceFactoriesManager to install and uninstall
>> source factories.
> 
> Yes, but when and where should I call SourceFactoriesManager to install SourceFactories?
> That's the main problem here.

Ok, somehow "solved" and committed. The stuff I committed should be considered as experimental (even
though it works...) so don't be surprised seeing lots of hacks.

After playing with JNet idea for a while I'm more and more doubtful about the direction we have
taken. I really like Source, SourceFactory interfaces, they are clean, focused and obvious to use
contrary to the URL machinery from Java API. Look at what I committed, there is no way to release
underlying Source object if InputStream was not been obtained.

Moreover, if you need some advanced functionality (e.g. traversable source) you still need switch
back to Excalibur interfaces. Same goes for modifiable, postable etc.

I'm going to invest my energy into implementation of my original idea of providing default
SourceResolver for SSF internal needs so we can release SSF 1.1.0 ASAP. I'll wait with JNet
integration until someone (Carsten?) else chimes in and explains how everything should be glued.

Abstract description explaining what are _real_ benefits of integrating JNet into SSF and Cocoon
(Corona?) in general would be good. I really need to get some roadmap if I'm going to continue.

Thanks for listening and have happy post-Easter time!

-- 
Grzegorz

Re: JNet integration doubts

Posted by Grzegorz Kossakowski <gr...@tuffmail.com>.

Reinhard Poetz pisze:
> Grzegorz Kossakowski wrote:
>> Hello,
>>
>> I've played with JNet for a while trying to integrate it with SSF and run
>> into many troubles.
>>
>> First of all, I'm not sure if I understand whole concept correctly. Do I
>> understand correctly that JNet provides SourceURLStreamHandlerFactory
>> class
>> that acts just like a bridge supporting legacy Source implementations?
>> Should
>> we consider URLStreamHandlerFactory and URLStreamHandler as general
>> replacements for SourceFactory and Source interfaces?
>>
>> If a long-term goal is to drop Source and SourceFactory interfaces
>> what about
>> extension like ModifiableSource, MoveableSource, PostableSource? How
>> can they
>> be supported by URLConnection and friends?
>>
>> --- o0o ---
>>
>> Another problem is with the implementation. There is a problem with
>> installing SourceURLStreamHandlerFactory because: a) it must be installed
>> before ServletFactoryBean is being used at Spring initialization phase
>> b) it
>> must be installed after ApplicationContext is created because
>> SourceFactories
>> are components that must be initialized by Spring container.
>>
>> I have no clue how to solve this problem. Any ideas?
> 
> 
> AFAIU, you call
> 
> Installer.setURLStreamHandlerFactory(new SourceURLStreamHandlerFactory());
> 
> at the startup of your application.
> 
> Then you can use the SourceFactoriesManager to install and uninstall
> source factories.

Yes, but when and where should I call SourceFactoriesManager to install SourceFactories?
That's the main problem here.

-- 
Grzegorz Kossakowski

Re: JNet integration doubts

Posted by Reinhard Poetz <re...@apache.org>.

Grzegorz Kossakowski wrote:
> Hello,
> 
> I've played with JNet for a while trying to integrate it with SSF and run
> into many troubles.
> 
> First of all, I'm not sure if I understand whole concept correctly. Do I
> understand correctly that JNet provides SourceURLStreamHandlerFactory class
> that acts just like a bridge supporting legacy Source implementations? Should
> we consider URLStreamHandlerFactory and URLStreamHandler as general 
> replacements for SourceFactory and Source interfaces?
> 
> If a long-term goal is to drop Source and SourceFactory interfaces what about
> extension like ModifiableSource, MoveableSource, PostableSource? How can they
> be supported by URLConnection and friends?
> 
> --- o0o ---
> 
> Another problem is with the implementation. There is a problem with
> installing SourceURLStreamHandlerFactory because: a) it must be installed
> before ServletFactoryBean is being used at Spring initialization phase b) it
> must be installed after ApplicationContext is created because SourceFactories
> are components that must be initialized by Spring container.
> 
> I have no clue how to solve this problem. Any ideas?


AFAIU, you call

Installer.setURLStreamHandlerFactory(new SourceURLStreamHandlerFactory());

at the startup of your application.

Then you can use the SourceFactoriesManager to install and uninstall source 
factories.

HTH

-- 
Reinhard Pötz                            Managing Director, {Indoqa} GmbH
                           http://www.indoqa.com/en/people/reinhard.poetz/

Member of the Apache Software Foundation
Apache Cocoon Committer, PMC member, PMC Chair        reinhard@apache.org
_________________________________________________________________________

Re: JNet integration doubts

Posted by Vadim Gritsenko <va...@reverycodes.com>.

On Apr 3, 2008, at 2:58 PM, Grzegorz Kossakowski wrote:
> Reinhard Poetz pisze:
>> Why do we have to replace the blockcontext: protocol at all?
>
> Take a look at its current source code. There is no such a thing  
> like "blockcontext:" protocol implementation at the moment.
>
> In my [RT] mail I explained how we could possibly to stop cheating  
> pretending there is a blockcontext protocol and

A:

> replace it with blockcontext expression that would better reflect  
> current implementation.

B:

> Another possibility (suggested by you) is to provide real  
> implementation of blockcontext: protocol and use blockcontext  
> protocol in base URLs for blocks. I cannot comment on this solution  
> because I haven't enough free time to check all implications.  
> Remember: you will put blockcontext into ServletContext that is  
> rather general interface. I don't say there is any problem, I'm only  
> saying I haven't checked if there is none.
>
> I prefer (only for now, as a quick solution) first way because there  
> is not much room for discussion, brainstorming and general research  
> which is quite opposite to URL-em-them-all approach. I really would  
> like to fix SSF ASAP and let the discussion/research on URL go in  
> parallel.

I've read your RT and I agree with conclusion that approach taken  
there - to convert String (blockcontext:) --> SourceResolver -->  
Source --> and back into String (file:) - it definitely smells bad.

But, I don't think plugging dependency to expressions block (A above)  
is the good idea. I'd rather prefer B: make blockcontext a regular  
protocol, and treat context path parameter as regular source without  
any special treatment. I'd expect any of supported source  
implementations to work there, be it http, webdav, or xmldb, or even  
blockcontext.

Vadim

Re: JNet integration doubts

Posted by Reinhard Poetz <re...@apache.org>.

Grzegorz Kossakowski wrote:
>> Why do we have to replace the blockcontext: protocol at all?
> 
> Take a look at its current source code. There is no such a thing like 
> "blockcontext:" protocol implementation at the moment.

Therefore I'm asking ...

> In my [RT] mail I explained how we could possibly to stop cheating 
> pretending there is a blockcontext protocol and replace it with 
> blockcontext expression that would better reflect current implementation.

making the blockcontext protocol an expression is just another big hack IMO.

> 
> Another possibility (suggested by you) is to provide real implementation 
> of blockcontext: protocol and use blockcontext protocol in base URLs for 
> blocks. I cannot comment on this solution because I haven't enough free 
> time to check all implications. Remember: you will put blockcontext into 
> ServletContext that is rather general interface. I don't say there is 
> any problem, I'm only saying I haven't checked if there is none.

Steven and I made the servlet protocol based on JNet work for Corona yesterday 
(not committed yet). Then we had a quick look into the resolution of the 
blockcontext protocol and we don't see any problem why it should not work.

> I prefer (only for now, as a quick solution) first way because there is 
> not much room for discussion, brainstorming and general research which 
> is quite opposite to URL-em-them-all approach. I really would like to 
> fix SSF ASAP and let the discussion/research on URL go in parallel.

Making blockcontext: a real protocol seems to be the simplest and by far the 
most elegant and most obvious solution. I will give this a try either today or 
at the Hackathon and find out if it is really as simple as expected.

-- 
Reinhard Pötz                            Managing Director, {Indoqa} GmbH
                           http://www.indoqa.com/en/people/reinhard.poetz/

Member of the Apache Software Foundation
Apache Cocoon Committer, PMC member, PMC Chair        reinhard@apache.org
_________________________________________________________________________

Re: JNet integration doubts

Posted by Grzegorz Kossakowski <gr...@tuffmail.com>.

Reinhard Poetz pisze:
> Grzegorz Kossakowski wrote:
>> Hello,
>>
>> I've played with JNet for a while trying to integrate it with SSF and run
>> into many troubles.
>>
>> First of all, I'm not sure if I understand whole concept correctly. Do I
>> understand correctly that JNet provides SourceURLStreamHandlerFactory 
>> class
>> that acts just like a bridge supporting legacy Source implementations? 
>> Should
>> we consider URLStreamHandlerFactory and URLStreamHandler as general 
>> replacements for SourceFactory and Source interfaces?
>>
>> If a long-term goal is to drop Source and SourceFactory interfaces 
>> what about
>> extension like ModifiableSource, MoveableSource, PostableSource? How 
>> can they
>> be supported by URLConnection and friends?
>>
>> --- o0o ---
>>
>> Another problem is with the implementation. There is a problem with
>> installing SourceURLStreamHandlerFactory because: a) it must be installed
>> before ServletFactoryBean is being used at Spring initialization phase 
>> b) it
>> must be installed after ApplicationContext is created because 
>> SourceFactories
>> are components that must be initialized by Spring container.
>>
>> I have no clue how to solve this problem. Any ideas?
> 
> Why do we have to replace the blockcontext: protocol at all?

Take a look at its current source code. There is no such a thing like "blockcontext:" protocol 
implementation at the moment.

In my [RT] mail I explained how we could possibly to stop cheating pretending there is a 
blockcontext protocol and replace it with blockcontext expression that would better reflect current 
implementation.

Another possibility (suggested by you) is to provide real implementation of blockcontext: protocol 
and use blockcontext protocol in base URLs for blocks. I cannot comment on this solution because I 
haven't enough free time to check all implications. Remember: you will put blockcontext into 
ServletContext that is rather general interface. I don't say there is any problem, I'm only saying I 
haven't checked if there is none.

I prefer (only for now, as a quick solution) first way because there is not much room for 
discussion, brainstorming and general research which is quite opposite to URL-em-them-all approach. 
I really would like to fix SSF ASAP and let the discussion/research on URL go in parallel.

-- 
Grzegorz Kossakowski

Re: JNet integration doubts

Posted by Reinhard Poetz <re...@apache.org>.

Grzegorz Kossakowski wrote:
> Hello,
> 
> I've played with JNet for a while trying to integrate it with SSF and run
> into many troubles.
> 
> First of all, I'm not sure if I understand whole concept correctly. Do I
> understand correctly that JNet provides SourceURLStreamHandlerFactory class
> that acts just like a bridge supporting legacy Source implementations? Should
> we consider URLStreamHandlerFactory and URLStreamHandler as general 
> replacements for SourceFactory and Source interfaces?
> 
> If a long-term goal is to drop Source and SourceFactory interfaces what about
> extension like ModifiableSource, MoveableSource, PostableSource? How can they
> be supported by URLConnection and friends?
> 
> --- o0o ---
> 
> Another problem is with the implementation. There is a problem with
> installing SourceURLStreamHandlerFactory because: a) it must be installed
> before ServletFactoryBean is being used at Spring initialization phase b) it
> must be installed after ApplicationContext is created because SourceFactories
> are components that must be initialized by Spring container.
> 
> I have no clue how to solve this problem. Any ideas?

Why do we have to replace the blockcontext: protocol at all?

-- 
Reinhard Pötz                            Managing Director, {Indoqa} GmbH
                           http://www.indoqa.com/en/people/reinhard.poetz/

Member of the Apache Software Foundation
Apache Cocoon Committer, PMC member, PMC Chair        reinhard@apache.org
_________________________________________________________________________