You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@cocoon.apache.org by Ugo Cei <u....@cbim.it> on 2002/07/03 18:32:21 UTC

Source vs. Generator

A long long time ago ... [1] someone here or on cocoon-users made a good 
job of explaining the pros & cons of Sources vs. Generators. 
Unfortunately, I can't find it again either on MARC, via Google or in my 
archived folders.

If the original author or someone else can give me a pointer, I'd be 
forever grateful :)

	Ugo

[1]: Don McLean, American Pie

-- 
Ugo Cei - Consorzio di Bioingegneria e Informatica Medica
P.le Volontari del Sangue, 2 - 27100 Pavia - Italy
Phone: +39.0382.525100 - E-mail: u.cei@cbim.it


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: avoiding convoluted Cc mess (was: Re: Source vs. Generator)

Posted by Sylvain Wallez <sy...@anyware-tech.com>.

David Crossley wrote:

>Help. We are going to end up with a massive mess of Cc email
>back and forth between -docs -dev -users. This concern was
>raised by various people during the original vote on -dev.
>http://marc.theaimsgroup.com/?t=102068666400002
>
>So how are we going to deal with this.
>
>Sorry Sylvain, not trying to admonish you. It is just that
>yours was the first occurrence.
>

No problem. I cc'ed as the thread started on cocoon-dev and is more 
docs-relevant.

Here are some simple rules that should allow us to limit cross-posts :
- we should make the assumption that all developers are also subscribed 
to dev,
- any docs-related thread either on -users or -dev should be moved to 
-docs. The original list isn't cc'ed, but a short reply indicating the 
thread move should be sent.

How does this sound ?

Sylvain

-- 
Sylvain Wallez
  Anyware Technologies                  Apache Cocoon
  http://www.anyware-tech.com           mailto:sylvain@apache.org

avoiding convoluted Cc mess (was: Re: Source vs. Generator)

Posted by David Crossley <cr...@indexgeo.com.au>.

Help. We are going to end up with a massive mess of Cc email
back and forth between -docs -dev -users. This concern was
raised by various people during the original vote on -dev.
http://marc.theaimsgroup.com/?t=102068666400002

So how are we going to deal with this.

Sorry Sylvain, not trying to admonish you. It is just that
yours was the first occurrence.
--David

--------------
> From: Sylvain Wallez <sy...@anyware-tech.com>
> Reply-To: cocoon-docs@xml.apache.org
> To: cocoon-dev@xml.apache.org
> Cc: cocoon-docs@xml.apache.org
> Subject: Re: Source vs. Generator
--------------

Re: Source vs. Generator

Posted by Vadim Gritsenko <va...@verizon.net>.

Ugo Cei wrote:

> A long long time ago Carsten Ziegeler wrote:
>
>> Nearly all generators could be rewritten as sources, for
>> example the RequestGenerator could be written as a "request:"
>> protocol. But does this make sense - I would say: "No". I think a 
>> protocol makes sense if several, different sources
>> (documents, pieces of information) can be obtained using this
>> protocol. For example using an FTP protocol you can fetch
>> several files from the FTP server.
>> A request protocol for example addresses only one piece of
>> information, the request.
>
>
> After more than three months, I incurred in a scenario that might 
> justify the implementation of a RequestSource.
>
> Say you have an HTML form with a textarea field, where the user is 
> allowed to paste an HTML (not XHTML) fragment, maybe because he is 
> using some rich text editor that outputs a bunch of invalid HTML, like 
> the MS rich text editor for IE or Mozilla's ComposIte [1].
>
> Moreover, say that you want to take this text and convert it to 
> well-formed XML with JTidy. You could do it all with some custom 
> action or XSP page or custom generator. But wouldn't it be much easier 
> to just write something like:
>
> <map:generate type="html" src="request://parameters/parametername"/>
>
> ?
>
> Is there's an easier and more elegant alternative? And if there isn't 
> one, if I wrote this kind of Source, would it be useful to someone 
> else beside me?


I was planning for something similar, but working on 
request/session/application parameters/attributes, with integration with 
xscript variables and webapps.contexts, and (possibly) with JXPath support.

PS These ideas go back to 
http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=102760517302812&w=2

Vadim


>     Ugo
>
> [1]: http://composite.mozdev.org/index.html



---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: Source vs. Generator

Posted by Nicola Ken Barozzi <ni...@apache.org>.

Ugo Cei wrote:
> A long long time ago Carsten Ziegeler wrote:
> 
>> Nearly all generators could be rewritten as sources, for
>> example the RequestGenerator could be written as a "request:"
>> protocol. But does this make sense - I would say: "No". I think a 
>> protocol makes sense if several, different sources
>> (documents, pieces of information) can be obtained using this
>> protocol. For example using an FTP protocol you can fetch
>> several files from the FTP server.
>> A request protocol for example addresses only one piece of
>> information, the request.
> 
> 
> After more than three months, I incurred in a scenario that might 
> justify the implementation of a RequestSource.
> 
> Say you have an HTML form with a textarea field, where the user is 
> allowed to paste an HTML (not XHTML) fragment, maybe because he is using 
> some rich text editor that outputs a bunch of invalid HTML, like the MS 
> rich text editor for IE or Mozilla's ComposIte [1].
> 
> Moreover, say that you want to take this text and convert it to 
> well-formed XML with JTidy. You could do it all with some custom action 
> or XSP page or custom generator. But wouldn't it be much easier to just 
> write something like:
> 
> <map:generate type="html" src="request://parameters/parametername"/>
> 
> ?
> 
> Is there's an easier and more elegant alternative? And if there isn't 
> one, if I wrote this kind of Source, would it be useful to someone else 
> beside me?

A Source is a way of obtaining a piece of information.
A Generator is a way of converting it to XML.

Since all Generators need to obtain a piece of information first, it's 
only logical that every Generator should work from data gotten from a 
Source.

Sometimes it won't be practical, but conceptually it stands, and it adds 
a great deal of flexibility.

As for the Request, we are used to it being an Object per se, but it's 
really an Object *holder*, a reference.
So it's a way of obtaining Objects, not only an Object itself, so it 
should be repackaged as a Source.

-- 
Nicola Ken Barozzi                   nicolaken@apache.org
             - verba volant, scripta manent -
    (discussions get forgotten, just code remains)
---------------------------------------------------------------------


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: Source vs. Generator

Posted by Andy Lewis <aj...@ascii27.net>.

I could see uses for it - very similar to what you suggest. However I was instead looking into
achiving the same thing with the Request Generator, the Fragment Extractor and internal
pipelines...
> A long long time ago Carsten Ziegeler wrote:
>> Nearly all generators could be rewritten as sources, for
>> example the RequestGenerator could be written as a "request:"
>> protocol. But does this make sense - I would say: "No". I
>> think a protocol makes sense if several, different sources
>> (documents, pieces of information) can be obtained using this
>> protocol. For example using an FTP protocol you can fetch
>> several files from the FTP server.
>> A request protocol for example addresses only one piece of
>> information, the request.
>
> After more than three months, I incurred in a scenario that might  justify the implementation
> of a RequestSource.
>
> Say you have an HTML form with a textarea field, where the user is  allowed to paste an HTML
> (not XHTML) fragment, maybe because he is using  some rich text editor that outputs a bunch of
> invalid HTML, like the MS  rich text editor for IE or Mozilla's ComposIte [1].
>
> Moreover, say that you want to take this text and convert it to
> well-formed XML with JTidy. You could do it all with some custom action  or XSP page or custom
> generator. But wouldn't it be much easier to just  write something like:
>
> <map:generate type="html" src="request://parameters/parametername"/>
>
> ?
>
> Is there's an easier and more elegant alternative? And if there isn't  one, if I wrote this
> kind of Source, would it be useful to someone else  beside me?
>
> 	Ugo
>
> [1]: http://composite.mozdev.org/index.html
>
> --
> Ugo Cei - http://www.beblogging.com/blog/
>
>
> --------------------------------------------------------------------- To unsubscribe, e-mail:
> cocoon-dev-unsubscribe@xml.apache.org
> For additional commands, email: cocoon-dev-help@xml.apache.org


-- 
"The heights of genius are only measurable by the depths of stupidity."



---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: Source vs. Generator

Posted by Ugo Cei <u....@cbim.it>.

A long long time ago Carsten Ziegeler wrote:
> Nearly all generators could be rewritten as sources, for
> example the RequestGenerator could be written as a "request:"
> protocol. But does this make sense - I would say: "No". I 
> think a protocol makes sense if several, different sources
> (documents, pieces of information) can be obtained using this
> protocol. For example using an FTP protocol you can fetch
> several files from the FTP server.
> A request protocol for example addresses only one piece of
> information, the request.

After more than three months, I incurred in a scenario that might 
justify the implementation of a RequestSource.

Say you have an HTML form with a textarea field, where the user is 
allowed to paste an HTML (not XHTML) fragment, maybe because he is using 
some rich text editor that outputs a bunch of invalid HTML, like the MS 
rich text editor for IE or Mozilla's ComposIte [1].

Moreover, say that you want to take this text and convert it to 
well-formed XML with JTidy. You could do it all with some custom action 
or XSP page or custom generator. But wouldn't it be much easier to just 
write something like:

<map:generate type="html" src="request://parameters/parametername"/>

?

Is there's an easier and more elegant alternative? And if there isn't 
one, if I wrote this kind of Source, would it be useful to someone else 
beside me?

	Ugo

[1]: http://composite.mozdev.org/index.html

-- 
Ugo Cei - http://www.beblogging.com/blog/

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: Source vs. Generator

Posted by Jeremy Quinn <sh...@mac.com>.

On Thursday, July 4, 2002, at 10:37 AM, Sylvain Wallez wrote:

>
>> Ok, can someone add more "confusion"? ;)
>>
>
> Maybe Jeremy could tell us that slash-edit allows to read 
> stylesheets from the request ;)
>

Don't knock it, sounds like a great idea ;)


regards Jeremy


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: Source vs. Generator

Posted by Sylvain Wallez <sy...@anyware-tech.com>.

Carsten Ziegeler wrote:

>And to add more complexity, we have a XMLizer! What's that?
>
>Now, as Sylvain explains a Source delivers data in any format
>which might not be XML. To classify the format of the data,
>mime-types are used, so a Source can return the mime-type
>of the data, for example "text/html" etc.
>
>The XMLizer is a (guess what) configurable Avalon component
>which knows how to convert some mime-types to XML, for example
>it can convert "text/html" to XHTML (using jtidy).
>  
>

Damn, I missed that one !

>The Cocoon SourceResolver checks (as Sylvain has explained) in
>the toSAX(Source) method, if the Source is itself XMLizable,
>which means if the Source can convert itself (or it's data)
>to an XML representation). If this is not the case, the 
>XMLizer is used. It get's the mime-type from the source and
>looks if it has a mapping for this mime-type to XML. If so
>this mapping is used.
>If the source does not return a mime-type, it's assumed that
>the source already delivers XML and the XML parser is used
>on the content.
>
>So, if you look at this scenario, there is actually no need
>for special generators like the SWFGenerator mentioned below,
>because the XMLizer could be extended by this conversion and
>then the usual FileGenerator could be used as well.
>  
>

But this works only if the source is capable of determining the 
mime-type, which isn't always obvious (consider for example the BlobSource).

>And now the final attack:
>Nearly all generators could be rewritten as sources, for
>example the RequestGenerator could be written as a "request:"
>protocol. But does this make sense - I would say: "No". I 
>think a protocol makes sense if several, different sources
>(documents, pieces of information) can be obtained using this
>protocol. For example using an FTP protocol you can fetch
>several files from the FTP server.
>A request protocol for example addresses only one piece of
>information, the request.
>

Agree. Moreover, writing a source makes sense only if the data it 
provides can be used for other purposes than a generator. An XSL 
stylesheet can be fetched from an FTP server, but I can hardly think of 
fetching a stylesheet from the request !

>Ok, can someone add more "confusion"? ;)
>

Maybe Jeremy could tell us that slash-edit allows to read stylesheets 
from the request ;)

>And yes, Sylvain, this should really go in the docs!
>  
>

Ok. I'll assemble this in an xdoc.

Sylvain

-- 
Sylvain Wallez
  Anyware Technologies                  Apache Cocoon
  http://www.anyware-tech.com           mailto:sylvain@apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

RE: Source vs. Generator

Posted by Carsten Ziegeler <cz...@s-und-n.de>.

And to add more complexity, we have a XMLizer! What's that?

Now, as Sylvain explains a Source delivers data in any format
which might not be XML. To classify the format of the data,
mime-types are used, so a Source can return the mime-type
of the data, for example "text/html" etc.

The XMLizer is a (guess what) configurable Avalon component
which knows how to convert some mime-types to XML, for example
it can convert "text/html" to XHTML (using jtidy).

The Cocoon SourceResolver checks (as Sylvain has explained) in
the toSAX(Source) method, if the Source is itself XMLizable,
which means if the Source can convert itself (or it's data)
to an XML representation). If this is not the case, the 
XMLizer is used. It get's the mime-type from the source and
looks if it has a mapping for this mime-type to XML. If so
this mapping is used.
If the source does not return a mime-type, it's assumed that
the source already delivers XML and the XML parser is used
on the content.

So, if you look at this scenario, there is actually no need
for special generators like the SWFGenerator mentioned below,
because the XMLizer could be extended by this conversion and
then the usual FileGenerator could be used as well.

And now the final attack:
Nearly all generators could be rewritten as sources, for
example the RequestGenerator could be written as a "request:"
protocol. But does this make sense - I would say: "No". I 
think a protocol makes sense if several, different sources
(documents, pieces of information) can be obtained using this
protocol. For example using an FTP protocol you can fetch
several files from the FTP server.
A request protocol for example addresses only one piece of
information, the request.

Ok, can someone add more "confusion"? ;)

And yes, Sylvain, this should really go in the docs!

Carsten

> -----Original Message-----
> From: Sylvain Wallez [mailto:sylvain.wallez@anyware-tech.com]
> Sent: Thursday, July 04, 2002 10:03 AM
> To: cocoon-dev@xml.apache.org
> Cc: cocoon-docs@xml.apache.org
> Subject: Re: Source vs. Generator
> 
> 
> Ugo Cei wrote:
> 
> > A long long time ago ... [1] someone here or on cocoon-users made a 
> > good job of explaining the pros & cons of Sources vs. Generators. 
> > Unfortunately, I can't find it again either on MARC, via Google or in 
> > my archived folders.
> >
> > If the original author or someone else can give me a pointer, I'd be 
> > forever grateful :)
> 
> 
> Couldn't find the post you mention, as searching "source generator" give 
> so many replies ! Anyway, here are some (hopefully clear) explanations.
> 
>                                  -- o --
> 
> A source is a mean to access data, and makes no assumption on what kind 
> of data it is. It provides the low-level plumbing to access the data. 
> That's why we have FileSource, URLSource, BlobSource (for SQL blobs), 
> SlideSource (for webdav), etc. There is absolutely no assumption that 
> data accessed using a Source is XML, nor that it can be converted to XML 
> (e.g. gif images).
> 
> Sources were added to Cocoon, and later to Avalon because the URL 
> protocol extension mechanism provided by the JDK is very limited, as it 
> requires some parameters to be set at the JVM level and only allows 
> JVM-wide protocol definitions, which is obviously bad for application 
> servers.
> 
>                                  -- o --
> 
> On the other side, a generator produces XML from a given environment. We 
> have some generators that rely on sources, other that don't. The 
> RequestGenerator for example produces some XML, but doesn't rely on a 
> Source.
> 
> Source-related generators take a Source as input and generate XML from 
> the data given this Source. The most well-kown is the FileGenerator, 
> which feeds an XML parser with the Source data. This makes the important 
> assumption that data is an XML text. But there are also other 
> source-related generators that accept non-XML data : the SWFGenerator 
> reads a Flash document from a source and converts it to XML.
> 
> So a source-related generator should be considered as a parser (in the 
> general meaning of the word), that takes data from a source and produces 
> XML.
> 
>                                  -- o --
> 
> Now that things are clear, lets add some confusion ;)
> 
> Some sources provide access to native XML datasources, such as xmldb. 
> For these sources, there is no need for a parser since the data can 
> already been accessed in its XML form (i.e. DOM or SAX). Moreover, it 
> would be a waste of time to ask the source to serialize XML data to 
> textual form and then feed a parser. So these Sources implement the 
> XMLizable interface to provide direct access to XML data.
> 
> Cocoon's SourceResolver takes care of that and provides a toSAX(Source) 
> method that decides to use a parser or not depending on the XMLizable 
> nature of the Source. Of course, if the source isn't natively XML, this 
> assumes it provides XML in its textual form.
> 
>                                  -- o --
> 
> As a final note, Sources have a wider usage range than geneators. Cocoon 
> uses sources everywhere it needs to access data, since sources provide 
> independence against the access means. So sources are used to read XSL 
> stylesheets, the sitemap, XSP source files, images (in readers), etc. 
> And don't forget the nice Cocoon source that gives access to data 
> produced dynamically by Cocoon itself.
> 
> 
> Hope this helps.
> 
> Seems like this explanation is a good candidate for an xdoc. Where 
> should it be better located : in user concepts, or developper concepts ?
> 
> 
> Sylvain
> 
> -- 
> Sylvain Wallez
>   Anyware Technologies                  Apache Cocoon
>   http://www.anyware-tech.com           mailto:sylvain@apache.org
> 
> 
>

Re: Source vs. Generator

Posted by Diana Shannon <sh...@apache.org>.

On Thursday, July 4, 2002, at 04:03  AM, Sylvain Wallez wrote:

> Seems like this explanation is a good candidate for an xdoc. Where 
> should it be better located : in user concepts, or developper concepts ?

For now, let's start with an xdoc in users/concepts.

Down the road, IMHO it should be relocated. I just raised the visibility 
of Gerhard's proposed TOC for a refactored guide. It's now an xdoc 
(src/documentation/xdoc/plan/proposed-toc.xml). This means refactoring 
existing core documents of all three current guides: CTWIG, user, and 
developer into a single doc. More on this soon.

Why don't you check out the TOC and find a place for it? If there's no 
obvious location, please patch as necessary. And remember also to update 
the release branch. As we are close to release, as of last night, both 
branches were synced (as far as xdocs). Please note I still have work to 
do (e.g. properly flagging all 2.1 docs, etc.)

Thanks for the great addition!

Diana

RE: Source vs. Generator

Posted by Carsten Ziegeler <cz...@s-und-n.de>.

And to add more complexity, we have a XMLizer! What's that?

Now, as Sylvain explains a Source delivers data in any format
which might not be XML. To classify the format of the data,
mime-types are used, so a Source can return the mime-type
of the data, for example "text/html" etc.

The XMLizer is a (guess what) configurable Avalon component
which knows how to convert some mime-types to XML, for example
it can convert "text/html" to XHTML (using jtidy).

The Cocoon SourceResolver checks (as Sylvain has explained) in
the toSAX(Source) method, if the Source is itself XMLizable,
which means if the Source can convert itself (or it's data)
to an XML representation). If this is not the case, the 
XMLizer is used. It get's the mime-type from the source and
looks if it has a mapping for this mime-type to XML. If so
this mapping is used.
If the source does not return a mime-type, it's assumed that
the source already delivers XML and the XML parser is used
on the content.

So, if you look at this scenario, there is actually no need
for special generators like the SWFGenerator mentioned below,
because the XMLizer could be extended by this conversion and
then the usual FileGenerator could be used as well.

And now the final attack:
Nearly all generators could be rewritten as sources, for
example the RequestGenerator could be written as a "request:"
protocol. But does this make sense - I would say: "No". I 
think a protocol makes sense if several, different sources
(documents, pieces of information) can be obtained using this
protocol. For example using an FTP protocol you can fetch
several files from the FTP server.
A request protocol for example addresses only one piece of
information, the request.

Ok, can someone add more "confusion"? ;)

And yes, Sylvain, this should really go in the docs!

Carsten

> -----Original Message-----
> From: Sylvain Wallez [mailto:sylvain.wallez@anyware-tech.com]
> Sent: Thursday, July 04, 2002 10:03 AM
> To: cocoon-dev@xml.apache.org
> Cc: cocoon-docs@xml.apache.org
> Subject: Re: Source vs. Generator
> 
> 
> Ugo Cei wrote:
> 
> > A long long time ago ... [1] someone here or on cocoon-users made a 
> > good job of explaining the pros & cons of Sources vs. Generators. 
> > Unfortunately, I can't find it again either on MARC, via Google or in 
> > my archived folders.
> >
> > If the original author or someone else can give me a pointer, I'd be 
> > forever grateful :)
> 
> 
> Couldn't find the post you mention, as searching "source generator" give 
> so many replies ! Anyway, here are some (hopefully clear) explanations.
> 
>                                  -- o --
> 
> A source is a mean to access data, and makes no assumption on what kind 
> of data it is. It provides the low-level plumbing to access the data. 
> That's why we have FileSource, URLSource, BlobSource (for SQL blobs), 
> SlideSource (for webdav), etc. There is absolutely no assumption that 
> data accessed using a Source is XML, nor that it can be converted to XML 
> (e.g. gif images).
> 
> Sources were added to Cocoon, and later to Avalon because the URL 
> protocol extension mechanism provided by the JDK is very limited, as it 
> requires some parameters to be set at the JVM level and only allows 
> JVM-wide protocol definitions, which is obviously bad for application 
> servers.
> 
>                                  -- o --
> 
> On the other side, a generator produces XML from a given environment. We 
> have some generators that rely on sources, other that don't. The 
> RequestGenerator for example produces some XML, but doesn't rely on a 
> Source.
> 
> Source-related generators take a Source as input and generate XML from 
> the data given this Source. The most well-kown is the FileGenerator, 
> which feeds an XML parser with the Source data. This makes the important 
> assumption that data is an XML text. But there are also other 
> source-related generators that accept non-XML data : the SWFGenerator 
> reads a Flash document from a source and converts it to XML.
> 
> So a source-related generator should be considered as a parser (in the 
> general meaning of the word), that takes data from a source and produces 
> XML.
> 
>                                  -- o --
> 
> Now that things are clear, lets add some confusion ;)
> 
> Some sources provide access to native XML datasources, such as xmldb. 
> For these sources, there is no need for a parser since the data can 
> already been accessed in its XML form (i.e. DOM or SAX). Moreover, it 
> would be a waste of time to ask the source to serialize XML data to 
> textual form and then feed a parser. So these Sources implement the 
> XMLizable interface to provide direct access to XML data.
> 
> Cocoon's SourceResolver takes care of that and provides a toSAX(Source) 
> method that decides to use a parser or not depending on the XMLizable 
> nature of the Source. Of course, if the source isn't natively XML, this 
> assumes it provides XML in its textual form.
> 
>                                  -- o --
> 
> As a final note, Sources have a wider usage range than geneators. Cocoon 
> uses sources everywhere it needs to access data, since sources provide 
> independence against the access means. So sources are used to read XSL 
> stylesheets, the sitemap, XSP source files, images (in readers), etc. 
> And don't forget the nice Cocoon source that gives access to data 
> produced dynamically by Cocoon itself.
> 
> 
> Hope this helps.
> 
> Seems like this explanation is a good candidate for an xdoc. Where 
> should it be better located : in user concepts, or developper concepts ?
> 
> 
> Sylvain
> 
> -- 
> Sylvain Wallez
>   Anyware Technologies                  Apache Cocoon
>   http://www.anyware-tech.com           mailto:sylvain@apache.org
> 
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: Source vs. Generator

Posted by Sylvain Wallez <sy...@anyware-tech.com>.

Ugo Cei wrote:

> A long long time ago ... [1] someone here or on cocoon-users made a 
> good job of explaining the pros & cons of Sources vs. Generators. 
> Unfortunately, I can't find it again either on MARC, via Google or in 
> my archived folders.
>
> If the original author or someone else can give me a pointer, I'd be 
> forever grateful :)

Couldn't find the post you mention, as searching "source generator" give 
so many replies ! Anyway, here are some (hopefully clear) explanations.

                                 -- o --

A source is a mean to access data, and makes no assumption on what kind 
of data it is. It provides the low-level plumbing to access the data. 
That's why we have FileSource, URLSource, BlobSource (for SQL blobs), 
SlideSource (for webdav), etc. There is absolutely no assumption that 
data accessed using a Source is XML, nor that it can be converted to XML 
(e.g. gif images).

Sources were added to Cocoon, and later to Avalon because the URL 
protocol extension mechanism provided by the JDK is very limited, as it 
requires some parameters to be set at the JVM level and only allows 
JVM-wide protocol definitions, which is obviously bad for application 
servers.

                                 -- o --

On the other side, a generator produces XML from a given environment. We 
have some generators that rely on sources, other that don't. The 
RequestGenerator for example produces some XML, but doesn't rely on a 
Source.

Source-related generators take a Source as input and generate XML from 
the data given this Source. The most well-kown is the FileGenerator, 
which feeds an XML parser with the Source data. This makes the important 
assumption that data is an XML text. But there are also other 
source-related generators that accept non-XML data : the SWFGenerator 
reads a Flash document from a source and converts it to XML.

So a source-related generator should be considered as a parser (in the 
general meaning of the word), that takes data from a source and produces 
XML.

                                 -- o --

Now that things are clear, lets add some confusion ;)

Some sources provide access to native XML datasources, such as xmldb. 
For these sources, there is no need for a parser since the data can 
already been accessed in its XML form (i.e. DOM or SAX). Moreover, it 
would be a waste of time to ask the source to serialize XML data to 
textual form and then feed a parser. So these Sources implement the 
XMLizable interface to provide direct access to XML data.

Cocoon's SourceResolver takes care of that and provides a toSAX(Source) 
method that decides to use a parser or not depending on the XMLizable 
nature of the Source. Of course, if the source isn't natively XML, this 
assumes it provides XML in its textual form.

                                 -- o --

As a final note, Sources have a wider usage range than geneators. Cocoon 
uses sources everywhere it needs to access data, since sources provide 
independence against the access means. So sources are used to read XSL 
stylesheets, the sitemap, XSP source files, images (in readers), etc. 
And don't forget the nice Cocoon source that gives access to data 
produced dynamically by Cocoon itself.

Hope this helps.

Seems like this explanation is a good candidate for an xdoc. Where 
should it be better located : in user concepts, or developper concepts ?

Sylvain

-- 
Sylvain Wallez
  Anyware Technologies                  Apache Cocoon
  http://www.anyware-tech.com           mailto:sylvain@apache.org

Re: Source vs. Generator

Posted by Sylvain Wallez <sy...@anyware-tech.com>.

Ugo Cei wrote:

> A long long time ago ... [1] someone here or on cocoon-users made a 
> good job of explaining the pros & cons of Sources vs. Generators. 
> Unfortunately, I can't find it again either on MARC, via Google or in 
> my archived folders.
>
> If the original author or someone else can give me a pointer, I'd be 
> forever grateful :)

Couldn't find the post you mention, as searching "source generator" give 
so many replies ! Anyway, here are some (hopefully clear) explanations.

                                 -- o --

A source is a mean to access data, and makes no assumption on what kind 
of data it is. It provides the low-level plumbing to access the data. 
That's why we have FileSource, URLSource, BlobSource (for SQL blobs), 
SlideSource (for webdav), etc. There is absolutely no assumption that 
data accessed using a Source is XML, nor that it can be converted to XML 
(e.g. gif images).

Sources were added to Cocoon, and later to Avalon because the URL 
protocol extension mechanism provided by the JDK is very limited, as it 
requires some parameters to be set at the JVM level and only allows 
JVM-wide protocol definitions, which is obviously bad for application 
servers.

                                 -- o --

On the other side, a generator produces XML from a given environment. We 
have some generators that rely on sources, other that don't. The 
RequestGenerator for example produces some XML, but doesn't rely on a 
Source.

Source-related generators take a Source as input and generate XML from 
the data given this Source. The most well-kown is the FileGenerator, 
which feeds an XML parser with the Source data. This makes the important 
assumption that data is an XML text. But there are also other 
source-related generators that accept non-XML data : the SWFGenerator 
reads a Flash document from a source and converts it to XML.

So a source-related generator should be considered as a parser (in the 
general meaning of the word), that takes data from a source and produces 
XML.

                                 -- o --

Now that things are clear, lets add some confusion ;)

Some sources provide access to native XML datasources, such as xmldb. 
For these sources, there is no need for a parser since the data can 
already been accessed in its XML form (i.e. DOM or SAX). Moreover, it 
would be a waste of time to ask the source to serialize XML data to 
textual form and then feed a parser. So these Sources implement the 
XMLizable interface to provide direct access to XML data.

Cocoon's SourceResolver takes care of that and provides a toSAX(Source) 
method that decides to use a parser or not depending on the XMLizable 
nature of the Source. Of course, if the source isn't natively XML, this 
assumes it provides XML in its textual form.

                                 -- o --

As a final note, Sources have a wider usage range than geneators. Cocoon 
uses sources everywhere it needs to access data, since sources provide 
independence against the access means. So sources are used to read XSL 
stylesheets, the sitemap, XSP source files, images (in readers), etc. 
And don't forget the nice Cocoon source that gives access to data 
produced dynamically by Cocoon itself.

Hope this helps.

Seems like this explanation is a good candidate for an xdoc. Where 
should it be better located : in user concepts, or developper concepts ?

Sylvain

-- 
Sylvain Wallez
  Anyware Technologies                  Apache Cocoon
  http://www.anyware-tech.com           mailto:sylvain@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org