You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@cocoon.apache.org by Stefano Mazzocchi <st...@apache.org> on 2000/01/23 23:57:49 UTC

[Moving on] SAX vs. DOM part II

Hello people,

[note: I CC'ed all the people that should be involved in this discussion
which I find critical for the evolution of the Cocoon project. Sorry for
those of you that are also subscribed to the Cocoon-dev mail list, but I
want you to be named so that others know who you are and the role that
you have]

I'd like to introduce you the people on this microforum:

- John Milan, is a software architect corrently working for DataChannel.
He'll play the DOM expert and DOM advocate role since he helped creating
and designed the DataChannel virtual DOM implementation. John contacted
me before Xmas investigating possible integrations of their DOM
implementation into Cocoon as an official donation. I waited until 1.6
was released. Now it's time to talk about it a little more.

- Clark C. Evans, is a crazy and brilliant guy that works for the
Gardner Group (or had worked for.. sorry I don't know your status right
now :)... he felt in love with Cocoon when I showed him his power last
may during first Exolab in France. Since then, Clark has been very
active in both his list and xml-dev and the xsl list, also proposing an
alternative to XML called YML. Clark was the very first to outline the
problems of the DOM model for big files, also advocating for XSLT
incremental operativity. Here, he'll play the SAX-DOM hybrid advocate
and something else, I'm sure :)

- Ted Leung, is one of the key software engineers at the XML team in
IBM, he's one of the makers of the Xerces parser and he'll play the role
of the man that worked with both SAX and DOM. I hope he'll bring
knowledge about integrating the two and how Cocoon integration with
Xerces can be smoother, faster and more useful for everybody.

- Scott Boat, is a software engineer at Lotus, author of the Xalan XSLT
processor, member of the XSLT WG at W3C. In this discussion, he'll play
the XSLT expert role as well as parser-liaison expert. Scott and I had
frequent and productive discussions about better APIs for LotusXSL and
now Xalan, but, as he recently expressed to me privately, we need more
integration between the xml.apache.org projects. This discussion wants
to clear out the problems and start a continuous dialog that makes
Cocoon benefit more and more for the close collaboration with such
powerful and well implemented software.

but let me assing other roles for the people already on this list:

- Ricardo Rocha, he'll play the dynamic XML guru role.

- Pierpaolo Fumagalli, he'll play the static XML guru role.

- myself, I'll play the "all right, all right, but let's come up with a
working solution" role. :)

Ok, but what is this discussion about?

You all read the Cocoon2 proposal where I outlined the problems in
current Cocoon architecture. Some of you don't like that proposal, some
of you liked it before but changed your mind, myself, I changed my mind
so many times I don't know what to do.

While the DOM model is not posing that many limitations on dynamic
operation (Cocoon is not generally used to generate mb-long web pages),
it is on static operation (I'm talking about Stylebook at this point,
but you should consider Cocoon2 = Cocoon1 + Stylebook) where mb-long PDF
reports are not that far away to be considered.

On the other hand, key issues about web operation (like content-length,
expiration headers and such) or internal operation pose a great deal of
problems when the DOM model is abandoned.

This discussion should be focused on answering this question:

 "what is the best architecture for Cocoon2?"

in answering this question, we should consider both dynamic and static
operativity, performance, memory usage, scalability, availability of
implementations, degree of standardization, degree of usability, ease of
use, cost of operation and time to market of a possible solution.

Also, I would like you to focus on practical considerations rather than
theorical approaches, so, to prevent "pindaric flights", I fix some
rules:

1) the adoption of W3C standards is not under discussion. We should work
with what it's standardized "today". Proposals that rely on
yet-to-be-finalized features or new ideas will be evaluated one by one,
but as a general rule, we should play with the rules we already have.

2) nothing in the Cocoon architecture is carved in stone. Even less, the
cocoon2 proposal. We are open to all kinds of suggestions and I'm
willing to undertake a major code rewriting if the benefits are evident
and long lasting.

3) this discussion is about internal architecture, and should not deal
with other issues such as XSP vs JSP, or XSP vs. XSLT-extentions, or
producer vs. processor, or Xalan vs. XT or anything like that. Let's
remain focused on the underlying architecture, everything else will be
dealt with when this is resolved.

4) this discussion will be orthogonal to the sitemap design, meaning
that the sitemap will not make assumptions on the underlying API
architecture used inside Cocoon.

Ok, I'll start with my personal and very brief comment:

"I like DOM because I'm lazy and I don't want to rewrite Cocoon, but I
also know that Pier needs SAX support for static operation and we need
better links between Cocoon and X*L components than DOM 1 provides. I'd
be glad to make everyone happy without rewriting the whole thing,
removing this debate once and forever" 

Now, tell me what you think and I warn you: if you don't speak up now,
I'll continue with what we have today since I'm happy about it, I don't
have the itch to scratch and I'm lazy :)

[NOTE: this discussion is open to _everyone_ of course, even those one
not listed above. Also, you are not forced to play the roles that I
assigned you up above. I just wanted to kick you in.]

Your turn, now. Let the discussion begin.

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------
 Come to the first official Apache Software Foundation Conference!  
------------------------- http://ApacheCon.Com ---------------------

Re: servlet or no? (was Re: [Moving on] SAX vs. DOM part II)

Posted by brian moseley <ix...@maz.org>.

On Mon, 24 Jan 2000, Pierpaolo Fumagalli wrote:

> Ok... On this we agree (I believe). This is the cache mechanism how it
> should be deployed:
> 
> The cocoon execution involves these steps:
>    a) producer
>    b) filter (one or more)
>    c) serializer
>    All together can build up a "cocoon chain"
> 
> When I receive a request, I call everyone of those (a,b*,c) and ask
> "were you, your configurations, or those files your rely on modified
> since the last time I called you (and to be precise this -the previous
> request date-)?"
> If that says it changed, then I re-process the request, if not, I simply
> get the copy out of the cache and send it, or if the client (in HTTP)
> provided a "If-Modified-Since" header, and that's within our range, I
> don't serve anything.
> Is this right? Because if so, it's already carved in Cocoon 2.

yes, sounds good.

Re: servlet or no? (was Re: [Moving on] SAX vs. DOM part II)

Posted by Pierpaolo Fumagalli <pi...@apache.org>.

brian moseley wrote:
> 
> step 1: scheduled process extracts raw logicsheets, etc from
> say a cvs repository, compiles them into classes, and
> distributes the classes to the appropriate location on the
> host running a cocoon server.
> 
> step 2: the cocoon server handles a request and decides it
> needs one of these classes as a processor. it notices that
> the class's last modified time is newer than the last
> modified time that it remembers for the class. it reloads
> the class and then executes the code appropriately.
> 
> step 1 is performed offline, from the perspective of the
> cocoon server. the classes are not cached, from the
> perspective of the cocoon server. the relationship is much
> simpler, just 'reload if modified since x'.
> 
> this is a perfectly valid deployment, and is in fact much
> more highly scalable (performance-wise) than
> load-at-request-time strategies. at least in my experience.

Ok... On this we agree (I believe). This is the cache mechanism how it
should be deployed:

The cocoon execution involves these steps:
   a) producer
   b) filter (one or more)
   c) serializer
   All together can build up a "cocoon chain"

When I receive a request, I call everyone of those (a,b*,c) and ask
"were you, your configurations, or those files your rely on modified
since the last time I called you (and to be precise this -the previous
request date-)?"
If that says it changed, then I re-process the request, if not, I simply
get the copy out of the cache and send it, or if the client (in HTTP)
provided a "If-Modified-Since" header, and that's within our range, I
don't serve anything.
Is this right? Because if so, it's already carved in Cocoon 2.

	Pier

-- 
--------------------------------------------------------------------
-          P              I              E              R          -
stable structure erected over water to allow the docking of seacraft
<ma...@betaversion.org>    <http://www.betaversion.org/~pier/>
--------------------------------------------------------------------
- ApacheCON Y2K: Come to the official Apache developers conference -
-------------------- <http://www.apachecon.com> --------------------

Re: servlet or no? (was Re: [Moving on] SAX vs. DOM part II)

Posted by brian moseley <ix...@maz.org>.

On Mon, 24 Jan 2000, Pierpaolo Fumagalli wrote:

> brian moseley wrote:
> > 
> > On Mon, 24 Jan 2000, Pierpaolo Fumagalli wrote:
> > 
> > > Are we talking about CACHING the XSP (the class file
> > > produced...) or we're talking about executing XSP
> > > gathering request (and response) informations from an
> > > E-Mail?
> > 
> > i don't think of it as caching. in my deployment scenario
> > the application never adds or removes compiled classes. it
> > only loads them, and reloads them when it notices they have
> > changed on disk.
> > 
> > from the point of view of the offline compiler, you might
> > consider it a cache.
> > 
> > either way, we are certainly not talking about executing the
> > logic in response to an http request, or any other kind of
> > request.
> 
> So, when do you execute the logic? I'm starting to loose focus.

step 1: scheduled process extracts raw logicsheets, etc from
say a cvs repository, compiles them into classes, and
distributes the classes to the appropriate location on the
host running a cocoon server.

step 2: the cocoon server handles a request and decides it
needs one of these classes as a processor. it notices that
the class's last modified time is newer than the last
modified time that it remembers for the class. it reloads
the class and then executes the code appropriately.

step 1 is performed offline, from the perspective of the
cocoon server. the classes are not cached, from the
perspective of the cocoon server. the relationship is much
simpler, just 'reload if modified since x'.

this is a perfectly valid deployment, and is in fact much
more highly scalable (performance-wise) than
load-at-request-time strategies. at least in my experience.

> But the SMTP and HTTP models ARE orthogonal.. How can
> you reconcile those? SMTP doesnt' provide a
> request-response mechanism, while HTTP does.

:)

i am not talking about smtp. i have never been talking about
smtp. please understand this.

> > i didn't say they are. smtp and http are equivalent
> > transport mechanisms for mime data.
> 
> But their operation model IS different.

fine, who cares? im not talking about using cocoon as an
smtp handler.

> I like this guy... He reminds me of another guy I knew
> around october 1998 when he, with a friend, invented a
> thing called Mail Servlet... :)

augh. its like im beating myself in the head with a hammer.

Re: servlet or no? (was Re: [Moving on] SAX vs. DOM part II)

Posted by Pierpaolo Fumagalli <pi...@apache.org>.

brian moseley wrote:
> 
> On Mon, 24 Jan 2000, Pierpaolo Fumagalli wrote:
> 
> > Are we talking about CACHING the XSP (the class file
> > produced...) or we're talking about executing XSP
> > gathering request (and response) informations from an
> > E-Mail?
> 
> i don't think of it as caching. in my deployment scenario
> the application never adds or removes compiled classes. it
> only loads them, and reloads them when it notices they have
> changed on disk.
> 
> from the point of view of the offline compiler, you might
> consider it a cache.
> 
> either way, we are certainly not talking about executing the
> logic in response to an http request, or any other kind of
> request.

So, when do you execute the logic? I'm starting to loose focus.

> > That was the idea... Using cocoon to deal with Email...
> > That's the point that started the whole discussion...
> 
> the tool i described "deals with email" just as an smtp
> server "deals with email". their suitability for embedding
> cocoon is somewhat different.

????????????????? I don't REALLY follow...

> > That's cool... as long as cocoon receives an HTTP
> > request and response, I'm totally +1 on it... It's your
> > business, then, to correctly translate the EMail into
> > some HTTP stuff.
> 
> not what i was going for.

But the SMTP and HTTP models ARE orthogonal.. How can you reconcile
those? SMTP doesnt' provide a request-response mechanism, while HTTP
does.

> > You can write a java class that takes an EMail, converts
> > it into a ServletRequest/Response, place it in cocoon
> > and let it go, that would make the trick, but, SERVLETS
> > are not EMAIL...
> 
> i didn't say they are. smtp and http are equivalent
> transport mechanisms for mime data.

But their operation model IS different.

> i agree that servlets are not the best way to implement an
> smtp server.

Cool :) at least on this we agree...

> i disagree that there can not be servlet request and
> response subclasses that provide methods for accessing
> standard rfc822 and mime email headers. these methods would
> probably be very similar to those on the javamail message
> class.

I like this guy... He reminds me of another guy I knew around october
1998 when he, with a friend, invented a thing called Mail Servlet... :)

> > YES... So, we're saying the same exact thing... Don't
> > reuse the GLUE, reuse the PIECES... Cocoon IS the http
> > interface (AKA servlet) AND the Sitemap... I thought
> > there were NO DOUBTS on that... Let's not confuse Cocoon
> > with a tool that applies stylesheets to XML, please,
> > Cocoon is more than that...
> 
> no, i don't think we're saying the same exact thing. if you
> define cocoon as the http interface, then you really should
> define the producer/processor/serializer engine as something
> else. that can receive and respond to more transports than
> simply http. otherwise you are limiting your user base.

The producer/processor/serializer interfaces are not (yet) defined. I
would make them as similar as possible to servlets (the data required
inside, IMVHO, is ServletContext, HTTPServletRequest and
HTTPServletResponse). But if you want to come up with a different
paradigm of request/response model that can be also used from James (or
whater email server), I will be more than happy to discuss it.
Then Cocoon will be a bridge between the servlet world to this new
world... And GiveANameHere, could be the bridge between EMail and it...

	Pier (who already feels he's going to throw in the trashbin
              two weeks of work!)

-- 
--------------------------------------------------------------------
-          P              I              E              R          -
stable structure erected over water to allow the docking of seacraft
<ma...@betaversion.org>    <http://www.betaversion.org/~pier/>
--------------------------------------------------------------------
- ApacheCON Y2K: Come to the official Apache developers conference -
-------------------- <http://www.apachecon.com> --------------------

Re: servlet or no? (was Re: [Moving on] SAX vs. DOM part II)

Posted by brian moseley <ix...@maz.org>.

On Mon, 24 Jan 2000, Pierpaolo Fumagalli wrote:

> Are we talking about CACHING the XSP (the class file
> produced...) or we're talking about executing XSP
> gathering request (and response) informations from an
> E-Mail?

i don't think of it as caching. in my deployment scenario
the application never adds or removes compiled classes. it
only loads them, and reloads them when it notices they have
changed on disk.

from the point of view of the offline compiler, you might
consider it a cache.

either way, we are certainly not talking about executing the
logic in response to an http request, or any other kind of
request.

> That was the idea... Using cocoon to deal with Email...
> That's the point that started the whole discussion...

the tool i described "deals with email" just as an smtp
server "deals with email". their suitability for embedding
cocoon is somewhat different.

> That's cool... as long as cocoon receives an HTTP
> request and response, I'm totally +1 on it... It's your
> business, then, to correctly translate the EMail into
> some HTTP stuff.

not what i was going for.

> You can write a java class that takes an EMail, converts
> it into a ServletRequest/Response, place it in cocoon
> and let it go, that would make the trick, but, SERVLETS
> are not EMAIL...

i didn't say they are. smtp and http are equivalent
transport mechanisms for mime data.

i agree that servlets are not the best way to implement an
smtp server.

i disagree that there can not be servlet request and
response subclasses that provide methods for accessing
standard rfc822 and mime email headers. these methods would
probably be very similar to those on the javamail message
class.

> YES... So, we're saying the same exact thing... Don't
> reuse the GLUE, reuse the PIECES... Cocoon IS the http
> interface (AKA servlet) AND the Sitemap... I thought
> there were NO DOUBTS on that... Let's not confuse Cocoon
> with a tool that applies stylesheets to XML, please,
> Cocoon is more than that...

no, i don't think we're saying the same exact thing. if you
define cocoon as the http interface, then you really should
define the producer/processor/serializer engine as something
else. that can receive and respond to more transports than
simply http. otherwise you are limiting your user base.

Re: servlet or no? (was Re: [Moving on] SAX vs. DOM part II)

Posted by Pierpaolo Fumagalli <pi...@apache.org>.

brian moseley wrote:
> 
> On Mon, 24 Jan 2000, Pierpaolo Fumagalli wrote:
> 
> > I perfectly agree how XSLT can work off line, it's just
> > a transformation language... But, regarding XSP, I don't
> > think this is a possibility... It's like trying to run a
> > CGI from the command line...
> 
> i think we're talking about two slightly different things,
> and this may be due to my incomplete knowledge of xsp.
> 
> doesnt the xsp "processor" compile logicsheets into java
> classes? aren't those cached? or isn't this at least the
> plan? this effort is the one i'm referring to with regards
> to offline operation.

Are we talking about CACHING the XSP (the class file produced...) or
we're talking about executing XSP gathering request (and response)
informations from an E-Mail?

> > Hey... Let's remember one thing... some components ARE
> > reusable (serializers: FOP, XML/HTML printers, NRG
> > engine... and filters like XSLT) some are tied to HTTP
> > (XSP, things that rely on POST)...
> 
> the execution of logic happens at request time. compilation
> should not be forced to happen at request time.

Totally... Compilation could  be even done off line, then kicked into a
nice JAR file and loaded by cocoon... That's not a problem... That's
CACHING, or preprocessing... I thought we were talking about letting XSP
do the EMail job.

> > I'm not saying that the whole universe NEEDS to be tied
> > to HTTP, but I say that in the context of Cocoon, it
> > would be overhelming allowing it to deal with things
> > like EMails, and that it would be better if we restrict
> > it to Web Sites.
> 
> i have never heard any details about why people are so
> scared of using a servlet request that contains a mime
> structure. i don't think anybody is talking in the cocoon
> context at least about writing an smtp servlet.

That was the idea... Using cocoon to deal with Email... That's the point
that started the whole discussion...

> consider this use case: procmail pipes an email message
> to a command line tool. the tool creates a servlet request
> out of the message and hands the request to the cocoon
> engine. the engine creates a servlet response containing an
> html document. the tool places the html document on
> the file system and checks the file into cvs.

That's cool... as long as cocoon receives an HTTP request and response,
I'm totally +1 on it... It's your business, then, to correctly translate
the EMail into some HTTP stuff.

> i have a hand coded perl tool that does something exactly
> like this. if cocoon was a bit more flexible, i could use it
> instead.

You can write a java class that takes an EMail, converts it into a
ServletRequest/Response, place it in cocoon and let it go, that would
make the trick, but, SERVLETS are not EMAIL...

> > Other tools based on the same components (same pieces,
> > different glue) can be created to generate E-Mail, to do
> > gopher, to create XML->LDAP servers, or whatever we
> > want, but those SHOULD NOT USE the same glue used for
> > the Web. The WEB-GLUE is cocoon...
> 
> disagree. the sitemap and the http interface are web glue.
> the rest of cocoon is not. producers, processors,
> formatters, serializers, those do not have to be web
> specific in any way.

YES... So, we're saying the same exact thing... Don't reuse the GLUE,
reuse the PIECES... Cocoon IS the http interface (AKA servlet) AND the
Sitemap... I thought there were NO DOUBTS on that... Let's not confuse
Cocoon with a tool that applies stylesheets to XML, please, Cocoon is
more than that...

<joking>
Or not? I'm starting to loose my credo... I want a religion to follow...
Please someone summon me now :) (read it, Stefano, where are you?)
</joking>

	Pier (not following anymore the point of the discussion)

-- 
--------------------------------------------------------------------
-          P              I              E              R          -
stable structure erected over water to allow the docking of seacraft
<ma...@betaversion.org>    <http://www.betaversion.org/~pier/>
--------------------------------------------------------------------
- ApacheCON Y2K: Come to the official Apache developers conference -
-------------------- <http://www.apachecon.com> --------------------

Re: servlet or no? (was Re: [Moving on] SAX vs. DOM part II)

Posted by brian moseley <ix...@maz.org>.

On Mon, 24 Jan 2000, Pierpaolo Fumagalli wrote:

> Bingo... Also because I want to see how much time it
> will take to fire up a whole JVM from the command line
> every time you have to process an Email :) Either you

this is the first thing you've said in this thread i agree
with :)

> have a Java Mail Server (read it James, and GO AND LOOK
> AT IT, DAMMIT! :):):) or if you want to use sendmail, be
> ready to have a nice uptime on your machine :)

i will go look at it. i bet i will find i can't embed cocoon
in it :)

this was a specific example of a non-http application that
can use cocoon. there are many others im sure.

Re: servlet or no? (was Re: [Moving on] SAX vs. DOM part II)

Posted by Pierpaolo Fumagalli <pi...@apache.org>.

Donald Ball wrote:
> 
> On Mon, 24 Jan 2000, brian moseley wrote:
> 
> > consider this use case: procmail pipes an email message
> > to a command line tool. the tool creates a servlet request
> > out of the message and hands the request to the cocoon
> > engine. the engine creates a servlet response containing an
> > html document. the tool places the html document on
> > the file system and checks the file into cvs.
> 
> Thought - why not just have the command line tool construct and fire off
> an HTTP request? That way you're not forcing your mail server and your
> cocoon server to coexist. Plus you keep all of the benefits of servlets
> (namely persistency).

Bingo... Also because I want to see how much time it will take to fire
up a whole JVM from the command line every time you have to process an
Email :) Either you have a Java Mail Server (read it James, and GO AND
LOOK AT IT, DAMMIT! :):):) or if you want to use sendmail, be ready to
have a nice uptime on your machine :)

> In general, you should be able to map most any service request to an HTTP
> request. It may not necessarily be the best design pattern - I don't
> honestly know, but it's at least worth considering.

Hmmm... Anything can pass over HTTP. I mean, an email can be passed to a
servlet as a POST with and appropriate content-type (what was it?
"text/rfc-822").
But then, the output needs to go somewhere, it need a some kind of
redirection, because not always you're sending the response back to the
same client (basically email processing is following the
Request->Request model, from a protocol point of view). Do I need to get
technical on that? I'd rather move this whole discussion on the James
mailing list.

	Pier

-- 
--------------------------------------------------------------------
-          P              I              E              R          -
stable structure erected over water to allow the docking of seacraft
<ma...@betaversion.org>    <http://www.betaversion.org/~pier/>
--------------------------------------------------------------------
- ApacheCON Y2K: Come to the official Apache developers conference -
-------------------- <http://www.apachecon.com> --------------------

Re: servlet or no? (was Re: [Moving on] SAX vs. DOM part II)

Posted by brian moseley <ix...@maz.org>.

On Mon, 24 Jan 2000, Pierpaolo Fumagalli wrote:

> Or probably you're not fond enough about HTTP, servlets
> and their similarities or differences from the SMTP and
> the email world to see that in most cases they cannot
> coexist.

unfortunately you don't know enough about my work to make
that claim. as a matter of fact i am very fond of http.

you persist in claiming that i am trying to rehash your
decision not to implement an smtp server with servlets. i'm
not. im claiming that the servlet request/response classes
can be the appropriate interface for the
producer/processor/serializer engine, that it doesn't need
to be limited to being http-specific.

to me there is a big difference between "servlets" and
everything that word implies, and the simple data structures
that are requests and responses.

Re: servlet or no? (was Re: [Moving on] SAX vs. DOM part II)

Posted by Pierpaolo Fumagalli <pi...@apache.org>.

brian moseley wrote:
> 
> this is the problem - the group in general is so focused on
> http-based publishing that it isn't seeing that only small
> parts of cocoon are http-specific. there is a much wider
> world out there that can benefit from the sophistication
> that cocoon layers on top of simple xsl transforms.

Or probably you're not fond enough about HTTP, servlets and their
similarities or differences from the SMTP and the email world to see
that in most cases they cannot coexist.
While saying that, I'm not telling you that you're ignorant, nor I want
to offend you in any way, but me, Stefano, Federico (the main developer
of James and Avalon), James (the author of the Servlet APIs) spent so
much time between October 1998 and October 1999 thinking about those
concepts, that nowdays I'm focused with our "global" vision of the
world.
Anyway, if you think that these two models can coexist, please don't
hesitate to try making me change my idea... bacause if so, maybe, one
day, we'll get out (once again) our servlet spec, and kick asses :)

	Pier

-- 
--------------------------------------------------------------------
-          P              I              E              R          -
stable structure erected over water to allow the docking of seacraft
<ma...@betaversion.org>    <http://www.betaversion.org/~pier/>
--------------------------------------------------------------------
- ApacheCON Y2K: Come to the official Apache developers conference -
-------------------- <http://www.apachecon.com> --------------------

Re: servlet or no? (was Re: [Moving on] SAX vs. DOM part II)

Posted by brian moseley <ix...@maz.org>.

On Mon, 24 Jan 2000, Donald Ball wrote:

> so just have an http/servlet server running locally,
> bound to 127.0.0.1, firewalled from everything else,
> that responds to requests from your command line tool?
> not only does it work out of the box right now, but you
> also get servlet reloading if you want it, you get nice
> logging, you get a little apache process to watch your
> buggy JVM and kick it as needed.

sure, im not saying i cant do it, but it definitely seems
like overkill. course there is the jvm startup overhead that
pier pointed out.

> again, i don't _know_ that it doesn't make sense to
> tunnel other service requests through HTTP, but i
> haven't seen an argument that convinces me otherwise
> neither.

having to administer yet another network process? when you
are in a small environment its not a big deal. in a large
environment things like this get lost.

Re: servlet or no? (was Re: [Moving on] SAX vs. DOM part II)

Posted by Donald Ball <ba...@webslingerZ.com>.

On Mon, 24 Jan 2000, brian moseley wrote:

> On Mon, 24 Jan 2000, Donald Ball wrote:
> 
> > Thought - why not just have the command line tool
> > construct and fire off an HTTP request? That way you're
> > not forcing your mail server and your cocoon server to
> > coexist. Plus you keep all of the benefits of servlets
> > (namely persistency).
> 
> there is no http server nor cocoon server. there is simply a
> filesystem with a cvs sandbox. the only network interaction
> with this box is mail delivery and cvs client access to the
> repository.

so just have an http/servlet server running locally, bound to 127.0.0.1,
firewalled from everything else, that responds to requests from your
command line tool? not only does it work out of the box right now, but you
also get servlet reloading if you want it, you get nice logging, you get a
little apache process to watch your buggy JVM and kick it as needed.

again, i don't _know_ that it doesn't make sense to tunnel other service
requests through HTTP, but i haven't seen an argument that convinces me
otherwise neither.

- donald

Re: servlet or no? (was Re: [Moving on] SAX vs. DOM part II)

Posted by brian moseley <ix...@maz.org>.

On Mon, 24 Jan 2000, Donald Ball wrote:

> Thought - why not just have the command line tool
> construct and fire off an HTTP request? That way you're
> not forcing your mail server and your cocoon server to
> coexist. Plus you keep all of the benefits of servlets
> (namely persistency).

there is no http server nor cocoon server. there is simply a
filesystem with a cvs sandbox. the only network interaction
with this box is mail delivery and cvs client access to the
repository.

this is the problem - the group in general is so focused on
http-based publishing that it isn't seeing that only small
parts of cocoon are http-specific. there is a much wider
world out there that can benefit from the sophistication
that cocoon layers on top of simple xsl transforms.

> In general, you should be able to map most any service
> request to an HTTP request. It may not necessarily be
> the best design pattern - I don't honestly know, but
> it's at least worth considering.

in some situations this is a great idea. in some situations
its overly complex.

Re: servlet or no? (was Re: [Moving on] SAX vs. DOM part II)

Posted by Pierpaolo Fumagalli <pi...@apache.org>.

Mike Engelhart wrote:
> 
> I agree - maybe we could start a side project, (we can call it "Mothra" :-))
> that develops a java application that generates HTTP requests to Cocoon for
> creation of static pages.  It would probably be pretty easy.

The creation of static pages can easily follow the HTTP request-response
model, and in the "Cocoon 2.0" proposal that is already included. Cocoon
2.0 was born on the experiences the Cocoon team had on dynamic content
generation, and the StyleBook team (sorry it's just me and few guys at
IBM) had on static pages generation.
"Mothra" is already there, not coded, but ready to be :)

	Pier

-- 
--------------------------------------------------------------------
-          P              I              E              R          -
stable structure erected over water to allow the docking of seacraft
<ma...@betaversion.org>    <http://www.betaversion.org/~pier/>
--------------------------------------------------------------------
- ApacheCON Y2K: Come to the official Apache developers conference -
-------------------- <http://www.apachecon.com> --------------------

Re: servlet or no? (was Re: [Moving on] SAX vs. DOM part II)

Posted by Mike Engelhart <me...@earthtrip.com>.

Donald Ball wrote:

> Thought - why not just have the command line tool construct and fire off
> an HTTP request? That way you're not forcing your mail server and your
> cocoon server to coexist. Plus you keep all of the benefits of servlets
> (namely persistency).
> 
> In general, you should be able to map most any service request to an HTTP
> request. It may not necessarily be the best design pattern - I don't
> honestly know, but it's at least worth considering.
> 
> - donald
I agree - maybe we could start a side project, (we can call it "Mothra" :-))
that develops a java application that generates HTTP requests to Cocoon for
creation of static pages.  It would probably be pretty easy.  We could even
use the Jakarta projects work to create a standalone version of Tomcat that
runs Cocoon right out of the box but is configured and run from within
Mothra so the user doesn't have to worry about that meddlesome HTTP stuff.
Also, make it have a plug-in architecture so that if you wanted it process
incoming email you could write a plug-in that allowed it receive that kind
of request and forward it to Cocoon.
??

Mike

Re: servlet or no? (was Re: [Moving on] SAX vs. DOM part II)

Posted by Donald Ball <ba...@webslingerZ.com>.

On Mon, 24 Jan 2000, brian moseley wrote:

> consider this use case: procmail pipes an email message
> to a command line tool. the tool creates a servlet request
> out of the message and hands the request to the cocoon
> engine. the engine creates a servlet response containing an
> html document. the tool places the html document on
> the file system and checks the file into cvs.

Thought - why not just have the command line tool construct and fire off
an HTTP request? That way you're not forcing your mail server and your
cocoon server to coexist. Plus you keep all of the benefits of servlets
(namely persistency).

In general, you should be able to map most any service request to an HTTP
request. It may not necessarily be the best design pattern - I don't
honestly know, but it's at least worth considering.

- donald

Re: servlet or no? (was Re: [Moving on] SAX vs. DOM part II)

Posted by brian moseley <ix...@maz.org>.

On Mon, 24 Jan 2000, Pierpaolo Fumagalli wrote:

> I perfectly agree how XSLT can work off line, it's just
> a transformation language... But, regarding XSP, I don't
> think this is a possibility... It's like trying to run a
> CGI from the command line...

i think we're talking about two slightly different things,
and this may be due to my incomplete knowledge of xsp.

doesnt the xsp "processor" compile logicsheets into java
classes? aren't those cached? or isn't this at least the
plan? this effort is the one i'm referring to with regards
to offline operation.

> Hey... Let's remember one thing... some components ARE
> reusable (serializers: FOP, XML/HTML printers, NRG
> engine... and filters like XSLT) some are tied to HTTP
> (XSP, things that rely on POST)...

the execution of logic happens at request time. compilation
should not be forced to happen at request time.

> I'm not saying that the whole universe NEEDS to be tied
> to HTTP, but I say that in the context of Cocoon, it
> would be overhelming allowing it to deal with things
> like EMails, and that it would be better if we restrict
> it to Web Sites.

i have never heard any details about why people are so
scared of using a servlet request that contains a mime
structure. i don't think anybody is talking in the cocoon
context at least about writing an smtp servlet.

consider this use case: procmail pipes an email message
to a command line tool. the tool creates a servlet request
out of the message and hands the request to the cocoon
engine. the engine creates a servlet response containing an
html document. the tool places the html document on
the file system and checks the file into cvs.

i have a hand coded perl tool that does something exactly
like this. if cocoon was a bit more flexible, i could use it
instead.

> Other tools based on the same components (same pieces,
> different glue) can be created to generate E-Mail, to do
> gopher, to create XML->LDAP servers, or whatever we
> want, but those SHOULD NOT USE the same glue used for
> the Web. The WEB-GLUE is cocoon...

disagree. the sitemap and the http interface are web glue.
the rest of cocoon is not. producers, processors,
formatters, serializers, those do not have to be web
specific in any way.

Re: servlet or no? (was Re: [Moving on] SAX vs. DOM part II)

Posted by Pierpaolo Fumagalli <pi...@apache.org>.

brian moseley wrote:
> 
> you are unnecessarily limiting your universe. cocoon needs
> to be embeddable. the xsp and xsl compilers/cachers need to
> be able to work fully offline, not just the processors and
> serializers (or whatever the new names are). i have not seen
> an adequate argument to the contrary.

I perfectly agree how XSLT can work off line, it's just a transformation
language... But, regarding XSP, I don't think this is a possibility...
It's like trying to run a CGI from the command line...
Hey... Let's remember one thing... some components ARE reusable
(serializers: FOP, XML/HTML printers, NRG engine... and filters like
XSLT) some are tied to HTTP (XSP, things that rely on POST)...

I'm not saying that the whole universe NEEDS to be tied to HTTP, but I
say that in the context of Cocoon, it would be overhelming allowing it
to deal with things like EMails, and that it would be better if we
restrict it to Web Sites.

Other tools based on the same components (same pieces, different glue)
can be created to generate E-Mail, to do gopher, to create XML->LDAP
servers, or whatever we want, but those SHOULD NOT USE the same glue
used for the Web. The WEB-GLUE is cocoon...

	Pier

-- 
--------------------------------------------------------------------
-          P              I              E              R          -
stable structure erected over water to allow the docking of seacraft
<ma...@betaversion.org>    <http://www.betaversion.org/~pier/>
--------------------------------------------------------------------
- ApacheCON Y2K: Come to the official Apache developers conference -
-------------------- <http://www.apachecon.com> --------------------

servlet or no? (was Re: [Moving on] SAX vs. DOM part II)

Posted by brian moseley <ix...@maz.org>.

On Mon, 24 Jan 2000, Pierpaolo Fumagalli wrote:

> brian moseley wrote:
> > 
> > On Mon, 24 Jan 2000, Niclas Hedhman wrote:
> > 
> > > And having Cocoon as a pure engine, it will make it more
> > > appealing to drop it into larger application frameworks
> > > as a standard component. That is a point to consider for
> > > wider acceptance.
> > 
> > stefano seems to hold strong to the belief that cocoon is
> > and forever shall be Just a Servlet :) at least i recall a
> > very passionate message in that regard just recently.
> 
> I would add, it's a servlet, and an application that
> generates servlet-like requests for creating off-line
> browseable web site (a servlet with an integrated
> mirroring tool, per say!)

you are unnecessarily limiting your universe. cocoon needs
to be embeddable. the xsp and xsl compilers/cachers need to
be able to work fully offline, not just the processors and
serializers (or whatever the new names are). i have not seen
an adequate argument to the contrary.

Re: [Moving on] SAX vs. DOM part II

Posted by Pierpaolo Fumagalli <pi...@apache.org>.

brian moseley wrote:
> 
> On Mon, 24 Jan 2000, Niclas Hedhman wrote:
> 
> > And having Cocoon as a pure engine, it will make it more
> > appealing to drop it into larger application frameworks
> > as a standard component. That is a point to consider for
> > wider acceptance.
> 
> stefano seems to hold strong to the belief that cocoon is
> and forever shall be Just a Servlet :) at least i recall a
> very passionate message in that regard just recently.

I would add, it's a servlet, and an application that generates
servlet-like requests for creating off-line browseable web site (a
servlet with an integrated mirroring tool, per say!)

	Pier

-- 
--------------------------------------------------------------------
-          P              I              E              R          -
stable structure erected over water to allow the docking of seacraft
<ma...@betaversion.org>    <http://www.betaversion.org/~pier/>
--------------------------------------------------------------------
- ApacheCON Y2K: Come to the official Apache developers conference -
-------------------- <http://www.apachecon.com> --------------------

Re: [Moving on] SAX vs. DOM part II

Posted by Stefano Mazzocchi <st...@apache.org>.

brian moseley wrote:
> 
> On Mon, 24 Jan 2000, Niclas Hedhman wrote:
> 
> > And having Cocoon as a pure engine, it will make it more
> > appealing to drop it into larger application frameworks
> > as a standard component. That is a point to consider for
> > wider acceptance.
> 
> stefano seems to hold strong to the belief that cocoon is
> and forever shall be Just a Servlet :) at least i recall a
> very passionate message in that regard just recently.

No, I think Cocoon is a framework with a servlet enty point. I see no
reasons to remove the servlet entry point even if further evolutions of
the platform, but I see evolutionary paths.

Anyway, the day that people will unsubscribe from this list because of
my presence, I'll do something else. I'm not Cocoon: all of us,
together, are.

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------
 Come to the first official Apache Software Foundation Conference!  
------------------------- http://ApacheCon.Com ---------------------

Re: [Moving on] SAX vs. DOM part II

Posted by brian moseley <ix...@maz.org>.

On Mon, 24 Jan 2000, brian moseley wrote:

> On Mon, 24 Jan 2000, Niclas Hedhman wrote:
> 
> > And having Cocoon as a pure engine, it will make it more
> > appealing to drop it into larger application frameworks
> > as a standard component. That is a point to consider for
> > wider acceptance.
> 
> stefano seems to hold strong to the belief that cocoon is
> and forever shall be Just a Servlet :) at least i recall a
> very passionate message in that regard just recently.

not that i agree :)

Re: [Moving on] SAX vs. DOM part II

Posted by brian moseley <ix...@maz.org>.

On Mon, 24 Jan 2000, Niclas Hedhman wrote:

> And having Cocoon as a pure engine, it will make it more
> appealing to drop it into larger application frameworks
> as a standard component. That is a point to consider for
> wider acceptance.

stefano seems to hold strong to the belief that cocoon is
and forever shall be Just a Servlet :) at least i recall a
very passionate message in that regard just recently.

Re: [Moving on] SAX vs. DOM part II

Posted by Niclas Hedhman <ni...@localbar.com>.

Pierpaolo Fumagalli wrote:

> Ricardo Rocha wrote:
> >
> > I completely agree with Pier Paolo.
> > [...]
> > That said, there's a lot in Niclas' proposal that makes sense and is
> > consistent with preserving Cocoon's identity while providing stronger
> > support for application development.
>
> Uh.. Totally... I wanted to say that many of the components used in
> Cocoon, can be reused also in another tool dealing with XML translation
> triggered by EMail events...

Maybe I forgot to distill the essence of my concerns...

If Cocoon is a pure Engine (my main proposal), not a servlet per se, then all
Requestors and Responsors are "outside" the Cocoon core.

Ok, drop in the Servlet Request/Response pair as a standard wrapper, but it
will allow me to do my dedicated Custom stuff a lot easier. Now, the reason why
the issue is raised, is that laziness seems to proliferate in OpenSource
projects, and if the Servlet model is "hard-assumed" and the whole servlet
context is passed and must be present, it will limit all other uses, including
command-line (off-line) generation, or Emails in my case (Alarm events
(application) trigger the generation of a bunch of status pages to be sent by
email).

My main point here is the introduction of a RequestFragment, which is a
normalization of the Request upon a standard that all can agree. I propose
using a XML datastructure, but may very well settle with a Java interface
instead, if performance suffers to a significant degree (which I argue it won't
because the XML RequestFragment should/can be cached.)

And having Cocoon as a pure engine, it will make it more appealing to drop it
into larger application frameworks as a standard component. That is a point to
consider for wider acceptance.

Niclas

Re: [Moving on] SAX vs. DOM part II

Posted by Pierpaolo Fumagalli <pi...@apache.org>.

Ricardo Rocha wrote:
> 
> I completely agree with Pier Paolo.
> [...]
> That said, there's a lot in Niclas' proposal that makes sense and is
> consistent with preserving Cocoon's identity while providing stronger
> support for application development.

Uh.. Totally... I wanted to say that many of the components used in
Cocoon, can be reused also in another tool dealing with XML translation
triggered by EMail events...

> A point where I tend to disagree with Niclas (unless I have missed
> his point) is in the convenience of applying multiple successive XSLT
> transformations:
> 
> > I find a lot easier to make small XSLT sheets that each do a single
> > simple thing, than trying to incorporate all transformations in a single
> > sheet. It also makes stylesheet re-use and establishment of XSL
> > libraries easier. Therefor I have the multiple stage XSL
> > transformations as a basic feature.

I tend to agree with this. Look for example to the XML.APACHE.ORG
website. We have multiple DTDs (FAQ, changes, documents...) all those
are translated into one, (the document one) and then styled in HTML.
When I made the XMAS stylesheet (the one for Christmas) I just had to
change the last style, while all the others (from FAQ to DOC, and from
CHANGES to DOC) didn't change.

Of course, anyway, with stylebook and xml.apache.org I don't have
performance problems :)

	Pier

-- 
--------------------------------------------------------------------
-          P              I              E              R          -
stable structure erected over water to allow the docking of seacraft
<ma...@betaversion.org>    <http://www.betaversion.org/~pier/>
--------------------------------------------------------------------
- ApacheCON Y2K: Come to the official Apache developers conference -
-------------------- <http://www.apachecon.com> --------------------

Re: [Moving on] SAX vs. DOM part II

Posted by Stefano Mazzocchi <st...@apache.org>.

brian moseley wrote:

[skipped very good comments]

> in my experience, its best to not perform this construction
> at request time because of the often large number of
> elements in each artifact that must be assembled into the
> whole. i find that an offline compilation process, combined
> with an application that can sense and reload cached
> stylesheets and compiled classes which have been modified,
> allows the best performance, and the least opportunity for
> different instances of an application to become out of sync.
> it also allows construction to happen at a central point on
> the network, as a scheduled job or triggered manually, after
> which the compiled assets can be distributed to production
> hosts. this type of control over the update of cobranding
> assets is essential for high paying and very picky
> customers.

This is an implementation detail. If the producer is hand written or
dynamically compiled, this doesn't change the design pattern used for
separation.

I agree, however, that Cocoon should provide a way to "compile" the
whole web app, much like Stylebook generates the web site.

This is far from be impossible to do with what the
technologies/implementations we already have (since both XSP and XSLT
compilation are available).

But I'd rather focus on finishing internal details before moving to
usability needs. 

Otherwise, the todo list will be become scary and my laziness will grow
even worse :)

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------
 Come to the first official Apache Software Foundation Conference!  
------------------------- http://ApacheCon.Com ---------------------

RE: [Moving on] SAX vs. DOM part II

Posted by brian moseley <ix...@maz.org>.

On Mon, 24 Jan 2000, Ricardo Rocha wrote:

> A related and very interesting direction Scott Boag has
> pointed out is that of stylesheet compilation: if we
> translate stylsheets into bytecodes the "prohibitive"
> cost I mention above would become less of a problem (I'd
> stick to stylesheet inclusion/importing, though!)

one of the most compelling uses of cocoon is in specifying
user interfaces that must scale to large numbers of
cobrands, locales, and output formats. consider an isp with
500 cobrands supporting html and wml output in 16 languages.
the number of combinations of these dimensions is
staggering.

in this scenario the sheer number of people involved in
producing an application is staggering. the various groups
involve:

 * production staff concentrating on the stylistic aspect of
cobranding - making images, creating html and wml
headers/footers/menus/what have you, interfacing with each
customers' designers

 * a localization team providing linquistic qa, interfacing
with translation outsourcing partner, collaborating with ui
designers on issues such as word order flexibility, etc

 * ui engineers creating the xsl templates that will be used
to generate the html and wml for each cobrand

 * application engineers coding the logicsheets that provide
custom behavior for each cobrand that overrides and extends
the application's custom behavior

each of these groups generates a distinct set of artifacts
from which combinations must be selected at runtime
according to the dimensions of a particular request. i
expect that the source form of each of these artifacts will
be xsl stylesheets and xsp logicsheets, and that the
mechanism used to combine stylesheets will be
including/importing "stylesheet fragments" into a final
transient stylesheet which will then be cached. i may be
mistaken but xsp already compiles logicsheets into classes,
right?

in my experience, its best to not perform this construction
at request time because of the often large number of
elements in each artifact that must be assembled into the
whole. i find that an offline compilation process, combined
with an application that can sense and reload cached
stylesheets and compiled classes which have been modified,
allows the best performance, and the least opportunity for
different instances of an application to become out of sync.
it also allows construction to happen at a central point on
the network, as a scheduled job or triggered manually, after
which the compiled assets can be distributed to production
hosts. this type of control over the update of cobranding
assets is essential for high paying and very picky
customers.

RE: [Moving on] SAX vs. DOM part II

Posted by Ricardo Rocha <ri...@apache.org>.

Stefano Mazzocchi wrote:
> Ok, but what is this discussion about?

> Niclas Hedhman wrote:
> Exactly!!  The biggest 'problem' with Cocoon is that its purpose is not
> exactly specified.

Pier Paolo Fumagalli wrote:
> IMVHO the purpose of Cocoon is very well specified. Cocoon is an XML
> publishing tool, and by "publishing" I mean the act of creating
> web-sites (regardeless wether on-line, through HTTP, or off-line,
> shipped on a CD-ROM, and regardeless wether your browser is a cellphone
> or acrobat reader).
> What you're asking, to bring Cocoon in the EMail world, basically, is,
> IMVHO toally wrong. This idea wasn't born without a deep research. We
> (Stefano and I) were the one who proposed EMail Servlets, and understood
> how their model was wrong. We faced the same issue while designing
> JAMES, the Java Apache (E*) Mail Server.
> . . .
> Before going on, anyway, I would like to hear comments from all you
> others out there, because, if the majority of you agree that "this is
> the way to go", I will have to accept it.

I completely agree with Pier Paolo.

While Cocoon does have the potential to be used for a wide variety
of web-based applications, I think its web publishing nature may
become "polluted" by trying to make it fit too many, possibly
conflicting requirements.

That said, there's a lot in Niclas' proposal that makes sense and is
consistent with preserving Cocoon's identity while providing stronger
support for application development.

A point where I tend to disagree with Niclas (unless I have missed
his point) is in the convenience of applying multiple successive XSLT
transformations:

> I find a lot easier to make small XSLT sheets that each do a single
> simple thing, than trying to incorporate all transformations in a single
> sheet. It also makes stylesheet re-use and establishment of XSL
> libraries easier. Therefor I have the multiple stage XSL
> transformations as a basic feature.

[Btw, this description fits the mechanism used for applying XSP libraries
and I find it acceptable in the context of (one-time) code generation]

When it comes to document processing, though, such an approach
can become prohibitively costly. We can always retain the reusability
and modularity of multiple, small stylesheets and yet achieve good
performance by means of stylesheet inclusion/importing.

A related and very interesting direction Scott Boag has pointed out
is that of stylesheet compilation: if we translate stylsheets into
bytecodes the "prohibitive" cost I mention above would become less
of a problem (I'd stick to stylesheet inclusion/importing, though!)

This last consideration also has important implications in the area
of XSLT-based dynamic XML generation: you always require at
least 2 passes for that. This is probably the only case where you'd
forcefully have 2 XSLT passes: one for dynamic XML generation
(using extended elements and functions) and one for presentation.

Re: [Moving on] SAX vs. DOM part II

Posted by Pierpaolo Fumagalli <pi...@apache.org>.

Niclas Hedhman wrote:
> 
> Stefano Mazzocchi wrote:
> 
> > Ok, but what is this discussion about?
> 
> Exactly!!  The biggest 'problem' with Cocoon is that its purpose is not
> exactly specified.
> [...]

IMVHO the purpose of Cocoon is very well specified. Cocoon is an XML
publishing tool, and by "publishing" I mean the act of creating
web-sites (regardeless wether on-line, through HTTP, or off-line,
shipped on a CD-ROM, and regardeless wether your browser is a cellphone
or acrobat reader).

What you're asking, to bring Cocoon in the EMail world, basically, is,
IMVHO toally wrong. This idea wasn't born without a deep research. We
(Stefano and I) were the one who proposed EMail Servlets, and understood
how their model was wrong. We faced the same issue while designing
JAMES, the Java Apache (E*) Mail Server.

I learnt with time that reconciling different models within one single
generic case, not only complicates the model itself, but also allow to
have less power and control in each specific case. And reconciling two
things so different like HTTP and SMTP, is, again IMVHO, not a good
idea.

If you need to get something like Cocoon dealing w/ EMails, that's where
JAMES comes to play, because with its set of APIs you will be able to
design a cocoon-like tool, but dealing exactly with SMTP and EMAIL, and
allowing you to have more control on this process.

Doing that, anyway, doesn't prevent you to reuse components from the
Cocoon mainstream, like XSLT, the SQLProcessor, or all the others.

Before going on, anyway, I would like to hear comments from all you
others out there, because, if the majority of you agree that "this is
the way to go", I will have to accept it.

	Pier (the limited :) guy!) 

-- 
--------------------------------------------------------------------
-          P              I              E              R          -
stable structure erected over water to allow the docking of seacraft
<ma...@betaversion.org>    <http://www.betaversion.org/~pier/>
--------------------------------------------------------------------
- ApacheCON Y2K: Come to the official Apache developers conference -
-------------------- <http://www.apachecon.com> --------------------

Cocoon Architecture [was Re: [Moving on] SAX vs. DOM part II]

Posted by Stefano Mazzocchi <st...@apache.org>.

[changed the subject and removed the listed people since this is another
discussion]

Niclas Hedhman wrote:
> 
> Stefano Mazzocchi wrote:
> 
> > Ok, but what is this discussion about?
> 
> Exactly!!  The biggest 'problem' with Cocoon is that its purpose is not
> exactly specified.

All right. No matter how hard to specifiy the limits of a particular
discussion there will always be somebody that breaks them... :) oh
well...

The very first line of the documentation says:

 "Cocoon is a 100% pure Java publishing framework that relies 
  on new W3C technologies (such as DOM, XML, and XSL) to provide 
  web content."

I read:

1) it's java based
2) it's web focused
3) it's publishing oriented
4) it works with new technologies

I think there is no problem in specifying what the purpose of this
project is. Problems arise when it doesn't cover your purposes :) but
that's a totally different story, don't you think?

> When the purpose is defined, we need to list the pro's and con's of each
> task/subpurpose, and how this can be achieved. There will also be needed a
> "expansion plan" on how to extend the underlying architecture (not talking
> about more producers and all that) without comprimising compatbility. Up
> until now, this has not been taken very seriously, but as more producers
> and formatters enters, the overall cost of re-writes staggers.

Hmmm, considering this project is my college thesis and will be probably
be my job for the next who knows how many years.... I personally don't
think I've not taken into serious consideration the architecture of this
software.

Rather the contrary: it may not cover all possible aspects and may have
holes and things to improve... I've never said the opposite... but lack
of seriousness.... well, no, I don't buy that.

> Cocoon is said to be a Web Publishing Framework, but it COULD be a lot more
> than that. I strongly feel that it will ONLY be that, cause the major force
> of Cocoon (read Stefano) is so much into just that. 

Well, I proposed a project to do web publishing using the XML model and
I'm continuing in that direction since this is my current itch. It
strangely appears to be a good idea and a useful software, not only for
myself.

I'm currently focused on finishing what I started before covering new
grounds, but I honestly don't know how this will end up. One day any of
you could take my seat on this project, when I'll get tired of it.

> The requirements in
> other areas are side-stepped either because of performance concerns,
> implementation laziness or ignorance to the issues involved. (This is the
> reason why I have laid back in the Cocoon dev for the last 6 months.)

Uh, this tone is the perfect one to get flamed for in other mail
lists.... On one thing you're right: I'm lazy.

> I am just one representative of an "alternative market".  Our needs are not
> much of high volume, nor large document, static serving. We are in need of
> flexibility, dynamic creation, and most importantly;  requests and
> responses are not only HTTP, but also Email.  The SAD thing is that Cocoon
> is so close to what we need, yet too far away in many respects so that our
> alternative to hardwire our needs is more convinient than adopting Cocoon.

Niclas, we already had this discussion and the terms didn't change that
much: I'm all in favor of creating a wider and more useful architecture,
but I don't have that itch to scratch and you seem the only one who
does.

Your requestor/responsor proposal was a good one, but it would transform
Cocoon into a servlet engine and this is not the right thing for this
project at this moment.

If we make Cocoon a servlet engine, or even worse, a web server or email
server, we loose the juice of it. I do believe in code reuse and
frameworks because they allow us to focus on what we need.

Cocoon is currently a servlet because this is the best way to be
portable without writing tons of code. True, this is a compromise, but a
good one since we are allowed to propose changes to the Servlet API
(I'll be an invited expert for the next round of the Servlet API
specification)

So, covering new ground must not reduce the ground already covered. This
includes porting Cocoon in other languages (a PERL port of Cocoon was
recently proposed to me), or even going native inside Apache (something
we discussed with the PIA guys).

Of course, Cocoon's power are the design patterns not the
implementation. On the other hand, those patterns are rather new and we
need better field knowledge to go forward and this takes time and
skills. Skills I admittedly do not have in many fields.

On the other hand, I think Cocoon was very successful in providing
useful technologies for real needs. This will not stop today, that's for
sure.

> And I think many are in the same position.

I think you are right in seeing powerful design patterns in Cocoon that
could be applied to other realms. But one thing is to reuse patterns,
another thing is to reuse software.

I placed a significant amount of work into the Cocoon architecture and
in prototyping some design patterns for response production on a servlet
based architecture.

Like Pier noted, us two where the first one to propose an extention to
the Servlet API that worked for other request/response protocols as
well. This work is being currently implemented in the JAMES project by
Federico Barbieri, who is going to be hired by Exoffice later next
months to work on Avalon and JAMES and integration with other Apache and
Exolab software.

Also note that Pier and Ricardo and Assaf and Keith are working for
Exoffice.

And given that Ismael (Exoffice CEO) uses nothing but Cocoon, you
clearly see a strong requirement for integration with other tecnologies,
email up front.

> I still strongly believe in the
> Request -> Production -> Content Alignment -> Presentation Alignment
> -> Presentation Formatting ->  Response
> 
> Request:  The incoming request from any of many different kind of sources.
>     Servlet
>     Command line
>     Email
>     Custom
> 
> Production:  The generation of the RAW content, not necessarily suitable
> for presentation.
>     File
>     XSP generated
>     JDBC
>     Custom
> 
> Content Alignment:  The filtering, sorting and other re-arrangments of
> content prior to delivery to the Presentation context.
>     XSLT (multiple stage, often none)
>     XSP generated XSLT (?)
>     Custom
> 
> Presentation Alignment: Re-arrangment of Content to be more suitable for
> the Presentation at hand.
>     XSLT (multiple stage, often none)
>     XSP generated XSLT (?)
>     Custom
> 
> Presentation Formatting:  Application of Style to the Content.
>     XSL FO
>     XSLT to HTML
>     XSLT to CSS
>     Browser formatting with external XSL FO or XSLT
>     FOP
>     Text
>     WML
>     Custom
> 
> Responsors
>     HTTP
>     EMail
>     File / StdOut
>     Custom

What you outline is something like this

 requestor -> cocoon engine -> responsor

Now, if you look into Cocoon 1.6, this is _exactly_ like Cocoon is
implemented. But only three requestors are available

1) servlet 
2) command line
3) custom (used for Turbine integration and inter-java call)

I'll be ready to add "4) email" when Federico comes up with JAMES 1.0
and full MailServlet support. I assume this will take another month or
two.

Note that this needs you to create your own HttpServletRequest/Response,
which is something dirty and I don't like it myself, but a custom
Request/Response object was just too similar for what we need. But, as
always, I'm wide open to suggestions on how to improve this. 

> The key to this model is that the interfaces between each stage must be
> abstracted, yet contain the HTTP headers and parameters, EMail headers and
> command-line parameters, and other future request type information. As well
> as context based information.

Yes, this is why I started abstracting the Request/Respose objects
and... miserably fail. :( The Servlet API is not multi-prototol but it's
heavily http based, even in its general form. Creating a new truly
multi-protocol servlet API is a scary task for me and I don't have the
time/energy/need to do that at this moment. (but I never stopped others
for doing it!)

> My only solution to this is an XML fragment generated by the Requestor, and
> passed on to each stage, and possibly both modfied and expanded. For
> instance, I think it would be a good idea to let the components record
> their activities into such a fragment in debug mode, and a formatter and
> responsor is used to send the debugging result somewhere.

I don't like this, not only for performance reasons: good OOP requires
abstractions in OO code to be designed as objects, not DTDs. You should
access your request data thru methods, not XPath queries. You gain
nothing using XML on this stage, plus, designing a DTD is not easier
than designing an API.

My performance concerns are still there, but I would throw them away if
you could prove the absolute need for such a thing against an API-like
design. On the other hand, talking as a component writer, an API would
be much more easy to learn and use. This is, in fact, the exact same
pattern used in the creation of the Servlet API for http: instead of
parsing the http headers and do all that directly, provide some good
methods to hide all the protocol details.

Unfortunately, they failed to provide a true protocol-abstracted API and
Cocoon suffers from this. :( I'm totally aware of that.

> I would also like to point out, that when XSLT is used for generating
> well-formed HTML or creating FOs,  it is a formatter, belong to the
> Presentation layer and not the Content Layer. And this is important from an
> educational point of view. I.e  document transformation is BOTH a Content
> context, for filtering, sorting and other content based re-arrangements,
> and a Presentation context, for formatting. That is hardly ever
> distinguished in the discussion.

Good point. I'll be more than happy to accept documentation for the
Cocoon project that explains your points better or suggestions on how to
improve the docs based on your ideas. Niclas, I'm sorry you feel left
out of the fun around here, and I apologize in advance for anything that
I might have done to make you walk away from this project.

but I'm open to suggestions and ready to reconsider my decisions. But
remaining silent is not the best way to interact...

> Another thing I have noticed....  I find a lot easier to make small XSLT
> sheets that each do a single simple thing, than trying to incorporate all
> transformations in a single sheet. It also makes stylesheet re-use and
> establishment of XSL libraries easier. Therefor I have the multiple stage
> XSL transformations as a basic feature.

The Cocoon architecture is totally orthogonal to the number of
transformations involved. I do not have enough field knowledge to say
which one is the best.

> What are different people working with?
> I can see 5 kinds of people working on large projects;
> 
> SiteManagement
>     Requestor
>     Responsor
>     GenerationPath (SiteMap)
> 
> Content Logic
>     Producers
>     Transformation
> 
> Presentation Logic
>     Transformation
>     Formatters
> 
> Content
>     Static XML
>     XSP taglibs
>     Custom
> 
> Presentation
>     Static XSL
>     XSP taglibs (?)
>     Custom

Good analisys.

> To me, this is a clean analysis of the needs/purpose (or potential purpose)
> of Cocoon, and I feel a satisfactory design can easily be achieved, that
> satisfies the needs of more people than the current proof of concept.

As far as I'm concerned, you'll always find me here listening for
suggestions.

And with the XSP implementation and many other fixes here and there, I
do not think Cocoon is a proof of concept anymore, but a real and useful
software. True, key issues like the sitemap need to be addressed, but
this is already planned and being worked on.

> As for the less important, but highly discussed, issue of DOM versus SAX.
> Wouldn't it be proper to have a DOMbuilder as a utility class, in such a
> way that whoever needs a DOM or DOM fragment, will listen to the SAX
> events, and when the start of the fragment is trapped, the SAXstream is
> "handed over" to the DOMbuilder, which will return with the DOM fragment at
> the closing tag, and the original SAX receiver continues listening. Call it
> DOMonDemand  :o)  I would think that satisfies just about everyone. It will
> also show better overall performance only to DOMize the part that a
> component is interested in, instead of the SAX2DOM streams/pipes that has
> been proposed earlier.

I'll reply to the real discussion part on another mail.

> Stefano, thanks for inviting to an open challenge of the current "proof of
> concept".

You're welcome. I'd love to do this once every three months or so, just
to see if new people coming in has better ideas or new features to
propose.

I do not think that walking away is the best way to make a project suite
your need, don't you agree with me? :)

> The one who can't refrain from speaking...
> Niclas Hedhman

Never thought this was a problem. On the contrary: keep it up! :)

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------
 Come to the first official Apache Software Foundation Conference!  
------------------------- http://ApacheCon.Com ---------------------

Re: [Moving on] SAX vs. DOM part II

Posted by Niclas Hedhman <ni...@localbar.com>.

Stefano Mazzocchi wrote:

> Ok, but what is this discussion about?

Exactly!!  The biggest 'problem' with Cocoon is that its purpose is not
exactly specified.
When the purpose is defined, we need to list the pro's and con's of each
task/subpurpose, and how this can be achieved. There will also be needed a
"expansion plan" on how to extend the underlying architecture (not talking
about more producers and all that) without comprimising compatbility. Up
until now, this has not been taken very seriously, but as more producers
and formatters enters, the overall cost of re-writes staggers.

Cocoon is said to be a Web Publishing Framework, but it COULD be a lot more
than that. I strongly feel that it will ONLY be that, cause the major force
of Cocoon (read Stefano) is so much into just that. The requirements in
other areas are side-stepped either because of performance concerns,
implementation laziness or ignorance to the issues involved. (This is the
reason why I have laid back in the Cocoon dev for the last 6 months.)

I am just one representative of an "alternative market".  Our needs are not
much of high volume, nor large document, static serving. We are in need of
flexibility, dynamic creation, and most importantly;  requests and
responses are not only HTTP, but also Email.  The SAD thing is that Cocoon
is so close to what we need, yet too far away in many respects so that our
alternative to hardwire our needs is more convinient than adopting Cocoon.
And I think many are in the same position.

I still strongly believe in the
Request -> Production -> Content Alignment -> Presentation Alignment
-> Presentation Formatting ->  Response

Request:  The incoming request from any of many different kind of sources.
    Servlet
    Command line
    Email
    Custom

Production:  The generation of the RAW content, not necessarily suitable
for presentation.
    File
    XSP generated
    JDBC
    Custom

Content Alignment:  The filtering, sorting and other re-arrangments of
content prior to delivery to the Presentation context.
    XSLT (multiple stage, often none)
    XSP generated XSLT (?)
    Custom

Presentation Alignment: Re-arrangment of Content to be more suitable for
the Presentation at hand.
    XSLT (multiple stage, often none)
    XSP generated XSLT (?)
    Custom

Presentation Formatting:  Application of Style to the Content.
    XSL FO
    XSLT to HTML
    XSLT to CSS
    Browser formatting with external XSL FO or XSLT
    FOP
    Text
    WML
    Custom

Responsors
    HTTP
    EMail
    File / StdOut
    Custom


The key to this model is that the interfaces between each stage must be
abstracted, yet contain the HTTP headers and parameters, EMail headers and
command-line parameters, and other future request type information. As well
as context based information.
My only solution to this is an XML fragment generated by the Requestor, and
passed on to each stage, and possibly both modfied and expanded. For
instance, I think it would be a good idea to let the components record
their activities into such a fragment in debug mode, and a formatter and
responsor is used to send the debugging result somewhere.

I would also like to point out, that when XSLT is used for generating
well-formed HTML or creating FOs,  it is a formatter, belong to the
Presentation layer and not the Content Layer. And this is important from an
educational point of view. I.e  document transformation is BOTH a Content
context, for filtering, sorting and other content based re-arrangements,
and a Presentation context, for formatting. That is hardly ever
distinguished in the discussion.

Another thing I have noticed....  I find a lot easier to make small XSLT
sheets that each do a single simple thing, than trying to incorporate all
transformations in a single sheet. It also makes stylesheet re-use and
establishment of XSL libraries easier. Therefor I have the multiple stage
XSL transformations as a basic feature.

What are different people working with?
I can see 5 kinds of people working on large projects;

SiteManagement
    Requestor
    Responsor
    GenerationPath (SiteMap)

Content Logic
    Producers
    Transformation

Presentation Logic
    Transformation
    Formatters

Content
    Static XML
    XSP taglibs
    Custom

Presentation
    Static XSL
    XSP taglibs (?)
    Custom


To me, this is a clean analysis of the needs/purpose (or potential purpose)
of Cocoon, and I feel a satisfactory design can easily be achieved, that
satisfies the needs of more people than the current proof of concept.

As for the less important, but highly discussed, issue of DOM versus SAX.
Wouldn't it be proper to have a DOMbuilder as a utility class, in such a
way that whoever needs a DOM or DOM fragment, will listen to the SAX
events, and when the start of the fragment is trapped, the SAXstream is
"handed over" to the DOMbuilder, which will return with the DOM fragment at
the closing tag, and the original SAX receiver continues listening. Call it
DOMonDemand  :o)  I would think that satisfies just about everyone. It will
also show better overall performance only to DOMize the part that a
component is interested in, instead of the SAX2DOM streams/pipes that has
been proposed earlier.


Stefano, thanks for inviting to an open challenge of the current "proof of
concept".


The one who can't refrain from speaking...
Niclas Hedhman

Re: [Moving on] SAX vs. DOM part II

Posted by Mike Williams <mi...@o3.co.uk>.

  >>> On Mon, 24 Jan 2000 10:38:14 -0800,
  >>> "Pier" == Pierpaolo Fumagalli <pi...@apache.org> wrote:

  Pier> And anyway, even if two stacked processors create a full DOM tree
  Pier> of their input and their output, they will end up building two
  Pier> instances of the two DOM documents. Probably it could take slightly
  Pier> more time to build the DOM from SAX events, but, even in the worst
  Pier> case, the memory is the same.

Isn't it slightly worse than that.  It would be like moving from

  Cocoon1:
    DOM1 -> (ProcA) -> DOM2 -> (ProcB) -> DOM3

to

  Cocoon2(SAX):
    SAX -> (DOM1-> ProcX ->DOM2) -> SAX -> (DOM3-> ProcY ->DOM4) -> SAX

in that the DOM output of ProcA would have to be effectively cloned via
SAX, rather than just passed across.

Still, I thinks this provides a better separatation between the two
processing steps.  I can imagine ProcA wanting to somehow cache it's
output, but ProcB would potentially be able to mess with DOM2.  Using SAX
sidesteps that issue, as the communication between processors is a
read-only event stream.

-- 
Mike Williams

Re: [Moving on] SAX vs. DOM part II

Posted by Pierpaolo Fumagalli <pi...@apache.org>.

Mike Williams wrote:
> 
>   >>> On Sun, 23 Jan 2000 16:03:55 -0800,
>   >>> "Pier" == Pierpaolo Fumagalli <pi...@apache.org> wrote:
> 
>   Pier> I still think it's SAX as a transport, and hooks should be given to
>   Pier> COCOON components to translate a set of sax events in DOM trees
>   Pier> (because they're easier to manage).
> 
> This would certainly make it possible to create very lightweight processing
> layers, and reduce the latency in the processing stream (as well as memory
> usage, etc.)
> 
> As you say, complex processors might end up building a DOM from the SAX
> input, 'cos it's easier to deal with.  The worst case scenario is when two
> processors that use DOM internally are stacked together: the first ends up
> unwinding it's output DOM into SAX events, which the second uses to
> re-create a DOM.  I wonder how much overhead there actually is in doing
> this ... it's not going to be any worse than "Node.cloneNode(true)", is it?


Exactly... I am already doing this for the NRG rasterizer (converting
XML to images) and it consumes less memory, and takes less time, than
having the whole DOM parsed and built.

And anyway, even if two stacked processors create a full DOM tree of
their input and their output, they will end up building two instances of
the two DOM documents. Probably it could take slightly more time to
build the DOM from SAX events, but, even in the worst case, the memory
is the same.

	Pier

-- 
--------------------------------------------------------------------
-          P              I              E              R          -
stable structure erected over water to allow the docking of seacraft
<ma...@betaversion.org>    <http://www.betaversion.org/~pier/>
--------------------------------------------------------------------
- ApacheCON Y2K: Come to the official Apache developers conference -
-------------------- <http://www.apachecon.com> --------------------

Re: [Moving on] SAX vs. DOM part II

Posted by Pierpaolo Fumagalli <pi...@apache.org>.

Paul Russell wrote:
> 
> > As you say, complex processors might end up building a DOM from the SAX
> > input, 'cos it's easier to deal with.  The worst case scenario is when two
> > processors that use DOM internally are stacked together: the first ends up
> > unwinding it's output DOM into SAX events, which the second uses to
> > re-create a DOM.  I wonder how much overhead there actually is in doing
> > this ... it's not going to be any worse than "Node.cloneNode(true)", is it?
> 
> Wouldn't it make sense to create an 'adapter' that sits between each set of
> nodes and translates between the two models on an as needed basis - that
> way, only required translations would actually happen. Just a thought.

An AbstractDOMFilter... Yep... I am writing right now the SaxToDom and
DomToSax utilities, that can be used in those cases.

	Pier

-- 
--------------------------------------------------------------------
-          P              I              E              R          -
stable structure erected over water to allow the docking of seacraft
<ma...@betaversion.org>    <http://www.betaversion.org/~pier/>
--------------------------------------------------------------------
- ApacheCON Y2K: Come to the official Apache developers conference -
-------------------- <http://www.apachecon.com> --------------------

Re: [Moving on] SAX vs. DOM part II

Posted by Paul Russell <Pa...@uea.ac.uk>.

> As you say, complex processors might end up building a DOM from the SAX
> input, 'cos it's easier to deal with.  The worst case scenario is when two
> processors that use DOM internally are stacked together: the first ends up
> unwinding it's output DOM into SAX events, which the second uses to
> re-create a DOM.  I wonder how much overhead there actually is in doing
> this ... it's not going to be any worse than "Node.cloneNode(true)", is it?

Wouldn't it make sense to create an 'adapter' that sits between each set of
nodes and translates between the two models on an as needed basis - that
way, only required translations would actually happen. Just a thought.


Paul

Re: [Moving on] SAX vs. DOM part II

Posted by Mike Williams <mi...@o3.co.uk>.

  >>> On Sun, 23 Jan 2000 16:03:55 -0800,
  >>> "Pier" == Pierpaolo Fumagalli <pi...@apache.org> wrote:

  Pier> I still think it's SAX as a transport, and hooks should be given to
  Pier> COCOON components to translate a set of sax events in DOM trees
  Pier> (because they're easier to manage).

This would certainly make it possible to create very lightweight processing
layers, and reduce the latency in the processing stream (as well as memory
usage, etc.)  

As you say, complex processors might end up building a DOM from the SAX
input, 'cos it's easier to deal with.  The worst case scenario is when two
processors that use DOM internally are stacked together: the first ends up
unwinding it's output DOM into SAX events, which the second uses to
re-create a DOM.  I wonder how much overhead there actually is in doing
this ... it's not going to be any worse than "Node.cloneNode(true)", is it?

-- 
Mike Williams

Re: [Moving on] SAX vs. DOM part II

Posted by Ben Laurie <be...@algroup.co.uk>.

Pierpaolo Fumagalli wrote:
> It's fairly trivial if you convert a whole pack of SAX events (from

That'll be a SAX-pack, then? (sorry, couldn't resist :-)

Cheers,

Ben.

--
SECURE HOSTING AT THE BUNKER! http://www.thebunker.net/hosting.htm

http://www.apache-ssl.org/ben.html

Y19100 no-prize winner!
http://www.ntk.net/index.cgi?back=2000/now0121.txt

Re: [Moving on] SAX vs. DOM part II

Posted by Pierpaolo Fumagalli <pi...@apache.org>.

Donald Ball wrote:
> 
> I concur. We need SAX for fast transport between layers and DOM access
> routines for use by layers if the layer prefers to work in DOM. How hard
> could this be? The algorithm seems to be pretty simple.

It's fairly trivial if you convert a whole pack of SAX events (from
startDocument() to endDocument()) into a FULL DOM tree. It gets a little
bit trickier (just a little bit), if you want to have only fragments of
DOM:

I have this document:

<image width="100" height="100" background="#000000">
  <layer method="multiply" opacity="100%">
    <text x="10" y="20" color="#ffffff">
      Print this!
    </text>
  </layer>
  <layer method="multiply" opacity="100%">
    <text x="11" y="21" color="#999999">
      Print this!
    </text>
  </layer>
</image>

I can imagine that the <image> and the <layer> stuff are handled
directly using SAX events, but I want to pass a DOM fragment (an Element
and all that is included into it) to the object handling the <text>
element...

I'm about to complete it, and the implementation will be in CVS
tomorrow...

> If a layer knows
> that it's going to be accessed by DOM, then it could just catch incoming
> SAX events and store them in a list. If and when DOM access occured, you'd
> wait for all SAX events to com in and then either just scan through the
> list looking for nodes or turn the list into a tree. Events would be sent
> out either as incremental processing was done (for SAX processors), or the
> node tree would be serialized as SAX events once DOM processing was
> complete (unless you know the next layer wants DOM access, in which case
> you could maybe just pass on the DOM object.

Yep... The only thing is that both processors (the idea was to call them
Filters, am I right?) are DOM-based, we'll end up building two DOM trees
for the same stuff. The biggest problem (right now) seems to be XSLT,
but I bet that Scott will come up with something nice...

> (sorry to be so pedantic, but I'm trying to convince myself as much as
> anyone else)

:)

> Although, you know, it's not DOM that my processors might want so much,
> anyway, but a _good_ tree-based API. Personally, I think DOM is a poorly
> designed API and that one could do much better (hell, we could even come
> up with the new de facto replacement for tree-based XML API).

I share your opinion on the DOM API, I wouldn't did it in that way, but
those are personal ideas. Anyway I do understand that right now DOM is a
standard, and it's more or less supported everywhere. Designing another
would be overkilling, and the task of porting already built components
from DOM to "our nice dom" would be a too big task.
The best thing is "just stick with SAX" :)

	Pier

-- 
--------------------------------------------------------------------
-          P              I              E              R          -
stable structure erected over water to allow the docking of seacraft
<ma...@betaversion.org>    <http://www.betaversion.org/~pier/>
--------------------------------------------------------------------
- ApacheCON Y2K: Come to the official Apache developers conference -
-------------------- <http://www.apachecon.com> --------------------

Re: [Moving on] SAX vs. DOM part II

Posted by Donald Ball <ba...@webslingerZ.com>.

On Sun, 23 Jan 2000, Pierpaolo Fumagalli wrote:

> > This discussion should be focused on answering this question:
> >  "what is the best architecture for Cocoon2?"
> 
> I still think it's SAX as a transport, and hooks should be given to
> COCOON components to translate a set of sax events in DOM trees (because
> they're easier to manage).
> 
> > in answering this question, we should consider both dynamic and static
> > operativity, performance, memory usage, scalability, availability of
> > implementations, degree of standardization, degree of usability, ease of
> > use, cost of operation and time to market of a possible solution.
> 
> As I said. Let's use SAX on the transport (between
> producer/filters/serializer), and each of this components is given hooks
> to translate a serie of SAX events in dom fragments.

I concur. We need SAX for fast transport between layers and DOM access
routines for use by layers if the layer prefers to work in DOM. How hard
could this be? The algorithm seems to be pretty simple. If a layer knows
that it's going to be accessed by DOM, then it could just catch incoming
SAX events and store them in a list. If and when DOM access occured, you'd
wait for all SAX events to com in and then either just scan through the
list looking for nodes or turn the list into a tree. Events would be sent
out either as incremental processing was done (for SAX processors), or the
node tree would be serialized as SAX events once DOM processing was
complete (unless you know the next layer wants DOM access, in which case
you could maybe just pass on the DOM object.

(sorry to be so pedantic, but I'm trying to convince myself as much as
anyone else)

Although, you know, it's not DOM that my processors might want so much,
anyway, but a _good_ tree-based API. Personally, I think DOM is a poorly
designed API and that one could do much better (hell, we could even come
up with the new de facto replacement for tree-based XML API).

- donald

Re: [Moving on] SAX vs. DOM part II

Posted by Pierpaolo Fumagalli <pi...@apache.org>.

Stefano Mazzocchi wrote:
> 
> [...]
> - Scott Boat

Scott "The Mississippi SteamBOAT" Boag :) hahahahaha :)
(Sorry Scott!)

> [...]
> - Pierpaolo Fumagalli, he'll play the static XML guru role.

"Guru wannabe" please :)

> [...]
> On the other hand, key issues about web operation (like content-length,
> expiration headers and such) or internal operation pose a great deal of
> problems when the DOM model is abandoned.

I don't see how, when you're using DOM, you solve the issue of the
content length.
The Content-Length must be written before the content is passed, and I
don't see how DOM can help on calculating it. I can see the memory
footprint of my in-memory dom, but that's far from being the content
length.
The content length can be issued only when the processed document is
already in the cache (and so properly formatted), but not on the first
hit, in both cases: DOM or SAX.

> This discussion should be focused on answering this question:
>  "what is the best architecture for Cocoon2?"

I still think it's SAX as a transport, and hooks should be given to
COCOON components to translate a set of sax events in DOM trees (because
they're easier to manage).

> in answering this question, we should consider both dynamic and static
> operativity, performance, memory usage, scalability, availability of
> implementations, degree of standardization, degree of usability, ease of
> use, cost of operation and time to market of a possible solution.

As I said. Let's use SAX on the transport (between
producer/filters/serializer), and each of this components is given hooks
to translate a serie of SAX events in dom fragments.

> [...]
> 1) the adoption of W3C standards is not under discussion. We should work
> with what it's standardized "today". Proposals that rely on
> yet-to-be-finalized features or new ideas will be evaluated one by one,
> but as a general rule, we should play with the rules we already have.

Like SAX2?

> 4) this discussion will be orthogonal to the sitemap design, meaning
> that the sitemap will not make assumptions on the underlying API
> architecture used inside Cocoon.

Agreed...

> Ok, I'll start with my personal and very brief comment:
> 
> "I like DOM because I'm lazy and I don't want to rewrite Cocoon, but I
> also know that Pier needs SAX support for static operation and we need
> better links between Cocoon and X*L components than DOM 1 provides. I'd
> be glad to make everyone happy without rewriting the whole thing,
> removing this debate once and forever"

I am willing to rewrite the whole thing :) I have to earn my salary
somehow :)

	Pier 

-- 
--------------------------------------------------------------------
-          P              I              E              R          -
stable structure erected over water to allow the docking of seacraft
<ma...@betaversion.org>    <http://www.betaversion.org/~pier/>
--------------------------------------------------------------------
- ApacheCON Y2K: Come to the official Apache developers conference -
-------------------- <http://www.apachecon.com> --------------------