You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@cocoon.apache.org by Stefano Mazzocchi <st...@apache.org> on 2001/09/30 17:26:51 UTC

[RT] Cocoon web applications

This is something I always wanted to bring up but never considered it to
be a priority high enough, but as soon as Cocoon2 reaches a status where
it's useable in production sites, people will start asking for something
more user friendly than a WAR file that can get as big as hell.

Let me start saying that I have a love/hate approach with WAR files and
has nothing to do with the acronym which I *very strongly* dislike for
the unfortunate resonance with a terrible thing.

ok, pacifist disclaimer given, I'll being saying that I love the way we
install cocoon in a big file we deploy on all servlet containers. Many
have already personally expressed their happyness to me compared to the
hassle that they required for installation for Cocoon1. This is mainly
given to the webapp deployment concept that latest servlet API include.

So, if you consider Cocoon and its samples a single web application,
this way is perfect and you will never going to need anything else, but
as soon as you start adding your own stuff, you'll find out that Cocoon
is not a single web application but a framework for applications.

In short, as Tomcat is a container for servlet-based web applications,
Cocoon is a container for cocoon-based web applications. The parallel is
evident to me and should *not* require (as Jeremy was asking, clearly
touched by the same feeling) separate cocoon instances just to *deploy*
different cocoon-based web applications.

There is also another reason: while there exist components that are
general enough to be distributed with Cocoon, in some other
circumstances, such components, might well turn out to be to specific
for a particular need (specific actions tight to your own business
logic, or specific transformers that are triggered by some precise
semantic, etc..)

In this respect, there is a big parallel between servlet-based web
applications and cocoon-based web applications: both require a
"deployment descriptor" that gives the container instructions on where
to "mount" it, where to find web-app specific components, libraries and
resources.

Clearly, the sitemap is the closes thing that matches this.

Let's make a solid example: I started integrating my image gallery thing
which now requires 12 (or so) new classes added to the Cocoon
distribution (some 6 new components), but they are general enough to be
useable on many other circumstances, but one component which is simply
too specific to be of any use in other circumstances.

Currently, the operations that we have to do to *install* a new
cocoon-based web application are: 

 1) prepare a directory with all the required files
 2) mount the new web-app sitemap in the sitemap that controls the
URI-space we want to mount our stuff on
 3) place our web-app specific components in the folder for new
components (defined in cocoon.conf, if my memory doesn't fail)
 4) have the servlet container restart the entire web-application
handled by Cocoon.

While, following the servlet parallel, we should do:

 1) have a CWA (Cocoon Web Application) file with a manifest file (or
equivalent thing) that specifies where is the sitemap file (I'm also
happy with forcing the sitemap file to be called sitemap.xmap and places
in the root of the package, thus eliminating the need for such a
manifest file) and contains all the required things (resources,
stylesheets, additional components and libraries, entity catalogs,
etc..).

 2) open the cocoon manager (similar to Tomcat 4.0 manager webapp, just
*much* more user friendly) and authenticate (if more security is
required this could be mapped over an SLL-secured connection and
authentication guaranteed by client-side certification, but this is none
of our concern since Cocoon doesn't handle nor should that part of the
HTTP connection).

 3) upload the CWA file (unlike Tomcat 4.0 manager which simply requires
you to indicate where the CWA file is on the machine, with upload we can
deploy a CWA from another machine entirely which is a great feature).

 4) tell Cocoon to start the deployed CWA

and that's it, without even having to stop Cocoon or even tell the
servlet container about what we are doing.

Of course, Cocoon's classloader should be rearchitected to allow several
"contexts" which different classloaders, this will automatically solve
the issues of having to run multiple cocoon instances to separate the
resolution space of different cocoon-based webapps.

But there are other things that might turn out incredibly useful: almost
everybody works with two copies, one for development and one for
production. The first is used when developing, the second is deployed
and used until changes are required.

Everybody that has real-life working knowledge knows that is almost
impossible to force people to work on a centralized version, expecially
if the easiest way to modify something is to work on what is currently
live.

Currently, the processing cycles are something like: 

 1) write your webapp under the /cocoon2/ folder

 2) use cocoon build file to generate the WAR file (which contains your
stuff as well)

 3) deploy that on the servlet container.

but then you note that your stylesheets have something wrong, so you
don't do this over and over (since the cocoon-war file is so big and
restarting the entire crap takes forever and a half) but simply modify
the stylesheets in-place while they are live.

You can bet your ass that you'll forget to copy back the changes to your
original location.

Result: next revision gets deployed, many things that previously worked
well (expecially in sections you didn't touch because they were just
perfect as they were) don't work anymore. This is called: lost update.

One solution is to do the deploying once cocoon2 fresh out of the box,
then install your stuff over on the deployed version.

NOTE: the servlet API doesn't say *anything* about what happens to
deployed files that are subsequently modified after being unpacked from
the WAR. In some circumstances, the container might even erase the
unpacked version when the web-app is stopped or the container is shut
down in order to save space. The Servlet API assume the WAR file and the
unpacked version are the *same* and unpacking occurs only for speed
reasons, not to allow you to modify things live.

So, you installed Cocoon2 fresh, it gets unpacked, you stop tomcat
without shutting down but simply kill -9 it or CTRL-C on the shell, you
add your stuff and work well.

C'mon, this is crap, we must come up with something smarter.

Ok, the idea is: how do I make the files deployed unmodifiable?

My solution is: compile everything. Tranquillity by obscurity.

If you transform all XML files into CompiledXML files (using the code I
wrote for a long time ago and which is now used in the cache system),
not only parsing performance is greatly improved on live sites, but also
we obtain that people will very unlikely modify directly the unpacked
files because they are, in fact, binary.

This also means precompiling sitemaps and XSPs and everything that needs
compilation.

Of course, this is not suitable for close-cycle development of cocoon
web apps: I could not want to have to recompile my entire CWA, deploy,
restart, etc, everytime I have to modify my stylesheet, it would be
foolish to impose this, but on production this makes a real difference,
expecially in those places where carefully scrutinized quality assurance
phases must be performed before something enters production.

In these situations, we must take all the actions to allow packages as
sealed as possible (possibly even crypto-sealed) to be deployed even
remotely on a live site, making also possible to upgrade an existing
package with a new one while the other is running (which is not that
hard to do with carefully designed multi-threading management of
subcontexts).

Currently, development of cocoon webapps is rough and not engineered: is
mostly left to the user ability to manage the process.

In the future, I'd love to make it possible to design the system in such
a way that concerns are well kept separate even during the two stages,
development and production, for example, performing sitemap
interpretation during developement (since no high load is required, but
faster responsiveness at structure changes) while performing sitemap
compilation on deployment. Same thing for compiled XML.

Ok, hope this is enough to start a discussion. If you have any
suggestion to shape the way you will develop, deploy, manage your future
webapps in cocoon, make yourself heard now at design stage so that we
can get down in coding with a clear indication on what the people want
or would like to see.

Ciao.

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] Cocoon web applications

Posted by Arno Illmann <ar...@gmx.de>.

I would not post to this list or to this thread, but I read it unfortunately and thought it could be useful to have a view from the eyes of a Cocoon beginner at this point. Ahead I beg your pardon for my germlish and want to express my thankfulness and admiration for inventing and developing Cocoon, working examples and documentation.

"...developers try to use Cocoon, and they look at the sitemap and freeze. ..." (from http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=99443970706638&w=2 ).

I think the main problem in developing with Cocoon is not an incomprehensibility and unusability of the sitemap concepts, it is especially at this point and in spite of better documentation in the release candidate a stumbling over lots of undocumented or difficult to find little things in syntax and usage. I would have a big problem telling my superiors about the progress of a project I am searching around and want to start a group in funny software puzzle solving. We need a reference work with a chapter listing sitemap tags, description, parameters, usage, little example and some hints. Other chapters could be xsp tags, built in logicsheets, actions and other cocoon specific things for developing applications.

Next steps in minimizing the problems with the sitemap and improvement of developing applications with cocoon could be
a) a additional java or web based GUI with a configuration administrator, an application workbench with a local and remote window and versioning and checkout features as well as a build, deployment, production stage and automatical backup manager.
b) a representative description of a way to work with an IDE which I could not put into effect without building and restarting tomcat after every little change and after that derive it to the production stage. This part is really a pain.
c) the highly configurable possibility to appoint files external to a perhaps fully compiled webapplication in consideration of the particular customizing process of an in its core only maintained web application version.

I of course do not think there could be no next steps in developing the sitemap concepts especially splitting and extending the possibilitys of its syntax but I hope this will not be a complete rewrite of the syntax and in a way backward compatible corresponding to a beta and not an alpha version.

Kind regards, Arno Illmann

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] Cocoon web applications

Posted by Stefano Mazzocchi <st...@apache.org>.

-- Michael Hartle wrote:
> 
> Stefano Mazzocchi wrote:
> 
> >>large CWAs are going to happen (they will, right ?), the sheer length of
> >>the list of "parameters" for customizing them could well justify this
> >>approach.
> >>
> >Well, I don't think the number of parameters depend on the size of the
> >CWA being deployed and I don't think I get your point.
> >
> With "large" CWAs I meant possibly as complex systems as project
> management, groupware or financial/commercial components. The interest
> in developing off-the-shelf drop-in CWAs that can be composed and
> interconnected will more likely lead to more configuration options
> (parameters) than not. Maybe you have another estimation on that.

To be honest, I think it will depend, but generally, I'd tend to assume
that rather than passing tons of parameters, one might pass the name of
the directory server (or other type of repository) that contains them.

But I might well be wrong.

> If those configuration options/parameters are not being passed or
> maintained by defined means of Cocoon, each CWA will solve this problem
> on their own by web forms or seperate configuration files, unnecessarily
> duplicating work of developers and maybe hinder the use of CWAs.

Here I completely agree.

We'll see: if a small number of configurations are required for each
CWA, then a 'lookup' on a configuration registry is, IMO, the way to go.
If, on the other hand, a massive flow of highly structured and possibly
namespaced configurations will be required (but I'd suggest against it
unless we want to deal with broken contracts between CWAs), then a IoC
mechanism will be better suited.

> >>This would even allow passing on an instance of configuration source to
> >>the CWA, this instance implementing some ConfigurationSource-interface -
> >>be aware that I am not sure whether such an interface or class with this
> >>name is already existing, and I am not knowingly referring to anything
> >>here as I just made a wild guess how this could be named. This might be
> >>an LDAPConfigurationSource or a FileConfigurationSource, being able to
> >>deliver arbitrary configuration information to the CWA as a stream of
> >>SAX events.
> >>
> >
> >Normally, configurations are strings or numbers. Do you really want to
> >receive a stream of SAX events that you have to write your own code to
> >interpret, instead of being able to simply *ask* a configuration
> >repository for the configuration you want?
> >
> I think we basically misunderstood each other in respect to the amount
> of information that may be required as configuration for a CWA; I so far
> am thinking that this might not be done with passing 2 or 3 name/value
> pairs to any app. I consider the term configuration to be a potentially
> hierarchical collection of one or more key/value combinations.
> 
> Ok, I agree with you on the part that many small and simple applications
> will be satisfied with just asking for one or two strings or numbers.
> Why not basically use SAX events, giving small applications some utility
> classes at hand that helps doing exactly and solely this while
> developers of more complex apps might decide for themselves what they need ?

I'm not against using SAX 'per se'. I'm more against passing a high
number of parameters and making it easier for the CWA developer to
receive a load of them... but this is unless talking without real-life
examples.

Anyway, I think of CWA as the web equivalent of avalon blocks and blocks
receive a configuration instance that incapsulates the conf structure
they need and the block queries the configuration it needs.

I fail to see why such a system cannot scale with the amount of
configurations required by CWA. In fact, directory servers was created
exactly to allow tree-like structures to scale (given the difficulty of
providing a fast relational view of a tree).

Also, as a developer, I would not find it easier to have to intercept a
bunch of SAX events, store them someplace, retrieve them later and
manage their caching/lookup-strategy, etc...

I'd rather receive a configuration repository with a solid hierarchical
structure and pruned for my own configurations, then ask for what I need
when I need it without caring about how to get it.

Admittedly, this is not IoC since you are calling directly an API, but
there are situations where normal flow of control is preferrable and
this is, IMO, one of such places.

> >>And those ConfigurationSource's could be defined in the
> >>sitemap and being configured with parameters when needed. Hey, why not
> >>simply consider different configuration sources as sources for SAX
> >>events directly ?
> >>
> >Because this is exactly what flexiblity syndrome looks like: when you
> >have a hammer in your hand, everything looks like a nail.
> >
> I rather like to open-mindedly check everything in advance before
> stepping into nails; this does not necessarily mean that this
> flexibility needs to be exercised all the way through, but left as a
> possible way to go in case it is needed.

Absolutely.

> >>>each CWA indicates
> >>>   o  its role as a URI  (http://apache.org/cocoon/webapp)
> >>>
> What about putting the version in major.minor format into the URI, like
> it is being done with namespace URIs like
> "http://apache.org/cocoon/LDAP/1.0" or
> "http://apache.org/cocoon/include/1.0" ?

Yes, this is another alternative and might appear more coherent with our
use of namespace URIs if we all start using them correctly: means that
we *must* make sure that version numbers maintain their semantic meaning
that if major numbers are equal, namespaces are compatible, otherwise
they are not.

> >>I am always fearful of prompts that do something for me that I may not
> >>have understood completely.
> >>
> >This is why the CWA will also contain a description of the configuration
> >that you are being asked to provide.
> >
> I will correct my sentence: I am always fearful of automated processes
> and prompts that do something for me that I should at least once have
> done for myself manually in order to understand it completely.

Ok, got your feeling and I think you raise a good cognitive point: who
should we address this mechanism? people used to install and configurre
software with point and click installers and GUI tools (admittedly, I'm
one of those) or people used to install software from the command line
and configure it thru text editing?

I think we should target both, mostly because that will make it easier
for people coming on the server market from non-unix backgrounds
(admittedly, I'm one of those, again) to install, configure and create
their own complex website using our lego-like CWA components.

And, BTW, MacOSX shows that having two choices (GUIs and CLIs) gives you
such a great feeling of ease-of-use without sacrificing the power to
control the system at the very granular level of single configurations.

> >>Do we speak of a solely installer-based
> >>approach here or might this allow the admin to just drop-in the .cwa
> >>file and add an entry to the sitemap without anything able to directly
> >>prompt at all ?
> >>
> >As you guys wish, I don't see any reason to force one behavior or the
> >other.
> >
> I think the installer-based approach with prompting the user is
> something to be left for a seperate deployment tool.

It might be, but it has to connect directly to the system and people
will trust less a package that you have to install afterwords and has to
connect to cocoon core to operate.

If we had such a tool, I'd rather ship it with cocoon while letting you
turn it off (for whatever reason).

> >>We should give the administrator a way to explicitly name a certain
> >>instance, leaving the administrator the choice between automatic
> >>GUI-driven and good-old vi-driven approaches, as the admin can partially
> >>or completely override the naming scheme and refer CWA dependencies to
> >>their targets manually. Whenever people would use the dynamic naming
> >>scheme to ensure multiple installations do not collide, they show a
> >>certain lack of interest to make sure they connect CWA dependencies
> >>correctly.
> >>
> >You got the wrong perception here, since I'm trying to help
> >administrators instead of doing stuff for them behind their backs. But
> >if you don't like that, fine, you'll always have the good old
> >configuration files to modify by hand.
> >
> Helping administrators is a good thing and will certainly be rewarded; I
> am by no means having a vi fetish, but I think that being able to modify
> the configurations by hand has its advantages such as speed.

Agreed.

Believe me: even if I was raised on GUIs and still know only a few basic
VI commands, I do appreciate the power of being able to reconfigure a
server by simply editing a few lines you know very well and without
going to a nightmare of clicks.

But on the other hand, newbies or people not really "deep" into that,
might just want a nice and simple visual interface to do their stuff and
start learning.

Don't worry, we won't sacrifice one for the other.

--
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] Cocoon web applications

Posted by Michael Hartle <mh...@hartle-klug.com>.

Stefano Mazzocchi wrote:

>>large CWAs are going to happen (they will, right ?), the sheer length of
>>the list of "parameters" for customizing them could well justify this
>>approach.
>>
>Well, I don't think the number of parameters depend on the size of the
>CWA being deployed and I don't think I get your point.
>
With "large" CWAs I meant possibly as complex systems as project 
management, groupware or financial/commercial components. The interest 
in developing off-the-shelf drop-in CWAs that can be composed and 
interconnected will more likely lead to more configuration options 
(parameters) than not. Maybe you have another estimation on that.

If those configuration options/parameters are not being passed or 
maintained by defined means of Cocoon, each CWA will solve this problem 
on their own by web forms or seperate configuration files, unnecessarily 
duplicating work of developers and maybe hinder the use of CWAs.

>>This would even allow passing on an instance of configuration source to
>>the CWA, this instance implementing some ConfigurationSource-interface -
>>be aware that I am not sure whether such an interface or class with this
>>name is already existing, and I am not knowingly referring to anything
>>here as I just made a wild guess how this could be named. This might be
>>an LDAPConfigurationSource or a FileConfigurationSource, being able to
>>deliver arbitrary configuration information to the CWA as a stream of
>>SAX events.
>>
>
>Normally, configurations are strings or numbers. Do you really want to
>receive a stream of SAX events that you have to write your own code to
>interpret, instead of being able to simply *ask* a configuration
>repository for the configuration you want?
>
I think we basically misunderstood each other in respect to the amount 
of information that may be required as configuration for a CWA; I so far 
am thinking that this might not be done with passing 2 or 3 name/value 
pairs to any app. I consider the term configuration to be a potentially 
hierarchical collection of one or more key/value combinations.

Ok, I agree with you on the part that many small and simple applications 
will be satisfied with just asking for one or two strings or numbers. 
Why not basically use SAX events, giving small applications some utility 
classes at hand that helps doing exactly and solely this while 
developers of more complex apps might decide for themselves what they need ?

>>And those ConfigurationSource's could be defined in the
>>sitemap and being configured with parameters when needed. Hey, why not
>>simply consider different configuration sources as sources for SAX
>>events directly ?
>>
>Because this is exactly what flexiblity syndrome looks like: when you
>have a hammer in your hand, everything looks like a nail.
>
I rather like to open-mindedly check everything in advance before 
stepping into nails; this does not necessarily mean that this 
flexibility needs to be exercised all the way through, but left as a 
possible way to go in case it is needed.

>>>each CWA indicates
>>>   o  its role as a URI  (http://apache.org/cocoon/webapp)
>>>
What about putting the version in major.minor format into the URI, like 
it is being done with namespace URIs like 
"http://apache.org/cocoon/LDAP/1.0" or 
"http://apache.org/cocoon/include/1.0" ?

>>I am always fearful of prompts that do something for me that I may not
>>have understood completely. 
>>
>This is why the CWA will also contain a description of the configuration
>that you are being asked to provide.
>
I will correct my sentence: I am always fearful of automated processes 
and prompts that do something for me that I should at least once have 
done for myself manually in order to understand it completely.

>>Do we speak of a solely installer-based
>>approach here or might this allow the admin to just drop-in the .cwa
>>file and add an entry to the sitemap without anything able to directly
>>prompt at all ?
>>
>As you guys wish, I don't see any reason to force one behavior or the
>other.
>
I think the installer-based approach with prompting the user is 
something to be left for a seperate deployment tool.

>>We should give the administrator a way to explicitly name a certain
>>instance, leaving the administrator the choice between automatic
>>GUI-driven and good-old vi-driven approaches, as the admin can partially
>>or completely override the naming scheme and refer CWA dependencies to
>>their targets manually. Whenever people would use the dynamic naming
>>scheme to ensure multiple installations do not collide, they show a
>>certain lack of interest to make sure they connect CWA dependencies
>>correctly.
>>
>You got the wrong perception here, since I'm trying to help
>administrators instead of doing stuff for them behind their backs. But
>if you don't like that, fine, you'll always have the good old
>configuration files to modify by hand.
>
Helping administrators is a good thing and will certainly be rewarded; I 
am by no means having a vi fetish, but I think that being able to modify 
the configurations by hand has its advantages such as speed.

Best regards,

Michael Hartle


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] Cocoon web applications

Posted by Stefano Mazzocchi <st...@apache.org>.

Michael Hartle wrote:

> >>Some setup like mounting could happen in a sitemap.xmap, as currently
> >>this is the place controlling URI space; this even allows for an
> >>auto-mounting extension for the sitemap. Many packages will require
> >>setup beyond mounting, for example where to find the corporate identity
> >>stylesheets, which accounting database to use, or what's the business
> >>name to be put on the tax report thats being produced, so I need both
> >>the information about what I CAN configure and what I actually DO
> >>configure for this package.
> >>
> >Again, I was thinking about IoC: have the CWA ask Cocoon about things it
> >needs instead of having you to modify the things after setup.
> >
> Oh, I am not thinking of modifying the CWA for any purpose, I simply
> meant supplying Cocoon with configuration information it would pass on
> to the CWA when appropriate. I am interpreting the IoC principle the
> other way around; the CWA is "just" a web application among many others
> that are also being some sort of drop-in. So it is not the CWA
> actively/dynamically asking Cocoon about something, but Cocoon handing
> down configuration information to the CWA at a certain stage.

You're right, I was more 'subverting' the control, but in this case, a
passive operation would be very much bean-like. Well, the two
alternatives are equivalent under a certain light so, as long as all the
required features are met, I don't care what side the control takes.

> >The contract is the internal tree shape (sort of URI space for
> >configurations) and CWA might look for configurations in there. For
> >example in a sitemap
> >
> > <map:parameter name="database"
> >value="conf://datasource/relational/main"/>
> > <map:parameter name="user"
> >value="conf://datasource/relational/main/user"/>
> > <map:parameter name="password"
> >value="conf://datasource/relational/main/password"/>
> >
> >something like that.
> >
> >It is also pretty easy to scan the CWA for conf:// protocols and
> >understand if the registry contains already the information or needs to
> >prompt the installer for it.
> >
> >This way, security sensible information is stored by Cocoon in another
> >location, probably out of the addressing space, making it inherently
> >more secure (it might even attach directly to an LDAP server, for that
> >matter).
> >
> The concept of allowing different sources (LDAP, filesystem, etc) for
> configuration as described is interesting, although I do think that each
> application should have it's own dedicated branch for setup information
> instead of forcing each parameter to be explicitly being passed;

well, both xml configurations and java properties have tree-like shapes,
just like directory servers and netinfo-like registries. I was proposing
to simply make this more coherent internally and abstract it on the
repository (but completely transparent from the configurable entities
asking for their configurations).

> as
> large CWAs are going to happen (they will, right ?), the sheer length of
> the list of "parameters" for customizing them could well justify this
> approach.

Well, I don't think the number of parameters depend on the size of the
CWA being deployed and I don't think I get your point.
 
> This would even allow passing on an instance of configuration source to
> the CWA, this instance implementing some ConfigurationSource-interface -
> be aware that I am not sure whether such an interface or class with this
> name is already existing, and I am not knowingly referring to anything
> here as I just made a wild guess how this could be named. This might be
> an LDAPConfigurationSource or a FileConfigurationSource, being able to
> deliver arbitrary configuration information to the CWA as a stream of
> SAX events.

Normally, configurations are strings or numbers. Do you really want to
receive a stream of SAX events that you have to write your own code to
interpret, instead of being able to simply *ask* a configuration
repository for the configuration you want?

> And those ConfigurationSource's could be defined in the
> sitemap and being configured with parameters when needed. Hey, why not
> simply consider different configuration sources as sources for SAX
> events directly ?

Because this is exactly what flexiblity syndrome looks like: when you
have a hammer in your hand, everything looks like a nail.
 
> >>3.) As the .cwa package does not know in advance where it will be
> >>deployed, it cannot know about the URI space it will be accessable from
> >>via the web, yet most content needs to point to other content in this
> >>package, for example just simple links from HTML page A to HTML page B.
> >>
> >I'd assume that the URI structure of the CWA package can be considered a
> >contract. So, the only "soft" thing is the location where this "hard"
> >URI tree is mounted.
> >
> That's what I meant but didn't express. ;)
> 
> >>If  resource names of pipelines were added to the sitemap which are
> >>local to the package/sitemap, the .cwa designer could just use resource
> >>names in his package and have them resolved later via taglibs in a page
> >>or other means in the sitemap like a cocoon-protocol extension like it
> >>was posted for role-based access.
> >>
> >Exactly, we still have to define "how" those "soft"+"hard" links are
> >actually translated to real URL addresses, but we agree on the mechanism
> >and this is a good thing.
> >
> Ok, so the "how" basically has to allow the developer to interchangably
> use "soft" and "hard" links all over the place, even though "soft" links
> might need one or two steps in addition.

"hard" links are expressed just like you would do normally: I mean,
between the CWA, all pages know the location of the others and know
they'll never change once deployed (this is why "hard"). So you can
reference it as you do today in the sitemap when reading stuff on your
context file space.

The "soft" part takes into account the fact that CWA might be "mounted"
in a position choosen by the deployer, thus one CWA cannot base address
resolution to a hard position (since it might change) but has to look it
up using some sort of indirect addressing, which I proposed it to be
based on behavioral interfaces, just like we do for Avalon components.
 
> >but one solution would be use (abuse?) the XML namespace mechanism
> >
> > <element xmlns:webmail="http://apache.org/cocoon/webapp"
> >          href="cocoon://webmail/some/resource"/>
> >
> >where we extend the default namespace behavior to do namespace
> >resolution even inside the attribute content. In fact, even XSLT does so
> >when doing
> >
> > <xsl:template match="ns:element" xmlns:ns="...">
> >
> >and ns: is matched not by the prefix, but by the expanded namespace URI.
> >
> I would consider this approach an abuse of the XML namespace mechanism,
> as in the way it is used in the XML definition, it is setting up a short
> namespace alias for the notation ns:element, not as a preprocessing
> instruction like #define in C. I think we should be able to come up with
> something more clear.

Ok, comment considered. Any suggestion will be welcome.
 
> >As far as uniqueness is concerned, the above mechanism works, but if we
> >want to allow more than one instance of the same role, we could indicate
> >so like this:
> >
> > <element xmlns:webmail="http://apache.org/cocoon/webapp"
> >          href="cocoon://webmail:mywebapp/some/resource"/>
> >
> >But this creates a composition problem: one CWA must know in advance the
> >instance-specific name of the other CWA. Since this is controlled by the
> >CWA deployer and cannot be hardcoded (unless we accept name collisions),
> >this is a weak contract and it's very likely to break everything very
> >soon. (with a very hard time figuring out what to do).
> >
> >So, here is my solution (that closely follows the strategy we designed
> >for Avalon blocks):
> >
> >  each CWA indicates
> >    o  its role as a URI  (http://apache.org/cocoon/webapp)
> >    o  its name as a human readable form  (My Fancy Webapp)
> >    o  its version as major.minor format (2.3)
> >    o  its dependancies on other CWA (role:version)
> >    o  its dependencies on external configurations
> >
> >when the CWA is deployed, the following things happen:
> >
> > 1) the CWA deployment descriptor is read
> >
> > 2) a machine specific name is given to the deployed instance. (if
> >another CWA of the same role:version pair is already in place, the
> >instance name must be unique, for example, adding a counter at the end
> >such as "http://apache.org/cocoon/webapp:2.3:2"
> >
> >3) for each CWA dependancy do:
> >  3.a) check if a CWA with that role is already in place.
> >  3.b) if so
> >    3.b.i) if only instance of that role, map the role to that instance.
> >    3.b.ii) otherwise, prompt the deployer and ask for which available
> >instance should be associated to that role.
> >
>
> I am always fearful of prompts that do something for me that I may not
> have understood completely. 

This is why the CWA will also contain a description of the configuration
that you are being asked to provide.

> Do we speak of a solely installer-based
> approach here or might this allow the admin to just drop-in the .cwa
> file and add an entry to the sitemap without anything able to directly
> prompt at all ?

As you guys wish, I don't see any reason to force one behavior or the
other.
 
> We should give the administrator a way to explicitly name a certain
> instance, leaving the administrator the choice between automatic
> GUI-driven and good-old vi-driven approaches, as the admin can partially
> or completely override the naming scheme and refer CWA dependencies to
> their targets manually. Whenever people would use the dynamic naming
> scheme to ensure multiple installations do not collide, they show a
> certain lack of interest to make sure they connect CWA dependencies
> correctly.

You got the wrong perception here, since I'm trying to help
administrators instead of doing stuff for them behind their backs. But
if you don't like that, fine, you'll always have the good old
configuration files to modify by hand.

> >3.c) otherwise, use the role URI to download the required CWA [we can
> >define how this is done later] and deploy it.
> >
> > 4) for each configuration dependancy do:
> >   4.a) check if the configuration key already exists in the conf
> >registry
> >    4.a.i) if so, prompt the user if the available value is ok
> >     4.a.i.1) if so, go on
> >     4.a.i.2) otherwise, change the value associated to that
> >configuration and relative to that instance only.
> >    4.a.ii) otherwise, prompt the deployer for the conf value
> >
> > 5) the deployer is finally asked for a URI location to mount the CWA
> >instance.
> >
> Basically this sounds good ;)

Good.
 
> >
> >NOTE:
> >
> > 1) possible recursive dependancies might create a deadlock on the
> >deployment phase, expecially when the Cocoon container is initially
> >empty. This is unlikely to happen for well designed components, but we
> >can download and scan all required CWA for deadlocks before actually do
> >any real deployment so that problems can be stopped *before* entering
> >the system.
> >
> >
> > 2) if more than one instance of a single webapp is available, the conf
> >registry must be smart enough to lookup a configuration based not only
> >on the requested path but also on the webapp instance that has requested
> >it. This avoid collisions due to the fact that different instances of
> >the same role by definition share the same configuration needs.
> >
> >>I guess there are plenty of opportunities to discuss what can be done
> >>better or easier differently, so let's hear them.
> >>
> >The only thing that is left to discuss is how (who does it and at what
> >level) the address translation between roled-based access and real URI
> >address is performed.
> >
> Are there already existing solutions being used in Cocoon ? What about
> the cocoon:// protocol ?

Se my next mail to Berin for that.

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] Cocoon web applications

Posted by giacomo <gi...@apache.org>.

On Sun, 7 Oct 2001, Michael Hartle wrote:

> Stefano Mazzocchi wrote:
>
> >>Ok, asbestos underwear in position - here we go:
> >>
> >Sam, seems like people liked this expression very much :)
> >
> It was sort of an inspiring picture, I admit ;)
>
> >>Some setup like mounting could happen in a sitemap.xmap, as currently
> >>this is the place controlling URI space; this even allows for an
> >>auto-mounting extension for the sitemap. Many packages will require
> >>setup beyond mounting, for example where to find the corporate identity
> >>stylesheets, which accounting database to use, or what's the business
> >>name to be put on the tax report thats being produced, so I need both
> >>the information about what I CAN configure and what I actually DO
> >>configure for this package.
> >>
> >Again, I was thinking about IoC: have the CWA ask Cocoon about things it
> >needs instead of having you to modify the things after setup.
> >
> Oh, I am not thinking of modifying the CWA for any purpose, I simply
> meant supplying Cocoon with configuration information it would pass on
> to the CWA when appropriate. I am interpreting the IoC principle the
> other way around; the CWA is "just" a web application among many others
> that are also being some sort of drop-in. So it is not the CWA
> actively/dynamically asking Cocoon about something, but Cocoon handing
> down configuration information to the CWA at a certain stage.

Exactly my thought :)

>
> >The contract is the internal tree shape (sort of URI space for
> >configurations) and CWA might look for configurations in there. For
> >example in a sitemap
> >
> > <map:parameter name="database"
> >value="conf://datasource/relational/main"/>
> > <map:parameter name="user"
> >value="conf://datasource/relational/main/user"/>
> > <map:parameter name="password"
> >value="conf://datasource/relational/main/password"/>
> >
> >something like that.
> >
> >It is also pretty easy to scan the CWA for conf:// protocols and
> >understand if the registry contains already the information or needs to
> >prompt the installer for it.
> >
> >This way, security sensible information is stored by Cocoon in another
> >location, probably out of the addressing space, making it inherently
> >more secure (it might even attach directly to an LDAP server, for that
> >matter).
> >
> The concept of allowing different sources (LDAP, filesystem, etc) for
> configuration as described is interesting, although I do think that each
> application should have it's own dedicated branch for setup information
> instead of forcing each parameter to be explicitly being passed; as
> large CWAs are going to happen (they will, right ?), the sheer length of
> the list of "parameters" for customizing them could well justify this
> approach.
>
> This would even allow passing on an instance of configuration source to
> the CWA, this instance implementing some ConfigurationSource-interface -
> be aware that I am not sure whether such an interface or class with this
> name is already existing, and I am not knowingly referring to anything
> here as I just made a wild guess how this could be named. This might be
> an LDAPConfigurationSource or a FileConfigurationSource, being able to
> deliver arbitrary configuration information to the CWA as a stream of
> SAX events. And those ConfigurationSource's could be defined in the
> sitemap and being configured with parameters when needed. Hey, why not
> simply consider different configuration sources as sources for SAX
> events directly ?

I don't think we should go that fax to hand configuration over as SAX
events. Avalons Configuration object is smart enought for it. Also, I
don't like to additionally blow up the sitemap with stuff like
ConfigurationSource. I'd rather see that going into cocoon.xconf have a
component that is the ConfigurationSource used by Cocoon to hand over
the configuration to a CWA.

>
> >>3.) As the .cwa package does not know in advance where it will be
> >>deployed, it cannot know about the URI space it will be accessable from
> >>via the web, yet most content needs to point to other content in this
> >>package, for example just simple links from HTML page A to HTML page B.
> >>
> >I'd assume that the URI structure of the CWA package can be considered a
> >contract. So, the only "soft" thing is the location where this "hard"
> >URI tree is mounted.
> >
> That's what I meant but didn't express. ;)
>
> >>If  resource names of pipelines were added to the sitemap which are
> >>local to the package/sitemap, the .cwa designer could just use resource
> >>names in his package and have them resolved later via taglibs in a page
> >>or other means in the sitemap like a cocoon-protocol extension like it
> >>was posted for role-based access.
> >>
> >Exactly, we still have to define "how" those "soft"+"hard" links are
> >actually translated to real URL addresses, but we agree on the mechanism
> >and this is a good thing.
> >
> Ok, so the "how" basically has to allow the developer to interchangably
> use "soft" and "hard" links all over the place, even though "soft" links
> might need one or two steps in addition.
>
> >but one solution would be use (abuse?) the XML namespace mechanism
> >
> > <element xmlns:webmail="http://apache.org/cocoon/webapp"
> >          href="cocoon://webmail/some/resource"/>
> >
> >where we extend the default namespace behavior to do namespace
> >resolution even inside the attribute content. In fact, even XSLT does so
> >when doing
> >
> > <xsl:template match="ns:element" xmlns:ns="...">
> >
> >and ns: is matched not by the prefix, but by the expanded namespace URI.
> >
> I would consider this approach an abuse of the XML namespace mechanism,
> as in the way it is used in the XML definition, it is setting up a short
> namespace alias for the notation ns:element, not as a preprocessing
> instruction like #define in C. I think we should be able to come up with
> something more clear.
>
> >As far as uniqueness is concerned, the above mechanism works, but if we
> >want to allow more than one instance of the same role, we could indicate
> >so like this:
> >
> > <element xmlns:webmail="http://apache.org/cocoon/webapp"
> >          href="cocoon://webmail:mywebapp/some/resource"/>
> >
> >But this creates a composition problem: one CWA must know in advance the
> >instance-specific name of the other CWA. Since this is controlled by the
> >CWA deployer and cannot be hardcoded (unless we accept name collisions),
> >this is a weak contract and it's very likely to break everything very
> >soon. (with a very hard time figuring out what to do).
> >
> >So, here is my solution (that closely follows the strategy we designed
> >for Avalon blocks):
> >
> >  each CWA indicates
> >    o  its role as a URI  (http://apache.org/cocoon/webapp)
> >    o  its name as a human readable form  (My Fancy Webapp)
> >    o  its version as major.minor format (2.3)
> >    o  its dependancies on other CWA (role:version)
> >    o  its dependencies on external configurations
> >
> >when the CWA is deployed, the following things happen:
> >
> > 1) the CWA deployment descriptor is read
> >
> > 2) a machine specific name is given to the deployed instance. (if
> >another CWA of the same role:version pair is already in place, the
> >instance name must be unique, for example, adding a counter at the end
> >such as "http://apache.org/cocoon/webapp:2.3:2"
> >
> >3) for each CWA dependancy do:
> >  3.a) check if a CWA with that role is already in place.
> >  3.b) if so
> >    3.b.i) if only instance of that role, map the role to that instance.
> >    3.b.ii) otherwise, prompt the deployer and ask for which available
> >instance should be associated to that role.
> >
> I am always fearful of prompts that do something for me that I may not
> have understood completely. Do we speak of a solely installer-based
> approach here or might this allow the admin to just drop-in the .cwa
> file and add an entry to the sitemap without anything able to directly
> prompt at all ?

I see the web based (Cocoon based) configuration editor here for
deployment configuration (yes, JMX comes to mind also).

Giacomo

>
> We should give the administrator a way to explicitly name a certain
> instance, leaving the administrator the choice between automatic
> GUI-driven and good-old vi-driven approaches, as the admin can partially
> or completely override the naming scheme and refer CWA dependencies to
> their targets manually. Whenever people would use the dynamic naming
> scheme to ensure multiple installations do not collide, they show a
> certain lack of interest to make sure they connect CWA dependencies
> correctly.
>
> >3.c) otherwise, use the role URI to download the required CWA [we can
> >define how this is done later] and deploy it.
> >
> > 4) for each configuration dependancy do:
> >   4.a) check if the configuration key already exists in the conf
> >registry
> >    4.a.i) if so, prompt the user if the available value is ok
> >     4.a.i.1) if so, go on
> >     4.a.i.2) otherwise, change the value associated to that
> >configuration and relative to that instance only.
> >    4.a.ii) otherwise, prompt the deployer for the conf value
> >
> > 5) the deployer is finally asked for a URI location to mount the CWA
> >instance.
> >
> Basically this sounds good ;)
>
> >
> >NOTE:
> >
> > 1) possible recursive dependancies might create a deadlock on the
> >deployment phase, expecially when the Cocoon container is initially
> >empty. This is unlikely to happen for well designed components, but we
> >can download and scan all required CWA for deadlocks before actually do
> >any real deployment so that problems can be stopped *before* entering
> >the system.
> >
> >
> > 2) if more than one instance of a single webapp is available, the conf
> >registry must be smart enough to lookup a configuration based not only
> >on the requested path but also on the webapp instance that has requested
> >it. This avoid collisions due to the fact that different instances of
> >the same role by definition share the same configuration needs.
> >
> >>I guess there are plenty of opportunities to discuss what can be done
> >>better or easier differently, so let's hear them.
> >>
> >The only thing that is left to discuss is how (who does it and at what
> >level) the address translation between roled-based access and real URI
> >address is performed.
> >
> Are there already existing solutions being used in Cocoon ? What about
> the cocoon:// protocol ?
>
> Best regards,
>
> Michael Hartle
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
> For additional commands, email: cocoon-dev-help@xml.apache.org
>
>
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] Cocoon web applications

Posted by Michael Hartle <mh...@hartle-klug.com>.

Stefano Mazzocchi wrote:

>>Ok, asbestos underwear in position - here we go:
>>
>Sam, seems like people liked this expression very much :)
>
It was sort of an inspiring picture, I admit ;)

>>Some setup like mounting could happen in a sitemap.xmap, as currently
>>this is the place controlling URI space; this even allows for an
>>auto-mounting extension for the sitemap. Many packages will require
>>setup beyond mounting, for example where to find the corporate identity
>>stylesheets, which accounting database to use, or what's the business
>>name to be put on the tax report thats being produced, so I need both
>>the information about what I CAN configure and what I actually DO
>>configure for this package.
>>
>Again, I was thinking about IoC: have the CWA ask Cocoon about things it
>needs instead of having you to modify the things after setup.
>
Oh, I am not thinking of modifying the CWA for any purpose, I simply 
meant supplying Cocoon with configuration information it would pass on 
to the CWA when appropriate. I am interpreting the IoC principle the 
other way around; the CWA is "just" a web application among many others 
that are also being some sort of drop-in. So it is not the CWA 
actively/dynamically asking Cocoon about something, but Cocoon handing 
down configuration information to the CWA at a certain stage.

>The contract is the internal tree shape (sort of URI space for
>configurations) and CWA might look for configurations in there. For
>example in a sitemap
>
> <map:parameter name="database"
>value="conf://datasource/relational/main"/>
> <map:parameter name="user"
>value="conf://datasource/relational/main/user"/>
> <map:parameter name="password"
>value="conf://datasource/relational/main/password"/>
>
>something like that. 
>
>It is also pretty easy to scan the CWA for conf:// protocols and
>understand if the registry contains already the information or needs to
>prompt the installer for it.
>
>This way, security sensible information is stored by Cocoon in another
>location, probably out of the addressing space, making it inherently
>more secure (it might even attach directly to an LDAP server, for that
>matter).
>
The concept of allowing different sources (LDAP, filesystem, etc) for 
configuration as described is interesting, although I do think that each 
application should have it's own dedicated branch for setup information 
instead of forcing each parameter to be explicitly being passed; as 
large CWAs are going to happen (they will, right ?), the sheer length of 
the list of "parameters" for customizing them could well justify this 
approach.

This would even allow passing on an instance of configuration source to 
the CWA, this instance implementing some ConfigurationSource-interface - 
be aware that I am not sure whether such an interface or class with this 
name is already existing, and I am not knowingly referring to anything 
here as I just made a wild guess how this could be named. This might be 
an LDAPConfigurationSource or a FileConfigurationSource, being able to 
deliver arbitrary configuration information to the CWA as a stream of 
SAX events. And those ConfigurationSource's could be defined in the 
sitemap and being configured with parameters when needed. Hey, why not 
simply consider different configuration sources as sources for SAX 
events directly ?

>>3.) As the .cwa package does not know in advance where it will be
>>deployed, it cannot know about the URI space it will be accessable from
>>via the web, yet most content needs to point to other content in this
>>package, for example just simple links from HTML page A to HTML page B.
>>
>I'd assume that the URI structure of the CWA package can be considered a
>contract. So, the only "soft" thing is the location where this "hard"
>URI tree is mounted.
>
That's what I meant but didn't express. ;)

>>If  resource names of pipelines were added to the sitemap which are
>>local to the package/sitemap, the .cwa designer could just use resource
>>names in his package and have them resolved later via taglibs in a page
>>or other means in the sitemap like a cocoon-protocol extension like it
>>was posted for role-based access.
>>
>Exactly, we still have to define "how" those "soft"+"hard" links are
>actually translated to real URL addresses, but we agree on the mechanism
>and this is a good thing.
>
Ok, so the "how" basically has to allow the developer to interchangably 
use "soft" and "hard" links all over the place, even though "soft" links 
might need one or two steps in addition.

>but one solution would be use (abuse?) the XML namespace mechanism
>
> <element xmlns:webmail="http://apache.org/cocoon/webapp"
>          href="cocoon://webmail/some/resource"/>
>
>where we extend the default namespace behavior to do namespace
>resolution even inside the attribute content. In fact, even XSLT does so
>when doing
> 
> <xsl:template match="ns:element" xmlns:ns="...">
>
>and ns: is matched not by the prefix, but by the expanded namespace URI.
>
I would consider this approach an abuse of the XML namespace mechanism, 
as in the way it is used in the XML definition, it is setting up a short 
namespace alias for the notation ns:element, not as a preprocessing 
instruction like #define in C. I think we should be able to come up with 
something more clear.

>As far as uniqueness is concerned, the above mechanism works, but if we
>want to allow more than one instance of the same role, we could indicate
>so like this:
>
> <element xmlns:webmail="http://apache.org/cocoon/webapp"
>          href="cocoon://webmail:mywebapp/some/resource"/>
>
>But this creates a composition problem: one CWA must know in advance the
>instance-specific name of the other CWA. Since this is controlled by the
>CWA deployer and cannot be hardcoded (unless we accept name collisions),
>this is a weak contract and it's very likely to break everything very
>soon. (with a very hard time figuring out what to do).
>
>So, here is my solution (that closely follows the strategy we designed
>for Avalon blocks):
>
>  each CWA indicates
>    o  its role as a URI  (http://apache.org/cocoon/webapp)
>    o  its name as a human readable form  (My Fancy Webapp)
>    o  its version as major.minor format (2.3)
>    o  its dependancies on other CWA (role:version)
>    o  its dependencies on external configurations
>
>when the CWA is deployed, the following things happen:
>
> 1) the CWA deployment descriptor is read
>
> 2) a machine specific name is given to the deployed instance. (if
>another CWA of the same role:version pair is already in place, the
>instance name must be unique, for example, adding a counter at the end
>such as "http://apache.org/cocoon/webapp:2.3:2"
>
>3) for each CWA dependancy do:
>  3.a) check if a CWA with that role is already in place.
>  3.b) if so 
>    3.b.i) if only instance of that role, map the role to that instance.
>    3.b.ii) otherwise, prompt the deployer and ask for which available
>instance should be associated to that role.
>
I am always fearful of prompts that do something for me that I may not 
have understood completely. Do we speak of a solely installer-based 
approach here or might this allow the admin to just drop-in the .cwa 
file and add an entry to the sitemap without anything able to directly 
prompt at all ?

We should give the administrator a way to explicitly name a certain 
instance, leaving the administrator the choice between automatic 
GUI-driven and good-old vi-driven approaches, as the admin can partially 
or completely override the naming scheme and refer CWA dependencies to 
their targets manually. Whenever people would use the dynamic naming 
scheme to ensure multiple installations do not collide, they show a 
certain lack of interest to make sure they connect CWA dependencies 
correctly.

>3.c) otherwise, use the role URI to download the required CWA [we can
>define how this is done later] and deploy it.
>
> 4) for each configuration dependancy do:
>   4.a) check if the configuration key already exists in the conf
>registry
>    4.a.i) if so, prompt the user if the available value is ok
>     4.a.i.1) if so, go on
>     4.a.i.2) otherwise, change the value associated to that
>configuration and relative to that instance only.
>    4.a.ii) otherwise, prompt the deployer for the conf value
>
> 5) the deployer is finally asked for a URI location to mount the CWA
>instance.
>
Basically this sounds good ;)

>
>NOTE:
>
> 1) possible recursive dependancies might create a deadlock on the
>deployment phase, expecially when the Cocoon container is initially
>empty. This is unlikely to happen for well designed components, but we
>can download and scan all required CWA for deadlocks before actually do
>any real deployment so that problems can be stopped *before* entering
>the system.
>
>
> 2) if more than one instance of a single webapp is available, the conf
>registry must be smart enough to lookup a configuration based not only
>on the requested path but also on the webapp instance that has requested
>it. This avoid collisions due to the fact that different instances of
>the same role by definition share the same configuration needs.
>
>>I guess there are plenty of opportunities to discuss what can be done
>>better or easier differently, so let's hear them.
>>
>The only thing that is left to discuss is how (who does it and at what
>level) the address translation between roled-based access and real URI
>address is performed.
>
Are there already existing solutions being used in Cocoon ? What about 
the cocoon:// protocol ?

Best regards,

Michael Hartle



---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] Cocoon web applications

Posted by giacomo <gi...@apache.org>.

On Sat, 6 Oct 2001, Stefano Mazzocchi wrote:

> This thread is become really good.
>
> Michael Hartle wrote:
> >
> > Hi,
> >
> > I guess that adding practical examples and consequences to the
> > suggestions will help discussing the concepts at a broader scale. I just
> > love examples that can be torn in half and rebuild.
> >
> > Ok, asbestos underwear in position - here we go:
>
> Sam, seems like people liked this expression very much :)

:)

>
> > 1.) We want easily deployable packages, just like .war or .ear files in
> > other contexts. For Cocoon, this would be .cwa files; I guess it's just
> > a zip file with some information relating to the package, like a
> > MANIFEST in a .jar file. I would consider this extra information to be
> > at least a sitemap.xmap that controls the sub-URI-space of the package.
>
> I share your vision totally.
>
> > 2.) We want these easily deployable .cwa packages to be self-contained.
> > I consider modifying a package for setup issues impractical, so the
> > setup for a package should happen outside of the package. Inversion of
> > control, Avalon principle ;). At the same time this allows a single .cwa
> > package file to be setup and deployed at multiple places at the same time.
>
> Ditto.
>
> > Some setup like mounting could happen in a sitemap.xmap, as currently
> > this is the place controlling URI space; this even allows for an
> > auto-mounting extension for the sitemap. Many packages will require
> > setup beyond mounting, for example where to find the corporate identity
> > stylesheets, which accounting database to use, or what's the business
> > name to be put on the tax report thats being produced, so I need both
> > the information about what I CAN configure and what I actually DO
> > configure for this package.
>
> Again, I was thinking about IoC: have the CWA ask Cocoon about things it
> needs instead of having you to modify the things after setup.

I thought IoC is working the other way: have Cocoon tell the CWA about
those things.

>
> > I think of the former as sort of a standardized setup-info.xml or the
> > like regulary contained in .cwa packages that have something to be
> > configured. The latter could be an configuration file positioned
> > somewhere near the sitemap.xmap and cocoon.xconf files in the
> > filesystem. One might even refer to the configuration file from the
> > sitemap, so the way the configuration files are being organized is left
> > as a choice to the system administrator.
>
> I recently came to know MacOSX NetInfo package and found it *very*
> elegant (much more elegant than the 30-years old /etc directory,
> anyway): it's a short of directory server that contains configuration. A
> sort of elegant and simple configuration registry (unlike windows') that
> is used to "serve" configurations to who it requires them.
>
> It provides a central point of configuration and an easy way to deploy
> something without having to modify it.

I've recently discussed this with other people. A central point of
configuration is essential especially if you have to manage a
distributed environment.

Giacomo

> The contract is the internal tree shape (sort of URI space for
> configurations) and CWA might look for configurations in there. For
> example in a sitemap
>
>  <map:parameter name="database"
> value="conf://datasource/relational/main"/>
>  <map:parameter name="user"
> value="conf://datasource/relational/main/user"/>
>  <map:parameter name="password"
> value="conf://datasource/relational/main/password"/>
>
> something like that.
>
> It is also pretty easy to scan the CWA for conf:// protocols and
> understand if the registry contains already the information or needs to
> prompt the installer for it.
>
> This way, security sensible information is stored by Cocoon in another
> location, probably out of the addressing space, making it inherently
> more secure (it might even attach directly to an LDAP server, for that
> matter).
>
> > 3.) As the .cwa package does not know in advance where it will be
> > deployed, it cannot know about the URI space it will be accessable from
> > via the web, yet most content needs to point to other content in this
> > package, for example just simple links from HTML page A to HTML page B.
>
> I'd assume that the URI structure of the CWA package can be considered a
> contract. So, the only "soft" thing is the location where this "hard"
> URI tree is mounted.
>
> > If  resource names of pipelines were added to the sitemap which are
> > local to the package/sitemap, the .cwa designer could just use resource
> > names in his package and have them resolved later via taglibs in a page
> > or other means in the sitemap like a cocoon-protocol extension like it
> > was posted for role-based access.
>
> Exactly, we still have to define "how" those "soft"+"hard" links are
> actually translated to real URL addresses, but we agree on the mechanism
> and this is a good thing.
>
> > 4.) .cwa packages will rarely be on their own, not interconnecting. So
> > resource naming would need to work between .cwa packages. Giving each
> > deployed .cwa package a global name, the local resource name for a
> > pipeline could be referenced from another position.
>
> You touch another important point here: if on one hand, addressing by
> role must not sacrifice the ability to have multiple instances for the
> same role, on the other hand, must be precise enough to avoid name
> collisions.
>
> This is the same problem faced by by both java dynamic loading and xml
> namespaces: both use URI's as unique identification.
>
> Avalon, for example, uses the inverse dot notation (in short, the
> interface name, i.e. org.apache.cocoon.component.Parser) to create
> unique behaviors identified by the interface that represent them.
>
> Same thing for namespaces, in fact the xmlns attribute is a way to
> reduce verbosity but doesn't change the nature of the internal infoset
> which assumes that all elements are prefixed with the URI that uniquely
> reference them.
>
> So, each CWA must indicate both:
>
>  o its unique role
>  o its instance identifier
>
> For example, a webmail CWA could be identified by
>
>  http://apache.org/cocoon/webapp/webmail
>  My Fancy WebApp 2.3
>
> Now, the problem is that we cannot impose the use of something like
>
>  cocoon://[http://apache.org/cocoon/webapp]/some/resource
>
> but one solution would be use (abuse?) the XML namespace mechanism
>
>  <element xmlns:webmail="http://apache.org/cocoon/webapp"
>           href="cocoon://webmail/some/resource"/>
>
> where we extend the default namespace behavior to do namespace
> resolution even inside the attribute content. In fact, even XSLT does so
> when doing
>
>  <xsl:template match="ns:element" xmlns:ns="...">
>
> and ns: is matched not by the prefix, but by the expanded namespace URI.
>
> As far as uniqueness is concerned, the above mechanism works, but if we
> want to allow more than one instance of the same role, we could indicate
> so like this:
>
>  <element xmlns:webmail="http://apache.org/cocoon/webapp"
>           href="cocoon://webmail:mywebapp/some/resource"/>
>
> But this creates a composition problem: one CWA must know in advance the
> instance-specific name of the other CWA. Since this is controlled by the
> CWA deployer and cannot be hardcoded (unless we accept name collisions),
> this is a weak contract and it's very likely to break everything very
> soon. (with a very hard time figuring out what to do).
>
> So, here is my solution (that closely follows the strategy we designed
> for Avalon blocks):
>
>   each CWA indicates
>     o  its role as a URI  (http://apache.org/cocoon/webapp)
>     o  its name as a human readable form  (My Fancy Webapp)
>     o  its version as major.minor format (2.3)
>     o  its dependancies on other CWA (role:version)
>     o  its dependencies on external configurations
>
> when the CWA is deployed, the following things happen:
>
>  1) the CWA deployment descriptor is read
>
>  2) a machine specific name is given to the deployed instance. (if
> another CWA of the same role:version pair is already in place, the
> instance name must be unique, for example, adding a counter at the end
> such as "http://apache.org/cocoon/webapp:2.3:2"
>
>  3) for each CWA dependancy do:
>   3.a) check if a CWA with that role is already in place.
>   3.b) if so
>     3.b.i) if only instance of that role, map the role to that instance.
>     3.b.ii) otherwise, prompt the deployer and ask for which available
> instance should be associated to that role.
>   3.c) otherwise, use the role URI to download the required CWA [we can
> define how this is done later] and deploy it.
>
>  4) for each configuration dependancy do:
>    4.a) check if the configuration key already exists in the conf
> registry
>     4.a.i) if so, prompt the user if the available value is ok
>      4.a.i.1) if so, go on
>      4.a.i.2) otherwise, change the value associated to that
> configuration and relative to that instance only.
>     4.a.ii) otherwise, prompt the deployer for the conf value
>
>  5) the deployer is finally asked for a URI location to mount the CWA
> instance.
>
> NOTE:
>
>  1) possible recursive dependancies might create a deadlock on the
> deployment phase, expecially when the Cocoon container is initially
> empty. This is unlikely to happen for well designed components, but we
> can download and scan all required CWA for deadlocks before actually do
> any real deployment so that problems can be stopped *before* entering
> the system.
>
>  2) if more than one instance of a single webapp is available, the conf
> registry must be smart enough to lookup a configuration based not only
> on the requested path but also on the webapp instance that has requested
> it. This avoid collisions due to the fact that different instances of
> the same role by definition share the same configuration needs.
>
> > I guess there are plenty of opportunities to discuss what can be done
> > better or easier differently, so let's hear them.
>
> The only thing that is left to discuss is how (who does it and at what
> level) the address translation between roled-based access and real URI
> address is performed.
>
> Everything else looks in pretty good design shape to me, but of course,
> comments are more than welcome.
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] Cocoon web applications

Posted by Stefano Mazzocchi <st...@apache.org>.

This thread is become really good.

Michael Hartle wrote:
> 
> Hi,
> 
> I guess that adding practical examples and consequences to the
> suggestions will help discussing the concepts at a broader scale. I just
> love examples that can be torn in half and rebuild.
> 
> Ok, asbestos underwear in position - here we go:

Sam, seems like people liked this expression very much :)

> 1.) We want easily deployable packages, just like .war or .ear files in
> other contexts. For Cocoon, this would be .cwa files; I guess it's just
> a zip file with some information relating to the package, like a
> MANIFEST in a .jar file. I would consider this extra information to be
> at least a sitemap.xmap that controls the sub-URI-space of the package.

I share your vision totally.

> 2.) We want these easily deployable .cwa packages to be self-contained.
> I consider modifying a package for setup issues impractical, so the
> setup for a package should happen outside of the package. Inversion of
> control, Avalon principle ;). At the same time this allows a single .cwa
> package file to be setup and deployed at multiple places at the same time.

Ditto.

> Some setup like mounting could happen in a sitemap.xmap, as currently
> this is the place controlling URI space; this even allows for an
> auto-mounting extension for the sitemap. Many packages will require
> setup beyond mounting, for example where to find the corporate identity
> stylesheets, which accounting database to use, or what's the business
> name to be put on the tax report thats being produced, so I need both
> the information about what I CAN configure and what I actually DO
> configure for this package.

Again, I was thinking about IoC: have the CWA ask Cocoon about things it
needs instead of having you to modify the things after setup.

> I think of the former as sort of a standardized setup-info.xml or the
> like regulary contained in .cwa packages that have something to be
> configured. The latter could be an configuration file positioned
> somewhere near the sitemap.xmap and cocoon.xconf files in the
> filesystem. One might even refer to the configuration file from the
> sitemap, so the way the configuration files are being organized is left
> as a choice to the system administrator.

I recently came to know MacOSX NetInfo package and found it *very*
elegant (much more elegant than the 30-years old /etc directory,
anyway): it's a short of directory server that contains configuration. A
sort of elegant and simple configuration registry (unlike windows') that
is used to "serve" configurations to who it requires them.

It provides a central point of configuration and an easy way to deploy
something without having to modify it.

The contract is the internal tree shape (sort of URI space for
configurations) and CWA might look for configurations in there. For
example in a sitemap

 <map:parameter name="database"
value="conf://datasource/relational/main"/>
 <map:parameter name="user"
value="conf://datasource/relational/main/user"/>
 <map:parameter name="password"
value="conf://datasource/relational/main/password"/>

something like that. 

It is also pretty easy to scan the CWA for conf:// protocols and
understand if the registry contains already the information or needs to
prompt the installer for it.

This way, security sensible information is stored by Cocoon in another
location, probably out of the addressing space, making it inherently
more secure (it might even attach directly to an LDAP server, for that
matter).

> 3.) As the .cwa package does not know in advance where it will be
> deployed, it cannot know about the URI space it will be accessable from
> via the web, yet most content needs to point to other content in this
> package, for example just simple links from HTML page A to HTML page B.

I'd assume that the URI structure of the CWA package can be considered a
contract. So, the only "soft" thing is the location where this "hard"
URI tree is mounted.

> If  resource names of pipelines were added to the sitemap which are
> local to the package/sitemap, the .cwa designer could just use resource
> names in his package and have them resolved later via taglibs in a page
> or other means in the sitemap like a cocoon-protocol extension like it
> was posted for role-based access.

Exactly, we still have to define "how" those "soft"+"hard" links are
actually translated to real URL addresses, but we agree on the mechanism
and this is a good thing.

> 4.) .cwa packages will rarely be on their own, not interconnecting. So
> resource naming would need to work between .cwa packages. Giving each
> deployed .cwa package a global name, the local resource name for a
> pipeline could be referenced from another position.

You touch another important point here: if on one hand, addressing by
role must not sacrifice the ability to have multiple instances for the
same role, on the other hand, must be precise enough to avoid name
collisions.

This is the same problem faced by by both java dynamic loading and xml
namespaces: both use URI's as unique identification.

Avalon, for example, uses the inverse dot notation (in short, the
interface name, i.e. org.apache.cocoon.component.Parser) to create
unique behaviors identified by the interface that represent them.

Same thing for namespaces, in fact the xmlns attribute is a way to
reduce verbosity but doesn't change the nature of the internal infoset
which assumes that all elements are prefixed with the URI that uniquely
reference them.

So, each CWA must indicate both:

 o its unique role
 o its instance identifier

For example, a webmail CWA could be identified by

 http://apache.org/cocoon/webapp/webmail
 My Fancy WebApp 2.3

Now, the problem is that we cannot impose the use of something like

 cocoon://[http://apache.org/cocoon/webapp]/some/resource

but one solution would be use (abuse?) the XML namespace mechanism

 <element xmlns:webmail="http://apache.org/cocoon/webapp"
          href="cocoon://webmail/some/resource"/>

where we extend the default namespace behavior to do namespace
resolution even inside the attribute content. In fact, even XSLT does so
when doing

 <xsl:template match="ns:element" xmlns:ns="...">

and ns: is matched not by the prefix, but by the expanded namespace URI.

As far as uniqueness is concerned, the above mechanism works, but if we
want to allow more than one instance of the same role, we could indicate
so like this:

 <element xmlns:webmail="http://apache.org/cocoon/webapp"
          href="cocoon://webmail:mywebapp/some/resource"/>

But this creates a composition problem: one CWA must know in advance the
instance-specific name of the other CWA. Since this is controlled by the
CWA deployer and cannot be hardcoded (unless we accept name collisions),
this is a weak contract and it's very likely to break everything very
soon. (with a very hard time figuring out what to do).

So, here is my solution (that closely follows the strategy we designed
for Avalon blocks):

  each CWA indicates
    o  its role as a URI  (http://apache.org/cocoon/webapp)
    o  its name as a human readable form  (My Fancy Webapp)
    o  its version as major.minor format (2.3)
    o  its dependancies on other CWA (role:version)
    o  its dependencies on external configurations

when the CWA is deployed, the following things happen:

 1) the CWA deployment descriptor is read

 2) a machine specific name is given to the deployed instance. (if
another CWA of the same role:version pair is already in place, the
instance name must be unique, for example, adding a counter at the end
such as "http://apache.org/cocoon/webapp:2.3:2"

 3) for each CWA dependancy do:
  3.a) check if a CWA with that role is already in place.
  3.b) if so 
    3.b.i) if only instance of that role, map the role to that instance.
    3.b.ii) otherwise, prompt the deployer and ask for which available
instance should be associated to that role.
  3.c) otherwise, use the role URI to download the required CWA [we can
define how this is done later] and deploy it.

 4) for each configuration dependancy do:
   4.a) check if the configuration key already exists in the conf
registry
    4.a.i) if so, prompt the user if the available value is ok
     4.a.i.1) if so, go on
     4.a.i.2) otherwise, change the value associated to that
configuration and relative to that instance only.
    4.a.ii) otherwise, prompt the deployer for the conf value

 5) the deployer is finally asked for a URI location to mount the CWA
instance.

NOTE:

 1) possible recursive dependancies might create a deadlock on the
deployment phase, expecially when the Cocoon container is initially
empty. This is unlikely to happen for well designed components, but we
can download and scan all required CWA for deadlocks before actually do
any real deployment so that problems can be stopped *before* entering
the system.

 2) if more than one instance of a single webapp is available, the conf
registry must be smart enough to lookup a configuration based not only
on the requested path but also on the webapp instance that has requested
it. This avoid collisions due to the fact that different instances of
the same role by definition share the same configuration needs.

> I guess there are plenty of opportunities to discuss what can be done
> better or easier differently, so let's hear them.

The only thing that is left to discuss is how (who does it and at what
level) the address translation between roled-based access and real URI
address is performed.

Everything else looks in pretty good design shape to me, but of course,
comments are more than welcome.

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] Cocoon web applications

Posted by Michael Hartle <mh...@hartle-klug.com>.

Hi,

I guess that adding practical examples and consequences to the 
suggestions will help discussing the concepts at a broader scale. I just 
love examples that can be torn in half and rebuild.

Ok, asbestos underwear in position - here we go:

1.) We want easily deployable packages, just like .war or .ear files in 
other contexts. For Cocoon, this would be .cwa files; I guess it's just 
a zip file with some information relating to the package, like a 
MANIFEST in a .jar file. I would consider this extra information to be 
at least a sitemap.xmap that controls the sub-URI-space of the package.

2.) We want these easily deployable .cwa packages to be self-contained. 
I consider modifying a package for setup issues impractical, so the 
setup for a package should happen outside of the package. Inversion of 
control, Avalon principle ;). At the same time this allows a single .cwa 
package file to be setup and deployed at multiple places at the same time.

Some setup like mounting could happen in a sitemap.xmap, as currently 
this is the place controlling URI space; this even allows for an 
auto-mounting extension for the sitemap. Many packages will require 
setup beyond mounting, for example where to find the corporate identity 
stylesheets, which accounting database to use, or what's the business 
name to be put on the tax report thats being produced, so I need both 
the information about what I CAN configure and what I actually DO 
configure for this package.

I think of the former as sort of a standardized setup-info.xml or the 
like regulary contained in .cwa packages that have something to be 
configured. The latter could be an configuration file positioned 
somewhere near the sitemap.xmap and cocoon.xconf files in the 
filesystem. One might even refer to the configuration file from the 
sitemap, so the way the configuration files are being organized is left 
as a choice to the system administrator.

3.) As the .cwa package does not know in advance where it will be 
deployed, it cannot know about the URI space it will be accessable from 
via the web, yet most content needs to point to other content in this 
package, for example just simple links from HTML page A to HTML page B. 
If  resource names of pipelines were added to the sitemap which are 
local to the package/sitemap, the .cwa designer could just use resource 
names in his package and have them resolved later via taglibs in a page 
or other means in the sitemap like a cocoon-protocol extension like it 
was posted for role-based access.

4.) .cwa packages will rarely be on their own, not interconnecting. So 
resource naming would need to work between .cwa packages. Giving each 
deployed .cwa package a global name, the local resource name for a 
pipeline could be referenced from another position.


I guess there are plenty of opportunities to discuss what can be done 
better or easier differently, so let's hear them.

Best regards,

Michael Hartle


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] Cocoon web applications

Posted by Stefano Mazzocchi <st...@apache.org>.

Berin Loritsch wrote:

> > Behavioral-only pipeline composition might lead to unwanted behavior and
> > would be very hard to understand what's wrong, expecially the more
> > namespaces and components are mixed and aggregated.
> 
> I am not advocating behavioral-only pipelines.  I am advocating declared
> pipelines with behavioral optimization.  There is a key difference there.

Granted and agreed.
 
Stefano.



---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] Cocoon web applications

Posted by Berin Loritsch <bl...@apache.org>.

Stefano Mazzocchi wrote:
> 
> As we are touching a *deep core* problem, we must be very formal and
> abstract and analyze all the implicit assumptions. So this is why I have
> to be picky on what you say above:
> 
> if you read again my previous statements and understand my terminology
> that indicates that a "schema" contains all the necessary information on
> how to mix different namespaces, you'll understand that you make an
> implicit assumption in your analysis: the topological space defined by
> input and output namespaces are translated but never rotated.

This is not always the case.  For instance, Stylebook schema does not
"allow" you to mix arbitrary schemas inside of it--however in my example
that is temporarily the case.  The transformers convert it into the
proper schema.  Many times, Schema also allows you to embed an *any*
element (any schema, any markup after that point).

> This far from being trivial to understand so let me expand over your
> example updating the notation in a more readable (at least to me) form:
> 
>             g[doc][loc][schem] ->
>  [*][schem]t1[doc][loc] ->
>    [*][loc]t2[doc] ->
>       [doc]t3[xhtml] ->
> 
> which indicates that
> 
>  t1 transforms [schem] to [doc][loc] and everything else is copied over
>  t2 transforms [loc] to [doc] and everything else is copied over
>  t3 transforms [doc] to [xhtml]

The transformation process that I was attempting to point out was that
Cocoon knows that the Generator outputs [doc][loc][schem].  It knows
that t1 will take anything and transform the initial [schem] parts
to [doc] format.  Using simple logic, Cocoon knows now that the output
from t1 is now [doc][loc].  Again for t2, it knows that it will take
anything and convert the [loc] schema to [doc] and pass everything else.
It also knows by simple logic that the only schema left at this point
is the [doc] schema.  This allows t3 to go through with confidence and
the only result of the chain is [xhtml] schema.

In other words, the tracking mechanism knows how to narrow the pipeline.
In order to accomplish this, the schema declaration must be declarative
instead of reactive.  In other words, the sitemap must declare what the
pipeline components are expected to output and expected to accept.  That
way the pipeline optimization can happen without incurring a performance
cost at the beginning.

> but this is far from being a general enough notation, because it
> presumes that stylesheets work on namespaces orthogonally and don't mix
> them.
> 
> While I agree this is a "good way" to design modular stylesheets, it's
> not general enough, in fact it fails to describe a stylesheet that does:
> 
>  <xsl:template match="loc:element/doc:element">
>   ...
>  </xsl:template>
> 
> therefore assumes that the [doc] and [loc] namespaces are intermixed and
> in a well defined order.

Internal to a Cocoon webapp, this can and should be enforced.  The external
interfaces need to be robust enough with well defined schemas.

> Yes, it's very similar to what I was thinking, but this should NOT, IMO,
> happen at sitemap level, but at pipeline assembly level, thus when
> authoring and creating the webapp thru component composition.

Maybe not, but the interfaces must be well documented to provide the sitemap
and tools necessary for the resource.

> Such authoting tool might well be a GUI version of Cocoon itself or
> something else that connects to Cocoon for the deployment, I don't know,
> but all this looking-up and discoveries should not happen on a live
> site, IMO.

Exactly.  GUI tools require and encourage a good set of meta data to
allow both validation, and information to the user.

> > IOW, Root registers [doc] and [loc] with Stocking.  Stocking configures
> > itself so that it does not transform the [loc] schema--assuming that the
> > parent knows how to handle it.  However, because the Root did not state
> > that it could handle [schem] schema, Stocking applies the transformation
> > for that.
> 
> I get the feeling that too much happens behind my back. I'd like to be
> able to compose my pipelines in such an easy way, but I'd also would
> like to be able to modify my sitemaps by hand when required (WYSIWIG has
> drowbacks and we all know that very well, don't we?).
> 
> Also, how does this handle conflicts? what if there are two different
> instances of a transformer that performs the same behavior [for example,
> two stylesheets one fancy for nice graphics and one simple for text-like
> browsers]? how should I choose between them?

Again, let me give another example:

<map:match pattern="**.html">
  <map:generate source="xdocs/{1}.xml"/>
  <map:transform source="stylesheets/schem2doc.xsl"/>
  <map:transform source="stylesheets/loc2doc.xsl"/>
  <map:transform source="stylesheets/doc2xhtml.xsl"/>
  <map:serialize/>
</map:match>

The pipeline is declared for multiple resources (a common occurance).  Not
every resource has [schem] or [loc] schemas.  For those resources, the
unneeded step is removed.  The transform processes are declared in the sitemap
as always.  Again, with augmenting the stages with declarative schema information,
we can show how other transformers can be substituted (in the GUI tool).

> Behavioral-only pipeline composition might lead to unwanted behavior and
> would be very hard to understand what's wrong, expecially the more
> namespaces and components are mixed and aggregated.

I am not advocating behavioral-only pipelines.  I am advocating declared
pipelines with behavioral optimization.  There is a key difference there.

> So, the best solution would be, IMO, to allow behavioral-driven
> discovery of components in a sitemap editor and behavioral validation at
> CWA deployment, while leaving the sitemap (or whatever comes next) as
> static and explicit as possible, even if its readability is reduced.
> 
> I believe the path for easier Cocoon use passes also by creating aid
> tools because, in fact, sometimes we are simply too powerful to simplify
> the semantics of our configurations without sacrificing some of that
> power.
> 
> This doesn't mean the sitemap is perfect and will never be touched, but,
> IMO,  it should not sacrifice explicitness for ease of use, at least at
> this level.

When we have a tool to do pipelines, we can make the sitemap information
painfully explicit.  Until then, there are going to be pipelines reacting
to an aggregation of URIs.

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] Cocoon web applications

Posted by Stefano Mazzocchi <st...@apache.org>.

Berin Loritsch wrote:

> > > That is what I am referring to.  As of Servlet 2.3 and much debate, the
> > > official stance on where "/resource" maps you is to the web server root,
> > > not the context root.  Instead, the context root is much more difficult
> > > to reach.  Perhaps we can improve the HTML serializer to automagically
> > > correct context root resources.
> >
> > Yuck! I'd hate that. Serializers that mangle things behind your back are
> > the worst pain in the ass to find out expecially because you never look
> > at them since you normally consider them brainless and pure adapters
> > from the XML world to the binary world.
> >
> > Let's find a more elegant way.
> 
> OK, any ideas?

As I wrote in my previous email, this is the only thing that is left to
dicuss on the design of a component model for Cocoon webapps.

Let's analyze the needs for addressing:

1) it must allow strong contracts inside a component and between
different components
2) it must avoid name collisions or contract misunderstanding
3) it must be easy and immediate to use
4) it must be as least verbose/error-prone as possible
5) it must totally hide the hassle to lookup, discover or otherwise
access component instances.

I believe that the namespaced cocoon: protocol that I proposed in my
previous mail covers all points more or less decently:

 <element xmlns:webapp="http://apache.org/cocoon/webapp/2.3"
          src="cocoon://webapp/some/resource"/>

in fact

 1) the contract is rock solid as long as the internal URI structure of
the component remains solid. Versioning can be used to understand if the
structure is back compatible (minor is bigger than requested) or not
(major is bigger than requested).

 2) since instance lookup is hidden and done by the container with
information given by the deployer at deployment time (or subsequently at
container reconfiguration time), the CWA doesn't need to know anything
about this and concerns do not overlap.

 3) it's not extremely easy to use, but it's good enough for those who
understand namespace matching in XSLT.

 4) it reduces verbosity since the same namespace declaration can be
used thruout the entire document.

 5) as for point 2)

So far for addressing.

How is address resolution performed?

In this context address resolution means to translate an indirect
address of the above form into an absolute address. But there are two
differen behaviors depending on whether the information is consumed
internally or externally.

If consumed internally (for example, inside the sitemap for
aggregation), resolution means to transform the address from the
indirect form

 cocoon://webapp/some/resource

into an absolute internal one

 cocoon:/mount/point/of/webapp/some/resource

while, if referenced externally (for example, in an HTML page that is
sent to the browser), resolution means to transform into an absolute
external one

 protocol://host/path/to/container/mount/point/of/webapp/some/resource

or a relative external one relativized from the absolute external form
of the current resource that containes the link.

It's easy to discrimiate the two because the first behavior actually
invoques the cocoon: protocol handler, while the second does not (it's
just considered content from the pipelines).

So, while the first behavior is actually implemented inside the cocoon:
protocol handler, the second is more tricky.

My preferred solution would be to associate this behavior to "extended
xlinks" (which are my invention: are considered extended xlinks all
elements that contain an xlink:href attribute OR the src or href
attributes, look into org.apache.cocoon.xml.xlink for more details on
this) and perform a transparent "extended xlinks cocoon address
resolution" just before serialization (NOT by the serializer itself).

This has some major advantages:

 1) behavior is transparent to the user
 2) they don't have to specify it in the sitemap as a transformation
component (they might forget to add it and find it out is hard)
 3) it works on every serialized format (even PDF and such).

I can't think of anything more elegant than this that solves our
problems.

> > > Let me expound.  I like to use a dierectory structure like this:
> > >
> > > /xdocs
> > > /resources
> > >       /images
> > >       /scripts
> > >       /styles
> > > /stylesheets
> > >       /system
> > > /WEB-INF
> > >       /logicsheets
> > >       /cocoon.xconf
> > >       /logkit.xconf
> > > /sitemap.xmap
> > > /${sub.app}
> > >       /xdocs
> > >       /resources
> > >             /images
> > >             /scripts
> > >       /sitemap.xmap
> > >
> > > The problem is when I want a consistent look and feel in my ${sub.app}
> > > area.  I cannot access the /stylesheets that are accessible via the
> > > context--but not via the sitemap.  This requires me to copy the
> > > /stylesheets to the ${sub.app}.
> >
> > Ok, in this case, absolute URI would work and will not require you
> > access to your parent, but to an absolute location (which, in this case,
> > accidentally, happens to be your parent)
> >
> > This is a simple fix and we can schedule it for Cocoon 2.1 since it
> > might break back compatibility of sitemaps a little.
> 
> Sounds good.

Great.

> > > Because Cocoon is an XML framework, in order for this approach to work,
> > > you have to define the interfaces.  There are definite roles that I
> > > have already identified.  Some of the solutions come from concepts in
> > > SOAP, and some of the solutions come from concepts in JNDI, but here goes.
> > >
> > > For sub applications to work, you must have them work to a specific schema.
> > > (this concept is from SOAP).  For instance, your resource must return
> > > the results in DocBook format so that the parent knows how to apply views.
> > > This is the interface of your "component".
> >
> > I've already thought about this when I thought about a way to validate
> > sitemaps and it's a *LOT* more complex than this.
> >
> > Let's make an example: the "behavioral interfaces" of pipeline
> > components are the expected input namespaces and the resulting
> > namespaces. But listing them is not enough: you must know the exact
> > structure, thus the namespace-aware schemas.
> >
> > Even between components, schemas are the structure description that
> > identify the expected "shape" of the SAX pipe that connects two
> > components.
> >
> > Now, suppose you have a pipeline such as
> >
> >  <g] -> [t1] -> [t2] -> [s>
> >
> > and you have
> >
> >  g -> output schema of generator
> >  t1i -> input schema of first transformer
> >  t1o -> output schema of first transformer
> >  t2i -> input schema of second transformer
> >  t2o -> output schema of second transformer
> >  s -> input schema of serializer
> >
> > with all this information you can precisely estimate if the pipeline is
> > "valid", in a behavioral sense.
> >
> > This would allow you to perform some pretests on sitemaps (before
> > compilation and before uploading) that avoids those "impedence
> > mismatches" between connected components.
> 
> This is excellent--validation is vital!

Exactly. But even more: knowning the behavioral interfaces of pipeline
components allows for indirect creation of the pipeline.

For example, a sitemap indirect pseudocode might be:

 1) my generator creates stuff using schema G1
 2) have webapp "style" create a PDF out of it.

So, the sitemap looks up for the webapp "style" and asks for a pipeline
fragment (a transformer and a serializer in this case, but might be more
complex than this) that implement the behavior "from G1 to PDF".

If all pipeline components indicate their in/out schemas, simple
inference rules might be used to come up with those pipeline fragments
in a semi-automatic way, so if we have one transformer that goes from G1
to FO and a serializer that goes from FO to PDF, asking for G1 to PDF
might semi-automatically provide ways to assemble the pipeline.

I'm thinking about a graphical sitemap authoring tool: it might query a
CWA for particular pipeline fragments depending on in/out schema
behaviors and not only perform passive validation at the end.

> I know my practices, and I tend
> to use existing schemas, only inventing if necessary.  When I do invent
> a schema, I always have it generated by a logicsheet and provide a
> transformation to the main document schema.  This works for me, because
> it is a known environment.

I follow the same pattern (when possible), but still the XML world might
soon become so "babel-like" that it's hard to know the behavioral
interface of a stylesheet by simply looking at it or, even worse, by
reading its filename.

> What you are talking about is validating that not only I am doing my
> job right, but other people in my team don't make simple mistakes.
> The only thing is that the validation shouldn't be done in live serving.

Yes, such pipeline validation/assembly operation should be performed at
sitemap authoring and maybe at deployment time, but for sure this is too
heavy (and useless) to perform during reallime serving operation (just
like XML validation is useless on live sites but extremly useful when
debugging the development site).

> I think we do need to have schema validation on during development (esp.
> when designing new schemas) to ensure the app works, but have it off for
> deployment--something the deployment tool can ensure.

Agreed.

> > As more and more Cocoon components emerge and are made available even
> > outside the Cocoon distribution, the ability to estimate the "behavioral
> > match" between two components, will very likely be vital, expecially for
> > sitemap authoring tools.
> >
> > The algorithm that performs the validation is far from being trivial: a
> > sufficient condition (and the most simple one) requires the connecting
> > ends to be identified by the exact same schema.
> >
> > So, the above pipeline would be valid *if*
> >
> >  t1i == g
> >  t2i == t1o
> >  s == t2o
> >
> > but this is not a necessary condition since there exist cases where a
> > pipeline is behaviorally valid even if the two subsequent schemas don't
> > match completely, but only on parts.
> 
> Just to add a little more complexity to the system is now that we have
> namespaces, we have multiple schemas in one document.  Therefore, the
> transformation and serialization layers must be even more specific.

For "schemas", I intended XSchema (or equivalent) documents that
identify completely the structure of the class of documents they
represent. This automatically includes the namespace nesting rules, etc.

[... omitted real-life scenario ...]

> > But in this case, in order to be possible to continue the validation,
> > the output schema must state what can be left pass thru.
> 
> Not necessarily.  If you use my example above, the namespaces used are all
> declared in the generator.  To show the how the validator would work with all
> three schemas in use check this out:
> 
> Schematic ns: [schem]
> Location ns:  [loc]
> XHTML ns:     [xhtml]
> Any ns:       [*]
> 
> g[doc][schem][loc] ->
> t1i[*][schem]      ->
> t1o[doc][loc]      ->
> t2i[*][loc]        ->
> t2o[doc]           ->
> t3i[doc]           ->
> t3o[xhtml]         ->
> s
> 
> As you can see, the validator tracks the namespaces used at each OUTPUT point.
> This g, t1o, t2o, and t3o.  It is easy to track the document namespaces.  The
> big thing is that if a transformer or generator uses any intermediate namespaces
> during processing, it needs to clean up after itself.  For example, the esql
> logicsheet or SQLTransformer use a namespace to describe how pull information
> from a database--however none of that information is transfered in the document
> markup.  Currently, the generator calls the start and end namespace for the
> logicsheet/transformer, but no elements are passed using the namespace.  This
> presents added complexity to the validator.  We might be able to use the
> SAXConnector approach to strip the unnecessary namespace arguments.  That
> would require caching the SAX calls until the namespace is closed or the first
> element using the namespace is found.

As we are touching a *deep core* problem, we must be very formal and
abstract and analyze all the implicit assumptions. So this is why I have
to be picky on what you say above:

if you read again my previous statements and understand my terminology
that indicates that a "schema" contains all the necessary information on
how to mix different namespaces, you'll understand that you make an
implicit assumption in your analysis: the topological space defined by
input and output namespaces are translated but never rotated.

This far from being trivial to understand so let me expand over your
example updating the notation in a more readable (at least to me) form:

            g[doc][loc][schem] ->
 [*][schem]t1[doc][loc] ->
   [*][loc]t2[doc] ->
      [doc]t3[xhtml] ->

which indicates that

 t1 transforms [schem] to [doc][loc] and everything else is copied over
 t2 transforms [loc] to [doc] and everything else is copied over
 t3 transforms [doc] to [xhtml]

but this is far from being a general enough notation, because it
presumes that stylesheets work on namespaces orthogonally and don't mix
them.

While I agree this is a "good way" to design modular stylesheets, it's
not general enough, in fact it fails to describe a stylesheet that does:

 <xsl:template match="loc:element/doc:element">
  ...
 </xsl:template>

therefore assumes that the [doc] and [loc] namespaces are intermixed and
in a well defined order.

This is why I talked about general schemas and not lists of namespaces.
If DTDs didn't count namespaces, XSchemas do: you can define that a
document is valid if and only if it contains something of the form
loc:element/doc:element while not any other conbinations of the two.

So, in general, the true behavior of a stylesheet can be indicated by
the schema of input and the schema of output, which might be as simple
as placing n different namespaces without mixing them and simply
adapting one of the n into the others, but this is not general enough to
allow behavior validation of pipelines.

> > I don't want to get deeper into these details, but I just wanted to show
> > you that establishing behavioral composition on pipeline components is a
> > lot more complex than you described.
> >
> > But, yes, it can and needs to be done.
> >
> > > StreamResources: Take any source and goes completely through serialization.
> > >                  This is basically an alternate for Readers, although it
> > >                  can also be used for generated reports.
> > >
> > > FlowResources: A mounted flowmap that performs all the logic necessary for
> > >                a complex form.  It handles paging, etc.  It is a type of
> > >                compound resource in that it pools several simple resources
> > >                together, and returns the one we are concerned with at the
> > >                moment.
> > >
> > > URIMapResources: A compound resource that maps URIs to specific simple
> > >                  resources.
> > >
> > > SitemapResource: A compound resource that is a sub sitemap.  Sitemaps are
> > >                  completely self contained, so it is near impossible to
> > >                  override their results.
> >
> > I'm not sure about these, though. Could you give me some pseudo-example
> > of a pseudo-sitemap and how it would use the above?
> 
> My thinking on a StreamResource was that the sub cocoon app would completely
> handle that resource.  So whether that resource was a Reader or a full pipeline
> does not need to be known by the parent.
> 
> As to markup, I am not sure yet.  We need a conceptual model that works before
> we can express the markup.

I think I understand you and I think we are thinking, as usual, along
the same lines (influenced by Avalon design patterns, I presume)

> > > A sub application can specify resource adaptors for it's native XML generators,
> > > for instance you might have a document schema and a schema for an inbox.
> > > The If the parent has a View that recognizes the inbox schema, then it will
> > > directly use that schema.  If not, the sub application will specify a default
> > > mapping.
> > >
> > > Hopefully this is enough to get us started.
> >
> > I understand very well the concept of schema-based adaptation, but I
> > think I lost you on the other resources, I think a couple of dirty
> > examples will get me closer to your point.
> 
> Hopefully, I can model it in ASCII....
> 
> +--------------------+ get(stocking-section) +---------------------------+
> | Root Cocoon App    |---------------------->| Stocking Section App      |
> | schema: [doc][loc] |<----------------------| schema: [doc][loc][schem] |
> +--------------------+    rcv([doc][loc])    +---------------------------+
> 
> In the above "diagram", the root Cocoon app is designed to accept the
> [doc] and [loc] schemas (to carry on the previous examples), but has no
> knowledge of the [schem] schema.  The Stocking Section App is registered
> to output [doc], [loc], and [schem] schemas.  If the whole app is engineered
> to the [doc] schema (that being the target), Stocking Section App would
> provide adaptors for the [loc] and [schem] schemas to convert to the end
> [doc] schema.  If the parent app and the child app register the expected
> schemas with each other, the sitemap will return any schemas that can
> be handled natively.

Yes, it's very similar to what I was thinking, but this should NOT, IMO,
happen at sitemap level, but at pipeline assembly level, thus when
authoring and creating the webapp thru component composition.

Such authoting tool might well be a GUI version of Cocoon itself or
something else that connects to Cocoon for the deployment, I don't know,
but all this looking-up and discoveries should not happen on a live
site, IMO.

> IOW, Root registers [doc] and [loc] with Stocking.  Stocking configures
> itself so that it does not transform the [loc] schema--assuming that the
> parent knows how to handle it.  However, because the Root did not state
> that it could handle [schem] schema, Stocking applies the transformation
> for that.

I get the feeling that too much happens behind my back. I'd like to be
able to compose my pipelines in such an easy way, but I'd also would
like to be able to modify my sitemaps by hand when required (WYSIWIG has
drowbacks and we all know that very well, don't we?).

Also, how does this handle conflicts? what if there are two different
instances of a transformer that performs the same behavior [for example,
two stylesheets one fancy for nice graphics and one simple for text-like
browsers]? how should I choose between them?

Behavioral-only pipeline composition might lead to unwanted behavior and
would be very hard to understand what's wrong, expecially the more
namespaces and components are mixed and aggregated.

So, the best solution would be, IMO, to allow behavioral-driven
discovery of components in a sitemap editor and behavioral validation at
CWA deployment, while leaving the sitemap (or whatever comes next) as
static and explicit as possible, even if its readability is reduced.

I believe the path for easier Cocoon use passes also by creating aid
tools because, in fact, sometimes we are simply too powerful to simplify
the semantics of our configurations without sacrificing some of that
power.

This doesn't mean the sitemap is perfect and will never be touched, but,
IMO,  it should not sacrifice explicitness for ease of use, at least at
this level.

> > > > In short, you are asking for more solid and wiser contracts between web
> > > > applications and I believe that an absolute URI space accessing is
> > > > already a solid contract, but the proposed role-based addressing is a
> > > > killer since it allows strong contracts and still complete flexibility
> > > > and scalability.
> > >
> > > Yep. Well defined contracts reduce cognitive dissonence.  Too many contracts
> > > increase cognitive dissonence.
> >
> > Careful about using that term: "cognitive dissonance" is a good thing on
> > many situations since modern learning theories give it the role of
> > difference maker between short term and long term learning.
> >
> > In fact, they suggest that something gets learned only when there is
> > cognitive dissonance and your brain must work to overcome it, normally
> > by creating the abstraction that make it possible to make the two
> > cognitive concepts resonate and overlap with your existing semantic
> > environment.
> 
> See what they polute your minds with at school?  Keep in mind you are talking
> to someone with an Associates in the Recording Arts.  Psyche was not part of
> the lesson plan (however, psychoacoustics was...).

Hey, don't worry. :) I just scratched the surface myself on those issues
(even if I plan to dive into them deeply in the next months).

> I get your point though.

Good.

> > I'd love to continue research on this topic by letting practical things
> > like  real-life user experience as well as more theorical things like
> > cognitive science influence our decisions on how to make this project
> > evolve.
> 
> I have practical knowledge and real-life user experience.  I'll have to rely
> on your expertise for the cognitive sciences.  I know _some_ of the concepts
> because I have mentored others--but not nearly the detail you do.

Oh, believe me: there is so much to discover in this that I can wait to
start diving in. In fact, now that my technical studies are complete and
I know what happens when I click on my mouse to retrieve a web page
ranging from world-scale network architects to quantum behavior of light
and matter, I'm moving my attention on what's left: humans.

I've already done extensive research on psycoacustics myself but more
general things as cognitive science, visual semantics, color theory,
etc. are very likely to become my (and my girlfriend's) next research
field.

And you guys will get sick of me talking to you about how to apply to
cocoon what I will learn :)

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] Cocoon web applications

Posted by Berin Loritsch <bl...@apache.org>.

Stefano Mazzocchi wrote:
> 
> > Ok.  I can agree with that statement.  Keep in mind that for Cocoon
> > app installation you have to modify both the unarchived war and the
> > archived war file.  The reason is that SOME servlet containers ignore
> > the original war file once it is deployed.  SOME servlet containers
> > overwrite the contents of the directory with the contents of the war
> > file.  And still OTHERS act like the second scenario until the unarchived
> > directory is modified.
> >
> > Don't you just love it when there is no standard?
> 
> We have more than one Apache members listed on the JCP 053 which is
> responsible for the creation of that spec. If you think it's important
> to specify what behavior the servlet container should take when
> deploying a webapp, please, let's discuss a formal proposal that we
> might submit to the group and that might make into Servlet 2.4

I believe that the "work" directory was proposed for normal access to
writable areas.  The down side is that the location of the directory
is not easy to predict.  You can work around the spec with direct file
access, but that is not easily portable between environments and you
have to place entries in the policy file granting you access to those
locations.  I think that a Servlet that manages it's own internal context
might be worth exploring.  There are issues such as overwriting your
own files, etc.  I don't know if it is wise to open it to all servlet
developers.  Before I can come up with anything formal, I need to know
what environment I need.

> > That is what I am referring to.  As of Servlet 2.3 and much debate, the
> > official stance on where "/resource" maps you is to the web server root,
> > not the context root.  Instead, the context root is much more difficult
> > to reach.  Perhaps we can improve the HTML serializer to automagically
> > correct context root resources.
> 
> Yuck! I'd hate that. Serializers that mangle things behind your back are
> the worst pain in the ass to find out expecially because you never look
> at them since you normally consider them brainless and pure adapters
> from the XML world to the binary world.
> 
> Let's find a more elegant way.

OK, any ideas?

> > Let me expound.  I like to use a dierectory structure like this:
> >
> > /xdocs
> > /resources
> >       /images
> >       /scripts
> >       /styles
> > /stylesheets
> >       /system
> > /WEB-INF
> >       /logicsheets
> >       /cocoon.xconf
> >       /logkit.xconf
> > /sitemap.xmap
> > /${sub.app}
> >       /xdocs
> >       /resources
> >             /images
> >             /scripts
> >       /sitemap.xmap
> >
> > The problem is when I want a consistent look and feel in my ${sub.app}
> > area.  I cannot access the /stylesheets that are accessible via the
> > context--but not via the sitemap.  This requires me to copy the
> > /stylesheets to the ${sub.app}.
> 
> Ok, in this case, absolute URI would work and will not require you
> access to your parent, but to an absolute location (which, in this case,
> accidentally, happens to be your parent)
> 
> This is a simple fix and we can schedule it for Cocoon 2.1 since it
> might break back compatibility of sitemaps a little.

Sounds good.

> > Because Cocoon is an XML framework, in order for this approach to work,
> > you have to define the interfaces.  There are definite roles that I
> > have already identified.  Some of the solutions come from concepts in
> > SOAP, and some of the solutions come from concepts in JNDI, but here goes.
> >
> > For sub applications to work, you must have them work to a specific schema.
> > (this concept is from SOAP).  For instance, your resource must return
> > the results in DocBook format so that the parent knows how to apply views.
> > This is the interface of your "component".
> 
> I've already thought about this when I thought about a way to validate
> sitemaps and it's a *LOT* more complex than this.
> 
> Let's make an example: the "behavioral interfaces" of pipeline
> components are the expected input namespaces and the resulting
> namespaces. But listing them is not enough: you must know the exact
> structure, thus the namespace-aware schemas.
> 
> Even between components, schemas are the structure description that
> identify the expected "shape" of the SAX pipe that connects two
> components.
> 
> Now, suppose you have a pipeline such as
> 
>  <g] -> [t1] -> [t2] -> [s>
> 
> and you have
> 
>  g -> output schema of generator
>  t1i -> input schema of first transformer
>  t1o -> output schema of first transformer
>  t2i -> input schema of second transformer
>  t2o -> output schema of second transformer
>  s -> input schema of serializer
> 
> with all this information you can precisely estimate if the pipeline is
> "valid", in a behavioral sense.
> 
> This would allow you to perform some pretests on sitemaps (before
> compilation and before uploading) that avoids those "impedence
> mismatches" between connected components.

This is excellent--validation is vital!  I know my practices, and I tend
to use existing schemas, only inventing if necessary.  When I do invent
a schema, I always have it generated by a logicsheet and provide a
transformation to the main document schema.  This works for me, because
it is a known environment.

What you are talking about is validating that not only I am doing my
job right, but other people in my team don't make simple mistakes.
The only thing is that the validation shouldn't be done in live serving.

I think we do need to have schema validation on during development (esp.
when designing new schemas) to ensure the app works, but have it off for
deployment--something the deployment tool can ensure.

> As more and more Cocoon components emerge and are made available even
> outside the Cocoon distribution, the ability to estimate the "behavioral
> match" between two components, will very likely be vital, expecially for
> sitemap authoring tools.
> 
> The algorithm that performs the validation is far from being trivial: a
> sufficient condition (and the most simple one) requires the connecting
> ends to be identified by the exact same schema.
> 
> So, the above pipeline would be valid *if*
> 
>  t1i == g
>  t2i == t1o
>  s == t2o
> 
> but this is not a necessary condition since there exist cases where a
> pipeline is behaviorally valid even if the two subsequent schemas don't
> match completely, but only on parts.

Just to add a little more complexity to the system is now that we have
namespaces, we have multiple schemas in one document.  Therefore, the
transformation and serialization layers must be even more specific.

As an example, let us use a recent real life scenario.  I created a Cocoon
app that manages schematics (maps of where products go on a retailer's shelf)
and the location of the schematics (which shelves in the store use the
schematic).  As a result, I had to create a schema for the schematics and
the location (sharing a namespace).  It was not uncommon for the generator
to produce a document with the document schema and the schematic schema.
Your validation code has to be further expanded to include namespace
resolution like this:

Document ns:  [doc]
Schematic ns: [schem]
XHTML ns:     [xhtml]
Any ns:       [*]

g[doc][schem] ->
t1i[*][schem] ->
t1o[doc]      ->
t2i[doc]      ->
t2o[xhtml]    ->
s

This is actually a simplified pipeline (the real one used aggregation for
the menu, etc).

Using this approach of specifying the expected schemas and the output schemas,
we can go beyond simple validation, and do automatic discovery and use of the
needed transformation layers.  That way, when a generator mixes several schemas
together (I had one instance where I had up to five in one document--different
project), I don't need every request to go through the whole transformation
chain.  Let's take the same example above, and add a separate "location" schema
to the mix:

Document ns:  [doc]
Schematic ns: [schem]
Location ns:  [loc]
XHTML ns:     [xhtml]
Any ns:       [*]

UnOptimized                              Optimized
-----------------                        -----------------
g[doc][schem] ->                         g[doc][schem] ->
t1i[*][schem] ->                         t1i[*][schem] ->
t1o[doc]      ->                         t1o[doc]      ->
t2i[*][loc]   ->                         t2i[doc]      ->
t2o[doc]      ->                         t2o[xhtml]    ->
t3i[doc]      ->                         s
t3o[xhtml]    ->
s

The end result of both pipelines is the same, so the different path does not
affect the cache validity.  If the original doc had all three source schemas,
then the full path would have been used.  This has one more added optimization:
if the t[*][loc]->[doc] transformer changes, it does not affect the cache
validity of the optimized path.

> In fact, the input schema might work only on part of the previous output
> schema, for example, working only on one namespace and leaving the
> others elements pass-thru unchanged.

Exactly.

> But in this case, in order to be possible to continue the validation,
> the output schema must state what can be left pass thru.

Not necessarily.  If you use my example above, the namespaces used are all
declared in the generator.  To show the how the validator would work with all
three schemas in use check this out:

Schematic ns: [schem]
Location ns:  [loc]
XHTML ns:     [xhtml]
Any ns:       [*]

g[doc][schem][loc] ->
t1i[*][schem]      ->
t1o[doc][loc]      ->
t2i[*][loc]        ->
t2o[doc]           ->
t3i[doc]           ->
t3o[xhtml]         ->
s

As you can see, the validator tracks the namespaces used at each OUTPUT point.
This g, t1o, t2o, and t3o.  It is easy to track the document namespaces.  The
big thing is that if a transformer or generator uses any intermediate namespaces
during processing, it needs to clean up after itself.  For example, the esql
logicsheet or SQLTransformer use a namespace to describe how pull information
from a database--however none of that information is transfered in the document
markup.  Currently, the generator calls the start and end namespace for the
logicsheet/transformer, but no elements are passed using the namespace.  This
presents added complexity to the validator.  We might be able to use the
SAXConnector approach to strip the unnecessary namespace arguments.  That
would require caching the SAX calls until the namespace is closed or the first
element using the namespace is found.

> I don't want to get deeper into these details, but I just wanted to show
> you that establishing behavioral composition on pipeline components is a
> lot more complex than you described.
> 
> But, yes, it can and needs to be done.
> 
> > StreamResources: Take any source and goes completely through serialization.
> >                  This is basically an alternate for Readers, although it
> >                  can also be used for generated reports.
> >
> > FlowResources: A mounted flowmap that performs all the logic necessary for
> >                a complex form.  It handles paging, etc.  It is a type of
> >                compound resource in that it pools several simple resources
> >                together, and returns the one we are concerned with at the
> >                moment.
> >
> > URIMapResources: A compound resource that maps URIs to specific simple
> >                  resources.
> >
> > SitemapResource: A compound resource that is a sub sitemap.  Sitemaps are
> >                  completely self contained, so it is near impossible to
> >                  override their results.
> 
> I'm not sure about these, though. Could you give me some pseudo-example
> of a pseudo-sitemap and how it would use the above?

My thinking on a StreamResource was that the sub cocoon app would completely
handle that resource.  So whether that resource was a Reader or a full pipeline
does not need to be known by the parent.

As to markup, I am not sure yet.  We need a conceptual model that works before
we can express the markup.

> > A sub application can specify resource adaptors for it's native XML generators,
> > for instance you might have a document schema and a schema for an inbox.
> > The If the parent has a View that recognizes the inbox schema, then it will
> > directly use that schema.  If not, the sub application will specify a default
> > mapping.
> >
> > Hopefully this is enough to get us started.
> 
> I understand very well the concept of schema-based adaptation, but I
> think I lost you on the other resources, I think a couple of dirty
> examples will get me closer to your point.

Hopefully, I can model it in ASCII....

+--------------------+ get(stocking-section) +---------------------------+
| Root Cocoon App    |---------------------->| Stocking Section App      |
| schema: [doc][loc] |<----------------------| schema: [doc][loc][schem] |
+--------------------+    rcv([doc][loc])    +---------------------------+

In the above "diagram", the root Cocoon app is designed to accept the
[doc] and [loc] schemas (to carry on the previous examples), but has no
knowledge of the [schem] schema.  The Stocking Section App is registered
to output [doc], [loc], and [schem] schemas.  If the whole app is engineered
to the [doc] schema (that being the target), Stocking Section App would
provide adaptors for the [loc] and [schem] schemas to convert to the end
[doc] schema.  If the parent app and the child app register the expected
schemas with each other, the sitemap will return any schemas that can
be handled natively.

IOW, Root registers [doc] and [loc] with Stocking.  Stocking configures
itself so that it does not transform the [loc] schema--assuming that the
parent knows how to handle it.  However, because the Root did not state
that it could handle [schem] schema, Stocking applies the transformation
for that.

> > > In short, you are asking for more solid and wiser contracts between web
> > > applications and I believe that an absolute URI space accessing is
> > > already a solid contract, but the proposed role-based addressing is a
> > > killer since it allows strong contracts and still complete flexibility
> > > and scalability.
> >
> > Yep. Well defined contracts reduce cognitive dissonence.  Too many contracts
> > increase cognitive dissonence.
> 
> Careful about using that term: "cognitive dissonance" is a good thing on
> many situations since modern learning theories give it the role of
> difference maker between short term and long term learning.
> 
> In fact, they suggest that something gets learned only when there is
> cognitive dissonance and your brain must work to overcome it, normally
> by creating the abstraction that make it possible to make the two
> cognitive concepts resonate and overlap with your existing semantic
> environment.

See what they polute your minds with at school?  Keep in mind you are talking
to someone with an Associates in the Recording Arts.  Psyche was not part of
the lesson plan (however, psychoacoustics was...).

I get your point though.

> I'd love to continue research on this topic by letting practical things
> like  real-life user experience as well as more theorical things like
> cognitive science influence our decisions on how to make this project
> evolve.

I have practical knowledge and real-life user experience.  I'll have to rely
on your expertise for the cognitive sciences.  I know _some_ of the concepts
because I have mentored others--but not nearly the detail you do.

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] Cocoon web applications

Posted by Stefano Mazzocchi <st...@apache.org>.

Berin Loritsch wrote:
> 
> There is a lot to discuss, and I will try to rim the fat as I go along.  First,
> let me say that having you back, Stefano, is great!

Well, thanks buddy :)

> With a little iron sharpening
> iron, we will have an excellent tool.

You can bet your ass on that :) [excuse my french and no offense to
frech people out there :)]
 
> > > Automatic discovery and deployment of Cocoon webapps requires
> > > that we allow our URI space to be at the mercy of your deployment tool.
> >
> > This is where I disagree.
> >
> > I never used the word "automatic" because what I described (not even
> > proposed, just thrown out as a RT to bootstrap a discussion) is a system
> > that allows you to install web applications in a manner which is
> > (presumably) more friendly than the rough and undesigned way we do today
> > that follows the implicit assumptions that Servlet API make that one WAR
> > file = one web application.
> 
> Ok.  I can agree with that statement.  Keep in mind that for Cocoon
> app installation you have to modify both the unarchived war and the
> archived war file.  The reason is that SOME servlet containers ignore
> the original war file once it is deployed.  SOME servlet containers
> overwrite the contents of the directory with the contents of the war
> file.  And still OTHERS act like the second scenario until the unarchived
> directory is modified.
> 
> Don't you just love it when there is no standard?

We have more than one Apache members listed on the JCP 053 which is
responsible for the creation of that spec. If you think it's important
to specify what behavior the servlet container should take when
deploying a webapp, please, let's discuss a formal proposal that we
might submit to the group and that might make into Servlet 2.4
 
> > > It is the same flaw in WAR files.
> >
> > Again, I disagree: the only flaw I see in WAR files is the implicit
> > one2one assumption I stated above and the fact that deployment behavior
> > is mostly left to the "mercy of your servlet container" (to quote you).
> >
> > I'm not describing a solution that copies over these bad assumptions,
> > but one solution that takes the good out of the war file concept
> > (mostly, the ability to install a package, just as you install software
> > on your machine, instead of having to place a bunch of files here and
> > there and mofify a ton of configurations just to make it start, not
> > talking about coherence between the different webapps installed in the
> > same system)
> 
> Ok.  I have some ideas on the requirements of such a system.  For one thing,
> the hierarchical CM approach will have to be used.

Ok
 
> > > Yet another issue arrizing from the URI space issue is where
> > > to locate all your images and resources.  I prefer to have one URI for each
> > > resource, wether it be image or stylesheet.  I can't reference an image at
> > > "/images/foo.jpg" because my webapp or cocoonapp may not be installed at
> > > the root context.  There really is no way of automatically rewritting the
> > > location to reference the context root.  What I end up doing is writting
> > > a rule in my sitemap that makes "images/foo.jpg" read the same resource
> > > no matter what directory it is called from--effectively poluting my URI
> > > space.
> >
> > I've always been an avid proponent of the complete remounting facility
> > of subsitemaps that lead to the assumptions that all URIs are relative
> > to the current sitemap and absolute locations were not possible.
> 
> Yes, but one issue is that stylesheets which are clearly in the webapp
> context cannot be accessed by a subsitemap if it requires parent directory
> access.  This is not optimal.  As to your proposal below, there is a lot
> to like.  Lets see if we can come up with a killer.
> 
> > While the ability to perform internal redirection makes is possible to
> > do what you describe above, I came to agree that this is *not* a good
> > thing and does not improve any concern separation nor makes the
> > architecture more elegant.
> >
> > As you imply, rather the opposite.
> >
> > Expecially thinking about batch processing, such remapped resources
> > would end up being copied all over the place with a clear inelegance,
> > redundancy and memory consumption.
> 
> Just look at Avalon's documentation build process to prove that.
> 
> > I'm happy to reconsider the ability to mount absolute URIs in sitemaps.
> > (of course, absolute is used here in sense of global cocoon context, not
> > absolute compared to the entire web site space)
> 
> That is what I am referring to.  As of Servlet 2.3 and much debate, the
> official stance on where "/resource" maps you is to the web server root,
> not the context root.  Instead, the context root is much more difficult
> to reach.  Perhaps we can improve the HTML serializer to automagically
> correct context root resources.

Yuck! I'd hate that. Serializers that mangle things behind your back are
the worst pain in the ass to find out expecially because you never look
at them since you normally consider them brainless and pure adapters
from the XML world to the binary world.

Let's find a more elegant way.
 
> > > Another side affect of the sitemap is that you cannot refer to resources in
> > > a parent sitemap.  The whole concept is centered arround the resources being
> > > in the same directory as the sitemap, or below.  I find this frustrating,
> > > as I am forced to duplicate resources in my directory hierarchy just so the
> > > entire site can have the same look and feel.
> >
> > Careful, this is an entirely different story. While I agree that having
> > the ability to access absolute Cocoon URIs is a good thing, I still
> > believe that something like cocoon:../images/logo is *bad* because it
> > removes the ability to remount the subsitemap.
> 
> Let me expound.  I like to use a dierectory structure like this:
> 
> /xdocs
> /resources
>       /images
>       /scripts
>       /styles
> /stylesheets
>       /system
> /WEB-INF
>       /logicsheets
>       /cocoon.xconf
>       /logkit.xconf
> /sitemap.xmap
> /${sub.app}
>       /xdocs
>       /resources
>             /images
>             /scripts
>       /sitemap.xmap
> 
> The problem is when I want a consistent look and feel in my ${sub.app}
> area.  I cannot access the /stylesheets that are accessible via the
> context--but not via the sitemap.  This requires me to copy the
> /stylesheets to the ${sub.app}.

Ok, in this case, absolute URI would work and will not require you
access to your parent, but to an absolute location (which, in this case,
accidentally, happens to be your parent)

This is a simple fix and we can schedule it for Cocoon 2.1 since it
might break back compatibility of sitemaps a little.
 
> > So, while the three other URIs (absolute and the two equivalent local)
> > maintain the same state independent on their relative position, the
> > "parent referring" one does not.
> >
> > This is why it should *not* be available, unless we want to sacrifice
> > control over webapp remounting and give it to the user. Something I'd
> > rather *not* do, but not because I don't trust users' ability to manager
> > their URI space, but because they can become so complex that breaking
> > transparent remounting at the architectural level places more concerns
> > on the URI space maintainer and for sure he'll have enough to take care
> > of.
> 
> Hopefully I explained the scenario better further up.  Resource is a very
> generic term, and does not always refer to graphics.
> 
> > > Clearly, Cocoon needs to focus on some key resource resolution issues to
> > > make it work properly.  Part of the issues come from resolving the names.
> > > Perhaps views are our ticket to standard look and feel.  However the site
> > > maintainer needs the perogative of adjusting the look and feel of the site
> > > without affecting the logic of the site.
> >
> > Of course.
> >
> > I just had an idea: could avalon-like component behavioral dependancies
> > for cocoon webapps make this possible?
> 
> Actually I was thinking along these lines.  (Great minds really do think
> alike).

great mind? nah, just obsessed by architectural elegance :)

> > I explain further: think of cocoon webapps as avalon components. Your
> > image gallery webapp and your webmail webapp being instances of
> > behavioral descriptions for webapps (sort of "interfaces" for web
> > applications) and both require the instance of another webapp which
> > includes the stylesheets they need to complete their job.
> >
> > So, the contract is that, in order to work, these webapps *require* the
> > existance of the stylesheet webapp and they require to know how to
> > access it.
> >
> > One solution is absolute positioning: the stylesheet-containing webapp,
> > must be mounted in a specific place. But, this is ugly and might create
> > collisions, expecially if different versions of the webapp must be
> > installed (one webapp requires one version and another requires
> > another).
> >
> > The other solution is to use the same avalon component discovery
> > mechanisms: if the package instance of the behavioral interface
> > "Gallery" requires one package instance intalled of the behavioral
> > interface "Style", it might use something like
> >
> >  cocoon://Style/stylesheet/images2html.xslt
> >
> > which allows three different behaviors:
> >
> >  cocoon:/images/logo -> absolute positioning
> >  coocon:images/logo -> relative positioning
> >  coocon://Look&Feel/images/logo -> role-based positioning
> >
> > that, IMO, satisfy all needs and make the entire thing *extremely*
> > elegant.
> 
> I know what you are saying, but I think we can make it a bit better.

Oh, that's for sure.

> In the proposal I had back in July (link in another email), I was
> thinking about resource naming.  That way, no matter what sitemap
> or URI map a resource was, it could be found.  This was as a result
> of trying to implement the subsitemap approach with sub applications.

Ok, this is a similar but different solution to allow more indirect
addressing.

> I ran into the problem with sitemaps at that point where I could not
> access my stylesheets.  I figured if I could name a view resource,
> I could use that resource in every sub application.  The FlowMap would
> be used for form progressions, and itself be considered a resource.

Good point.
 
> Because Cocoon is an XML framework, in order for this approach to work,
> you have to define the interfaces.  There are definite roles that I
> have already identified.  Some of the solutions come from concepts in
> SOAP, and some of the solutions come from concepts in JNDI, but here goes.
> 
> For sub applications to work, you must have them work to a specific schema.
> (this concept is from SOAP).  For instance, your resource must return
> the results in DocBook format so that the parent knows how to apply views.
> This is the interface of your "component".  

I've already thought about this when I thought about a way to validate
sitemaps and it's a *LOT* more complex than this.

Let's make an example: the "behavioral interfaces" of pipeline
components are the expected input namespaces and the resulting
namespaces. But listing them is not enough: you must know the exact
structure, thus the namespace-aware schemas.

Even between components, schemas are the structure description that
identify the expected "shape" of the SAX pipe that connects two
components.

Now, suppose you have a pipeline such as

 <g] -> [t1] -> [t2] -> [s>

and you have
 
 g -> output schema of generator
 t1i -> input schema of first transformer
 t1o -> output schema of first transformer
 t2i -> input schema of second transformer
 t2o -> output schema of second transformer
 s -> input schema of serializer

with all this information you can precisely estimate if the pipeline is
"valid", in a behavioral sense.

This would allow you to perform some pretests on sitemaps (before
compilation and before uploading) that avoids those "impedence
mismatches" between connected components.

As more and more Cocoon components emerge and are made available even
outside the Cocoon distribution, the ability to estimate the "behavioral
match" between two components, will very likely be vital, expecially for
sitemap authoring tools.

The algorithm that performs the validation is far from being trivial: a
sufficient condition (and the most simple one) requires the connecting
ends to be identified by the exact same schema. 

So, the above pipeline would be valid *if* 

 t1i == g
 t2i == t1o
 s == t2o

but this is not a necessary condition since there exist cases where a
pipeline is behaviorally valid even if the two subsequent schemas don't
match completely, but only on parts.

In fact, the input schema might work only on part of the previous output
schema, for example, working only on one namespace and leaving the
others elements pass-thru unchanged.

But in this case, in order to be possible to continue the validation,
the output schema must state what can be left pass thru.

I don't want to get deeper into these details, but I just wanted to show
you that establishing behavioral composition on pipeline components is a
lot more complex than you described.

But, yes, it can and needs to be done.

> Second, there are roles that
> are identified in the sitemap and the flow/uri map concepts:
> 
> Views: Take a known schema and transform it into a final product (HTML,
>        PDF, XML, etc.).  It is assumed that a View completes the pipeline
>        all the way to a Serializer.
> 
> Resources: Take any source and transform it into a known schema.  It is
>            assumed that the resource starts the pipeline from Generator.
>            There are specializations of this type.
> 
> AdatableResource: A specialization of Resource where the output schema
>                   can be specified by the parent.  I.e. If the parent
>                   doesn't know the inbox schema, the Adaptable resource
>                   will convert it to the docuement schema.

I had the same feeling when first writing the sitemap when I ended up
writing all my stylesheets as xml2html.xsl or simple-docbook2fo.xsl and
such. I thought: couldn't the sitemap automagically assemble the steps
to adapt one component to the next without me to know every simple
transformation stylesheet?

In fact, stylesheets are nothing about style: they transform one schema
into another which is more suitable for your needs.

So, I perfectly see what you are aiming at, but at that time, I just
wanted something that worked and, as you very well know, the sitemap is
already complex enough :)

> StreamResources: Take any source and goes completely through serialization.
>                  This is basically an alternate for Readers, although it
>                  can also be used for generated reports.
> 
> FlowResources: A mounted flowmap that performs all the logic necessary for
>                a complex form.  It handles paging, etc.  It is a type of
>                compound resource in that it pools several simple resources
>                together, and returns the one we are concerned with at the
>                moment.
> 
> URIMapResources: A compound resource that maps URIs to specific simple
>                  resources.
> 
> SitemapResource: A compound resource that is a sub sitemap.  Sitemaps are
>                  completely self contained, so it is near impossible to
>                  override their results.

I'm not sure about these, though. Could you give me some pseudo-example
of a pseudo-sitemap and how it would use the above?

> A sub application can specify resource adaptors for it's native XML generators,
> for instance you might have a document schema and a schema for an inbox.
> The If the parent has a View that recognizes the inbox schema, then it will
> directly use that schema.  If not, the sub application will specify a default
> mapping.
> 
> Hopefully this is enough to get us started.

I understand very well the concept of schema-based adaptation, but I
think I lost you on the other resources, I think a couple of dirty
examples will get me closer to your point.
 
> > Agreed, do you think that my proposed avalon-like component solution for
> > webapps might help in this regard?
> 
> definitely.

Good, does anybody disagree?
 
> > In short, you are asking for more solid and wiser contracts between web
> > applications and I believe that an absolute URI space accessing is
> > already a solid contract, but the proposed role-based addressing is a
> > killer since it allows strong contracts and still complete flexibility
> > and scalability.
> 
> Yep. Well defined contracts reduce cognitive dissonence.  Too many contracts
> increase cognitive dissonence.

Careful about using that term: "cognitive dissonance" is a good thing on
many situations since modern learning theories give it the role of
difference maker between short term and long term learning. 

In fact, they suggest that something gets learned only when there is
cognitive dissonance and your brain must work to overcome it, normally
by creating the abstraction that make it possible to make the two
cognitive concepts resonate and overlap with your existing semantic
environment.

> In other words, I want our new contracts to be built on existing contracts
> and known principals so that it doesn't take as long for someone to come
> up to speed.

I totally agree with the cognitive concept behind this.

Cocoon was engineered with usability in mind: in fact, Cocoon itself was
created to allow separation of concerns at the use level and make it
possible for people to do their job better and with less overlap and
frustration.

I'd love to continue research on this topic by letting practical things
like  real-life user experience as well as more theorical things like
cognitive science influence our decisions on how to make this project
evolve.

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] Cocoon web applications

Posted by Berin Loritsch <bl...@apache.org>.

There is a lot to discuss, and I will try to rim the fat as I go along.  First,
let me say that having you back, Stefano, is great!  With a little iron sharpening
iron, we will have an excellent tool.

Stefano Mazzocchi wrote:
> 
> Berin Loritsch wrote:
> >
> > I am working on a way to make the ExcaliburComponentManager more resource
> > friendly, and really help.  There are some issues to work out, but as that
> > is an Avalon Excalibur optimization, I would prefer to discuss those details
> > on the Avalon developer's list.
> 
> Ok, I'll resubscribe to that list (haven't done so yet).

So far it's in my head, and I haven't had the time to put it out there.  I have
a couple of proposals in my head right now.  Let me know when you subscribe to
Avalon's list, and I will put them out there (spoiler: it is resource management,
self-healing configuration files, and component tracking [for debug purposes]).

> > Automatic discovery and deployment of Cocoon webapps requires
> > that we allow our URI space to be at the mercy of your deployment tool.
> 
> This is where I disagree.
> 
> I never used the word "automatic" because what I described (not even
> proposed, just thrown out as a RT to bootstrap a discussion) is a system
> that allows you to install web applications in a manner which is
> (presumably) more friendly than the rough and undesigned way we do today
> that follows the implicit assumptions that Servlet API make that one WAR
> file = one web application.

Ok.  I can agree with that statement.  Keep in mind that for Cocoon
app installation you have to modify both the unarchived war and the
archived war file.  The reason is that SOME servlet containers ignore
the original war file once it is deployed.  SOME servlet containers
overwrite the contents of the directory with the contents of the war
file.  And still OTHERS act like the second scenario until the unarchived
directory is modified.

Don't you just love it when there is no standard?

> > It is the same flaw in WAR files.
> 
> Again, I disagree: the only flaw I see in WAR files is the implicit
> one2one assumption I stated above and the fact that deployment behavior
> is mostly left to the "mercy of your servlet container" (to quote you).
> 
> I'm not describing a solution that copies over these bad assumptions,
> but one solution that takes the good out of the war file concept
> (mostly, the ability to install a package, just as you install software
> on your machine, instead of having to place a bunch of files here and
> there and mofify a ton of configurations just to make it start, not
> talking about coherence between the different webapps installed in the
> same system)

Ok.  I have some ideas on the requirements of such a system.  For one thing,
the hierarchical CM approach will have to be used.

> > Yet another issue arrizing from the URI space issue is where
> > to locate all your images and resources.  I prefer to have one URI for each
> > resource, wether it be image or stylesheet.  I can't reference an image at
> > "/images/foo.jpg" because my webapp or cocoonapp may not be installed at
> > the root context.  There really is no way of automatically rewritting the
> > location to reference the context root.  What I end up doing is writting
> > a rule in my sitemap that makes "images/foo.jpg" read the same resource
> > no matter what directory it is called from--effectively poluting my URI
> > space.
> 
> I've always been an avid proponent of the complete remounting facility
> of subsitemaps that lead to the assumptions that all URIs are relative
> to the current sitemap and absolute locations were not possible.

Yes, but one issue is that stylesheets which are clearly in the webapp
context cannot be accessed by a subsitemap if it requires parent directory
access.  This is not optimal.  As to your proposal below, there is a lot
to like.  Lets see if we can come up with a killer.

> While the ability to perform internal redirection makes is possible to
> do what you describe above, I came to agree that this is *not* a good
> thing and does not improve any concern separation nor makes the
> architecture more elegant.
> 
> As you imply, rather the opposite.
> 
> Expecially thinking about batch processing, such remapped resources
> would end up being copied all over the place with a clear inelegance,
> redundancy and memory consumption.

Just look at Avalon's documentation build process to prove that.

> I'm happy to reconsider the ability to mount absolute URIs in sitemaps.
> (of course, absolute is used here in sense of global cocoon context, not
> absolute compared to the entire web site space)

That is what I am referring to.  As of Servlet 2.3 and much debate, the
official stance on where "/resource" maps you is to the web server root,
not the context root.  Instead, the context root is much more difficult
to reach.  Perhaps we can improve the HTML serializer to automagically
correct context root resources.

> > Another side affect of the sitemap is that you cannot refer to resources in
> > a parent sitemap.  The whole concept is centered arround the resources being
> > in the same directory as the sitemap, or below.  I find this frustrating,
> > as I am forced to duplicate resources in my directory hierarchy just so the
> > entire site can have the same look and feel.
> 
> Careful, this is an entirely different story. While I agree that having
> the ability to access absolute Cocoon URIs is a good thing, I still
> believe that something like cocoon:../images/logo is *bad* because it
> removes the ability to remount the subsitemap.

Let me expound.  I like to use a dierectory structure like this:

/xdocs
/resources
      /images
      /scripts
      /styles
/stylesheets
      /system
/WEB-INF
      /logicsheets
      /cocoon.xconf
      /logkit.xconf
/sitemap.xmap
/${sub.app}
      /xdocs
      /resources
            /images
            /scripts
      /sitemap.xmap

The problem is when I want a consistent look and feel in my ${sub.app}
area.  I cannot access the /stylesheets that are accessible via the
context--but not via the sitemap.  This requires me to copy the
/stylesheets to the ${sub.app}.

> So, while the three other URIs (absolute and the two equivalent local)
> maintain the same state independent on their relative position, the
> "parent referring" one does not.
> 
> This is why it should *not* be available, unless we want to sacrifice
> control over webapp remounting and give it to the user. Something I'd
> rather *not* do, but not because I don't trust users' ability to manager
> their URI space, but because they can become so complex that breaking
> transparent remounting at the architectural level places more concerns
> on the URI space maintainer and for sure he'll have enough to take care
> of.

Hopefully I explained the scenario better further up.  Resource is a very
generic term, and does not always refer to graphics.

> > Clearly, Cocoon needs to focus on some key resource resolution issues to
> > make it work properly.  Part of the issues come from resolving the names.
> > Perhaps views are our ticket to standard look and feel.  However the site
> > maintainer needs the perogative of adjusting the look and feel of the site
> > without affecting the logic of the site.
> 
> Of course.
> 
> I just had an idea: could avalon-like component behavioral dependancies
> for cocoon webapps make this possible?

Actually I was thinking along these lines.  (Great minds really do think
alike).

> I explain further: think of cocoon webapps as avalon components. Your
> image gallery webapp and your webmail webapp being instances of
> behavioral descriptions for webapps (sort of "interfaces" for web
> applications) and both require the instance of another webapp which
> includes the stylesheets they need to complete their job.
> 
> So, the contract is that, in order to work, these webapps *require* the
> existance of the stylesheet webapp and they require to know how to
> access it.
> 
> One solution is absolute positioning: the stylesheet-containing webapp,
> must be mounted in a specific place. But, this is ugly and might create
> collisions, expecially if different versions of the webapp must be
> installed (one webapp requires one version and another requires
> another).
> 
> The other solution is to use the same avalon component discovery
> mechanisms: if the package instance of the behavioral interface
> "Gallery" requires one package instance intalled of the behavioral
> interface "Style", it might use something like
> 
>  cocoon://Style/stylesheet/images2html.xslt
> 
> which allows three different behaviors:
> 
>  cocoon:/images/logo -> absolute positioning
>  coocon:images/logo -> relative positioning
>  coocon://Look&Feel/images/logo -> role-based positioning
> 
> that, IMO, satisfy all needs and make the entire thing *extremely*
> elegant.

I know what you are saying, but I think we can make it a bit better.
In the proposal I had back in July (link in another email), I was
thinking about resource naming.  That way, no matter what sitemap
or URI map a resource was, it could be found.  This was as a result
of trying to implement the subsitemap approach with sub applications.

I ran into the problem with sitemaps at that point where I could not
access my stylesheets.  I figured if I could name a view resource,
I could use that resource in every sub application.  The FlowMap would
be used for form progressions, and itself be considered a resource.

Because Cocoon is an XML framework, in order for this approach to work,
you have to define the interfaces.  There are definite roles that I
have already identified.  Some of the solutions come from concepts in
SOAP, and some of the solutions come from concepts in JNDI, but here goes.

For sub applications to work, you must have them work to a specific schema.
(this concept is from SOAP).  For instance, your resource must return
the results in DocBook format so that the parent knows how to apply views.
This is the interface of your "component".  Second, there are roles that
are identified in the sitemap and the flow/uri map concepts:

Views: Take a known schema and transform it into a final product (HTML,
       PDF, XML, etc.).  It is assumed that a View completes the pipeline
       all the way to a Serializer.

Resources: Take any source and transform it into a known schema.  It is
           assumed that the resource starts the pipeline from Generator.
           There are specializations of this type.

AdatableResource: A specialization of Resource where the output schema
                  can be specified by the parent.  I.e. If the parent
                  doesn't know the inbox schema, the Adaptable resource
                  will convert it to the docuement schema.

StreamResources: Take any source and goes completely through serialization.
                 This is basically an alternate for Readers, although it
                 can also be used for generated reports.

FlowResources: A mounted flowmap that performs all the logic necessary for
               a complex form.  It handles paging, etc.  It is a type of
               compound resource in that it pools several simple resources
               together, and returns the one we are concerned with at the
               moment.

URIMapResources: A compound resource that maps URIs to specific simple
                 resources.

SitemapResource: A compound resource that is a sub sitemap.  Sitemaps are
                 completely self contained, so it is near impossible to
                 override their results.

A sub application can specify resource adaptors for it's native XML generators,
for instance you might have a document schema and a schema for an inbox.
The If the parent has a View that recognizes the inbox schema, then it will
directly use that schema.  If not, the sub application will specify a default
mapping.

Hopefully this is enough to get us started.

> Agreed, do you think that my proposed avalon-like component solution for
> webapps might help in this regard?

definitely.

> In short, you are asking for more solid and wiser contracts between web
> applications and I believe that an absolute URI space accessing is
> already a solid contract, but the proposed role-based addressing is a
> killer since it allows strong contracts and still complete flexibility
> and scalability.

Yep. Well defined contracts reduce cognitive dissonence.  Too many contracts
increase cognitive dissonence.

In other words, I want our new contracts to be built on existing contracts
and known principals so that it doesn't take as long for someone to come
up to speed.

> > > Currently, development of cocoon webapps is rough and not engineered: is
> > > mostly left to the user ability to manage the process.
> >
> > I concur with this statement.  Hopefully my tutorial sheds some light on
> > an approach that is workable for more than just my crew and I.
> 
> Which is great and will make Cocoon must more used, but still, I think
> that good documentation should not *patch* an architectural design
> limitation, but sheds some light on both the users (to make them work in
> the best scenario) and the developers (to make it possible for users to
> be guided by the software first *and* the documentation later, not the
> other way around).

Agreed.

> > > In the future, I'd love to make it possible to design the system in such
> > > a way that concerns are well kept separate even during the two stages,
> > > development and production, for example, performing sitemap
> > > interpretation during developement (since no high load is required, but
> > > faster responsiveness at structure changes) while performing sitemap
> > > compilation on deployment. Same thing for compiled XML.
> >
> > +1
> 
> This is, IMO, a big thing. Currently Cocoon is damn powerful but not
> very useable. My goal for the future is to keep the power (or even
> increase it, if new ideas emerge) and increase the usability at the same
> level of its power, without sacrificing nothing that we already
> achieved.

It's a tall order, but it can be acheived.

> > I made another proposal that would be I think a better approach than the
> > sitemap concept as a whole.  Basically, it is breaking the sitemap into
> > pieces that are required for the management of the resources.  It keeps
> > the graphic designers in charge of their resources, Developers in charge
> > of their resources, etc.  It was a few months ago (I think around March/April).

I was wrong--it was July.

> > I proposed it to Giacomo as I did not want to distract from the focus on
> > C2 release at the time.  I will see if I can dig up the proposal in my
> > archives.
> 
> Yes, I've seen that and I concur that the sitemap forces concerns to
> overlap.
> 
> I'll be very happy to continue this thread and extend it to what we need
> to do to make it easier to use cocoon from a user's perspective and to
> allow more complete separation of concerns.

Excellent!

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

RE: [RT] Cocoon web applications

Posted by Vadim Gritsenko <va...@verizon.net>.

Under Resin, no need even to recompile if your extension classes are under WEB-INF/classes.
It will compile automatically and reload Cocoon.

It could save you lots of time ;)

Vadim

> -----Original Message-----
> From: Peter Royal [mailto:proyal@managingpartners.com]
> Sent: Tuesday, October 02, 2001 8:45 AM
> To: cocoon-dev@xml.apache.org
> Subject: Re: [RT] Cocoon web applications
> 
> 
> At 02:01 PM 10/2/2001 +0200, you wrote:
> >No kidding. Developping the gallery, I have to work on 6 new different
> >components and have to restart everytime (it takes almost 2 minutes on
> >my laptop) just to see what I did. The try/fail cycle was so big that I
> >decided to write standalone components with stdin/stdout interfaces just
> >to try them out. :(
> 
> really? we have several custom components in our c2 webapp, and all I have 
> to do is recompile and c2 reloads the sitemap/system as needed. no tomcat 
> restarting necessary. (this is for avalon components, loaded in via the 
> cocoon.xconf. we do have to reload for changes to objects stored in 
> session/etc).
> 
> One tip i've been using is to compile my code to the WEB-INF/classes 
> directory. it makes development much faster.
> -pete
> 
> -- 
> peter royal -> proyal@managingpartners.com
> managing partners, inc. -> http://www.managingpartners.com
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
> For additional commands, email: cocoon-dev-help@xml.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] Cocoon web applications

Posted by Peter Royal <pr...@managingpartners.com>.

At 02:01 PM 10/2/2001 +0200, you wrote:
>No kidding. Developping the gallery, I have to work on 6 new different
>components and have to restart everytime (it takes almost 2 minutes on
>my laptop) just to see what I did. The try/fail cycle was so big that I
>decided to write standalone components with stdin/stdout interfaces just
>to try them out. :(

really? we have several custom components in our c2 webapp, and all I have 
to do is recompile and c2 reloads the sitemap/system as needed. no tomcat 
restarting necessary. (this is for avalon components, loaded in via the 
cocoon.xconf. we do have to reload for changes to objects stored in 
session/etc).

One tip i've been using is to compile my code to the WEB-INF/classes 
directory. it makes development much faster.
-pete

-- 
peter royal -> proyal@managingpartners.com
managing partners, inc. -> http://www.managingpartners.com


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] Cocoon web applications

Posted by Stefano Mazzocchi <st...@apache.org>.

Berin Loritsch wrote:
> 
> Stefano Mazzocchi wrote:
> >
> > This is something I always wanted to bring up but never considered it to
> > be a priority high enough, but as soon as Cocoon2 reaches a status where
> > it's useable in production sites, people will start asking for something
> > more user friendly than a WAR file that can get as big as hell.
> 
> I had a solution to this problem posted a while back.  It dealt with fixing
> a number of issues related to classloaders and installation.  Basically,
> you have the option of the standard war file, an installed cocoon base,
> or an EAR file with all the Cocoon libs in one place.
> 
> I am working on a way to make the ExcaliburComponentManager more resource
> friendly, and really help.  There are some issues to work out, but as that
> is an Avalon Excalibur optimization, I would prefer to discuss those details
> on the Avalon developer's list.

Ok, I'll resubscribe to that list (haven't done so yet).

> > ok, pacifist disclaimer given, I'll being saying that I love the way we
> > install cocoon in a big file we deploy on all servlet containers. Many
> > have already personally expressed their happyness to me compared to the
> > hassle that they required for installation for Cocoon1. This is mainly
> > given to the webapp deployment concept that latest servlet API include.
> 
> Yes.  This is one of the goals we for C2.  For the most part, it is pretty
> good.
> 
> > So, if you consider Cocoon and its samples a single web application,
> > this way is perfect and you will never going to need anything else, but
> > as soon as you start adding your own stuff, you'll find out that Cocoon
> > is not a single web application but a framework for applications.
> 
> Wasn't this meantioned on the web site?  Yes, Cocoon is a publishing/webapp
> framework, just as Avalon is a Server Side Framework.  However, I have some
> reservations about the solution you are proposing below, and hopefully I
> can voice them in an intelligent manner.
> 
> > In short, as Tomcat is a container for servlet-based web applications,
> > Cocoon is a container for cocoon-based web applications. The parallel is
> > evident to me and should *not* require (as Jeremy was asking, clearly
> > touched by the same feeling) separate cocoon instances just to *deploy*
> > different cocoon-based web applications.
> 
> As an avid Cocoon developer and user (I have sold my IT department on it),
> I prefer to grow my Cocoon Webapp in a controlled environment, without
> automatic deployment of new sections.  I will expound below.

I wholeheartly agree with your statement and I don't think I proposed
anything "automatic", but just easier.

> > In this respect, there is a big parallel between servlet-based web
> > applications and cocoon-based web applications: both require a
> > "deployment descriptor" that gives the container instructions on where
> > to "mount" it, where to find web-app specific components, libraries and
> > resources.
> >
> > Clearly, the sitemap is the closes thing that matches this.
> 
> It may be the closest thing, but it is not the best thing.  Currently, in
> order to mount a sub-sitemap (effectively what you are talking about), you
> must include an entry in the parent that shows where the child sitemap is.

Correct.

> This allows for a controlled URI space--something you have been an avid
> proponent of.

Absolutely Correct and I'm still very concerned about this.

> Automatic discovery and deployment of Cocoon webapps requires
> that we allow our URI space to be at the mercy of your deployment tool.

This is where I disagree.

I never used the word "automatic" because what I described (not even
proposed, just thrown out as a RT to bootstrap a discussion) is a system
that allows you to install web applications in a manner which is
(presumably) more friendly than the rough and undesigned way we do today
that follows the implicit assumptions that Servlet API make that one WAR
file = one web application.

> Clearly, you cannot see that as an optimal solution.

Yes, an optimal solution to my eyes is one that allows you to deploy the
cocoon webapp with the *least* possible effort, but without sacrificing
the full control on what you're doing and mining the implicit separation
of concerns that the subsitemap concept allows.

> It is the same flaw in WAR files.  

Again, I disagree: the only flaw I see in WAR files is the implicit
one2one assumption I stated above and the fact that deployment behavior
is mostly left to the "mercy of your servlet container" (to quote you).

I'm not describing a solution that copies over these bad assumptions,
but one solution that takes the good out of the war file concept
(mostly, the ability to install a package, just as you install software
on your machine, instead of having to place a bunch of files here and
there and mofify a ton of configurations just to make it start, not
talking about coherence between the different webapps installed in the
same system)

> Yet another issue arrizing from the URI space issue is where
> to locate all your images and resources.  I prefer to have one URI for each
> resource, wether it be image or stylesheet.  I can't reference an image at
> "/images/foo.jpg" because my webapp or cocoonapp may not be installed at
> the root context.  There really is no way of automatically rewritting the
> location to reference the context root.  What I end up doing is writting
> a rule in my sitemap that makes "images/foo.jpg" read the same resource
> no matter what directory it is called from--effectively poluting my URI
> space.

I've always been an avid proponent of the complete remounting facility
of subsitemaps that lead to the assumptions that all URIs are relative
to the current sitemap and absolute locations were not possible.

While the ability to perform internal redirection makes is possible to
do what you describe above, I came to agree that this is *not* a good
thing and does not improve any concern separation nor makes the
architecture more elegant.

As you imply, rather the opposite.

Expecially thinking about batch processing, such remapped resources
would end up being copied all over the place with a clear inelegance,
redundancy and memory consumption.

I'm happy to reconsider the ability to mount absolute URIs in sitemaps.
(of course, absolute is used here in sense of global cocoon context, not
absolute compared to the entire web site space) 
 
> Another side affect of the sitemap is that you cannot refer to resources in
> a parent sitemap.  The whole concept is centered arround the resources being
> in the same directory as the sitemap, or below.  I find this frustrating,
> as I am forced to duplicate resources in my directory hierarchy just so the
> entire site can have the same look and feel.

Careful, this is an entirely different story. While I agree that having
the ability to access absolute Cocoon URIs is a good thing, I still
believe that something like cocoon:../images/logo is *bad* because it
removes the ability to remount the subsitemap.

So, if I want to access the logo without having to duplicate the rule
everywhere, I simply have to refer to cocoon:/images/logo, while, rather
normally, cocoon:images/logo would refer to the current location.

So, given

  /cocoon/ mounts a webapp on "webapp"

  1) /cocoon/images/logo
  2) /cocoon/webapp/images/logo

the four possible way to access the resource are

 cocoon:/images/logo   -> 1
 cocoon:../images/logo -> 1
 cocoon:images/logo    -> 2
 cocoon:./images/logo  -> 2

now, suppose we want to remount the webapp someplace else in another
site (for whatever reason).

 /cocoon/ mounts a webapp on "webapps/webapp"

 1) /cocoon/images/logo
 2) /cocoon/webapps/webapp/images/logo

now, the same four URIs access differently

 cocoon:/images/logo   -> 1
 cocoon:../images/logo -> (not found)
 cocoon:images/logo    -> 2
 cocoon:./images/logo  -> 2

So, while the three other URIs (absolute and the two equivalent local)
maintain the same state independent on their relative position, the
"parent referring" one does not.

This is why it should *not* be available, unless we want to sacrifice
control over webapp remounting and give it to the user. Something I'd
rather *not* do, but not because I don't trust users' ability to manager
their URI space, but because they can become so complex that breaking
transparent remounting at the architectural level places more concerns
on the URI space maintainer and for sure he'll have enough to take care
of.

> Clearly, Cocoon needs to focus on some key resource resolution issues to
> make it work properly.  Part of the issues come from resolving the names.
> Perhaps views are our ticket to standard look and feel.  However the site
> maintainer needs the perogative of adjusting the look and feel of the site
> without affecting the logic of the site.

Of course.

I just had an idea: could avalon-like component behavioral dependancies
for cocoon webapps make this possible?

I explain further: think of cocoon webapps as avalon components. Your
image gallery webapp and your webmail webapp being instances of
behavioral descriptions for webapps (sort of "interfaces" for web
applications) and both require the instance of another webapp which
includes the stylesheets they need to complete their job.

So, the contract is that, in order to work, these webapps *require* the
existance of the stylesheet webapp and they require to know how to
access it.

One solution is absolute positioning: the stylesheet-containing webapp,
must be mounted in a specific place. But, this is ugly and might create
collisions, expecially if different versions of the webapp must be
installed (one webapp requires one version and another requires
another).

The other solution is to use the same avalon component discovery
mechanisms: if the package instance of the behavioral interface
"Gallery" requires one package instance intalled of the behavioral
interface "Style", it might use something like

 cocoon://Style/stylesheet/images2html.xslt

which allows three different behaviors:

 cocoon:/images/logo -> absolute positioning
 coocon:images/logo -> relative positioning
 coocon://Look&Feel/images/logo -> role-based positioning

that, IMO, satisfy all needs and make the entire thing *extremely*
elegant.

> > Let's make a solid example: I started integrating my image gallery thing
> > which now requires 12 (or so) new classes added to the Cocoon
> > distribution (some 6 new components), but they are general enough to be
> > useable on many other circumstances, but one component which is simply
> > too specific to be of any use in other circumstances.
> >
> > Currently, the operations that we have to do to *install* a new
> > cocoon-based web application are:
> >
> >  1) prepare a directory with all the required files
> >  2) mount the new web-app sitemap in the sitemap that controls the
> > URI-space we want to mount our stuff on
> >  3) place our web-app specific components in the folder for new
> > components (defined in cocoon.conf, if my memory doesn't fail)
> 
> Cocoon.xconf actually.

Sorry :)
 
> >  4) have the servlet container restart the entire web-application
> > handled by Cocoon.
> 
> This part is a pain.

No kidding. Developping the gallery, I have to work on 6 new different
components and have to restart everytime (it takes almost 2 minutes on
my laptop) just to see what I did. The try/fail cycle was so big that I
decided to write standalone components with stdin/stdout interfaces just
to try them out. :(

> I have a solution to remove the need to restart
> the webapp when we change Cocoon.xconf.  Avalon Excalibur has a new
> resource monitoring system.  With the ActiveMonitor, the servlet can
> know when to reload Cocoon.xconf and apply the changes.

Sounds good.

Tomcat 4.0 allows you to reload a webapp just by hitting an URI
 
 http://localhost:8080/manager/reload?path=/cocoon

(you must add yourself as a role "manager" in the user file in order to
authenticate) but it doesn't always work. 

> > While, following the servlet parallel, we should do:
> >
> >  1) have a CWA (Cocoon Web Application) file with a manifest file (or
> > equivalent thing) that specifies where is the sitemap file (I'm also
> > happy with forcing the sitemap file to be called sitemap.xmap and places
> > in the root of the package, thus eliminating the need for such a
> > manifest file) and contains all the required things (resources,
> > stylesheets, additional components and libraries, entity catalogs,
> > etc..).
> >
> >  2) open the cocoon manager (similar to Tomcat 4.0 manager webapp, just
> > *much* more user friendly) and authenticate (if more security is
> > required this could be mapped over an SLL-secured connection and
> > authentication guaranteed by client-side certification, but this is none
> > of our concern since Cocoon doesn't handle nor should that part of the
> > HTTP connection).
> >
> >  3) upload the CWA file (unlike Tomcat 4.0 manager which simply requires
> > you to indicate where the CWA file is on the machine, with upload we can
> > deploy a CWA from another machine entirely which is a great feature).
> >
> >  4) tell Cocoon to start the deployed CWA
> >
> > and that's it, without even having to stop Cocoon or even tell the
> > servlet container about what we are doing.
> 
> It sounds promissing, but I am not sold on it yet.  Here are the issues
> I have--and added complexity is the least of them.  First, the sitemap
> is too self contained to allow for meaningful deployment of new functionality
> while maintaining consistent look and feel.

I totally agree.

> The CWA would be self-contained so that it would use its own 
> resources and not be allowed to be overridden
> by the parent.  Effectively, the new functionality would have the same
> look and feel regardless of where it is used.
> 
> That part negates one of Cocoon's strengths.  The ability to have a consistent
> look and feel in a scalable manner.  The only way I see around that is if the
> user is wise, and implements the sitemap so that the CWA only acts on XML.
> For instance, one of the methods I adopted for the tutorial app in CVS HEAD
> is to have a standard resource mapping for all HTML so that the look and feel
> was consistent.  The mapping would then call the XML version (that returned
> XML in the Stylebook format).  This allows the parent sitemap to theme any
> new functionality.  Because the sitemap is too flexible, this approach can't
> be enforced for the CWA--leading to unpredictable results later.

Agreed, do you think that my proposed avalon-like component solution for
webapps might help in this regard? 

In short, you are asking for more solid and wiser contracts between web
applications and I believe that an absolute URI space accessing is
already a solid contract, but the proposed role-based addressing is a
killer since it allows strong contracts and still complete flexibility
and scalability.
 
> >
> > Of course, Cocoon's classloader should be rearchitected to allow several
> > "contexts" which different classloaders, this will automatically solve
> > the issues of having to run multiple cocoon instances to separate the
> > resolution space of different cocoon-based webapps.
> >
> > But there are other things that might turn out incredibly useful: almost
> > everybody works with two copies, one for development and one for
> > production. The first is used when developing, the second is deployed
> > and used until changes are required.
> >
> > Everybody that has real-life working knowledge knows that is almost
> > impossible to force people to work on a centralized version, expecially
> > if the easiest way to modify something is to work on what is currently
> > live.
> >
> > Currently, the processing cycles are something like:
> >
> >  1) write your webapp under the /cocoon2/ folder
> 
> 1) create a new context with Cocoon installed.
> 
> >  2) use cocoon build file to generate the WAR file (which contains your
> > stuff as well)
> 
> 2) use your own build process (usually derived from Cocoon's build file).
> 
> > but then you note that your stylesheets have something wrong, so you
> > don't do this over and over (since the cocoon-war file is so big and
> > restarting the entire crap takes forever and a half) but simply modify
> > the stylesheets in-place while they are live.
> >
> > You can bet your ass that you'll forget to copy back the changes to your
> > original location.
> >
> > Result: next revision gets deployed, many things that previously worked
> > well (expecially in sections you didn't touch because they were just
> > perfect as they were) don't work anymore. This is called: lost update.
> 
> Actually, when I am developing a webapp using Cocoon I end up doing it
> "live" in the servlet engine.  When I have it working correctly, I copy
> it back to the build process.  This leaves the CVS logs not as incremental
> as they should be--but it works.
> 
> Another alternative is to have CVS check out the webapp directly into the
> Tomcat's webapps directory and work on it from there.  The rest of the
> stuff is how to assemble the webapp--so there is very little extra work.
> 
> > One solution is to do the deploying once cocoon2 fresh out of the box,
> > then install your stuff over on the deployed version.
> 
> -1  The process of overwriting existing stuff is very complex.
> 
> > NOTE: the servlet API doesn't say *anything* about what happens to
> > deployed files that are subsequently modified after being unpacked from
> > the WAR. In some circumstances, the container might even erase the
> > unpacked version when the web-app is stopped or the container is shut
> > down in order to save space. The Servlet API assume the WAR file and the
> > unpacked version are the *same* and unpacking occurs only for speed
> > reasons, not to allow you to modify things live.
> >
> > So, you installed Cocoon2 fresh, it gets unpacked, you stop tomcat
> > without shutting down but simply kill -9 it or CTRL-C on the shell, you
> > add your stuff and work well.
> >
> > C'mon, this is crap, we must come up with something smarter.
> 
> Exactly.
> 
> > Ok, the idea is: how do I make the files deployed unmodifiable?
> >
> > My solution is: compile everything. Tranquillity by obscurity.
> >
> > If you transform all XML files into CompiledXML files (using the code I
> > wrote for a long time ago and which is now used in the cache system),
> > not only parsing performance is greatly improved on live sites, but also
> > we obtain that people will very unlikely modify directly the unpacked
> > files because they are, in fact, binary.
> 
> I like this approach.  Especially for the XSP pages.  I would even go so
> far as to write a Sitemap mangler that compiled your XSP pages and included
> the classes in the proper location and rewrote your sitemap.xmap file so that
> they are treated like hand written generators.

Good idea.
 
> > This also means precompiling sitemaps and XSPs and everything that needs
> > compilation.
> >
> > Of course, this is not suitable for close-cycle development of cocoon
> > web apps: I could not want to have to recompile my entire CWA, deploy,
> > restart, etc, everytime I have to modify my stylesheet, it would be
> > foolish to impose this, but on production this makes a real difference,
> > expecially in those places where carefully scrutinized quality assurance
> > phases must be performed before something enters production.
> 
> Exactly.  Providing a tool that performs this step is a boon.  It takes
> all the guesswork out of compiling everything and makes the entire app
> self contained.  I went through a nightmare install using ColdFusion (I
> know I keep rehashing this one, but it is burned in my memmory), because
> there was no way to simply drop an archive and have everything work.

Yep.
 
> > In these situations, we must take all the actions to allow packages as
> > sealed as possible (possibly even crypto-sealed) to be deployed even
> > remotely on a live site, making also possible to upgrade an existing
> > package with a new one while the other is running (which is not that
> > hard to do with carefully designed multi-threading management of
> > subcontexts).
> 
> :)  I have learned a few things about threading and race conditions.
> Seriously, I get your point.  Everything would be run from the context
> that is currently installed until the new version is installed and fully
> set up.  Then we switch to serving the new content while we dispose of
> the old content.

Yup.
 
> > Currently, development of cocoon webapps is rough and not engineered: is
> > mostly left to the user ability to manage the process.
> 
> I concur with this statement.  Hopefully my tutorial sheds some light on
> an approach that is workable for more than just my crew and I.

Which is great and will make Cocoon must more used, but still, I think
that good documentation should not *patch* an architectural design
limitation, but sheds some light on both the users (to make them work in
the best scenario) and the developers (to make it possible for users to
be guided by the software first *and* the documentation later, not the
other way around).
 
> > In the future, I'd love to make it possible to design the system in such
> > a way that concerns are well kept separate even during the two stages,
> > development and production, for example, performing sitemap
> > interpretation during developement (since no high load is required, but
> > faster responsiveness at structure changes) while performing sitemap
> > compilation on deployment. Same thing for compiled XML.
> 
> +1

This is, IMO, a big thing. Currently Cocoon is damn powerful but not
very useable. My goal for the future is to keep the power (or even
increase it, if new ideas emerge) and increase the usability at the same
level of its power, without sacrificing nothing that we already
achieved.
 
> > Ok, hope this is enough to start a discussion. If you have any
> > suggestion to shape the way you will develop, deploy, manage your future
> > webapps in cocoon, make yourself heard now at design stage so that we
> > can get down in coding with a clear indication on what the people want
> > or would like to see.
> 
> I made another proposal that would be I think a better approach than the
> sitemap concept as a whole.  Basically, it is breaking the sitemap into
> pieces that are required for the management of the resources.  It keeps
> the graphic designers in charge of their resources, Developers in charge
> of their resources, etc.  It was a few months ago (I think around March/April).
> I proposed it to Giacomo as I did not want to distract from the focus on
> C2 release at the time.  I will see if I can dig up the proposal in my
> archives.

Yes, I've seen that and I concur that the sitemap forces concerns to
overlap.

I'll be very happy to continue this thread and extend it to what we need
to do to make it easier to use cocoon from a user's perspective and to
allow more complete separation of concerns.

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------



---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] Cocoon web applications

Posted by Berin Loritsch <bl...@apache.org>.

Berin Loritsch wrote:
> 
> I made another proposal that would be I think a better approach than the
> sitemap concept as a whole.  Basically, it is breaking the sitemap into
> pieces that are required for the management of the resources.  It keeps
> the graphic designers in charge of their resources, Developers in charge
> of their resources, etc.  It was a few months ago (I think around March/April).
> I proposed it to Giacomo as I did not want to distract from the focus on
> C2 release at the time.  I will see if I can dig up the proposal in my
> archives.


The URL for the first posting is here:

http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=99443970706638&w=2

There are a number of responses that can be followed from that link.

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] Cocoon web applications

Posted by Berin Loritsch <bl...@apache.org>.

Stefano Mazzocchi wrote:
> 
> This is something I always wanted to bring up but never considered it to
> be a priority high enough, but as soon as Cocoon2 reaches a status where
> it's useable in production sites, people will start asking for something
> more user friendly than a WAR file that can get as big as hell.

I had a solution to this problem posted a while back.  It dealt with fixing
a number of issues related to classloaders and installation.  Basically,
you have the option of the standard war file, an installed cocoon base,
or an EAR file with all the Cocoon libs in one place.

I am working on a way to make the ExcaliburComponentManager more resource
friendly, and really help.  There are some issues to work out, but as that
is an Avalon Excalibur optimization, I would prefer to discuss those details
on the Avalon developer's list.

> ok, pacifist disclaimer given, I'll being saying that I love the way we
> install cocoon in a big file we deploy on all servlet containers. Many
> have already personally expressed their happyness to me compared to the
> hassle that they required for installation for Cocoon1. This is mainly
> given to the webapp deployment concept that latest servlet API include.

Yes.  This is one of the goals we for C2.  For the most part, it is pretty
good.

> So, if you consider Cocoon and its samples a single web application,
> this way is perfect and you will never going to need anything else, but
> as soon as you start adding your own stuff, you'll find out that Cocoon
> is not a single web application but a framework for applications.

Wasn't this meantioned on the web site?  Yes, Cocoon is a publishing/webapp
framework, just as Avalon is a Server Side Framework.  However, I have some
reservations about the solution you are proposing below, and hopefully I
can voice them in an intelligent manner.

> In short, as Tomcat is a container for servlet-based web applications,
> Cocoon is a container for cocoon-based web applications. The parallel is
> evident to me and should *not* require (as Jeremy was asking, clearly
> touched by the same feeling) separate cocoon instances just to *deploy*
> different cocoon-based web applications.

As an avid Cocoon developer and user (I have sold my IT department on it),
I prefer to grow my Cocoon Webapp in a controlled environment, without
automatic deployment of new sections.  I will expound below.

> In this respect, there is a big parallel between servlet-based web
> applications and cocoon-based web applications: both require a
> "deployment descriptor" that gives the container instructions on where
> to "mount" it, where to find web-app specific components, libraries and
> resources.
> 
> Clearly, the sitemap is the closes thing that matches this.

It may be the closest thing, but it is not the best thing.  Currently, in
order to mount a sub-sitemap (effectively what you are talking about), you
must include an entry in the parent that shows where the child sitemap is.

This allows for a controlled URI space--something you have been an avid
proponent of.  Automatic discovery and deployment of Cocoon webapps requires
that we allow our URI space to be at the mercy of your deployment tool.
Clearly, you cannot see that as an optimal solution.  It is the same flaw
in WAR files.  Yet another issue arrizing from the URI space issue is where
to locate all your images and resources.  I prefer to have one URI for each
resource, wether it be image or stylesheet.  I can't reference an image at
"/images/foo.jpg" because my webapp or cocoonapp may not be installed at
the root context.  There really is no way of automatically rewritting the
location to reference the context root.  What I end up doing is writting
a rule in my sitemap that makes "images/foo.jpg" read the same resource
no matter what directory it is called from--effectively poluting my URI
space.

Another side affect of the sitemap is that you cannot refer to resources in
a parent sitemap.  The whole concept is centered arround the resources being
in the same directory as the sitemap, or below.  I find this frustrating,
as I am forced to duplicate resources in my directory hierarchy just so the
entire site can have the same look and feel.

Clearly, Cocoon needs to focus on some key resource resolution issues to
make it work properly.  Part of the issues come from resolving the names.
Perhaps views are our ticket to standard look and feel.  However the site
maintainer needs the perogative of adjusting the look and feel of the site
without affecting the logic of the site.

> Let's make a solid example: I started integrating my image gallery thing
> which now requires 12 (or so) new classes added to the Cocoon
> distribution (some 6 new components), but they are general enough to be
> useable on many other circumstances, but one component which is simply
> too specific to be of any use in other circumstances.
> 
> Currently, the operations that we have to do to *install* a new
> cocoon-based web application are:
> 
>  1) prepare a directory with all the required files
>  2) mount the new web-app sitemap in the sitemap that controls the
> URI-space we want to mount our stuff on
>  3) place our web-app specific components in the folder for new
> components (defined in cocoon.conf, if my memory doesn't fail)

Cocoon.xconf actually.

>  4) have the servlet container restart the entire web-application
> handled by Cocoon.

This part is a pain.  I have a solution to remove the need to restart
the webapp when we change Cocoon.xconf.  Avalon Excalibur has a new
resource monitoring system.  With the ActiveMonitor, the servlet can
know when to reload Cocoon.xconf and apply the changes.

> While, following the servlet parallel, we should do:
> 
>  1) have a CWA (Cocoon Web Application) file with a manifest file (or
> equivalent thing) that specifies where is the sitemap file (I'm also
> happy with forcing the sitemap file to be called sitemap.xmap and places
> in the root of the package, thus eliminating the need for such a
> manifest file) and contains all the required things (resources,
> stylesheets, additional components and libraries, entity catalogs,
> etc..).
> 
>  2) open the cocoon manager (similar to Tomcat 4.0 manager webapp, just
> *much* more user friendly) and authenticate (if more security is
> required this could be mapped over an SLL-secured connection and
> authentication guaranteed by client-side certification, but this is none
> of our concern since Cocoon doesn't handle nor should that part of the
> HTTP connection).
> 
>  3) upload the CWA file (unlike Tomcat 4.0 manager which simply requires
> you to indicate where the CWA file is on the machine, with upload we can
> deploy a CWA from another machine entirely which is a great feature).
> 
>  4) tell Cocoon to start the deployed CWA
> 
> and that's it, without even having to stop Cocoon or even tell the
> servlet container about what we are doing.

It sounds promissing, but I am not sold on it yet.  Here are the issues
I have--and added complexity is the least of them.  First, the sitemap
is too self contained to allow for meaningful deployment of new functionality
while maintaining consistent look and feel.  The CWA would be self-contained
so that it would use its own resources and not be allowed to be overridden
by the parent.  Effectively, the new functionality would have the same
look and feel regardless of where it is used.

That part negates one of Cocoon's strengths.  The ability to have a consistent
look and feel in a scalable manner.  The only way I see around that is if the
user is wise, and implements the sitemap so that the CWA only acts on XML.
For instance, one of the methods I adopted for the tutorial app in CVS HEAD
is to have a standard resource mapping for all HTML so that the look and feel
was consistent.  The mapping would then call the XML version (that returned
XML in the Stylebook format).  This allows the parent sitemap to theme any
new functionality.  Because the sitemap is too flexible, this approach can't
be enforced for the CWA--leading to unpredictable results later.

> 
> Of course, Cocoon's classloader should be rearchitected to allow several
> "contexts" which different classloaders, this will automatically solve
> the issues of having to run multiple cocoon instances to separate the
> resolution space of different cocoon-based webapps.
> 
> But there are other things that might turn out incredibly useful: almost
> everybody works with two copies, one for development and one for
> production. The first is used when developing, the second is deployed
> and used until changes are required.
> 
> Everybody that has real-life working knowledge knows that is almost
> impossible to force people to work on a centralized version, expecially
> if the easiest way to modify something is to work on what is currently
> live.
> 
> Currently, the processing cycles are something like:
> 
>  1) write your webapp under the /cocoon2/ folder

1) create a new context with Cocoon installed.

>  2) use cocoon build file to generate the WAR file (which contains your
> stuff as well)

2) use your own build process (usually derived from Cocoon's build file).

> but then you note that your stylesheets have something wrong, so you
> don't do this over and over (since the cocoon-war file is so big and
> restarting the entire crap takes forever and a half) but simply modify
> the stylesheets in-place while they are live.
> 
> You can bet your ass that you'll forget to copy back the changes to your
> original location.
> 
> Result: next revision gets deployed, many things that previously worked
> well (expecially in sections you didn't touch because they were just
> perfect as they were) don't work anymore. This is called: lost update.

Actually, when I am developing a webapp using Cocoon I end up doing it
"live" in the servlet engine.  When I have it working correctly, I copy
it back to the build process.  This leaves the CVS logs not as incremental
as they should be--but it works.

Another alternative is to have CVS check out the webapp directly into the
Tomcat's webapps directory and work on it from there.  The rest of the
stuff is how to assemble the webapp--so there is very little extra work.

> One solution is to do the deploying once cocoon2 fresh out of the box,
> then install your stuff over on the deployed version.

-1  The process of overwriting existing stuff is very complex.

> NOTE: the servlet API doesn't say *anything* about what happens to
> deployed files that are subsequently modified after being unpacked from
> the WAR. In some circumstances, the container might even erase the
> unpacked version when the web-app is stopped or the container is shut
> down in order to save space. The Servlet API assume the WAR file and the
> unpacked version are the *same* and unpacking occurs only for speed
> reasons, not to allow you to modify things live.
> 
> So, you installed Cocoon2 fresh, it gets unpacked, you stop tomcat
> without shutting down but simply kill -9 it or CTRL-C on the shell, you
> add your stuff and work well.
> 
> C'mon, this is crap, we must come up with something smarter.

Exactly.

> Ok, the idea is: how do I make the files deployed unmodifiable?
> 
> My solution is: compile everything. Tranquillity by obscurity.
> 
> If you transform all XML files into CompiledXML files (using the code I
> wrote for a long time ago and which is now used in the cache system),
> not only parsing performance is greatly improved on live sites, but also
> we obtain that people will very unlikely modify directly the unpacked
> files because they are, in fact, binary.

I like this approach.  Especially for the XSP pages.  I would even go so
far as to write a Sitemap mangler that compiled your XSP pages and included
the classes in the proper location and rewrote your sitemap.xmap file so that
they are treated like hand written generators.

> This also means precompiling sitemaps and XSPs and everything that needs
> compilation.
> 
> Of course, this is not suitable for close-cycle development of cocoon
> web apps: I could not want to have to recompile my entire CWA, deploy,
> restart, etc, everytime I have to modify my stylesheet, it would be
> foolish to impose this, but on production this makes a real difference,
> expecially in those places where carefully scrutinized quality assurance
> phases must be performed before something enters production.

Exactly.  Providing a tool that performs this step is a boon.  It takes
all the guesswork out of compiling everything and makes the entire app
self contained.  I went through a nightmare install using ColdFusion (I
know I keep rehashing this one, but it is burned in my memmory), because
there was no way to simply drop an archive and have everything work.

> In these situations, we must take all the actions to allow packages as
> sealed as possible (possibly even crypto-sealed) to be deployed even
> remotely on a live site, making also possible to upgrade an existing
> package with a new one while the other is running (which is not that
> hard to do with carefully designed multi-threading management of
> subcontexts).

:)  I have learned a few things about threading and race conditions.
Seriously, I get your point.  Everything would be run from the context
that is currently installed until the new version is installed and fully
set up.  Then we switch to serving the new content while we dispose of
the old content.

> Currently, development of cocoon webapps is rough and not engineered: is
> mostly left to the user ability to manage the process.

I concur with this statement.  Hopefully my tutorial sheds some light on
an approach that is workable for more than just my crew and I.

> In the future, I'd love to make it possible to design the system in such
> a way that concerns are well kept separate even during the two stages,
> development and production, for example, performing sitemap
> interpretation during developement (since no high load is required, but
> faster responsiveness at structure changes) while performing sitemap
> compilation on deployment. Same thing for compiled XML.

+1

> Ok, hope this is enough to start a discussion. If you have any
> suggestion to shape the way you will develop, deploy, manage your future
> webapps in cocoon, make yourself heard now at design stage so that we
> can get down in coding with a clear indication on what the people want
> or would like to see.

I made another proposal that would be I think a better approach than the
sitemap concept as a whole.  Basically, it is breaking the sitemap into
pieces that are required for the management of the resources.  It keeps
the graphic designers in charge of their resources, Developers in charge
of their resources, etc.  It was a few months ago (I think around March/April).
I proposed it to Giacomo as I did not want to distract from the focus on
C2 release at the time.  I will see if I can dig up the proposal in my
archives.

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org