You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avalon.apache.org by Stephen McConnell <mc...@apache.org> on 2002/11/03 15:41:12 UTC

[Fwd: Re: [design] Cocoon Blocks 1.1]


-------- Original Message --------
Subject: Re: [design] Cocoon Blocks 1.1
Date: Sun, 03 Nov 2002 15:32:18 +0100
From: Stephen McConnell <mc...@apache.org>
Reply-To: cocoon-dev@xml.apache.org
To: cocoon-dev@xml.apache.org
References: <3D...@apache.org>



Stefano Mazzocchi wrote:

Stefanno:

Have read with interest the Blocks 1.1 description. First of all -  
thanks to everyone who contributed to this.  I have a number of notes 
in-line, some of which I am sure will reflect my ignorance concerning 
the Cocoon world/terminolgy.  Throughts are strongly related to 
background with Avalon, experience with Merlin and Fortress and usage of 
the excalibur/meta package.

>
>  +---------------------------+
>  | Part 2: technical details |
>  +---------------------------+
>
> Ok. Now that we have described where we want to go, let's describe how.
>
> Cocoon Blocks
> -------------
>
> A Cocoon block is a zipped archive, just like JARs and WARs.
>
> The suggested extension of a cocoon block is ".cob" (for COcoon Block).


Another suggestion ... BAR - Block ARchive.

The reason for suggestiong this is that the concept of a JAR/WAR style 
deployment unit is something I've been looking at within the Merlin 
framework.  It seems to me that the notion of a block is something 
usable at level across many different applications and based on the 
requirements and descriptions here - I dion't see any immediate Cocoon 
specifics except for the inclusion of the sitemap and default sitemap 
semantics (more notes on that later).

>
> The suggested MIME type is "application/x-cocoon-block".


And following the BAR notion .. "application/x-block"?

>
> A Cocoon Block (COB from now on) includes a directory called
>
>  /BLOCK-INF
>
> which contains all the block metadata and the resources that must not be
> directly referentiable from other blocks (for example, jars, classes or
> file resources made available thru the classloader). The directories
>
>  /BLOCK-INF/classes
>  /BLOCK-INF/jar
>
> are used for classes and jar files. [This follows the WAR paradigm]


For consitency with the Servlet spec (Web Applications/SRV.9.5 Directory 
Structure) - I suggest /BLOCK-INF/jar be changed to /BLOCK-INF/lib

>
> The main COB descriptor file is found at
>
>  /BLOCK-INF/block.xml
>
> This file contains markup with a cob-specific namespace and will 
> include the following information:
>
>  1) block implementation metadata:
>      - unique URI identifier [this identifier will also be used as an 
> address on where to locate the block and how to download it from the 
> web!] (example: http://mystuff.org/dist/myblock-1.5.34.cob)
>      - version (1.5.34)
>      - short name (My Block)
>      - description
>      - author
>      - URI of license (http://mystuff.org/dist/license)
>      - URI of the distribution location 
> (http://mystuff/dist/latest/myblock.cob)
>      - ???
>
>  2) role(s):
>      the URI(s) of the behavioral role(s) this block implements
>      and exposes [optional]


When you are using the work "role", is it safe to assume that this is a 
URI that resolves to a description of the set of computational 
"service"(s) that a block is capable of providing? If this is correct - 
then I would suggest renaming this to "service(s)".  My rationale here 
is that a "role" (to me) is more correctly aligned with the consumer of 
a service - Block B1 is depedendent on service X for role of 
"authorization".  If I understand correctly, the notion you desribing is 
collection of an interface + version range supported by a block that 
would enable it to be supplied to Block B1 in order to fulfill the 
service dependecies that B1 has with respect to its "authorization" 
concerns.

Keep in mind that I'm biased relative to the Merlin/Phoenix coventions 
here of using the work "service" to describe the functionality exported 
by a component. I'm extending that notion on the assumption that a block 
exposed a set (or sub-set) of the services provided by the components it 
is aggregating.

Just as a side note, you may want to think about seperating "block" URIs 
from "service" URIs.  This is something I've been working on recently in 
Merlin - and the seperation of component provider for service has proved 
valuable.  It ensures that the concepts of a service is not tied to a 
particular implementation unit (block or component).  Seperation of 
component implemetation meta data from the service meta data is already 
in place under the excalibur/meta package for the same reasons.

>
>  3) dependencies:
>      the URI(s) of the behavioral roles this block expects,
>      along with the prefixes used by the block as shortcuts in protocol
>      resolving (see below for the meaning of this) [optional]


I'm guessing that your referring to the inclusion of a roles file - is 
that correct? How does this compare to something like the dependencies 
declaration used in the excalibur/meta package?
http://jakarta.apache.org/avalon/excalibur/meta/dependencies.html

>
>  4) inheritance:
>      the URI of the block extended. [optional]


It seems to me that there two distinct inheritance concerns: (a) block 
inheritance and (b) component inheritance (assuming that a block 
aggregates components).  In the case of block inheritance this would 
handles the cases of resources and the ability to redefine resources in 
derived blocks.  In the case of component inheritance, this should be 
handled at the component type level and should not be linked to block 
inheritance.

>
>  5) sitemap:
>      the location inside the block file space of the sitemap
>      [optional, if not found defaults to '/sitemap.xmap']


This one - I'm not sure about - does it make sence for this to be part 
of a generic block specification, or is it part of a block that provides 
functionality derived from a stitemap? Perhaps this is point where a COB 
extends a BAR ?

>
>  6) configurations:
>      the configurations required for this block to function [optional]


Some clarification needed here - I'm assuming that a block is a 
collection of a components.  Each component would have its own meta info 
(explicit or derived).  At the level of block I can imagine information 
that is describing profiles of component usage, and instructions 
concerning assembly of profiles that will result in the establishment of 
a computation system (I'm talking about internal assembly of a block 
here - not block assembly).  This internal "assembly" level information 
can be considered as the block configuration but should not be confused 
with component configuration data.

On the subject of component configuration, there are three different 
levels of component configuration that are handled within the Merlin 
container.  The first type is static configuration defaults (established 
by a developer and bundled with the class), the second type is 
configuration data associated with a named deployment profile (i.e. 
component X deployed using profile P1 is different to component X 
deployed using profile P2).  The third category of configuration data is 
data defined by an administrator that typically suppliments a profile, 
which in turn suppliments default configuration data.

>
>
> Also, the /BLOCK-INF/ directory contains the 'roles' file for Avalon
> components:
>
>  /BLOCK-INF/roles.xml


I've been thinking about how to handle roles versus the more formal meta 
data approach used in Merlin. One of the first things that is needed at 
the component level is the declaration of mechanism used to bring 
external data into and meta-data model.  Markus has already started 
working on content in this subject and I'll be shifting some of the 
meta-data content out of Merlin to the excalibur/meta package in the 
near future as part of supporting this work.  In effect there should not 
be a need to include a /BLOCK-INF/roles.xml at the spec level - instead 
one should be declaring a meta management strategy at the component 
level, and possible a default strategy at a block level.  This would 
enable the deployment of ECM style components without change, together 
with non-ECM components.  Specification of the inclusion of a roles file 
would be part a ECM meta strategy spec.

>
>
> Possible use-case scenario
> --------------------------
>
> Suppose you have your naked cocoon running in your favorite servlet
> container, and you want to deploy myblock.cob. Here is a possible
> sequence of actions on an hypotetical web interface on top of Cocoon
> (a-la Tomcat Manager)
>
>  1) upload the myblock.cob to Cocoon
>
>  2) Cocoon scans /BLOCK-INF/, reads block.xml and finds out the 
> behaviors this block depends on as well as the block that it extends.
>
>  3) the block manager connects to the uber "Cocoon Block Librarian" 
> web service (hosted probably on cocoon.apache.org) and asks for the 
> list of blocks that exhibit that required behavior.
>
>  4) the librarian returns a list of those blocks, so the users chooses,
> or the manager allows the user to deploy its own block that implements
> the required behavior or to reuse those already deployed blocks that 
> implement the required behaviors.
>
>  5) Cocoon checks that all dependencies are met, then unpacks and 
> installs the blocks
>
>  6) For each block that exposes a sitemap, the deployment manager asks
> the deploying user where he/she wants to *mount* that block in the
> managed URI space or if he/she wants to keep them internal only (thus 
> only available to the other blocks, but not mounted on the public URI 
> space) 


The above comment is probably the point where a COB comes into focusus 
as a specification that extends a more generic BAR specification (i.e. 
COcoon Block could be viewed as an extension of a generic component 
Block ARchive).

>
>
>  7) for each block that requires installation-time configurations, the 
> block manager will present the user information on how to configure 
> the block.
>
>  8) If no collisions in the URI spaces are found, the blocks are made
> available for servicing.
>
>
> Resource dereferencing
> ----------------------
>
> Security concerns aside, the above scenario shows one major issue:
> blocks are managed, deployed and mounted by the container. There is (and
> there should not be) a way for a block to directly access another block
> because this would ruin IoC. xdc


If you follow the seperation of  "block" from "service" you can avoid 
this issue.  In effect, "service" is what is exposed by the assembly 
system - block never needs to be exposed.  However, this does not 
address the complete picture.  The block concept includes resources as 
well as services.  To complete the picture, the block would need to 
declare accessible resources (something not addressed in the 
excalibur/meta or Merlin system).

The idea of seperating "block" and "service" has significant 
implications - firstly, the structural unit of deployment are seperate - 
which means that a service interface, realted meta and resources can be 
loaded indepedently of a block.  You need to be able to do this as soon 
as you get into classloader hierachies across which service defintions 
appear higher in the classloader that the implemetations (i.e. the 
service defintions are shared whereas the block implementation is 
protected).

>
> So, one block doesn't know where the blocks it depends on are located,
> both on disk *and* on the URI space as well.
>
> The proposed solution is to use block-specific protocols to identify the
> dereferenced resources.
>
> For example, the myblock.cob/sitemap.xmap file could contain a global
> matcher which works like this:
>
>    
>     
>     
>     
>    
>
> please note the
>
>  block:skin:/stylesheets/document2html.xslt
>
> which indicates
>
>  block -> use the block protocol
>
>  skin -> use the 'skin' prefix to lookup the block behavior URI and thus
> the block which implements it for this block (the block manager knows
> this)
>
>  /stylesheets/document2html.xslt -> it will ask the sitemap of the 
> skin block to produce that resource.
>
>
> Dereferencing navigation
> ------------------------
>
> Not only a sitemap needs to connect to the resources contained in the
> blocks on which the block depends on, but the resulting pages as well.
>
> In fact, suppose you have a block that exposes a web service and another
> one that exposes a web application that wraps that web service. For
> sure, the generated web page will have to have a URI to connect to that
> service, since it's the client's browser that makes the call (unless we
> want to virtualize everything thru the sitemaps, but I wouldn't suggest
> it).
>
> So, a possible solution is to use the "block:" protocol in the pages as
> well and have a URI-mapping transformer right before the serialization
> stage.
>
> For example, things like
>
> 
...


>
> is trasnformed into
>
> 
...




Not following the above too well (probably some prior Cocoon knowlege 
that I'm missing).  One thing that I feel uncomfortable about is the 
direct reference in the action statement to the "block" as distinct to a 
reference to a "service".

E.g. The following would make me feel better:

  
...



:-)

>
>
> Some design decision taken
> --------------------------
>
> o) NO BEHAVIOR VALIDATION:
>
> I thought a lot about it but I think that having 'behavior description
> languages' (such as the WSDL-equivalent for blocks) is going to be
> terribly complicated, expensive to implement and hard to use and
> enforce, even for simple blocks which don't expose a sitemap and are
> just repositories for informations.
>
> For this reason, there is no validation taking place: if a block
> implements a particular behavior and exposes it thru its descriptor
> file, Cocoon automatically assume it implements the behavior correctly.
>
> In the future, we might think of adding a behavior description layer to
> enforce a little more validation, but I fear the complexity (for
> example) of validating stylesheets against a particular required
> behavior.
>
> IMO, only human try/fail and patching will allow interoperability. 


Given sufficient meta-info (type-level) plus meta-data (profile-level) 
it is possible to do  validation on components prior to the assembly of 
blocks/components into a running system.  The validation phase does 
things like ensuring that meta-data in consitent with implemetation, 
references to resoruces actually refer to existing resources, etc.  This 
type of validation does not need any supplimentary langauge because its 
simply ensuring the consistency of a logical system before system 
deployment.  Validation could be applied at block creation time, and 
during multi-block assembly.

>
>
> o) VERSIONING AS PART OF THE BEHAVIOR URI
>
> The behavior URI *MUST* terminate with a /x.y that indicates the
> major.minor version of the behavior that a block implements. 


Can you explain the *must* - the conventions used in the excalibur/meta 
package assume a default value of 1.0 if no version information is 
supplied.  My experience is that this is good for the developer but bad 
for the user. User's typically prefer the most recent stable release as 
a default value.

I've also some reservation about the "/" delimited as the appropriate 
means for version delimiting - because it would break what is already 
running in Merlin :-)

>
>
> On dependencies, each block must be able to specify the 'ranges' of
> versioning that it is known to work with. For example
>
>    prefix="skin"/>
>
> But I haven't really thought about the patterns that could be used for
> this.
>
> Please, help on this.


Some useful documetation concerning "component" level meta info for the 
type level is available on the excalibur/meta package.  This meta info 
*only* deals with the component type level (equivalent to information 
supplimenting the component implementation classes and service interface 
classes).  

  http://jakarta.apache.org/avalon/excalibur/meta/index.html

Meta information concerning the description of "profiles" (the 
configuration data, context directives, etc.) is defined under the 
Merlin 2 API.  The Profile Javadoc is a good starting point.

  http://jakarta.apache.org/avalon/excalibur/merlin/api/

>
> o) CROSS-BLOCK SECURITY
>
> Even I don't think anybody is stupid enough to use a single Cocoon
> instance to run a full ISP and ask for sandboxing of the single blocks,
> cross-block security is a big concern, expecially since you might be
> deploying components on the fly in a binary format.
>
> So, first thing is to protect the /BLOCK-INF/ directory.
>
> The second thing is to wrap each block with its own classloader,
> connected to the block dependency map, so that each class discovery is
> done only on the class space of the dependent blocks.
>
> [NOTE: this doesn't prevent people from using blocks as trojans, but we
> won't host blocks which don't come with the source code so we solve that
> problem].
>
> o) COCOON MANAGER SECURITY
>
> The cocoon manager might be a block itself that connects to specific
> cocoon internals and provides a web interface for it. So, it can be
> removed or disabled when put on production.
>
> Also, the feature of automatic discovery of blocks thru the 'cocoon
> block library' can be turned off or substituted with its own (even the
> 'cocoon block library' could be a block, so you could have your own
> block library on your system instead of connecting to the apache one).


The introduction of the above points are examples of the requirement for 
the population of a container with a set of internal facilities (as 
opposed to components that a container is managing) - its an important 
distinction.  

>
> o) OPTIONAL COP
>
> The block.xml file makes it *optional* to expose behaviors or to depend
> on them. This allows the COP model to nicely downgrade to the good old
> single-archive WAR paradigm for those who don't care about block
> polymorphism.
>
>                                  - o -
>
> Conclusions
> -----------
>
> I think I have exposed a detailed plan on how to implement blocks and
> solve a number of issues we are having:
>
>  1) allow users to 'compose' Cocoon only with those modules they need
>  2) allow users to easily deploy their stuff on cocoon
>  3) allow users to easily reuse web applications components without
> sacrificing coherence, interoperability and easy extensibility
>  4) allow users to be helped by Cocoon to 'fill the gaps' and be
> suggested on what components is best required and feed it automatically
> (apt-get like)
>  5) allow the Cocoon communities to clearly separate concerns between
> the core and the application-level stuff, thus allowing the cocoon 
> community to really scale by massive development parallelization
>  6) allows, for the first time in the history of the web, to use
> polymorphism, inheritence and COP at a web application level.
>
> THANKS
> ------
>
> I would like to thank Giacomo Pati and Carsten Ziegler for their great 
> contribution and precious feedback.
>
>
> Changes from version 1.0
> ------------------------
>
> - added the concept of block inheritance
> - wrote a scenario for introduction of the COB model as an evolution 
> of the WAR model.
> - added configurations to blocks
> - changed block.info and blocks.roles into block.xml and roles.xml
> - removed issues already identified by the first round of design
>
>
> TODO
> ----
>
>  1) blocks should allow to depend on 'ranges' of behavior versions. 
> Let's try to come up with a way to describe those ranges effectively.
>
>  2) the block manager should present the user with a form on how to 
> configure the block, thus the block should contain enough 
> configuration metadata (default values, valid entries, ect..) to tell 
> the block manager how to create the form to present. Should we use RDF 
> for this or schemas are good enough?
>
>  3) Which avalon container should we use since the one we currently 
> use (ECM) is not powerful enough? is there already a container which 
> is powerful enough to handle our needs as described here? if not, what 
> do we do? we implement our own or work with the avalon people to fix 
> theirs to meet our needs?


I would *very* much like to see this as a joint Cocoon/Avalon iniative. 
 On the Avalon front there are two containers the play into the 
requirements stated above - Merlin and Fortress.  However, neither of 
these containers completely address the requirements.  But lets look a 
little deeper and figure out where Avalon is today relative to the 
target and what potential is offered by a combination of Merlin, 
Fortress and aother Avalon related iniatives.

Defintion of a block as a structural package
---------------------------------------------

I would like to see an Excalibur package dealing with a BAR (Block 
ARchive) that serves as the basic structure for a COB.  This should 
include tools and utilities for BAR creation, structural validation, 
signing, etc.  There is existing content in the Phoenix app-server 
related to the SAR file format which is close to the notion of a block 
in terms of structure but is too cause grained for the block concept. 
 In addition, the work in Merlin dealing with container defintion seems 
to me to be very close to component/service management side of a block, 
but lacks the formal management of resources (i.e. a Merlin container 
only expose services - not resources).

Component meta info and meta data
---------------------------------

As mented above - the component type level meta info in excalibur/meta 
combined with profile level meta data in excalibur/assembly (model 
package) is a working starting point for the component level deployment 
concepts. There is some more seperation work to be done on the Merlin 
side - after which much of the Merlin meta data model will move over to 
the excalibur/meta package.  This will provide a light-weight meta model 
that is container independent.  The model does not currently support 
inhertance - this would require some minor additions to the existing 
stucture and some significant additions to the verification functions.

Assembly solutions
------------------

Merlin includes a assembly engine that automates the process of wiring 
together components based on depedencies and services. This is working 
well today but could do with some refactoring.  Notions of default 
configurations combined with packaged deployment profiles are proving to 
be excellent solutions to simplification of the over service management 
problem.

Lifecycle and Lifestyle management
----------------------------------

Both Merlin and Fortress support the classic Avalon lifecycle stages 
(configuration, contextualization, composition/servicing, etc.) together 
with a common model for the introduction of lifecycle extensions. 
Respective implemetations differ in the Merlin allows extension 
implementations to be component that my have their own depedencies 
whereas Fortress does not have support for compoennt assembly. 
 Lifestyle management is equivalent in that both provide support for 
singleton, thread, pool and transient policies.  Again, implemetation 
approaches differ - Fortress is very much derived from the ECM model and 
respects lifestyle marker interfaces whereas Merlin requires lifestyle 
policy to be declared within the meta-info of a component type.  Looking 
forward, the Merlin strategy will be to declare the lifestyle processing 
strategy, allowing for defintion of a plug-in handler for lifestyle 
resolution - allowing a mix of pure meta-based components together with 
ECM style and Avalon 4.1.2 marker interface recognition.

Mixed lookup semantics
----------------------

Fortress provides complete support for the extended semantics implied 
within a lookup argument.  The Merlin 2 implementation does not support 
this.  The main issue (from my own point of view) is that the Avalon 
framrework Composable and Serviceable interface seamantics are 
insuffiently specificed and the real requirement here is to resolve this 
at the framework level first, then apply these solutions within 
respective containers.  In the meantime, the strategy for Merlin will be 
to plug-in an ECM style manager when required at a component or 
container level (with an implementation based on existing Fortress 
code).  This will enable zero modification of existing ECM style 
components.

>
>  4) how do we implement the block manager? should it be a comman line 
> interface or a web interface, or both? 


I don't think I agree with the question ;-)
Management of blocks should be indepedent of the means through 
information is presented. Assume that you have a container that is 
capable of managing a set of components, resources and subsidary 
containers ... you could imagine a management interface to the 
container, and that management interface could be accessible via the 
web, command line, JMX etc.

> what about security?


Work I've done in this area is perhaps excessive relative to what you 
have in mind.  I have a micro PKI which handles the generation of keys 
and certificates which are used for both admin and runtime 
authorization. The main difference between the work I'm doing and what 
your describing here is that I'm dealing with distributed containers and 
I need to propergate identify with every invocation and single 
invocation may result in service invocation across multiple container 
deployed in defferent sites, each with different security policies.

>
>  5) the 'uber library of cocoon blocks'. Where do we host it? how to 
> we manage it? How do we provide the block discovery web service? which 
> technology do we use: SOAP or REST?


My experience here is somewhat experimental at this stage.  I'm not 
using a web protocol - instead I'm passing meta model structures over 
the wire (i.e. remote invocations but the scenario is a little diffenent 
because I'm more concerned with service access where service can be 
relocated locally or accessed remotely).

>
>  6) should we "digitally sign" our blocks?


Yes.

> if so, how?
>

Has anyone though about a Cocoon Certification Authority ?

Cheers, Steve.

-- 

Stephen J. McConnell

OSM SARL
digital products for a global economy
mailto:mcconnell@osm.net
http://www.osm.net




---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org





-- 

Stephen J. McConnell

OSM SARL
digital products for a global economy
mailto:mcconnell@osm.net
http://www.osm.net



--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>