You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cocoon.apache.org by Jason Foster <ja...@uwaterloo.ca> on 2001/10/14 19:20:02 UTC

Incomplete draft of a high-level description of Cocoon

I thought that it might be a neat idea to go back over the Cocoon
documentation and synthesize our purpose, goals, and architecture.  This is
all based on the current Cocoon2 web site.

What is different is that I have partitioned the resources and introduced
the concepts of a Controller and a Manager.  The controller is the related
to the pipeline definitions, actions, and selectors within the sitemap.  The
manager is related to the matchers.

With respect to resources I've tried to add my interpretation of the notion
of internal and external resources that was described earlier.

I don't know if this will add anything to the discussion, but I found that
the existing documentation was too sitemap-centric and didn't tell me what I
wanted to know in one place.

Please take a look at it if you're interested.  The part where I try to
describe the pyramid model is definitely weak.

It looks best with a fixed-width font.

Jason Foster

======================================
Apache Cocoon XML Publishing Framework
======================================

Overall Purpose
---------------

Cocoon provides the tools that allow you to:

  - publish a resource,
  - in the format of your choice (HTML, PDF, SVG, XForm, etc.),
  - based on the input of your choice (XML, RTF, SQL, LDAP, etc.),
  - using XML as an intermediate form.

Overall Goals
-------------

- Separation of Concerns (SoC)

Cocoon should ensure that interactions between distinct aspects of resource
publishing are both minimized and rigidly prescribed.

- Standards

Cocoon should incorporate, and where necessary implement, existing and
emerging standards for both XML publishing and software interoperability.
Such standards include XML, SAX, XSLT, XSL:FO, XHTML, XForm, LDAP, and
Avalon.

- Scalability

Cocoon should be able to process, simultaneously, multiple sizeable
resources (eg. 100MB) using a minimum of heap and runtime overhead, at
better-than-acceptable speeds.  It should also be useable as a standalone
application, a servlet, and as an embedded component within a larger system.

Conceptual Models
-----------------

- Resources

Cocoon explicitly differentiates between two types of resources.  Both types
are identified by a URI.

"External" resources are context-free and have URIs that can be accessed
directly (eg. http://kbase.info.apple.com)  Web pages and PDF documents are
examples of external resources.

"Internal" resources depend on context and have URIs that should not be
accessed directly (eg.
http://kbase.info.apple.com/cgi-bin/WebObjects/kbase.woa/wo/2.0.6.20.4.3.0.0
.1.4)

Web applications generally consist of one external resource, usually the
"Home" or "Login" page, and a set of internal resources.

- Resource Publishing

Cocoon uses a four part conceptual model of resource publishing.  This
publishing model augments the "traditional" components of Content, Style,
and Logic, with a Management component.

  * "Content" refers to the different sources that are combined to form the
style-neutral aspects of the resource.  Sources of content include files,
databases, and other resources.

  * "Style" refers to the presentation aspects of the resource.  Such
aspects include colours, fonts, and images.

  * "Logic" refers to the dynamic aspects of generating the resource.  The
logic component controls how the content is combined and altered, as well
which styles are used and how they are applied.

The definition of "Management" depends on the type of resource being
requested.

  * With respect to external resources, Management refers to individual
URIs. These URIs must be defined and associated with particular pipelines.

  * With respect to internal resources, Management refers to the
interconnection of resources without explicit reference to the URIs that
identify those resources.

- Pipelines

Cocoon uses the pipeline model to create resources.  XML documents, in the
form of a stream of SAX events, propagate through the pipelines. The
pipeline approach to resource creation is analogous to the
"servlet-chaining" concept defined in the Servlet 2.3 specification.

Design Decisions
----------------

- SAX

Cocoon uses the SAX approach to working with XML data wherever possible.
Relative to the other common approach of constructing a DOM representation,
SAX has the following advantages:

  * Lowered memory consumption
  * More optimizable code model
  * Reduced garbage collection
  * Support for incremental operation

For convenience Cocoon provides mechanisms to map between SAX and DOM
representations.

- XML Namespaces

The use of XML Namespaces allows a single SAX stream to include both content
and style data.

- Avalon

Cocoon leverages the Apache Avalon framework and components.

- Environments

Cocoon uses an abstract environment, as opposed to assuming a fixed
environment such as a servlet engine.  This abstraction allows Cocoon to
function in multiple environments without requiring systemic changes.

- Component Types

Cocoon defines four type of components.  The components, and their roles,
are summarized in the following table:

Component    |  Role
-------------+-------------------------------------
Generator    |  feed events into a SAX stream
Transformer  |  accept events from, and emit them to, a SAX stream
Serializer   |  accept events from a SAX stream and emit a resource in
             |    a given presentation format
Controller   |  choose which generator, transformer(s), and serializer
             |    are required to publish a given pipeline
Manager      |  choose which pipeline is required to respond to a
             |    given URI

Every pipeline begins with a generator, continues with zero or more
transformers, and ends with a serializer.

The publishing concerns addressed by each component are summarized in the
following table:

                       Publishing Concerns
Component    |  Content   Style  Logic   Management
-------------+-------------------------------------
Generator    |     A
Transformer  |     A        A
Serializer   |              A
Controller   |                     A
Manager      |                               A

A - all resources
I - internal resources
E - external resources

Generators are responsible for feeding the initial content into the SAX
stream. Content can also be added to an existing SAX stream using
transformers.

Style data is added to the SAX stream by transformers.  To avoid merging
concerns, style and content data should exist within separate namespaces.
Serializers are responsible for consuming the SAX stream and interpreting
the content and style data into a final presentation format.

The controller is responsible for generating a resource using a pipeline.
It chooses a generator, zero or more transformers, and a serializer, and
connects them together using a SAX stream.  There are no restrictions on
when these choices must be made.

For external resources, the manager is responsible for associating a URI
with a particular controller.

For internal resources, the manager is responsible for constructing and
validating the context required to generate the resource, as well as for
choosing a controller to actually generate the resource.  For these
resources the separation between the manager and the controller is
permeable.

Architecture
------------

Cocoon consists of three nested layers.  Note that there layers do not
correspond to directories in the Cocoon source code repository.

- Framework

The Framework layer defines the basic interfaces and classes required to
implement pipelines and to interface with environments.  Significant aspects
of the framework include:

* Logging, configuration, threading, context, etc. (from Avalon)
* Interfaces and base classes for environments
* Runtime code generation, compilation, loading and execution
* Interfaces and base classes for generation
* Interfaces and base classes for transformation
* Interfaces and base classes for serialization
* Interfaces and base classes for control
* Interfaces and base classes for management
* SAX event caching

- Utilities

The utilities layer includes convenience and utility classes that are shared
among the components.

- Components

The Components layer contains implementations of the framework interfaces
and extensions to the framework classes that are suited to different
purposes. 


Implementations
---------------

- Environments

FileSaving   - allows Cocoon to ???
LinkSampling - allows Cocoon to ???
HTTP         - allows Cocoon to interact with HTTP requests
Wrapper      - allows Cocoon to ???

- Generators

Directory         - Streams a directory listing in XML format
File              - Streams XML read in from a source file
FragmentExtractor - ???
HTML              - ???
ImageDirectory    - Enhancement to "Directory" that adds image information
JSP               - Streams a Java Server Page
PHP               - Streams the result of a PHP request
Request           - Streams the contents of the request
Script            - Streams the result of a BSF script
ServerPages       - Streams an eXtensible Server Page
Status            - Streams the current status of the Cocoon environment
Stream            - Streams the InputStream associated with the request
Velocity          - Streams the result of a Velocity request

- Transformers

CInclude          - ???
Filter            - ???
FragmentExtractor - ???
I18n              - Transforms I18n markup into text based on locale
LDAP              - Transforms LDAP markup into the results of an LDAP query
Log               - Prints the SAX events that make up the stream
SQL               - Transforms SQL markup into the results of a SQL query
Trax              - Allows the use of a TRAX-compliant transformer
XInclude          - Includes new content in the SAX stream
XT                - Allows the use of the XT transformer

- Serializers

FOP  - uses the FOP toolkit to serialize XSL:FO to various output types
HTML - ???
Link - ???
SVG  - uses the BATIK toolkit to serialize SVG to various image formats
Text - ???
XML  - ???

- Controllers

???

- Managers

???


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Re: Incomplete draft of a high-level description of Cocoon

Posted by Stefano Mazzocchi <st...@apache.org>.
Jason Foster wrote:
> 
> I thought that it might be a neat idea to go back over the Cocoon
> documentation and synthesize our purpose, goals, and architecture.  This is
> all based on the current Cocoon2 web site.
> 
> What is different is that I have partitioned the resources and introduced
> the concepts of a Controller and a Manager.  The controller is the related
> to the pipeline definitions, actions, and selectors within the sitemap.  The
> manager is related to the matchers.
> 
> With respect to resources I've tried to add my interpretation of the notion
> of internal and external resources that was described earlier.
> 
> I don't know if this will add anything to the discussion, but I found that
> the existing documentation was too sitemap-centric and didn't tell me what I
> wanted to know in one place.
> 
> Please take a look at it if you're interested.  The part where I try to
> describe the pyramid model is definitely weak.
> 
> It looks best with a fixed-width font.
> 
> Jason Foster

<skipped/>

Sounds owesome. Thanks much, I'll try to fill it up a little....

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------



---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org