You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cocoon.apache.org by Berin Loritsch <bl...@apache.org> on 2001/07/06 18:10:05 UTC

[RT] Alternatives to sitemap (was [Fwd: Re: turbine vs. struts])

I have been thinking about the whole sitemap approach in Cocoon 2.
It is the point that has the biggest learning curve, and it is a
single point of failure.  The following email on the Turbine mail
list prompted me to voice this oppinion now instead of later.  After
the email, I want to propose an alternative solution.

-------- Original Message --------
Subject: Re: turbine vs. struts
Date: Thu, 05 Jul 2001 18:30:49 -0700
From: Jon Stevens <jo...@latchkey.com>
Reply-To: turbine-user@jakarta.apache.org
To: Turbine-user <tu...@jakarta.apache.org>

on 7/5/01 6:26 PM, "John McNally" <jm...@collab.net> wrote:

> So maybe I am wrong and there
> aren't that many people who prefer "mapping spec" approach.
> 
> john mcnally

Personally, I don't see the point and I think it is a bad design. It is a
single point of failure...

If you screw up your mapping .xml file on your live site, your entire site
is potentially broken. If you have 100 developers working on a site, then
making each developer edit a single file in order to define things like
Actions and URI mapping is a terrible idea.

I think that part of Struts is a terrible design idea.

-jon

--------- End of Message ---------


What is right with the Sitemap
------------------------------

Stefano envisioned a way to manage the URL space orthagonally to the filesystem.
Before Cocoon, people simply expected the URL space to match the filesystem.  Cocoon
has several pieces to a generated result, and therefore this simple approach really
won't work.  The Sitemap enforces the contract of URLs by allowing a filesystem to
be reorganized independantly of the URL space.  This is a good thing.

What is wrong with the Sitemap
------------------------------

I have already voiced the opinion that the sitemap mixes too many concerns (component
type declarations, etc.).  There is also a problem with it being a single point of
failure.  Just look at all the messages on Cocoon Users with the "The sitemap handler's
sitemap is not available" errors.  If the sitemap does not compile correctly, the whole
site is dead.  Also there is little security enforced, and the ability to extend the
file mapping outside the Servlet's Context.  These are bad things.  It should _never_
be Cocoon's responsibility to mimic all the things that Apache HTTPD can do.  Allowing
any kind of access outside of the Servlet spec approved areas (the context directory
and the repository) is a violation of security constraints and portability requirements
in the servlet spec.

The sitemap is the most complex piece in the entire Cocoon system, and as a result, it
is difficult for new users to comprehend it.  I have had three developers try to use
Cocoon, and they look at the sitemap and freeze.  They spend too much time trying to
understand the Sitemap, and not enough time trying to solve problems.  In a development
environment, this is not acceptable.  It is very frustrating because any time I tell them
"I have set up the Sitemap for you, ignore it for now", I find that they are still obsessing
with it.  The Sitemap as it stands is *too* powerful, and my developers are tempted to
try to use it to solve their problems.

Lastly, in practice, there are a few actual pipelines (generator/transformer/serializer)
for each site.  In fact, I have one pipeline for _all_ my html code in my webapps.  The
things that differ are the Actions used in conjunction with it, or the type of generator
I use.  Another side affect with the Sitemap is the existence of Readers.  It is my belief
that anything simply read from a filesystem should be handled by HTTP daemons like Apache,
TUX, or whatever you use.  They are better optimized for it, and it reduces the load on
the JVM.  We still need Readers for resources that cannot be reached via the filesystem
(i.e. the DatabaseReader).

What should we do?
------------------

We should persue Stefano's FlowMap idea, as well as use more formalized definitions of
a pipeline.  For the sake of our discussion, a pipeline will be considered a generator,
a list of transformers, and a serializer.  We will ignore resources, views, and readers
for the time being.  In practice, there are fewer pipelines than URLs much like there
are fewer stylesheets than XML sources.  We need to define what they are, and how to
map URLs to the pipeline.  I already hear the chorus of people saying, "Isn't that the
sitemap?".  Hear me out, there is a much simpler way of declaring these things.  It
also leverages some approaches that Avalon's Component Manager allows and aren't used
in Cocoon.  Check out the following syntax:

<pipelines default="file2html">
  <pipeline id="file2html">
    <generator type="file" source="${source}.xml"/>
    <transformer type="xslt" source="document2site.xsl"/>
    <transformer type="xslt" source="site-${theme}.xsl"/>
    <serialize type="html"/>
  </pipeline>
  <pipeline id="xsp2html" extends="file2html">
    <generator type="serverpages" source="${source}.xml"/>
  </pipeline>
  <!-- ... continued ... -->
</pipelines>

What is so special about this?  Aside from now having a list of pipelines that we can
use for flow maps and url maps, we have the ${variable} construct.  So far this is not
revolutionary.  What is new is the introduction of a FilteredContext that extends Avalon's
Context object.  This FilteredContext will have the following methods:

interface FilteredContext extends Context {
    /**
     * Add a filter to the Context object
     */
    void addFilter(PipelineFilter filter);

    /**
     * Sets the Object Model that the filters can use.
     */
    void setObjectModel(Map objectModel);
}

The Pipelines will extend the Recontextualize interface, and for each request, they are
fed a Context object that corresponds to a Flow map or URL map.  When the pipeline is
executed, the "source" parameter of the SitemapComponents is populated from the FilterContext.
The code would look like this:

generator.setup(resolver, objectModel, context.get("${source}.xml"), parameters);

The FilterContext uses the internal filters to translate the source parameter into the
actual filename.  Filters would be defined in this manner:

<filters default="url-match">
  <filter id="url-match" defines="source"/>
  <filter id="parameter" defines="theme"/>
</filters>

<url-map>
  <mount prefix="process/" flowmap="context://process/flowmap.xmap"/>
  <alias suffix=".html" pipeline="file2html">
    <apply-filter name="url-match">
      <parameter name="doc-root" value="context://docs"/>
    </apply-filter>
    <apply-filter name="parameter">
      <parameter name="source" value="session"/>
      <parameter name="type" value="attribute"/>
      <parameter name="name" value="theme"/>
    </apply-filter>
  </alias>
</url-map>

<flow-map protected="true" permit-roles="admin,user,manager">
  <resource-pipeline suffix=".html" pipeline="xsp2html">
    <apply-filter name="url-match">
      <parameter name="doc-root" value="context://process"/>
    </apply-filter>
    <apply-filter name="parameter">
      <parameter name="source" value="constant"/>
      <parameter name="name" value="theme"/>
      <parameter name="value" value="default"/>
    </apply-filter>
  </resource-pipeline>

  <resource id="header" handler="process-header"/>
  <resource id="line-item" handler="process-lineitem"/>
  <resource id="confirmation" handler="process-confirm"/>
  <resource id="no-permission" handler="forward"/>

  <flow start="header" access-denied="no-permission">
    <entry resource="header" next="line-item"/>
    <entry resource="line-item">
      <choice parameter="destination" default="end">
        <value="home" next="header"/>
        <value="next" next="line-item"/>
        <value="end" next="confirmation"/>
      </choice>
    </entry>
    <entry resource="confirmation" exit="../index.html"/>
    <entry resource="no-permission" exit="../index.html"/>
  </flow>
</flow-map>

Now, let's talk about how this all works together.  We have a default URL-MAP that
handles URI mapping, and takes care of mounting the flow maps.  The filters and
pipelines are simply resources that are used in the map files--they can and should
be contained in separate files.  The pipeline is chosen by simple suffix matching,
and pipelines can extend other pipelines.  The important thing to notice is that
the Filters take care of the magic of pulling information from the objectModel, and
populating the variables in the source parameters.  This is easily comprehended, and
very powerful.  This means that with the proper planning, you can get away with very
few pipelines.

The URL Map first checks to see if the URL matches the mounted flowmap.  If not, it
falls through to the alias for ".html" URLs.  Notice the name "alias", as it properly
reflects what is going on here.  Also note that there is a default pipeline.  In the
absense of and URL-Maps or Flow-Maps, the request will follow that pipeline's
instructions.  What about the variables?  That is something that requires some thought.
We could declare the filters in the pipelines, as they would now work automatically.
We also could provide reasonable defaults.

The Flow Map is a bit different.  It configures the pipeline and filters for all the
resources--this is a development speed savings.  After all, all the resources in a
form are going to remain the same.  You will also notice the attributes of the Flowmap.
Many forms are only allowed to people with the proper roles.  That is why the "protected"
attribute and the "permit-roles" are present.  When a flow map is protected, we check
the Request "isUserInRole" method to find out if a user can access the resource.  After
the resources are defined, we see the <flow/> entry.  The "start" attribute determines
where normal flow starts, and the "access-denied" attribute gives determines the resource
to use to handle when a user is not in the proper role.  Lastly, we have the entries that
determine where flow moves.  There are three ways of determining the next action in a
flow:

* The "next" attribute
* The "exit" attribute
* The "choice" element

The "next" and "exit" attributes function similarly, as they specify a static destination.
They differ in that the "next" attribute specifies a resource and the "exit" attribute
specifies a URL.

The "choice" element allows you to specify a Request parameter to inspect for a selection
of destinations.  You must provide a default value so that the flow is never broken.  The
default is chosen if the parameter specified does not exist or does not contain any of the
specified values.

KEEP IN MIND
------------

This whole solution can be put together without compiling the resources.  In fact, I would
much rather it be done that way.  By depending on a dynamically compiled resource we increase
the risk that our site will fail.