You are viewing a plain text version of this content. The canonical link for it is here.
Posted to svn@forrest.apache.org by cr...@apache.org on 2005/03/31 09:05:34 UTC

svn commit: r159564 - in forrest/trunk/docs-author: content/xdocs/howto/howto-custom-html-source.xml content/xdocs/howto/howto-understand-pipelines.xml content/xdocs/howto/index.xml content/xdocs/site.xml status.xml

Author: crossley
Date: Wed Mar 30 23:05:33 2005
New Revision: 159564

URL: http://svn.apache.org/viewcvs?view=rev&rev=159564
Log:
Merge the content of howto-understand-pipelines.xml back into howto-custom-html-source.xml
Issue: FOR-446

Removed:
    forrest/trunk/docs-author/content/xdocs/howto/howto-understand-pipelines.xml
Modified:
    forrest/trunk/docs-author/content/xdocs/howto/howto-custom-html-source.xml
    forrest/trunk/docs-author/content/xdocs/howto/index.xml
    forrest/trunk/docs-author/content/xdocs/site.xml
    forrest/trunk/docs-author/status.xml

Modified: forrest/trunk/docs-author/content/xdocs/howto/howto-custom-html-source.xml
URL: http://svn.apache.org/viewcvs/forrest/trunk/docs-author/content/xdocs/howto/howto-custom-html-source.xml?view=diff&r1=159563&r2=159564
==============================================================================
--- forrest/trunk/docs-author/content/xdocs/howto/howto-custom-html-source.xml (original)
+++ forrest/trunk/docs-author/content/xdocs/howto/howto-custom-html-source.xml Wed Mar 30 23:05:33 2005
@@ -35,7 +35,7 @@
 
   <purpose title="Purpose">
     <p>
-      Integrating legacy HTML-pages is a common task when migrating
+      Integrating legacy HTML pages is a common task when migrating
       existing websites to Forrest. This document explains how to implement
       custom processing which is required when Forrest's standard pipeline
       for html does not suffice.
@@ -46,10 +46,6 @@
     <p>To follow these instructions you will need:</p>
     <ol>
       <li>
-        Read the <a href="site:howto/understand-pipelines">How to
-        understand html processing and sitemap pipelines</a>.
-      </li>
-      <li>
         Know how to use a
         <a href="site:project-sitemap">project sitemap</a>.
       </li>
@@ -71,16 +67,500 @@
         If you plan on creating your own custom processing for
         html-pages, you'll also need write access to your
         project directory.
-        <note>
-          We strongly recommend creating a backup of your
-          Forrest project directory before you proceed with your own adjustments!
-        </note>
       </li>
     </ol>
   </prerequisites>
 
-  <steps title="Customizing the html-Pipeline">
-    <section id="when">
+  <steps title="Understanding the HTML-Pipeline">
+    <p>
+      The first part of this howto explains the html pipeline, so as to
+      provide the background to enable you to add additional processing
+      for legacy html documents. If you already know how pipelines work,
+      then skip to the section about
+      <a href="#custom">Customizing the html pipeline</a>.
+    </p>
+
+    <section id="example">
+      <title>Driven by Example</title>
+      <p>
+        The best way to learn about Forrest pipelines is follow
+        the processing of an imaginary request through the forrest
+        machinery.
+      </p>
+      <p>
+        So let's see what happens, when a client asks Forrest to
+        serve the document
+        <br />
+        'http://some.domain.org/mytest/mybad.html'.
+      </p>
+    </section>
+
+    <section id="sitemap">
+      <title>Finding the Sitemap</title>
+
+      <p>
+        Like all applications based on Apache Cocoon, each request for
+        a given document is processed by searching a sitemap for a
+        matching processing pipeline. With Forrest, this core sitemap
+        is found in the file
+        'main\webapp\sitemap.xmap' in Forrest's program directory.
+      </p>
+
+      <p class="instruction">
+        Open the file 'main\webapp\sitemap.xmap' in Forrest's
+        program directory with a text editor of your choice.
+      </p>
+      <note>
+        Any simple text editor will suffice, since the XML in
+        this file is quite simple in structure and easy to read.
+      </note>
+
+      <p>
+        To help you to easily follow the next steps, we have added
+        comments and anchors to 'sitemap.xmap',
+        so that you can quickly jump to all relevant sections
+        and read them more easily.
+      </p>
+
+      <p class="instruction">
+        Follow this link to the
+        <a href="sitemap.xmap.html#Start+of+Sitemap">
+          start of the Sitemap.
+        </a>
+      </p>
+
+      <p>
+        As the comment explains, this sitemap is the starting point
+        for all requests. So even if there are other sitemaps
+        (which we will see later on), we always start looking for a
+        matching pattern right here.
+      </p>
+    </section>
+
+    <section id="pipelines">
+      <title>Find the Beginning of the Pipelines Section</title>
+
+      <p>
+        Modular as everything else in Cocoon, Forrest's sitemap
+        starts with a long list of declarations for all the
+        components used later on. We can safely ignore these at
+        the moment.
+      </p>
+      <p class="instruction">
+        So let's skip right to the start of the
+        Pipelines-Section. Search for &lt;map:pipelines&gt; or
+        follow this link to the
+        <a href="sitemap.xmap.html#Start+of+Pipelines">
+          beginning of the pipelines-element
+        </a>
+      </p>
+
+      <p>
+        Within the pipelines-element you will find a long list
+        of pipeline-Elements (no trailing 's'), each one of them defining a
+        processing pipeline within Forrest.
+      </p>
+
+      <p>
+        When handling a request, Forrest will check the
+        Pipelines from top to bottom until it encounters a
+        Pipeline that will take care of our request.
+      </p>
+    </section>
+
+    <section id="matches">
+      <title>Looking for a Match</title>
+
+      <p>
+        Like all Cocoon applications, Forrest knows which
+        pipeline to use for processing a certain request by
+        looking at the entry criteria for each pipeline it comes
+        across. These can be a match against a given pattern,
+        the test if a certain files exists or one of many other
+        possible tests that Cocoon supports.
+      </p>
+
+      <p class="instruction">
+        To better know what we are talking about, let's follow
+        Forrest down the list to the
+        <a href="sitemap.xmap.html#Test+for+First+Pipeline">
+          Test for the First Pipeline
+        </a>.
+      </p>
+
+      <p>
+        Here you can see that very specialized matches need to occur
+        early in the sitemap. The
+        requested file (and pathname) is compared to a pattern
+        '*.xlex' that would match if our request ended with
+        '.xlex' and had no pathname. Since it doesn't, we don't
+        have a match and need to keep looking.
+      </p>
+
+      <p class="instruction">
+        Skip forward until we find the
+        <a href="sitemap.xmap.html#First+Match+for+%22**%2F*.html%22">
+          First Match for "**/*.html"
+        </a>
+        (&lt;map:match pattern="**/*.html"&gt;).
+      </p>
+    </section>
+
+    <section id="html-pipeline">
+      <title>Processing in the '**/*.html' Pipeline</title>
+      <p>  
+        Let's take a quick look at this pipeline to understand
+        what's happening here:
+      </p>
+<source><![CDATA[
+<map:match pattern="**/*.html">
+    <map:aggregate element="site">
+      <map:part src="cocoon:/skinconf.xml"/>
+      <map:part src="cocoon:/build-info"/>
+      <map:part src="cocoon:/{1}/tab-{2}.html"/>
+      <map:part src="cocoon:/{1}/menu-{2}.html"/>
+      <map:part src="cocoon:/{1}/body-{2}.html"/>
+    </map:aggregate>
+    <map:call resource="skinit">
+      <map:parameter name="type" value="site2xhtml"/>
+      <map:parameter name="path" value="{0}"/>
+    </map:call>
+</map:match>]]></source>
+      <p>
+        In the first part of this pipeline, the
+        aggregate-element assembles information required to build
+        a Forrest page with menu and tabs from different sources.
+        Then the call to the skinit-resource picks up the
+        aggregated info and generates a page in the
+        style of the current skin. That's easy, isn't it?
+      </p>
+      <p>
+        Well, the complex part begins, when we take a closer look at
+        the sources of the aggregation.
+      </p>
+      <source><![CDATA[<map:part src="cocoon:/{1}/body-{2}.html"/>]]></source>
+      <p>
+        This mysterious element is most easily explained as a
+        secondary request to the Forrest system.
+      </p>
+      <p>
+        The 'cocoon:'-part is called a pseudo-protocol and tells the
+        processor to ask Forrest for the resource named behind
+        the colon, process that request and feed the output as input
+        back into our pipeline.
+        (The 'pseudo' goes back to the fact that unlike
+        'http' or 'ftp', which are real protocols, you can use cocoon:
+        only within the cocoon environments as only they will know what to
+        do with it.)
+      </p>
+      <p>
+        So even though we have already seen the end of our pipeline
+        (the skinning), we still don't know, what goes into the skinning and
+        where it comes from. To find out, we have to look at the sources
+        of the aggregation.
+      </p>
+    </section>
+
+    <section id="protocols">
+      <title>Following the Pseudo-Protocols</title>
+      <p>
+        To find out what goes into our aggregation, we'll need to look
+        at the pipeline that is called by
+      </p>
+      <source><![CDATA[<map:part src="cocoon:/{1}/body-{2}.html"/>]]></source>
+      <p>
+        To do that, it's always a good idea to write down what this
+        call actually looks like when all the variables are replaced by real
+        values.
+        A safe way to do that is to look at the matcher to start with,
+        build a list of the numbered variables and their meaning in the
+        current context and then assemble the actual expression(s) from it.
+      </p>
+      
+      <p>In our example the matcher pattern
+       <code>**/*.html</code> is applied to the request-name
+       <code>mytest/mybad.html</code>, so we have three variables
+       altogether:
+      </p>
+      <table>
+        <tr>
+          <td>%0</td>
+          <td>mytest/mybad.html</td>
+          <td>the whole pathname</td>
+        </tr>
+        <tr>
+          <td>%1</td>
+          <td>mytest/</td>
+          <td>the first match</td>
+        </tr>
+        <tr>
+          <td>%2</td>
+          <td>mybad</td>
+          <td>the second match</td>
+        </tr>
+      </table>
+      <p>If we insert that into </p>
+      <source><![CDATA[<map:part src="cocoon:/{1}/body-{2}.html"/>]]></source>
+      <p>we get</p>
+      <source><![CDATA[<map:part src="cocoon:/mytest/body-mybad.html"/>]]></source>
+      <p>
+        As you can easily tell, we are suddenly calling for a whole
+        new document. Let's see where that takes us:
+      </p>
+    </section>
+    <section id="call">
+      <title>Second Call for Content</title>
+      <p>
+        Processing of cocoon-calls is not much different from
+        normal requests by a client. When you launch a call like
+      </p>
+      <source><![CDATA[<map:part src="cocoon:/mytest/body-mybad.html"/>]]></source>  
+      <p>
+        Forrest will once again start searching its main sitemap
+        from the beginning and look for a pipeline to match that call.
+      </p>
+
+      <p class="instruction">
+         Search for '**body-*.html' from the beginning of the
+         sitemap or jump to the
+         <a href="sitemap.xmap.html#First+Match+for+%27**body-*.html%27">First Match for '**body-*.html'</a>
+         to see where we find our next match.
+      </p>
+    </section>
+    <section id="match-1">
+      <title>First Match for '**body-*.html'</title>
+       <p>
+         Our first match is different to the previous ones because
+         there is a second condition placed inside the matcher.
+         Doing the replacements
+       </p>
+<source><![CDATA[
+<map:select type="exists">
+   <map:when test="{project:content.xdocs}mytests/mybad.ehtml">]]></source>
+       <p>
+         we quickly discover that there can't be a file of
+         that name in the project-directory.
+         <br />
+         (The variable '{project:content.xdocs}' is always replaced with
+         the name of your project directory that you can change
+         in the 'forrest.properties'-file.)
+       </p>
+       <p>
+         So we have a pipeline, but it doesn't do anything.
+         In this case Forrest will simply keep looking for
+         the next match further down.
+       </p>
+    </section>
+    <section id="match-2">
+      <title>Second Match for '**body-*.html'</title>
+      <p class="instruction">
+         Continue searching downwards for '**body-*.html' in the
+         sitemap-file or jump directly to the
+         <a href="sitemap.xmap.html#Second+Match+for+%27**body-*.html%27">Second Match for '**body-*.html'</a>.
+      </p>
+      <p>
+        Looking at the pipeline that handles the request, we see that
+        the cocoon-protocol is once again invoked
+      </p>
+      <source><![CDATA[<map:generate src="cocoon:/{1}{2}.xml"/>]]></source>  
+      <p>
+        this time as a direct generator of input for our pipeline.
+      </p>
+      <p>
+        So once again we ask Forrest to process a request for content.
+        To know what matcher to look for, let's first expand the variables:
+      </p>
+      <p>
+        In our example the matcher pattern
+        <code>**body-*.html</code> is applied to the request-name
+        <code>mytest/body-mybad.html</code>.
+        Which means that we have three variables altogether:
+      </p>
+      <table>
+        <tr>
+          <td>%0</td>
+          <td>mytest/body-mybad.html</td>
+          <td>the whole pathname</td>
+        </tr>
+        <tr>
+          <td>%1</td>
+          <td>mytests/</td>
+          <td>the first match</td>
+        </tr>
+        <tr>
+          <td>%2</td>
+          <td>mybad</td>
+          <td>the second match</td>
+        </tr>
+      </table>
+      <p>
+        If we insert that into
+      </p>
+      <source><![CDATA[<map:generate src="cocoon:/{1}{2}.xml"/>]]></source>
+      <p>
+        we get
+      </p>
+      <source><![CDATA[<map:generate src="cocoon:/mytests/mybad.xml"/>]]></source>
+    </section>
+    <section id="match-3">
+      <title>Third Call for Content</title>
+      <p class="instruction">
+        So let's scan the main sitemap looking for a match for
+        '/mytests/mybad.xml'.
+      </p>
+      <p>
+        We find it in the pattern
+        <a href="sitemap.xmap.html#First+Match+for+%27**.xml%27">'**.xml'></a>,
+        which is the standard handling for all xml-requests.
+      </p>
+      <p>
+        Since our request fulfils none of the secondary criteria in
+        this pipeline, it falls right through to the map:mount-element
+        at the end:
+        </p>
+      <source><![CDATA[<map:mount uri-prefix="" src="forrest.xmap" check-reload="yes" />]]></source>
+      <p>
+        which makes Forrest load and process a secondary sitemap,
+        the file 'forrest.xmap' in the same directory.
+      </p>
+      <p class="instruction">
+        Open the file 'forrest.xmap' and continue the search for a
+        matching pattern.
+      </p>
+      <p>
+        Our search leads us to the 
+        <a href="forrest.xmap.html#Second+Match+for+%27**.xml%27">
+          Second Match for '**.xml'
+        </a>,
+        a pipeline designed to handle internationalisation if that
+        feature is configured. Since it is not, all it does is
+        call the file-resolver-resource with the full pathname of
+        our file but no extension.
+      </p>
+<source><![CDATA[
+<map:call resource="file-resolver">
+  <map:parameter name="uri" value="mytests/mybad"/>
+</map:call>]]></source>
+    </section>
+    <section id="file">
+      <title>Introducing the File-Resolver</title>
+      <p class="instruction">
+        To find out more about the working of the file-resolver,
+        search for the definition of the
+        <a href="forrest.xmap.html#Definition+of+File-Resolver-Resource">
+          &lt;map:resource name="file-resolver"&gt;
+        </a>
+        higher up in the file.
+      </p>
+      <p>
+        Here you will find a pipeline that tests for the existence of
+        the file with different extensions. '.html' is second in this
+        list and leads to the processing steps shown below:
+      </p>
+    </section>
+
+    <section id="process-html">
+      <title>html-Default Processing</title>
+      <p>
+        The default processing of html-files consists of four
+        processing steps:
+      </p>
+      <ol>
+        <li>
+          <code>&lt;map:generate src="{project:content.xdocs}{uri}.html" type="html"/&gt;</code><br/>
+            Using the html-generator, Forrest reads the html-document
+            from file and uses JTidy to clean up and convert it to xml
+            (which is required for all processing in cocoon pipelines).
+            At the output of this transformer we will have a valid
+            XHTML-document althought it might still contain some unwanted
+            elements. We'll deal with those later (see
+            <a href="site:howto/custom-html-source">When to customize</a>).
+        </li>
+        <li>
+            <code>&lt;map:transform src="{forrest:stylesheets}/html2document.xsl"/&gt;</code><br/>
+          Using the standard stylesheet 'html2document.xsl', this XHTML is 
+          transformed into Forrest standard document format.
+        </li>
+        <li>
+          <code>&lt;map:transform type="idgen"/&gt;</code><br/>
+          This step generates IDs required for navigation within the page.
+        </li>
+        <li>
+          <code>&lt;map:serialize type="xml-document"/&gt;</code><br/>
+          Finally the document is serialized as XML and returned to the
+          calling pipeline.
+        </li>
+      </ol>
+      <p>
+        As a result, we now hand back the content of the html-document
+        in Forrest standard document format to the calling pipeline
+      </p>
+      <note>
+        To look at the output of this pipeline you can simply
+        point you browser to 'http://localhost:8888/mytest/mybad.xml'
+        (assuming that you are currently running Forrest on your
+        machine and there is an html-page of that name).
+      </note>
+    </section>
+    <section id="body">
+      <title>Returning to the '**body-*.html'-Pipeline</title>
+      <p>
+        On returning into the
+        <a href="sitemap.xmap.html#Returning+to+the+%27**body-*.html%27+Pipeline">'**body-*.html' pipeline</a>,
+        procesing continues with the next components in this pipeline:
+      </p>
+      <ul>
+        <li>
+          <strong>idgen</strong> will generate unique IDs for all elements
+          that need to be referenced within a page (mainly headlines).
+        </li>
+        <li>
+          <strong>xinclude</strong> would process any xinclude statements
+          in the source.
+          Since HTML does not support this mechanism, nothing happens
+          in our example.
+        </li>
+        <li>
+          <strong>linkrewriter</strong> adjusts links between pages
+          so that they will still work in the final Forrest output
+          directory structure. It also resolves any special Forrest links.
+        </li>
+        <li>
+          The final transformation, as the name suggests, will take
+          care of reporting broken site-links.
+        </li>
+        <li>
+          The call to 'skinit' prepares the page body for presentation
+          within the selected skin.
+        </li>
+      </ul>
+      <note>
+        To look at the output of this pipeline you can simply
+        point you browser to 'http://localhost:8888/mytest/body-mybad.html'
+        (assuming that you are currently running Forrest on your machine
+        and there is an html-page of that name).
+      </note>
+    </section>
+
+    <section id="aggregate">
+      <title>Returning to the '**/*.html'-Pipeline</title>
+      <p>
+        At the end of this pipeline, processing returns the results into
+        the aggregation section of the
+        <a href="#html-pipeline">'**/*.html' Pipeline</a>,
+        merges it with other data, skins and serializes for presentation
+        in the requesting client.
+      </p>
+    </section>
+
+    <section id="custom">
+      <title>Customizing the html pipeline</title>
+      <p>
+        In this last part of this document, we will show how to customize the
+        HTML-pipeline to add your additional steps to the default processing.
+      </p>
+
+      <section id="when">
         <title>When to customize?</title>
         
         <p>
@@ -206,12 +686,6 @@
           (defined in project.stylesheets-dir of forrest.properties),
           which is central storage for all stylesheets in a project.
         </p>
-        <fixme author="open">
-          I'm unclear about the translation from the property
-          project.stylesheets-dir to the variable {forrest:stylesheets}.
-          How and where does it take place and why are the names different?
-          That seems very confusing!          
-        </fixme>
         <p class="instruction">
           Add the new transformation as a new line, straight after the
           generator, and save the changes. 
@@ -240,5 +714,6 @@
         </note>
 
       </section>
+    </section>
   </steps>
 </howto>

Modified: forrest/trunk/docs-author/content/xdocs/howto/index.xml
URL: http://svn.apache.org/viewcvs/forrest/trunk/docs-author/content/xdocs/howto/index.xml?view=diff&r1=159563&r2=159564
==============================================================================
--- forrest/trunk/docs-author/content/xdocs/howto/index.xml (original)
+++ forrest/trunk/docs-author/content/xdocs/howto/index.xml Wed Mar 30 23:05:33 2005
@@ -42,10 +42,6 @@
     <section>
       <title>Using Forrest</title>
       <ul>
-        <li><link href="site:understand-pipelines">How to understand html processing and sitemap pipelines</link>
-        </li>
-        <li><link href="site:custom-html-source">How to customize processing of html source</link>
-        </li>
         <li><link href="site:howto/pdf-tab">How to create a PDF document for each tab</link>
           - Describes the generation of a PDF document for each
           group of documents that is defined by a tab.
@@ -62,6 +58,8 @@
         <li><link href="site:howto/maven">How to run Forrest from within Maven</link>
         - For Maven users who want to generate their project's documentation and/or website
           using Forrest in lieu of Maven's site plugin.
+        </li>
+        <li><link href="site:custom-html-source">How to customize processing of html source</link>
         </li>
       </ul>
     </section>

Modified: forrest/trunk/docs-author/content/xdocs/site.xml
URL: http://svn.apache.org/viewcvs/forrest/trunk/docs-author/content/xdocs/site.xml?view=diff&r1=159563&r2=159564
==============================================================================
--- forrest/trunk/docs-author/content/xdocs/site.xml (original)
+++ forrest/trunk/docs-author/content/xdocs/site.xml Wed Mar 30 23:05:33 2005
@@ -106,14 +106,13 @@
   <howto label="How-To" href="howto/" tab="howto">
     <overview label="Overview" href="index.html"/>
     <write-howto label="Write a How-to" href="howto-howto.html"/>
-    <understand-pipelines label="Understand pipelines" href="howto-understand-pipelines.html"/>
-    <custom-html-source label="Custom html source" href="howto-custom-html-source.html"/>
     <asf-mirror label="Download mirror" href="howto-asf-mirror.html"/>
     <pdf-tab label="Create tab PDF" href="howto-pdf-tab.html"/>
     <editcss label="Edit CSS (WYSIWYG)" href="howto-editcss.html"/>
     <corner-css label="CSS corner SVG" href="howto-corner-images.html"/>
     <maven label="Maven Integration" href="howto-forrest-from-maven.html"/>
     <buildPlugin label="Build a Plugin" href="howto-buildPlugin.html"/>
+    <custom-html-source label="Custom html source" href="howto-custom-html-source.html"/>
   </howto>
 
   <!-- Uncomment this if we want aggregate HTML/PDFs for this site 

Modified: forrest/trunk/docs-author/status.xml
URL: http://svn.apache.org/viewcvs/forrest/trunk/docs-author/status.xml?view=diff&r1=159563&r2=159564
==============================================================================
--- forrest/trunk/docs-author/status.xml (original)
+++ forrest/trunk/docs-author/status.xml Wed Mar 30 23:05:33 2005
@@ -62,9 +62,8 @@
       </action>
       <action dev="DC" type="add" context="docs"
         fixes-bug="FOR-446" due-to="Ferdinand Soethe">
-      Added <link href="site:howto/understand-pipelines">How to understand
-      html processing and sitemap pipelines</link>
-      and <link href="site:howto/custom-html-source">How to customize
+      Added 
+      <link href="site:howto/custom-html-source">How to customize
       processing of html source</link>.
       </action>
       <action dev="DC" type="add" context="core">