You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@forrest.apache.org by Unico Hommes <Un...@hippo.nl> on 2003/10/17 12:50:03 UTC

LocationMapModule

Hi all,

I've been able to spend enough time working on this thing to have a first iteration ready and running :-D I've attached the code here for you to review and comment on so we can decide what to do with it next.


                                          -- o --


A locationmap defines a mapping from requests to location strings.

It was conceived to:

a) Provide a more powerful means for semantic linking.
b) Enable Forrest with a standard configuration override mechanism.

The syntax of a locationmap resembles that of the sitemap in that it also makes use of Matchers and Selectors to traverse a tree of nodes towards a leaf. In the case of the locationmap however the leaf does not identify a pipeline but instead identifies a location string.

                                          -- o --

An example:
-----------

<locationmap xmlns="http://apache.org/cocoon/locationmap/1.0">

<!--
  + Components section: define Matchers and Selectors here.
  + The current implementation only supports a subset of 
  + the Avalon lifecycles. Only those that are actually
  + implemented by the currently available Selectors and Matchers.
  + Moreover, it expects them to be ThreadSafe (which all of them are IIRC)
  + -->
  <components>
    <matchers default="parameter">
      <matcher 
        name="parameter" 
        src="org.apache.cocoon.matching.WildcardParameterMatcher">
        <!-- 
          + #lm:name is a special parameter that is passed into 
          + the Matcher by the LocationMap implementation.
          + It identifies the name string the module was called
          + with. For example when calling the module as follows:
          + {lm:/my/virtual/path} the parameters value will
          + be /my/virtual/path .
          + -->
        <parameter>#lm:name</parameter>
      </matcher>
      <matcher 
        name="request"
        src="org.apache.cocoon.matching.WildcardURIMatcher"
      />
    </matchers>
    <selectors default="exists">
      <selector 
        name="exists" 
        src="org.apache.cocoon.selecting.ResourceExistsSelector" 
      />
    </selectors>
  </components>
  
<!--
  + Locator section. 
  + A locator groups matchers and selectors and provides a base location
  + locations are resolved against.
  + The locator below exemplifies how a locator could be used
  + to enable clients to override resource locations.
  + -->
  <locator base="."> <!-- resolves to the current sitemap context -->
    <!-- 
      + locate skin stylesheets by searching fs locations /forrest/styles/{1} 
      + and /hippo/styles/{1} respectively.
      + if none exist default to ./styles/
      + -->
    <match pattern="**.html" type="request">
      <select type="exists">
        <location src="file://unico/styles/{1}.xsl" />
        <location src="file://hippo/styles/{1}.xsl" />
      </select>
      <location src="styles/{1}.xsl" />
    </match>
  </locator>
  
  <!-- 
    + This locator exemplifies how to use a locationmap for
    + link resolving. Note the use of the parameter
    + type matcher. This one has been defined above to match
    + the special '#lm:name' parameter that holds a reference
    + to the argument the input module was called with.
    + -->
  <locator base=".">
    <match pattern="xdocs/**.xml" type="parameter">
      <match pattern="xdocs/news/*/*/*/today.xml">
        <location src="news/archive/{3}{2}{1}.html" />
      </match>
      <location src="{1}.html" />
    </match>
  </locator>
</locationmap>


                                          -- o --


In the first example a sitemap would be able to do something like:

sitemap.xmap:
<map:match pattern="**.html">
  <map:generate src="xdocs/{1}.xml" />
  <map:transform src="{lm:}" type="xslt" />
  <map:serialize type="html" />
</map:match>

Note the usage of the locationmap module: {lm:}. The input module argument is empty because it is not used in the locationmap anyway (the matcher used is of type request).

In the second usage scenario the idea is that xdocs in a repository are only aware of their relative locations. What URL they are mapped to in a website is not their concern. Therefore there needs to be a translation between links between documents within an xml repository and their locations on a website. This is very similar to semantic linking where editors define a virtual link and that link gets mapped to an actual location by a module that knows where the page is currently located (such as one reading a forrest site.xml file). The difference is that the source links are actually not virtual semantic identifiers at all but physical system identifiers, only relative to a seperate system than the target one. The LocationMap accommodates a complete mapping from any source location to any target location.

In scenario the module is used by a LinkRewriterTransformer:

sitemap.xmap:
<map:match pattern="**.html">
  <map:generate src="repo/**.xml" />
  <map:transform src="content2page.xsl" type="xslt" />
  <map:transform type="linkrewriter" />
  <map:serialize type="html" />
</map:match>

repo/{1}.xml:
<document>
  <title>How to use locationmaps</title>
  <related href="repo:xdocs/news/10/10/2002/today.xml" />
  <related href="repo:xdocs/news/02/02/2001/today.xml" />
  <body>blah</body>
</document>

{1}.html:
<html>
  <title>How to use locationmaps</title>
  <body>
   <p>blah</p>
   See also:
   <p><a href="news/archive/20021010.html">related news<a></p>
   <p><a href="news/archive/20010202.html">related news<a></p>
  </body>
</html>

                                          -- o --

Issues and Notes.
-----------------

The reason for the 'archored' syntax of special variables #lm:name and #lm:base is due to the fact that the treeprocessor code for variable resolution which I reused provides a way to declare global variables that way.

The reuse of Matcher and Selectors makes the locationmap very powerful. However, especially the Selector concept is not an exact fit when used in the context of a locationmap. This is because the test string that is passed to the selector in a locationmap has a more specific meaning than the test strings that are passed to it in the sitemap. In the locationmap's case it is the resolved location src attribute, whereas in a sitemap it can be anything because it is explicitly passed in as <map:when test="{something}"/>. Therefore some selectors do not make sense in the context of a locationmap (f.e. a HostSelector). 

Nothing in the code indicates that <location src="something" /> is interpreted as a location string per se. It can in fact be anything. Its only semantic constraint is that it is a string literal.

Things to do:
-------------

- provide sample for linkrewriter block
- provide sample for modules block
- handle SwitchSelector optimisations
- handle Disposable lifecycle?
- handle dynamic configuration?
- reconsider 'locationmap' naming (see above)
- ...

Comments are welcome :)

-- Unico

Re: LocationMapModule

Posted by Nicola Ken Barozzi <ni...@apache.org>.
Unico Hommes wrote:

> Hi all,
> 
> I've been able to spend enough time working on this thing to have a
> first iteration ready and running :-D I've attached the code here for
> you to review and comment on so we can decide what to do with it next.

Excellent! :-D

>                                           -- o --
> 
> 
> A locationmap defines a mapping from requests to location strings.
> 
> It was conceived to:
> 
> a) Provide a more powerful means for semantic linking.
> b) Enable Forrest with a standard configuration override mechanism.
   c) decouple the conceptual source space used by Cocoon from
      the concrete source space, so that a change in the concrete sources
      does not impact on the sitemap

> The syntax of a locationmap resembles that of the sitemap in that it
> also makes use of Matchers and Selectors to traverse a tree of nodes
> towards a leaf. In the case of the locationmap however the leaf does not
> identify a pipeline but instead identifies a location string.

Ok.

> An example:
> -----------
> 
> <locationmap xmlns="http://apache.org/cocoon/locationmap/1.0">
   <locationmap xmlns="http://apache.org/forrest/locationmap/1.0">
   ;-)

> <!--
>   + Components section: define Matchers and Selectors here.
>   + The current implementation only supports a subset of 
>   + the Avalon lifecycles. Only those that are actually
>   + implemented by the currently available Selectors and Matchers.
>   + Moreover, it expects them to be ThreadSafe (which all of them are IIRC)
>   + -->
>   <components>
>     <matchers default="parameter">
>       <matcher 
>         name="parameter" 
>         src="org.apache.cocoon.matching.WildcardParameterMatcher">
>         <!-- 
>           + #lm:name is a special parameter that is passed into 
>           + the Matcher by the LocationMap implementation.
>           + It identifies the name string the module was called
>           + with. For example when calling the module as follows:
>           + {lm:/my/virtual/path} the parameters value will
>           + be /my/virtual/path .
>           + -->

Can't we simply pass what is after the inputmodule colon?

>         <parameter>#lm:name</parameter>
>       </matcher>
>       <matcher 
>         name="request"
>         src="org.apache.cocoon.matching.WildcardURIMatcher"
>       />
>     </matchers>
>     <selectors default="exists">
>       <selector 
>         name="exists" 
>         src="org.apache.cocoon.selecting.ResourceExistsSelector" 
>       />
>     </selectors>
>   </components>

Ok, so you have already added the <component> section from the start; 
this is good in any case so we are not hooked on the components being 
declared in the sitemap.

> <!--
>   + Locator section. 
>   + A locator groups matchers and selectors and provides a base location
>   + locations are resolved against.
>   + The locator below exemplifies how a locator could be used
>   + to enable clients to override resource locations.
>   + -->
>   <locator base="."> <!-- resolves to the current sitemap context -->
>     <!-- 
>       + locate skin stylesheets by searching fs locations /forrest/styles/{1} 
>       + and /hippo/styles/{1} respectively.
>       + if none exist default to ./styles/
>       + -->
>     <match pattern="**.html" type="request">
>       <select type="exists">
>         <location src="file://unico/styles/{1}.xsl" />
>         <location src="file://hippo/styles/{1}.xsl" />
>       </select>
>       <location src="styles/{1}.xsl" />
>     </match>
>   </locator>

Ok.

>   <!-- 
>     + This locator exemplifies how to use a locationmap for
>     + link resolving. Note the use of the parameter
>     + type matcher. This one has been defined above to match
>     + the special '#lm:name' parameter that holds a reference
>     + to the argument the input module was called with.
>     + -->
>   <locator base=".">
>     <match pattern="xdocs/**.xml" type="parameter">
>       <match pattern="xdocs/news/*/*/*/today.xml">
>         <location src="news/archive/{3}{2}{1}.html" />
>       </match>
>       <location src="{1}.html" />
>     </match>
>   </locator>
> </locationmap>

Cool, matchers are nestable.

Hmmm, but what does the parameter matcher in this case do? I'm not keen 
on having matchers be passed a special parameter.

Can't we simply concatenate the two strings?

For example, if I have:

    path/to/page.html

and I called:

    {lm:gogogo}

then I can give the locator:

   "lm:gogogo|path/to/page.html"

In this way I can use standard matchers that can decide to match on the 
part before |, on the part after, or both.

>                                           -- o --
> 
> 
> In the first example a sitemap would be able to do something like:
> 
> sitemap.xmap:
> <map:match pattern="**.html">
>   <map:generate src="xdocs/{1}.xml" />
>   <map:transform src="{lm:}" type="xslt" />
>   <map:serialize type="html" />
> </map:match>
> 
> Note the usage of the locationmap module: {lm:}. The input module 
> argument is empty because it is not used in the locationmap anyway
> (the matcher used is of type request).

If you do the above concatenation, I eould do:

    <map:transform src="{lm:style}" type="xslt" />

Which tells the locationmap to get me the url of the stylesheet for the 
current url.

I could then easily do:

    <map:generate src="{lm:source}" />

and have a nice differentiation.

> In the second usage scenario the idea is that xdocs in a repository 
> are only aware of their relative locations. What URL they are mapped
> to in a website is not their concern. Therefore there needs to be a 
> translation between links between documents within an xml repository
> and their locations on a website. This is very similar to semantic
> linking where editors define a virtual link and that link gets mapped
> to an actual location by a module that knows where the page is
> currently located (such as one reading a forrest site.xml file). The
> difference is that the source links are actually not virtual semantic
> identifiers at all but physical system identifiers, only relative to
> a seperate system than the target one. The LocationMap accommodates a
> complete mapping from any source location to any target location.
>
> In scenario the module is used by a LinkRewriterTransformer:
> 
> sitemap.xmap:
> <map:match pattern="**.html">
>   <map:generate src="repo/**.xml" />
>   <map:transform src="content2page.xsl" type="xslt" />
>   <map:transform type="linkrewriter" />
>   <map:serialize type="html" />
> </map:match>
> 
> repo/{1}.xml:
> <document>
>   <title>How to use locationmaps</title>
>   <related href="repo:xdocs/news/10/10/2002/today.xml" />
>   <related href="repo:xdocs/news/02/02/2001/today.xml" />
>   <body>blah</body>
> </document>
> 
> {1}.html:
> <html>
>   <title>How to use locationmaps</title>
>   <body>
>    <p>blah</p>
>    See also:
>    <p><a href="news/archive/20021010.html">related news<a></p>
>    <p><a href="news/archive/20010202.html">related news<a></p>
>   </body>
> </html>

Cool, so you are using the locationmap to drive link rewriting. 
Interesting :-)

>                                           -- o --
> 
> Issues and Notes.
> -----------------
> 
> The reason for the 'archored' syntax of special variables #lm:name 
> and #lm:base is due to the fact that the treeprocessor code for
> variable resolution which I reused provides a way to declare global
> variables that way.

If we can do without then alltogether...

> The reuse of Matcher and Selectors makes the locationmap very
> powerful. However, especially the Selector concept is not an exact fit
> when used in the context of a locationmap. This is because the test
> string that is passed to the selector in a locationmap has a more
> specific meaning than the test strings that are passed to it in the
> sitemap. In the locationmap's case it is the resolved location src
> attribute, whereas in a sitemap it can be anything because it is
> explicitly passed in as <map:when test="{something}"/>. Therefore some
> selectors do not make sense in the context of a locationmap (f.e. a
> HostSelector).

Hmmm... in any case, if it will be needed, we can still pass it in by 
adding a src attribute to each location.

        <select type="xxx">
          <location test="blah" src="file://unico/styles/{1}.xsl" />
        ...

> Nothing in the code indicates that <location src="something" /> is 
> interpreted as a location string per se. It can in fact be anything.
> Its only semantic constraint is that it is a string literal.

Yup.

> Things to do:
> -------------

 > - reconsider 'locationmap' naming (see above)

In what sense? "locationmap" seems fine for me in any case.

> - provide sample for linkrewriter block
> - provide sample for modules block
> - handle SwitchSelector optimisations
> - handle Disposable lifecycle?
> - handle dynamic configuration?

Can I rewrite this list?

My POV is that of Forrest, so since the above items are not related to 
Forrest, they go in second order.

Furthermore, this is a novel concept, and should and will be refined by 
a concrete usage.

Since yesterday I have started banging my head with forrest.properties 
and such, I'd like to start testing this on Forrest and seeing as things 
go on how it works.

There is a problem though: AFAIK you are not a committer, and I cannot 
really accept such a donation... (see next mail)

> Comments are welcome :)

Great job! :-)

-- 
Nicola Ken Barozzi                   nicolaken@apache.org
             - verba volant, scripta manent -
    (discussions get forgotten, just code remains)
---------------------------------------------------------------------