You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@forrest.apache.org by Nicola Ken Barozzi <ni...@apache.org> on 2002/11/15 09:12:13 UTC

Problems with unified source directory

I've converted the Avalon Phoenix site to Forrest, and I'm trying a 
unified dir structure, with all xdocs in src/documentation/content and 
images in src/documentation/content/images.

I changed the two properties accordingly, and while the xdocs render 
fine in both CLI and, the images are crawled by the CLI as html files 
(header.gif.html) and not rendered.

This is because of this rule I'd committed to allow Forrest to display 
any content if present as-is.

    <map:match pattern="**">

     <!-- if the file is present verbatim as content,
          give that to the user... -->
     <map:act type="resource-exists">
      <map:parameter name="url" value="content/{1}"/>
      <map:read src="content/{../1}"/>
     </map:act>


IIUC the problem comes with the mime/type not being specified, and 
Cocoon doesn't seem to default to the right one.

What should we do, add rules to this containing mimetypes for common 
formats?

Example:

    <map:match pattern="**">

     <map:match pattern="**.gif">
       <map:act type="resource-exists">
        <map:parameter name="url" value="content/{1}.gif" />
        <map:read src="content/{../1}.gif" mime-type="image/gif"/>
       </map:act>
     </map:match>

     <map:match pattern="**.jpg">
       <map:act type="resource-exists">
        <map:parameter name="url" value="content/{1}.jpg"/>
        <map:read src="content/{../1}.jpg" mime-type="image/jpg"/>
       </map:act>
     </map:match>

     ...

     <!-- if the file is present verbatim as content,
          give that to the user... -->
     <map:act type="resource-exists">
      <map:parameter name="url" value="content/{1}"/>
      <map:read src="content/{../1}"/>
     </map:act>

    ...

Separating directories does help on this, but I don't want technical 
difficulties that can be overcome steer design decisions.

Thoughts?

-- 
Nicola Ken Barozzi                   nicolaken@apache.org
             - verba volant, scripta manent -
    (discussions get forgotten, just code remains)
---------------------------------------------------------------------


Re: Problems with unified source directory

Posted by Nicola Ken Barozzi <ni...@apache.org>.
Steven Noels wrote:
> Jeff Turner wrote:
> 
>> As you can see I'm making stuff up as I go.. a more coherent version of
>> this was intended to be the topic of a "Proposed Forrest Roadmap" email.
>> The only thing worth salvaging from this is that the current system _can_
>> be evolved to support use-cases like mixed content, without turning the
>> sitemap into a try-handle-anything mess.
> 
> 
> While I'm not sure whether your approach is the one to go for, I'm +100 
> that we should try to avoid the try-catch resource-exists-action 
> approach which, while supporting users in mixing clean and 
> well-structured xdocs with static html (et al.) files, also encourages 
> them to stick with these bad habits. It's a tummy-feeling decision, but 
> resource-exists seems like a clear breach of SoC to me, especially if we 
> need to set up a matcher for each and every mimetype.

Hmmm, break of SOC?
I'm a doc writer, I write in xdocs and sometimes have to include an html 
file as-is that I've gotten somewhere else for a reference.

I suppose, as any other normal webserver user, that I can put them in th 
esame dir.

Where's the messy part?

> A very stupid solution could be to isolate this messy part of the URI 
> space to a clearly isolated part of that space, i.e. 
> {protocol}:/{host}/{mount}/static/putyourfoobarwhereitbelongs, 

I want this to be a possibility, but not necessarily mandated; as long 
as both solutions are there, it will be ok.

As for the default, probably when we have a proposal, we should launch a 
poll for our users and a RFC to cocoon-dev to see what hey think about it.

My feeling is that discussion on the "URI space - source space" is 
really important and could impact on the future design of Cocoon.

> and have 
> some SetMimeTypeAction which sets mimetype according to file extension 
> (or even better: /etc/mime.types for OSes that care about mimetypes).

This big +1, IMHO it's really needed.

> Just my € 0.02,
> 
> </Steven>

-- 
Nicola Ken Barozzi                   nicolaken@apache.org
             - verba volant, scripta manent -
    (discussions get forgotten, just code remains)
---------------------------------------------------------------------


Re: Problems with unified source directory

Posted by Steven Noels <st...@outerthought.org>.
Jeff Turner wrote:

> As you can see I'm making stuff up as I go.. a more coherent version of
> this was intended to be the topic of a "Proposed Forrest Roadmap" email.
> The only thing worth salvaging from this is that the current system _can_
> be evolved to support use-cases like mixed content, without turning the
> sitemap into a try-handle-anything mess.

While I'm not sure whether your approach is the one to go for, I'm +100 
that we should try to avoid the try-catch resource-exists-action 
approach which, while supporting users in mixing clean and 
well-structured xdocs with static html (et al.) files, also encourages 
them to stick with these bad habits. It's a tummy-feeling decision, but 
resource-exists seems like a clear breach of SoC to me, especially if we 
need to set up a matcher for each and every mimetype.

A very stupid solution could be to isolate this messy part of the URI 
space to a clearly isolated part of that space, i.e. 
{protocol}:/{host}/{mount}/static/putyourfoobarwhereitbelongs, and have 
some SetMimeTypeAction which sets mimetype according to file extension 
(or even better: /etc/mime.types for OSes that care about mimetypes).

Just my € 0.02,

</Steven>
-- 
Steven Noels                            http://outerthought.org/
Outerthought - Open Source, Java & XML Competence Support Center
Read my weblog at              http://radio.weblogs.com/0103539/
stevenn at outerthought.org                stevenn at apache.org


Re: Problems with unified source directory

Posted by Jeff Turner <je...@apache.org>.
On Fri, Nov 15, 2002 at 11:14:44AM +0100, Nicola Ken Barozzi wrote:
...
> Just for a reminder (release-release-release first :-), this is the 
> scenario I would like, so you can already post the thoughts also 
> commenting on this:
> 
>  1) we categorize the sources (images, xdocs, etc)
>  2) each category location(s) can be defined with includes and excludes
>  3) default behaviour is having a mixed content dir for the 
> least-surprise principle and a separate one with all categories 
> separated for maximum content-type separation; the unified one takes 
> precedence on the separated ones.

Sounds good to me.. as long as both possibilities exist, I don't really
mind which is the default.


--Jeff


Re: Problems with unified source directory

Posted by Nicola Ken Barozzi <ni...@apache.org>.

Jeff Turner wrote:
> On Fri, Nov 15, 2002 at 09:12:13AM +0100, Nicola Ken Barozzi wrote:
> 
>>I've converted the Avalon Phoenix site to Forrest, and I'm trying a 
>>unified dir structure, with all xdocs in src/documentation/content and 
>>images in src/documentation/content/images.
>>
>>I changed the two properties accordingly, and while the xdocs render 
>>fine in both CLI and, the images are crawled by the CLI as html files 
>>(header.gif.html) and not rendered.
> 
> ...
> 
> Approaching the problem from the user's perspective..
> 
> For validation, we have a system of associating an includes and excludes
> pattern with each directory:
> 
> forrest.validate.xdocs.includes=*.x*
> forrest.validate.xdocs.excludes=
> 
> So how about using the same system for limiting what the xdocs-dir
> contains:
> 
> forrest.xdocs-dir.includes = **/*.xml
> forrest.xdocs-dir.excludes = images/**
> 
> 
> So forrest.properties for phoenix would be:
> 
> forrest.xdocs-dir = src/documentation/content
> forrest.xdocs-dir.excludes = images/**
> forrest.images-dir = src/documentation/content/images
> 
> Then we have a parametrized sitemap entry which uses the includes and
> excludes, so that matchers only exist for valid transformations.
> 
> <map:match type="AntMatcher"
>            includes="{layout:xdocs-dir/includes}"
>            excludes="{layout:xdocs-dir/excludes}"
>    <map:generate src="{layout:xdocs-dir}/{0}"/>
>    ...
> </map:match>
> 
> That could be implemented with an XMLModule (hence {layout:..} syntax)
> and a forrest.properties replacement looking like this:
> 
> <forrest-layout dir="src/documentation/content">
> 
>   <images-dir dir="images">
>     <include name="*.*"/>
>     <exclude name=""/>
>   </images-dir>
> 
>   <xdocs-dir dir="">
>     <include name="**/*.xml"/>
>     <exclude name="${images-dir}"/>
>   </xdocs-dir>
> 
> </forrest-layout>
> 
> 
> As you can see I'm making stuff up as I go.. a more coherent version of
> this was intended to be the topic of a "Proposed Forrest Roadmap" email.
> The only thing worth salvaging from this is that the current system _can_
> be evolved to support use-cases like mixed content, without turning the
> sitemap into a try-handle-anything mess.

Ok, very clear.
I'll wait then for the more complete proposal mail, I was not really 
that aware of all the ramifications of coming up with a quick fix, that 
would do more harm than good.

Just for a reminder (release-release-release first :-), this is the 
scenario I would like, so you can already post the thoughts also 
commenting on this:

  1) we categorize the sources (images, xdocs, etc)
  2) each category location(s) can be defined with includes and excludes
  3) default behaviour is having a mixed content dir for the 
least-surprise principle and a separate one with all categories 
separated for maximum content-type separation; the unified one takes 
precedence on the separated ones.

No need to reply now, get on with the release, we'll discuss this later :-)

-- 
Nicola Ken Barozzi                   nicolaken@apache.org
             - verba volant, scripta manent -
    (discussions get forgotten, just code remains)
---------------------------------------------------------------------


Re: Problems with unified source directory

Posted by Jeff Turner <je...@apache.org>.
On Fri, Nov 15, 2002 at 09:12:13AM +0100, Nicola Ken Barozzi wrote:
> I've converted the Avalon Phoenix site to Forrest, and I'm trying a 
> unified dir structure, with all xdocs in src/documentation/content and 
> images in src/documentation/content/images.
> 
> I changed the two properties accordingly, and while the xdocs render 
> fine in both CLI and, the images are crawled by the CLI as html files 
> (header.gif.html) and not rendered.
...

Approaching the problem from the user's perspective..

For validation, we have a system of associating an includes and excludes
pattern with each directory:

forrest.validate.xdocs.includes=*.x*
forrest.validate.xdocs.excludes=

So how about using the same system for limiting what the xdocs-dir
contains:

forrest.xdocs-dir.includes = **/*.xml
forrest.xdocs-dir.excludes = images/**


So forrest.properties for phoenix would be:

forrest.xdocs-dir = src/documentation/content
forrest.xdocs-dir.excludes = images/**
forrest.images-dir = src/documentation/content/images

Then we have a parametrized sitemap entry which uses the includes and
excludes, so that matchers only exist for valid transformations.

<map:match type="AntMatcher"
           includes="{layout:xdocs-dir/includes}"
           excludes="{layout:xdocs-dir/excludes}"
   <map:generate src="{layout:xdocs-dir}/{0}"/>
   ...
</map:match>

That could be implemented with an XMLModule (hence {layout:..} syntax)
and a forrest.properties replacement looking like this:

<forrest-layout dir="src/documentation/content">

  <images-dir dir="images">
    <include name="*.*"/>
    <exclude name=""/>
  </images-dir>

  <xdocs-dir dir="">
    <include name="**/*.xml"/>
    <exclude name="${images-dir}"/>
  </xdocs-dir>

</forrest-layout>


As you can see I'm making stuff up as I go.. a more coherent version of
this was intended to be the topic of a "Proposed Forrest Roadmap" email.
The only thing worth salvaging from this is that the current system _can_
be evolved to support use-cases like mixed content, without turning the
sitemap into a try-handle-anything mess.


--Jeff