You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@forrest.apache.org by Maurice Lanselle <la...@evc.net> on 2005/06/07 09:33:11 UTC

Re: Sitemap dot xmap sourcetyping

David Crossley said the following on 07/06/2005 03:46:

>Maurice Lanselle wrote:
>  
>
>>Ross Gardler said the following:
>>    
>>
>>>However, you should avoid limiting the URL space of your application 
>>>by requiring a given file type to have a given filename or path. This 
>>>can result in false matches. You should use the SourceTypeResolver, 
>>>for an example of how see 
>>>http://svn.apache.org/viewcvs.cgi/forrest/trunk/plugins/org.apache.forrest.plugin.input.simplified-docbook/input.xmap?view=markup 
>>>for an example of how to do this.
>>>      
>>>
>>I totally buy your point about not requiring a document type to have a 
>>given file name or path.  Since I'm trying to use or produce xml files 
>>which have the ".xml" extension rather than something more distinctive, 
>>I would like Forrest to choose the stylesheet on the basis of the 
>>doctype, perhaps using a catalog (like for resolving DTDs...why not 
>>catalogs for xsl?).  ...
>>    
>>
>
>No, the "catalog entity resolver" addresses a separate part of the issue.
>It sounds like we need to enhance the documentation. Source Type Resolver,
>actually called "SourceTypeAction (content aware pipelines)" [2] is one of
>the key features of Forrest and so we need to explain it better.
>
>Lets first correct your comment about "catalog". Its use is to create
>an efficient system for xml documents that declare a DTD so that the
>xml parser gets a local copy rather than going across the network.
>
>  
>
Sorry if I wasn't clear.  When I referred to using *a* catalog (not 
*the* catalogs used for validation resolving) I meant just the concept 
of a look-up table the pipeline could use to identify a 
document-type-specific handling to apply.  That seems to be what is 
being done in the sitemap.  If I understand your SourceTypeAction doc 
and what is being defined in the input.xmap (url above),
a) one can define classification rules for xml documents based on some 
types of document header information,
b) apply these rules to determine a "sourcetype" (=classification)
c) use the "sourcetype" (classification) to select the processing to apply.

a) is done in the map:actions section, as you explain in the doc.

b) is done by "map:act" (=a function call) when the document is 
encountered and a processing decision is to be taken:

<map:act type="sourcetype" src="{1}">
or 
<map:act type="sourcetype" src="{src}">

c) Is done by a "select:parameter-selector-test:when" construct 
resembling a select-case in a "resource" (=subroutine) named 
"transform-to-document":

<map:resource name="transform-to-document">
      <map:act type="sourcetype" src="{src}">
        <map:select type="parameter">
          <map:parameter name="parameter-selector-test" value="{sourcetype}" />

          <map:when test="docbook-v4.1.2">
            <map:generate src="{project:content.xdocs}{../../1}.xml" />
            <map:transform src="{forrest:plugins}/org.apache.forrest.plugin.input.simplified-docbook/resources/stylesheets/sdocbook2document.xsl" />
            <map:serialize type="xml-document"/>
          </map:when>
...

>Now back to Source Type Action ... It is a Cocoon sitemap component that
>peeks at the top-part of a document to look for hints about the type
>of the document. 
>
>[1] http://forrest.apache.org/docs/your-project.html#sitemap.xmap 
>[2] http://forrest.apache.org/docs/cap.html
>
>These are the available methods:
>document-declaration
>document-element and namespace
>processing-instruction
>w3c-xml-schema
>
>  
>
While reading the SourceTypeAction doc, a couple of questions came to 
mind.  I think it would be helpful to find their answers in that doc. :

1) What is the appropriate way to construct "OR" classification rules?  
For instance, the document-element may return a local-name, a namespace, 
or both. Should one define two (or more) rules with the same sourcetype 
name, such as...

<sourcetype name="foo">
    <document-element local-name="foo">
</sourcetype>
<sourcetype name="foo">
    <document-element namespace="bar">
</sourcetype>

or a single rule with a list of alternative conditions, such as...

<sourcetype name="foo">
    <document-element local-name="foo">
    <document-element namespace="bar">
</sourcetype>

or is there some other syntax?

2) How does one construct "AND" classification rules?

<sourcetype name="foo">
    <document-element local-name="foo"> && <document-element namespace="bar">
</sourcetype>


These are not urgent (for me), but I expect they will be wanted sooner 
or later.

Regards and thanks for the communication,
Maurice

>If you use the first technique, then the parser needs to go retrieve
>the DTD from across the network. Hence the need for Catalog Entity Resolver.
>
>I don't use "w3c-xml-schema" so i am not sure if the parser is forced
>to locate the actual schema. I gather that it doesn't. Therefore you
>don't need to mess about with catalogs.
>
>Now if there are Java people out there listening, then perhaps you would
>like to enhance the Source Type Action to enable other methods. It is in
>the Forrest source at main/java/org/apache/forrest/sourcetype
>
>--David
>
>  
>
>>...  It looks like that is what happens in the 
>><map:resources> group in the example you pointed me to (below).  My 
>>first attempt to *bend* it to my purpose failed, however: "Type 
>>'sourcetype' does not exist for 'map:pipeline' at..." when I replaced
>>
>> <map:pipeline>
>>  <map:match pattern="**Resume.xml">
>>by
>> <map:pipeline type="sourcetype" src="{src}">
>>       <map:select type="parameter">
>>         <map:parameter name="parameter-selector-test" 
>>value="{sourcetype}" />
>>         <map:when test="Resume">
>>
>>But one thing at a time...xslt first, then plugin/resolving.
>>
>>Many thanks,
>>Maurice
>>    
>>
>
>  
>


Re: Sitemap dot xmap sourcetyping

Posted by Tim Williams <wi...@gmail.com>.
On 6/9/05, Ross Gardler <rg...@apache.org> wrote:
> David Crossley wrote:
> > Ross Gardler wrote:
> >
> >>David Crossley wrote:
> >>
> >>>Maurice Lanselle wrote:
> >>
> >>>>2) How does one construct "AND" classification rules?
> >>>>
> >>>><sourcetype name="foo">
> >>>> <document-element local-name="foo"> && <document-element
> >>>> namespace="bar">
> >>>></sourcetype>
> >>>
> >>>I don't know if that is possible or needed. Perhaps there is another way.
> >>>It would be better if you provided an actual use-case.
> >>
> >>Using your example from earlier in this thread, AND would be:
> >>
> >>
> >>>  <sourcetype name="foo">
> >>>    <document-element local-name="foo">
> >>>  </sourcetype>
> >>>  <sourcetype name="bar">
> >>>    <document-element namespace="bar">
> >>>  </sourcetype>
> >>>...
> >>>...
> >>>      <map:when test="foo">
> >>>       <map:transform
> >>>          src="{project:resources.stylesheets}/foobar2document.xsl" />
> >>>      </map:when>
> >>>      <map:when test="bar">
> >>>       <map:transform
> >>>          src="{project:resources.stylesheets}/foobar2document.xsl" />
> >>>      </map:when>
> >>>...
> >
> >
> > Er, i am confused. That code snippet is exactly the same as the OR
> > situation. Did you forget to edit the code?
> 
> Nope, I was being stupid. Seems my logic circuits blew, I thought an OR
> was something completely different. I even added a note to the effect:
> 
> >>(note the above example was directly copied for the suggestion for OR,
> >>obviously in an OR you would have different actions within the test)
> 
> Just to be clear, the above is obviously an OR, I hope Maurice saw
> through my idiocy and didn't waste much time on that.
> 
> I've had almost a whole nights sleep now, so maybe my logic circuits
> have repaired themselves. However, they are not telling me how to do an
> AND I'm afraid.

Nesting should get the AND effect.  It does seem there's no direct way
about it though.
--tim

For example,

 <map:when test="foo">
      <map:when test="bar">
      <map:transform
          src="{project:resources.stylesheets}/foobar2document.xsl" />
  </map:when>
</map:when>

Re: Sitemap dot xmap sourcetyping

Posted by Ross Gardler <rg...@apache.org>.
David Crossley wrote:
> Ross Gardler wrote:
> 
>>David Crossley wrote:
>>
>>>Maurice Lanselle wrote:
>>
>>>>2) How does one construct "AND" classification rules?
>>>>
>>>><sourcetype name="foo">
>>>> <document-element local-name="foo"> && <document-element 
>>>> namespace="bar">
>>>></sourcetype>
>>>
>>>I don't know if that is possible or needed. Perhaps there is another way.
>>>It would be better if you provided an actual use-case.
>>
>>Using your example from earlier in this thread, AND would be:
>>
>>
>>>  <sourcetype name="foo">
>>>    <document-element local-name="foo">
>>>  </sourcetype>
>>>  <sourcetype name="bar">
>>>    <document-element namespace="bar">
>>>  </sourcetype>
>>>...
>>>...
>>>      <map:when test="foo">
>>>       <map:transform
>>>          src="{project:resources.stylesheets}/foobar2document.xsl" />
>>>      </map:when>
>>>      <map:when test="bar">
>>>       <map:transform
>>>          src="{project:resources.stylesheets}/foobar2document.xsl" />
>>>      </map:when>
>>>...
> 
> 
> Er, i am confused. That code snippet is exactly the same as the OR
> situation. Did you forget to edit the code?

Nope, I was being stupid. Seems my logic circuits blew, I thought an OR 
was something completely different. I even added a note to the effect:

>>(note the above example was directly copied for the suggestion for OR, 
>>obviously in an OR you would have different actions within the test)

Just to be clear, the above is obviously an OR, I hope Maurice saw 
through my idiocy and didn't waste much time on that.

I've had almost a whole nights sleep now, so maybe my logic circuits 
have repaired themselves. However, they are not telling me how to do an 
AND I'm afraid.

Ross

Re: Sitemap dot xmap sourcetyping

Posted by David Crossley <cr...@apache.org>.
Ross Gardler wrote:
> David Crossley wrote:
> >Maurice Lanselle wrote:
> 
> >>2) How does one construct "AND" classification rules?
> >>
> >><sourcetype name="foo">
> >>  <document-element local-name="foo"> && <document-element 
> >>  namespace="bar">
> >></sourcetype>
> >
> >I don't know if that is possible or needed. Perhaps there is another way.
> >It would be better if you provided an actual use-case.
> 
> Using your example from earlier in this thread, AND would be:
> 
> >   <sourcetype name="foo">
> >     <document-element local-name="foo">
> >   </sourcetype>
> >   <sourcetype name="bar">
> >     <document-element namespace="bar">
> >   </sourcetype>
> > ...
> > ...
> >       <map:when test="foo">
> >        <map:transform
> >           src="{project:resources.stylesheets}/foobar2document.xsl" />
> >       </map:when>
> >       <map:when test="bar">
> >        <map:transform
> >           src="{project:resources.stylesheets}/foobar2document.xsl" />
> >       </map:when>
> > ...

Er, i am confused. That code snippet is exactly the same as the OR
situation. Did you forget to edit the code?

--David

> i.e. test for each individual action and duplicate the transform code. 
> What duplicate code - that is bad (I herar you shout). You are of course 
> correct, so the sitemap provides a thing called resources to allow the 
> duplicate code to be refactored out and replaced with a 
> <map:call-resource...
>
> (note the above example was directly copied for the suggestion for OR, 
> obviously in an OR you would have different actions within the test)
> 
> See the simplified-docbook plugins input.xmap file for an example of the 
> use of resources like this.
> 
> Ross

Re: Sitemap dot xmap sourcetyping

Posted by Ross Gardler <rg...@apache.org>.
David Crossley wrote:
> Maurice Lanselle wrote:

..

>>2) How does one construct "AND" classification rules?
>>
>><sourcetype name="foo">
>>   <document-element local-name="foo"> && <document-element namespace="bar">
>></sourcetype>
> 
> 
> I don't know if that is possible or needed. Perhaps there is another way.
> It would be better if you provided an actual use-case.
> 

Using your example from earlier in this thread, AND would be:

 >   <sourcetype name="foo">
 >     <document-element local-name="foo">
 >   </sourcetype>
 >   <sourcetype name="bar">
 >     <document-element namespace="bar">
 >   </sourcetype>
 > ...
 > ...
 >       <map:when test="foo">
 >        <map:transform
 >           src="{project:resources.stylesheets}/foobar2document.xsl" />
 >       </map:when>
 >       <map:when test="bar">
 >        <map:transform
 >           src="{project:resources.stylesheets}/foobar2document.xsl" />
 >       </map:when>
 > ...

i.e. test for each individual action and duplicate the transform code. 
What duplicate code - that is bad (I herar you shout). You are of course 
correct, so the sitemap provides a thing called resources to allow the 
duplicate code to be refactored out and replaced with a 
<map:call-resource...

(note the above example was directly copied for the suggestion for OR, 
obviously in an OR you would have different actions within the test)

See the simplified-docbook plugins input.xmap file for an example of the 
use of resources like this.

Ross

Re: Sitemap dot xmap sourcetyping

Posted by David Crossley <cr...@apache.org>.
Maurice Lanselle wrote:
> David Crossley said the following on 07/06/2005 03:46:
> >Maurice Lanselle wrote:
> >>Ross Gardler said the following:
> >>
> >>>However, you should avoid limiting the URL space of your application 
> >>>by requiring a given file type to have a given filename or path. This 
> >>>can result in false matches. You should use the SourceTypeResolver, 
> >>>for an example of how see 
> >>>http://svn.apache.org/viewcvs.cgi/forrest/trunk/plugins/org.apache.forrest.plugin.input.simplified-docbook/input.xmap?view=markup 
> >>>for an example of how to do this.
> >>>
> >>I totally buy your point about not requiring a document type to have a 
> >>given file name or path.  Since I'm trying to use or produce xml files 
> >>which have the ".xml" extension rather than something more distinctive, 
> >>I would like Forrest to choose the stylesheet on the basis of the 
> >>doctype, perhaps using a catalog (like for resolving DTDs...why not 
> >>catalogs for xsl?).  ...
> >
> >No, the "catalog entity resolver" addresses a separate part of the issue.
> >It sounds like we need to enhance the documentation. Source Type Resolver,
> >actually called "SourceTypeAction (content aware pipelines)" [2] is one of
> >the key features of Forrest and so we need to explain it better.
> >
> >Lets first correct your comment about "catalog". Its use is to create
> >an efficient system for xml documents that declare a DTD so that the
> >xml parser gets a local copy rather than going across the network.
> >
> Sorry if I wasn't clear.  When I referred to using *a* catalog (not 
> *the* catalogs used for validation resolving) I meant just the concept 
> of a look-up table the pipeline could use to identify a 
> document-type-specific handling to apply.

Ah i see. I still had to correct your comments, otherwise other users
would be seriously confused. I was. Anyway, it is good because it
provides a chance to better explain these facilities.

> >Now back to Source Type Action ... It is a Cocoon sitemap component that
> >peeks at the top-part of a document to look for hints about the type
> >of the document. 
> >
> >[1] http://forrest.apache.org/docs/your-project.html#sitemap.xmap 
> >[2] http://forrest.apache.org/docs/cap.html
> >
> >These are the available methods:
> >document-declaration
> >document-element and namespace
> >processing-instruction
> >w3c-xml-schema
>
> While reading the SourceTypeAction doc, a couple of questions came to 
> mind.  I think it would be helpful to find their answers in that doc. :

I hope that you started at reference #1 and followed the background
reading. What this is showing is that we need much better documents
about our SourceTypeAction and that Cocoon needs a very high-level
document to explain its sitemap. Perhaps the Cocoon documents that
we refer to are too technical.

Sorry, i don't have much time today, so only quick comments.

The Cocoon sitemap is not a logic-based programming language.
There are no constructs like boolean AND/OR between the
sitemap elements. Its language is more like XSL.

> 1) What is the appropriate way to construct "OR" classification rules?  
> For instance, the document-element may return a local-name, a namespace, 
> or both.
>
> Should one define two (or more) rules with the same sourcetype 
> name, such as...
> 
> <sourcetype name="foo">
>    <document-element local-name="foo">
> </sourcetype>
> <sourcetype name="foo">
>    <document-element namespace="bar">
> </sourcetype>
> 
> or a single rule with a list of alternative conditions, such as...
> 
> <sourcetype name="foo">
>    <document-element local-name="foo">
>    <document-element namespace="bar">
> </sourcetype>
> 
> or is there some other syntax?

One way would be to assign different sourcetype@name attributes
in the <map:action> and then in the <pipelines> part of the sitemap
you would have separate matches which just repeat the same processing.
e.g.

  <sourcetype name="foo">
    <document-element local-name="foo">
  </sourcetype>
  <sourcetype name="bar">
    <document-element namespace="bar">
  </sourcetype>
...
...
      <map:when test="foo">
       <map:transform
          src="{project:resources.stylesheets}/foobar2document.xsl" />
      </map:when>
      <map:when test="bar">
       <map:transform
          src="{project:resources.stylesheets}/foobar2document.xsl" />
      </map:when>
...

You could also search the Cocoon documentation and wiki.
You might find other examples of <map:select> that show
better uses of the <map:when test="...
perhaps that test can be more complete.

> 2) How does one construct "AND" classification rules?
> 
> <sourcetype name="foo">
>    <document-element local-name="foo"> && <document-element namespace="bar">
> </sourcetype>

I don't know if that is possible or needed. Perhaps there is another way.
It would be better if you provided an actual use-case.

--David

> These are not urgent (for me), but I expect they will be wanted sooner 
> or later.
> 
> Regards and thanks for the communication,
> Maurice
> 
> >If you use the first technique, then the parser needs to go retrieve
> >the DTD from across the network. Hence the need for Catalog Entity 
> >Resolver.
> >
> >I don't use "w3c-xml-schema" so i am not sure if the parser is forced
> >to locate the actual schema. I gather that it doesn't. Therefore you
> >don't need to mess about with catalogs.
> >
> >Now if there are Java people out there listening, then perhaps you would
> >like to enhance the Source Type Action to enable other methods. It is in
> >the Forrest source at main/java/org/apache/forrest/sourcetype
> >
> >--David
> >
> > 
> >
> >>...  It looks like that is what happens in the 
> >><map:resources> group in the example you pointed me to (below).  My 
> >>first attempt to *bend* it to my purpose failed, however: "Type 
> >>'sourcetype' does not exist for 'map:pipeline' at..." when I replaced
> >>
> >><map:pipeline>
> >> <map:match pattern="**Resume.xml">
> >>by
> >><map:pipeline type="sourcetype" src="{src}">
> >>      <map:select type="parameter">
> >>        <map:parameter name="parameter-selector-test" 
> >>value="{sourcetype}" />
> >>        <map:when test="Resume">
> >>
> >>But one thing at a time...xslt first, then plugin/resolving.
> >>
> >>Many thanks,
> >>Maurice
> >>   
> >>
> >
> > 
> >