You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@forrest.apache.org by Ferdinand Soethe <fe...@apache.org> on 2005/10/01 11:56:14 UTC

Fixing a Howto

In trying to track down the processing of html-files (to fix the
problems with attributes disappearing and other unwanted ones
reappearing) I have gone through my own documentation

And found that it is broken as soon as we move from sitemap.xmap to
forrest.xmap.

> Open the file 'forrest.xmap' and continue the search for a matching pattern.

Could somebody pls check and correct the following analysis of what currently
happens in forrest.xmap?

1. We go through forrest.xmap looking for a match for **.xml
2. First match is

     <map:match type="wildcard" pattern="**.xml">
       <map:select type="exists">
         <map:when test="{project:temp-dir}/input.xmap">
             <map:mount uri-prefix=""
                        src="{project:temp-dir}/input.xmap"
                        check-reload="yes"
                        pass-through="true"/>
         </map:when> 
       </map:select>

   This will load and process input.xmap normally in build\tmp
   and process it if it exists.
   I guess this is to allow the input-plugins to intercept processing
   by defining their own matchers in there?

3. If input.xmap has no matches for us the next match could be

       <map:match type="i18n" pattern="{project:content.xdocs}{1}.*.html">
         <map:generate src="{source}" type="html" />
         <map:transform src="{forrest:stylesheets}/html2document.xsl" />
         <map:transform type="idgen" />
         <map:serialize type="xml-document"/>
       </map:match>

    but I'm not sure how if this is actually executed?

    If so, what does {source} stand for in

    <map:generate src="{source}" type="html" />

    I understand it is a variable but I don't understand where it is
    defined.

Thanks for your input.

--
Ferdinand






--
Ferdinand Soethe

Re: Fixing a Howto

Posted by Ross Gardler <rg...@apache.org>.

Ross Gardler wrote:
> Ferdinand Soethe wrote:
> 
>> In trying to track down the processing of html-files (to fix the
>> problems with attributes disappearing and other unwanted ones
>> reappearing) I have gone through my own documentation
>>
>> And found that it is broken as soon as we move from sitemap.xmap to
>> forrest.xmap.
>>
>>
>>> Open the file 'forrest.xmap' and continue the search for a matching 
>>> pattern.
>>
>>
>>
>> Could somebody pls check and correct the following analysis of what 
>> currently
>> happens in forrest.xmap?
>>
>> 1. We go through forrest.xmap looking for a match for **.xml
>> 2. First match is
>>
>>      <map:match type="wildcard" pattern="**.xml">
>>        <map:select type="exists">
>>          <map:when test="{project:temp-dir}/input.xmap">
>>              <map:mount uri-prefix=""
>>                         src="{project:temp-dir}/input.xmap"
>>                         check-reload="yes"
>>                         pass-through="true"/>
>>          </map:when>        </map:select>
>>
>>    This will load and process input.xmap normally in build\tmp
>>    and process it if it exists.
>>    I guess this is to allow the input-plugins to intercept processing
>>    by defining their own matchers in there?
> 
> 
> Yes, the 'pass-through="true"' attribute indicates that if no match is 
> found within the mounted sitemap then we should continue checking the 
> mounting sitemap from this point.
> 
> In other words, we will only continue processing in this sitemap if 
> there is a map:generate that is processed in input.xmap.

That last sentence should be:

In other words, we will only continue processing in this sitemap if
there is *not* a map:generate that is processed in input.xmap.

Sorry,
Ross

documenting sitemaps (was Re: Fixing a Howto)

Posted by Ross Gardler <rg...@apache.org>.

Ferdinand Soethe wrote:
> Ross Gardler wrote:

...

[concerning the refdoc project in Cocoon]

>>>And once that documentation is added, that would be easy enough to
>>>read w/o having to extract it into html-documents. So I'd consider
>>>that stuff nice to have but nothing to put much effort into.
> 
> 
>>Fair enough. Thanks.
> 
> 
> That said I'm very much in favour of adding documentation there (the
> documentation of the html-pipeline should really be the only extensive
> document of that kind) and will do so where I come across useful
> pieces of info.

As you know I'm visiting all sitemaps at the moment, and will do so 
again in the near future as I clean them up (actually, I'm thinking of 
starting from the very beginning again, rebuilding as defined in our TR 
- but that's a whole different story).

Anyway, the point is. If you can point me at a document describing how 
to write this embedded documentation, or provide some examples of the 
kind of thing you want to see, or write a how-to on it (whichever is 
appropriate), then I will do my best to add comments as I go through.

Ross

Re: Faciliatating profile use (Re: Fixing a Howto)

Posted by Ross Gardler <rg...@apache.org>.

David Crossley wrote:
> Ron Blaschke wrote:
> 
>>Ross Gardler wrote:
>>
>>>David, I am afraid to say I have not checked this out yet. Despite
>>>noting its existence and the effort you put into this (and the docs).
>>
>>>Can I make a suggestion? (I'll do this the first time I find the need to
>>>use this if it hasn't been done before me):
>>
>>>If we make an internal plugin of the profiler we can replace the 
>>>necessary pipelines in that plugin. Then, to enable profiling all we 
>>>need to do is add the plugin to forrest.properties.
>>
>>I am about to start working on a plugin.  Actually, I thought I
>>could start last weekend, but didn't quite make it.
>>
>>Most things are straightforward.  I'm only puzzled by the mix of input
>>plugin (ie, take data from the profiler data store and render it) and
>>internal plugin (ie, profile stuff and store results in profiler data
>>store).  Can a plugin be both, input and internal?
>>Second, the profiler result page must be rendered last, or it wouldn't
>>contain all profile results.  Don't know how to do this, yet.
> 
> 
> I find that the Cocoon Profiler is useful to look at the
> processing for a single page, i.e. forrest clean, forrest run,
> request index.html a few times, then look at cprofile.html
> and follow the results. Doing it for the whole of 'forrest site'
> might be useful in a different way - don't know yet.

Ron and I talked about this on IRC (see logs). I suggested making the 
profiler block a dobule purpose block - profiling and testing. Ron 
pointed me to your past discussions regarding benchmarking.

After discussion it became apparrent that making ait a test block as 
well was not a good idea (tests change often, profling/benchmarking need 
  stable content to work on).

To profile Forrest effectively we need to profile all paths through the 
pipelines. To benchmark we need to do it a number of times.

I think that having benchmark sites that has pages designed to test 
various paths (perhaps plugins can provide additional benchmark pages in 
the future) is a very good idea. We could run Forrest on these sites, 
say 1000 times. This would give us a reasonable indication (given 
consistent server spec) of whether a new release is performing well or not.

Of course, profiling individual pages is useful for testing individual 
edits. This would still be useful in the way you describe.

STATUS
------

Ron has the skeleton together, but has hit a problem. The internal.xmap 
needs to redefine the <map:pipe> elements, which can't be done - or so 
we thought on IRC. But I never thought to ask Ron if he had "just tried 
it". ROn, if you didn't try it please do, I remember reading that blocks 
can now define their own component specifications, perhaps it will work 
for mounted sitemaps too.

Ron is exploring this with Cocoon folk.

Ross

Re: Faciliatating profile use (Re: Fixing a Howto)

Posted by David Crossley <cr...@apache.org>.

Ron Blaschke wrote:
> Ross Gardler wrote:
> > David, I am afraid to say I have not checked this out yet. Despite
> > noting its existence and the effort you put into this (and the docs).
> 
> > Can I make a suggestion? (I'll do this the first time I find the need to
> > use this if it hasn't been done before me):
> 
> > If we make an internal plugin of the profiler we can replace the 
> > necessary pipelines in that plugin. Then, to enable profiling all we 
> > need to do is add the plugin to forrest.properties.
> 
> I am about to start working on a plugin.  Actually, I thought I
> could start last weekend, but didn't quite make it.
> 
> Most things are straightforward.  I'm only puzzled by the mix of input
> plugin (ie, take data from the profiler data store and render it) and
> internal plugin (ie, profile stuff and store results in profiler data
> store).  Can a plugin be both, input and internal?
> Second, the profiler result page must be rendered last, or it wouldn't
> contain all profile results.  Don't know how to do this, yet.

I find that the Cocoon Profiler is useful to look at the
processing for a single page, i.e. forrest clean, forrest run,
request index.html a few times, then look at cprofile.html
and follow the results. Doing it for the whole of 'forrest site'
might be useful in a different way - don't know yet.

-David

> > We could later enhance this by adding a command line switch to run with
> > the profiler enabled.
> 
> > In addition to making it easier to run the profiler, it will also keep
> > the sitemaps a little more tidy.
> 
> And forrest-core.xconf and lib/core, too, but only slightly.
> 
> Ron
>

Re: Faciliatating profile use (Re: Fixing a Howto)

Posted by Ross Gardler <rg...@apache.org>.

Ron Blaschke wrote:
> Ross Gardler wrote:
> 
>>Ron Blaschke wrote:
>>
>>>Ross Gardler wrote:
>>>Most things are straightforward.  I'm only puzzled by the mix of input
>>>plugin (ie, take data from the profiler data store and render it) and
>>>internal plugin (ie, profile stuff and store results in profiler data
>>>store).  Can a plugin be both, input and internal?
> 
> 
>>An internal plugin sitemap gets mounted before anything else is done. So
>>you can indeed make an internal plugin be both input and output as well
>>as internal. It is easy to break everything with internal plugins, much
>>harder with input and output plugins.
> 
> 
> Good, so this shouldn't be too hard, either.
> 
> 
>>Today is Forrest-Tuesday [1] you can find us on irc.freenod.org #for-oct
>>if you need any pointers.
> 
> 
> The one thing left that keeps me from getting started is that I don't
> know how to contribute the plugin.  I don't have any commit rights to
> Forrest, so I guess I am limited to (1) create an issue and attach my
> changes there, or (2) host the plugin myself.  The names would be
> org.apache.forrest.plugin.internal.cprofile or
> org.rblasch.forrest.plugin.internal.cprofile, respectively.

Ron and I are discussing this on IRC, you can see the logs at 20:09 
(http://casa.che-che.com/~bot/forrest/forrest.log.04Oct2005)

This includes the idea of making this profiling plugin into a wider 
ranging test plugin. Worth a read (or contribution if you are around at 
the right time).

Ross

Re: Faciliatating profile use (Re: Fixing a Howto)

Posted by Ron Blaschke <ma...@rblasch.org>.

Ross Gardler wrote:
> Ron Blaschke wrote:
>> Ross Gardler wrote:
>> Most things are straightforward.  I'm only puzzled by the mix of input
>> plugin (ie, take data from the profiler data store and render it) and
>> internal plugin (ie, profile stuff and store results in profiler data
>> store).  Can a plugin be both, input and internal?

> An internal plugin sitemap gets mounted before anything else is done. So
> you can indeed make an internal plugin be both input and output as well
> as internal. It is easy to break everything with internal plugins, much
> harder with input and output plugins.

Good, so this shouldn't be too hard, either.

> Today is Forrest-Tuesday [1] you can find us on irc.freenod.org #for-oct
> if you need any pointers.

The one thing left that keeps me from getting started is that I don't
know how to contribute the plugin.  I don't have any commit rights to
Forrest, so I guess I am limited to (1) create an issue and attach my
changes there, or (2) host the plugin myself.  The names would be
org.apache.forrest.plugin.internal.cprofile or
org.rblasch.forrest.plugin.internal.cprofile, respectively.

Any thoughts?

 >> Second, the profiler result page must be rendered last, or it wouldn't
 >> contain all profile results.  Don't know how to do this, yet.

> Hmmm... that's an interesting one. I don't think we can make the CLI 
> process pages in a particular order, although I may be wrong.

I've tried adding the profile page as arg to ...cocoon.Main after
${project.start-uri} in main/targets/site.xml, and this seemed to
work.  If by design or accident I don't know.
So it might actually be a good fit to enable profiling via a command
line option.

Ron

Re: Faciliatating profile use (Re: Fixing a Howto)

Posted by Ross Gardler <rg...@apache.org>.

Ron Blaschke wrote:
> Ross Gardler wrote:
> 
>>David, I am afraid to say I have not checked this out yet. Despite
>>noting its existence and the effort you put into this (and the docs).
> 
> 
>>Can I make a suggestion? (I'll do this the first time I find the need to
>>use this if it hasn't been done before me):
> 
> 
>>If we make an internal plugin of the profiler we can replace the 
>>necessary pipelines in that plugin. Then, to enable profiling all we 
>>need to do is add the plugin to forrest.properties.
> 
> 
> I am about to start working on a plugin.  Actually, I thought I
> could start last weekend, but didn't quite make it.
> 
> Most things are straightforward.  I'm only puzzled by the mix of input
> plugin (ie, take data from the profiler data store and render it) and
> internal plugin (ie, profile stuff and store results in profiler data
> store).  Can a plugin be both, input and internal?

An internal plugin sitemap gets mounted before anything else is done. So 
you can indeed make an internal plugin be both input and output as well 
as internal. It is easy to break everything with internal plugins, much 
harder with input and output plugins.

Today is Forrest-Tuesday [1] you can find us on irc.freenod.org #for-oct 
if you need any pointers.

 > Second, the profiler result page must be rendered last, or it wouldn't
 > contain all profile results.  Don't know how to do this, yet.

Hmmm... that's an interesting one. I don't think we can make the CLI 
process pages in a particular order, although I may be wrong.

If we can't do that then we could add Anteater or WebTest to the block 
and have some standard profiling scripts.

Ross

[1] http://forrest.apache.org/forrest-tuesday.html

Re: Faciliatating profile use (Re: Fixing a Howto)

Posted by Ron Blaschke <ma...@rblasch.org>.

Ross Gardler wrote:
> David, I am afraid to say I have not checked this out yet. Despite
> noting its existence and the effort you put into this (and the docs).

> Can I make a suggestion? (I'll do this the first time I find the need to
> use this if it hasn't been done before me):

> If we make an internal plugin of the profiler we can replace the 
> necessary pipelines in that plugin. Then, to enable profiling all we 
> need to do is add the plugin to forrest.properties.

I am about to start working on a plugin.  Actually, I thought I
could start last weekend, but didn't quite make it.

Most things are straightforward.  I'm only puzzled by the mix of input
plugin (ie, take data from the profiler data store and render it) and
internal plugin (ie, profile stuff and store results in profiler data
store).  Can a plugin be both, input and internal?
Second, the profiler result page must be rendered last, or it wouldn't
contain all profile results.  Don't know how to do this, yet.

> We could later enhance this by adding a command line switch to run with
> the profiler enabled.

> In addition to making it easier to run the profiler, it will also keep
> the sitemaps a little more tidy.

And forrest-core.xconf and lib/core, too, but only slightly.

Ron

Faciliatating profile use (Re: Fixing a Howto)

Posted by Ross Gardler <rg...@apache.org>.

David Crossley wrote:
> Ferdinand Soethe wrote:
> 
>>Ross Gardler wrote:
>>
>>
>>>>It's not actually what I was looking for (Which was a mechanism of
>>>>logging the way of requests through a sitemap as a way of debugging
>>>>them; would have been very useful for my current html-problem).
>>
>>>I think the profiler does that - I've never played with it but David and
>>>Ron have done some work with it.
>>
>>Hmmm. That's interesting. I'll try and figure out how to run that to
>>follow my html-processing.
> 
> 
> There is various discussion in the mail archives.
> Most of it arises due to investigating
> http://issues.apache.org/jira/browse/FOR-572
> 
> A while ago i documented how to use it here:
> http://forrest.apache.org/docs_0_80/howto/howto-dev.html#debug
> 
> It was praised here:
> http://marc.theaimsgroup.com/?t=112650986400002
> Re: Using the Cocoon sitemap profiler
> 
> If you have any observations to add, then please
> add to that thread or the xdoc.
> 
> I cannot suitably stress how useful it is as a
> development tool.

David, I am afraid to say I have not checked this out yet. Despite 
noting its existence and the effort you put into this (and the docs).

Can I make a suggestion? (I'll do this the first time I find the need to 
use this if it hasn't been done before me):

If we make an internal plugin of the profiler we can replace the 
necessary pipelines in that plugin. Then, to enable profiling all we 
need to do is add the plugin to forrest.properties.

We could later enhance this by adding a command line switch to run with 
the profiler enabled.

In addition to making it easier to run the profiler, it will also keep 
the sitemaps a little more tidy.

Ross

Re: Fixing a Howto

Posted by David Crossley <cr...@apache.org>.

Ferdinand Soethe wrote:
> Ross Gardler wrote:
> 
> >> It's not actually what I was looking for (Which was a mechanism of
> >> logging the way of requests through a sitemap as a way of debugging
> >> them; would have been very useful for my current html-problem).
> 
> > I think the profiler does that - I've never played with it but David and
> > Ron have done some work with it.
> 
> Hmmm. That's interesting. I'll try and figure out how to run that to
> follow my html-processing.

There is various discussion in the mail archives.
Most of it arises due to investigating
http://issues.apache.org/jira/browse/FOR-572

A while ago i documented how to use it here:
http://forrest.apache.org/docs_0_80/howto/howto-dev.html#debug

It was praised here:
http://marc.theaimsgroup.com/?t=112650986400002
Re: Using the Cocoon sitemap profiler

If you have any observations to add, then please
add to that thread or the xdoc.

I cannot suitably stress how useful it is as a
development tool.

-David

Re: Fixing a Howto

Posted by Ferdinand Soethe <fe...@apache.org>.

Ross Gardler wrote:

>> It's not actually what I was looking for (Which was a mechanism of
>> logging the way of requests through a sitemap as a way of debugging
>> them; would have been very useful for my current html-problem).

> I think the profiler does that - I've never played with it but David and
> Ron have done some work with it.

Hmmm. That's interesting. I'll try and figure out how to run that to
follow my html-processing.

>> I looked at it anyway and briefly wondered why the documentation
>> happens in SGML-comments rather then xml-elements with their own
>> namespace.

> I guess it will be because you might use it to document other file types
> where namespaces are meaningless, but comments are available everywhere.

That does make sense and it doesn't. If your implementing it for
different languages you'd have to have different parsers to accommodate
for different comment syntax anyway, so why not use one for xml-type
documents that is standard in their world.

But I realize that that's off topic and beyond the point ...

>> Apart from that I see not much progress from using DOCTOR (except that
>> we should of course use the documentation structure that this
>> requires) because we don't have much documentation in our sitemaps
>> right now.

> OK

>> And once that documentation is added, that would be easy enough to
>> read w/o having to extract it into html-documents. So I'd consider
>> that stuff nice to have but nothing to put much effort into.

> Fair enough. Thanks.

That said I'm very much in favour of adding documentation there (the
documentation of the html-pipeline should really be the only extensive
document of that kind) and will do so where I come across useful
pieces of info.

--
Ferdinand Soethe

Re: Fixing a Howto

Posted by Ross Gardler <rg...@apache.org>.

Ferdinand Soethe wrote:
> 
> 
> 
> 
> 
> Ross Gardler wrote:
> 
> 
>>I'd like to consider using the recent Google SoC project over in Cocoon
>>that created auto documenting sitemaps. If I understand correctly it 
>>does what you wanted to do some time ago, in that we can annotate the 
>>sitemaps with Javadoc like comments for generating documentation.
> 
> 
>>Do you have the time/inclination to take a look at it? [2] and [3]
> 
> 
> Hi Ross,
> 
> thanks for that pointer.
> 
> It's not actually what I was looking for (Which was a mechanism of
> logging the way of requests through a sitemap as a way of debugging
> them; would have been very useful for my current html-problem).

I think the profiler does that - I've never played with it but David and 
Ron have done some work with it.

> I looked at it anyway and briefly wondered why the documentation
> happens in SGML-comments rather then xml-elements with their own
> namespace.

I guess it will be because you might use it to document other file types 
where namespaces are meaningless, but comments are available everywhere.

> Apart from that I see not much progress from using DOCTOR (except that
> we should of course use the documentation structure that this
> requires) because we don't have much documentation in our sitemaps
> right now.

OK

> And once that documentation is added, that would be easy enough to
> read w/o having to extract it into html-documents. So I'd consider
> that stuff nice to have but nothing to put much effort into.

Fair enough. Thanks.

Ross

Re: Fixing a Howto

Posted by Ferdinand Soethe <fe...@apache.org>.

Ross Gardler wrote:

> I'd like to consider using the recent Google SoC project over in Cocoon
> that created auto documenting sitemaps. If I understand correctly it 
> does what you wanted to do some time ago, in that we can annotate the 
> sitemaps with Javadoc like comments for generating documentation.

> Do you have the time/inclination to take a look at it? [2] and [3]

Hi Ross,

thanks for that pointer.

It's not actually what I was looking for (Which was a mechanism of
logging the way of requests through a sitemap as a way of debugging
them; would have been very useful for my current html-problem).

I looked at it anyway and briefly wondered why the documentation
happens in SGML-comments rather then xml-elements with their own
namespace.

Apart from that I see not much progress from using DOCTOR (except that
we should of course use the documentation structure that this
requires) because we don't have much documentation in our sitemaps
right now.

And once that documentation is added, that would be easy enough to
read w/o having to extract it into html-documents. So I'd consider
that stuff nice to have but nothing to put much effort into.

--
Ferdinand Soethe

Re: Fixing a Howto

Posted by Ross Gardler <rg...@apache.org>.

Ferdinand Soethe wrote:
> In trying to track down the processing of html-files (to fix the
> problems with attributes disappearing and other unwanted ones
> reappearing) I have gone through my own documentation
> 
> And found that it is broken as soon as we move from sitemap.xmap to
> forrest.xmap.
> 
> 
>>Open the file 'forrest.xmap' and continue the search for a matching pattern.
> 
> 
> Could somebody pls check and correct the following analysis of what currently
> happens in forrest.xmap?
> 
> 1. We go through forrest.xmap looking for a match for **.xml
> 2. First match is
> 
>      <map:match type="wildcard" pattern="**.xml">
>        <map:select type="exists">
>          <map:when test="{project:temp-dir}/input.xmap">
>              <map:mount uri-prefix=""
>                         src="{project:temp-dir}/input.xmap"
>                         check-reload="yes"
>                         pass-through="true"/>
>          </map:when> 
>        </map:select>
> 
>    This will load and process input.xmap normally in build\tmp
>    and process it if it exists.
>    I guess this is to allow the input-plugins to intercept processing
>    by defining their own matchers in there?

Yes, the 'pass-through="true"' attribute indicates that if no match is 
found within the mounted sitemap then we should continue checking the 
mounting sitemap from this point.

In other words, we will only continue processing in this sitemap if 
there is a map:generate that is processed in input.xmap.

> 3. If input.xmap has no matches for us the next match could be
> 
>        <map:match type="i18n" pattern="{project:content.xdocs}{1}.*.html">
>          <map:generate src="{source}" type="html" />
>          <map:transform src="{forrest:stylesheets}/html2document.xsl" />
>          <map:transform type="idgen" />
>          <map:serialize type="xml-document"/>
>        </map:match>
> 
>     but I'm not sure how if this is actually executed?
> 
>     If so, what does {source} stand for in
> 
>     <map:generate src="{source}" type="html" />
> 
>     I understand it is a variable but I don't understand where it is
>     defined.


"{source}: The URI of the source that matched" [1] (much more detail there)

NOTE: due to the work I am currently doing on the xmaps to integrate the 
locationmap some of these match elements are changing. I am *not* 
currently updating documentation as well, however, we do need to do that.

I'd like to consider using the recent Google SoC project over in Cocoon 
that created auto documenting sitemaps. If I understand correctly it 
does what you wanted to do some time ago, in that we can annotate the 
sitemaps with Javadoc like comments for generating documentation.

Do you have the time/inclination to take a look at it? [2] and [3]

Ross

[1] 
http://cocoon.apache.org/2.1/apidocs/org/apache/cocoon/matching/LocaleMatcher.html

[2] http://wiki.apache.org/cocoon/CocoonRefDocProject

[3] http://svn.apache.org/repos/asf/cocoon/gsoc/rgraham/refdoc/