You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@forrest.apache.org by Nicola Ken Barozzi <ni...@apache.org> on 2003/06/16 19:52:40 UTC

Speed (was Re: Forrest 0.5 outstanding issues...))

Nicola Ken Barozzi wrote, On 16/06/2003 17.25:

> Nicola Ken Barozzi wrote, On 16/06/2003 16.13:
> 
> ...
> 
>>>  - Speed!  Need to haul out the profiler and find out why normal Cocoon
>>>    CLI rendering is 5-10x faster than Forrest.
>>
>>
>>
>> +1
>>
>> I don't know if/when I'll have time, but I'll surely want to work on 
>> this.
> 
> 
> Ok, so I ran hprof on it, with the krysalis-site skin ans PDFS, and it 
> seems to tell me that most of the time is used in - guess what - in Xalan.
> 
> So I though, hey, it's our bl--dy skin that sucks, with wierd templates.
> Guess what, this is what I got (all with PDF generation on):
> 
>  template:     31 seconds
>  krysalis:     35 seconds
>  forrest:      34 seconds
>  forrest-css:  36 seconds
> 
> Wait a sec, it's not only the kind of skin then. It's about the number 
> and types of xsl transformations that it has to go through!
> 
> So, what I will have to do now is to do a profile on Cocoon, because the 
> java one has quite clearly shown that the problem is not in link 
> translation or other Forrest-specific component, but in the xsl 
> transformations, where they are, and what they do.
> 
> Also, IIUC we are still using the old crawling method, is this correct?

Ok, done some profiling.
Attached is the result for a page.

First of all, there is a lot of difference between the first invocation
and the 4,5th one, that takes 50% less time. This happens also with the
non-caching pipeline, even if in the report there is not anymore as I
exceeded 10 calls (believe me).

But the major thing that I see is this in the non-caching body-index.html.

   20+1+1+39+49+3 != 355 !!!!

There are 242ms missing! Wierd, huh?
And if you look in the caching one, the divide seems even worse!

So, the real question to answer now is:

      What takes so long in the creation of the page body?

Then there are the tabs and the menu, that can be reused a lot with the
caching pipeline, and that shows in the numbers. So I guess that the
problem lies somewhere in body-xxx.html...

-- 
Nicola Ken Barozzi                   nicolaken@apache.org
             - verba volant, scripta manent -
    (discussions get forgotten, just code remains)
---------------------------------------------------------------------

Re: Speed (was Re: Forrest 0.5 outstanding issues...))

Posted by Stefano Mazzocchi <st...@apache.org>.
on 6/17/03 7:42 AM Jeff Turner wrote:

> On Mon, Jun 16, 2003 at 07:52:40PM +0200, Nicola Ken Barozzi wrote:
> ....
> 
>>Ok, done some profiling.
>>Attached is the result for a page.
>>
>>First of all, there is a lot of difference between the first invocation
>>and the 4,5th one, that takes 50% less time. This happens also with the
>>non-caching pipeline, even if in the report there is not anymore as I
>>exceeded 10 calls (believe me).
>>
>>But the major thing that I see is this in the non-caching body-index.html.
>>
>>  20+1+1+39+49+3 != 355 !!!!
>>
>>There are 242ms missing! Wierd, huh?
> 
> 
> Even weirder, if you add up the times:
> 
> 130 body-index.html
> 50  menu-index.html
> 28  tab-index.html
> 210 site2xhtml
> 
> That adds up to 418ms, but the page total is around 1s.  So over half our
> time is being lost somewhere inside Cocoon or Jetty.

Or the profiler's precision sucks ass (which would be the one I would
bet on).

-- 
Stefano.



Re: Speed (was Re: Forrest 0.5 outstanding issues...))

Posted by Jeff Turner <je...@apache.org>.
On Mon, Jun 16, 2003 at 07:52:40PM +0200, Nicola Ken Barozzi wrote:
...
> Ok, done some profiling.
> Attached is the result for a page.
> 
> First of all, there is a lot of difference between the first invocation
> and the 4,5th one, that takes 50% less time. This happens also with the
> non-caching pipeline, even if in the report there is not anymore as I
> exceeded 10 calls (believe me).
> 
> But the major thing that I see is this in the non-caching body-index.html.
> 
>   20+1+1+39+49+3 != 355 !!!!
> 
> There are 242ms missing! Wierd, huh?

Even weirder, if you add up the times:

130 body-index.html
50  menu-index.html
28  tab-index.html
210 site2xhtml

That adds up to 418ms, but the page total is around 1s.  So over half our
time is being lost somewhere inside Cocoon or Jetty.

Wild hypothesising..

You said that hprof indicated that Xalan was taking a lot of time.
Perhaps Avalon is doing lots of Xalan setups/teardowns, instead of
properly reusing the transformer, and this time isn't logged, as it is
done before the XSLT transformer is invoked...

The performance issue I was _expecting_ to cause trouble is the frequent
use of 'cocoon:' URLs.  Flattened out, we have:

<map:match pattern="*.html">
  <map:aggregate>
    <map:part src="cocoon:/body-{1}.html"/>
    ...
  <map:serialize type="html"/>
</map:match>

<map:match pattern="body-*.html">
  <map:generate src="cocoon://{1}.xml"/>
  ...
  <map:serialize type="xml"/>
</map:match>

<map:match pattern="*.xml">
  <map:generate src="content/xdocs/{0}"/>
  ...
  <map:serialize type="xml"/>
</map:match>

So for every URL, we create, serialize and reparse a DOM, twice.

In general, I think that the number of XSLTs applied is only going to
increase in the future.  In particular, I was thinking we could implement
a macro facility (a modern replacement for entities), eg,
<cocoon:version/> being replaced with '2.1-dev'.  Each macro can be
implemented as a stylesheet (probably STX).


--Jeff