You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cocoon.apache.org by "Eric E. Meyer" <er...@quoininc.com> on 2005/03/17 18:30:56 UTC

Cocoon Performance Woes, Is it flow? I don't know!

Cocoon Performance Woes, Is it flow? I don't know!

Hello all,

My team developed and deployed a web application for a client which is 
built on top of Cocoon (www.fivestaralliance.com).

I have spent over two weeks now attempting to improve the
performance and scalability of the application – with some real
improvements. However, I continue to feel like I'm flying blind –
because the apparent bottlenecks are somewhere outside of my code. I
have read a number of previous debates about javascript flow being slow,
and we are using CForms with Javascript flow, but my main concern is
that I cannot determine where the bottlenecks are.

Even when the application is bogging down, the components that my team
wrote are performing their tasks (generation, transformation,
flowscripting) at very fast rates (as seen by tracing/logging), but
something else in the framework/process is bogging down the requests.
The analysis tools don't show any obvious resource limitation even at
the highest loading levels.

Some pages in the site scale quite well, while one in particular does
not scale well at all. While I know that I'm close to having a system
that runs blindingly fast, I'm currently faced with a situation where I
cannot effectively argue that the architecture isn't "fundamentally
flawed" and I'm unable to address a major scalability concern for my
client. I would welcome any concrete suggestions on how to better
determine my bottlenecks and any additional tuning advice.

Brief description of the application architecture

CForms, Javascript flow, mix of JX template generators, XInclude
transformer and custom generator transformers. Core application
components implemented in Java. Hibernate persistence.

Profiling and Monitoring

My biggest problem is that I've only been able to determine where the
problem isn't at this point. I've used a variety of tools to attempt to
see what's going on, and why the application is bogging down, but I
cannot seem to get a comprehensive picture of what delays/bottlenecks
there are within Cocoon.

Specifically, it would be extremely helpful to monitor the number of
generators/transformers/other pooled components in use, allocated, freed
while under load. Additionally, it would be useful to see the time taken
up by each of the steps in the process of servicing a request – not just
the set-up and generation/transform times as shown by the Cocoon profiler.

These are the tools/approaches that I have used:

Multi-thread load test with JMeter.
Profiling of application code using JProbe (CPU and memory analysis).
Profiling of Cocoon components using Cocoon pipeline profiler.
Monitoring of Cocoon components using the Cocoon instrumentation client.
Monitoring of Cocoon server using Status generator.
Monitored Linux system activity with SAR, iostat, mpstat and vmstat
during load-testing.
Profiling of custom generators, transformers and flowscript using
Jakarata Commons StopWatch and log statements.

Tweaks made thus far

Adjusted Java virtual machine parameters
	-server -Xms512M -Xmx512M
Adjust logging levels - turned down logging
Adjusted thread pool sizes in Tomcat
	150 -> 350 max
Adjusted database connection pool size up to 50
Adjusted sitemap component pool sizes up
Optimized some Java code based upon JProbe profiling
Added additional objects to the in-memory cache to reduce database queries
Turned off reloading of sitemaps and javascript files
Replaced default Cocoon JCS cache with Whirlycache
Replaced default Cocoon Xalan XSL transformer with faster Saxon XSL
transformer
Configured Cocoon to reuse XML parsers
Removed Cocoon store janitor
Preloading key OO javascript flowscript at server startup

Observations

Windows XP
Pentium 4 1.8Ghz
JDK 1.4.2_06
Tomcat 5.0.28
Cocoon 2.1.5.1
512MB physical RAM
JVM -server -Xms256M -Xmx256M

Load test with num users threads each making 5 successive request in a
loop with approximately 3 second think time between requests. No derived
resources – only the main page.

users overall home  search1 search2 search3  detail  total
       avg ms  page                                   num reqs

10      486    208      637    588     644     355    500
20     1704    378     2684   1837    2875     745    500
30     3725    682     5987   4626    6270    1059    450
40    19461   1411    36021  23089   34726    2059    600
50    72942   3213   130482  90993  131666    8356    500

home page: /
search1: /luxury_hotels/europe__france__paris/index.html
search2: /luxury_hotels/bahamas_%26_the_caribbean/beach_resort/index.html
search3:
/luxury_hotels/europe__france__paris/city_centre_location/index.html
detail: /luxury_hotel/new_york,_ny/the_carlyle

Platform:

Development
Windows XP
JDK 1.4.2_06
Tomcat 5.0.28
Cocoon 2.1.5.1

Deployment
Linux 2.6.x

We see similar degradation on Linux as on Windows.

The home page has no flowscript or cforms, but does have jxtemplate
generation, xinclude, xslt, and a custom generator.

The search and detail pages include a cform, and are therefore driven
with flowscript at the top-level matching (and create continuations in
the process of displaying their forms). These pages use jxtemplate
generation, xinclude, xslt, custom generation, custom transformation,
and internal-only sub pipelines. When looking at the pipeline times with
a profiling pipeline, the total times (while under load) are much higher
that the displayed times for the setup and generation steps -- so where
is the time going?

Regards,
Eric Meyer





Re: Cocoon Performance Woes, Is it flow? I don't know!

Posted by Ralph Goers <Ra...@dslextreme.com>.
JProbe doesn't give you an indication as to what methods most of the 
time is spent in?  That should be extremely helpful.

Eric E. Meyer wrote:

>
> These are the tools/approaches that I have used:
>
> Multi-thread load test with JMeter.
> Profiling of application code using JProbe (CPU and memory analysis).
> Profiling of Cocoon components using Cocoon pipeline profiler.
> Monitoring of Cocoon components using the Cocoon instrumentation client.
> Monitoring of Cocoon server using Status generator.
> Monitored Linux system activity with SAR, iostat, mpstat and vmstat
> during load-testing.
> Profiling of custom generators, transformers and flowscript using
> Jakarata Commons StopWatch and log statements.
>
> Regards,
> Eric Meyer
>
>
>
>


Re: Cocoon Performance Woes, Is it flow? I don't know!

Posted by "Eric E. Meyer" <er...@quoininc.com>.
>>Haha :-) calls for "instrumentation"
>>
>>Cheers
>>
>>Giacomo
>>
>>PS: Eric, sorry, this is an insider joke you might not understand but I
>>couldn't resist to do as 6 month ago the Instrumentation code was removed.
>>
>>    
>>
>Yes, but only in 2.2 - it should still work in 2.1.x (but I have not
>tested it)
>
>Carsten
>  
>
I actually used the cocoon instrumentation client, but it didn't show me 
the allocation of the various generators, transformers, and serializers 
-- which would be helpful to tune my component pool sizes.

Regards,
Eric Meyer

Re: Cocoon Performance Woes, Is it flow? I don't know!

Posted by Carsten Ziegeler <cz...@apache.org>.
Giacomo Pati wrote:
> 

> 
> Haha :-) calls for "instrumentation"
> 
> Cheers
> 
> Giacomo
> 
> PS: Eric, sorry, this is an insider joke you might not understand but I
> couldn't resist to do as 6 month ago the Instrumentation code was removed.
> 
Yes, but only in 2.2 - it should still work in 2.1.x (but I have not
tested it)

Carsten

-- 
Carsten Ziegeler - Open Source Group, S&N AG
http://www.s-und-n.de
http://www.osoco.org/weblogs/rael/

Re: Cocoon Performance Woes, Is it flow? I don't know!

Posted by Giacomo Pati <gi...@apache.org>.

Eric E. Meyer wrote:
> Bertrand Delacretaz wrote:
> 
>> Le 17 mars 05, à 18:30, Eric E. Meyer a écrit :
>>
>>> ...Some pages in the site scale quite well, while one in particular does
>>> not scale well at all..
>>
>>
>>
>> Did you try to run tests on just that page, many same requests, and 
>> try removing components from the page generation path, one after the 
>> other?
>>
>> This should help you isolate the component that causes the problem, to 
>> narrow your search.
>>
>> You could also create a request-specific logger, one for each request, 
>> store it in the request context and use it to log times at various 
>> stages, with a unique request ID. This would help find worst case times.
>>
>> -Bertrand
> 
> 
> These are great ideas. Any other suggestions as to how to get a better 
> picture of what's going on? It's been a bit frustrating trying to get a 
> complete picture? For example, is there a way to see monitor the avalon 
> component pool sizes and utilization?

Haha :-) calls for "instrumentation"

Cheers

Giacomo

PS: Eric, sorry, this is an insider joke you might not understand but I
couldn't resist to do as 6 month ago the Instrumentation code was removed.

-- 
Giacomo Pati
Otego AG, Switzerland - http://www.otego.com
Orixo, the XML business alliance - http://www.orixo.com

Re: Cocoon Performance Woes, Is it flow? I don't know!

Posted by "Eric E. Meyer" <er...@quoininc.com>.
Bertrand Delacretaz wrote:

> Le 17 mars 05, à 18:30, Eric E. Meyer a écrit :
>
>> ...Some pages in the site scale quite well, while one in particular does
>> not scale well at all..
>
>
> Did you try to run tests on just that page, many same requests, and 
> try removing components from the page generation path, one after the 
> other?
>
> This should help you isolate the component that causes the problem, to 
> narrow your search.
>
> You could also create a request-specific logger, one for each request, 
> store it in the request context and use it to log times at various 
> stages, with a unique request ID. This would help find worst case times.
>
> -Bertrand

These are great ideas. Any other suggestions as to how to get a better 
picture of what's going on? It's been a bit frustrating trying to get a 
complete picture? For example, is there a way to see monitor the avalon 
component pool sizes and utilization?

Regards,
Eric Meyer

Re: Cocoon Performance Woes, Is it flow? I don't know!

Posted by Bertrand Delacretaz <bd...@apache.org>.
Le 17 mars 05, à 18:30, Eric E. Meyer a écrit :
> ...Some pages in the site scale quite well, while one in particular 
> does
> not scale well at all..

Did you try to run tests on just that page, many same requests, and try 
removing components from the page generation path, one after the other?

This should help you isolate the component that causes the problem, to 
narrow your search.

You could also create a request-specific logger, one for each request, 
store it in the request context and use it to log times at various 
stages, with a unique request ID. This would help find worst case 
times.

-Bertrand

Re: Cocoon Performance Woes, Is it flow? I don't know!

Posted by Jorg Heymans <jh...@domek.be>.

Eric E. Meyer wrote:
> 
> Also, for curiosity, I re-ran the tests with the embedded form in the 
> search results page removed. That had a positive impact on performance:
> 

A while ago i had performance problems using the bindings. My post at 
[1] contains a few things to watch out for when combining CForms with 
database retrieval code.

MTH
Jorg

[1] http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=109603234405613&w=2


Re: Cocoon Performance Woes, Is it flow? I don't know!

Posted by "Eric E. Meyer" <er...@quoininc.com>.
Eric E. Meyer wrote:

> Sylvain Wallez wrote:
>
>> Eric E. Meyer wrote:
>>
>>> Also, for curiosity, I re-ran the tests with the embedded form in 
>>> the search results page removed. That had a positive impact on 
>>> performance:
>>>
>>> With refine-search form:
>>>
>>> users overall home  search1 search2 search3  detail  total
>>>      avg ms  page                                   num reqs
>>>
>>> 10      486    208      637    588     644     355    500
>>> 20     1704    378     2684   1837    2875     745    500
>>> 30     3725    682     5987   4626    6270    1059    450
>>> 40    19461   1411    36021  23089   34726    2059    600
>>> 50    72942   3213   130482  90993  131666    8356    500
>>>
>>> Without refine-search form:
>>> 10     645     518      646    725     711     627    500
>>> 20     800     266     1116    963    1024     629    500
>>> 30    2539     634     4451   3003    3908     698    450
>>> 40    4716    1430     6936   4829    8040    2347    600
>>> 50    7978    1328    13063   8669   14122    2710    500
>>
>>
>>
>>
>> The difference is impressive. Do I read well that it's 10 times 
>> faster with 50 users without forms? How are the parallel users defined?
>
>
> Using JMeter, 50 threads each requesting the five pages in sequence - 
> with an approximately 3 second wait time between page requests. 
> Looking with Ethereal, I can seen that the client is not using 
> cookies, and that each thread appears to have it's own keep-alive 
> connection.
>
> Note that flow is used even in the case when I removed the 
> refine-search form, as the top-level configuration of the search based 
> upon the URL is done by some flowscript that ultimately calls 
> cocoon.sendPage() to serve the results page. That second pipeline 
> normally has top-level flow to create and bind the refine search form, 
> and that's what I removed.
>
>> Execution of a flowscript is synchronized on the global variable 
>> scope, which is bound to the session. Although this shouldn't be a 
>> problem in real world as a single user is not very likely to send 
>> parallel requests, you should verify that your load testing engine 
>> uses different sessions (or no session at all) for the simulated 
>> concurrent users. That may explain these numbers.
>>
So could there be contention in this scenario:
the top-level sitemap match for /luxury_hotels/**/*.html invokes 
flowscript which sets some request and session parameters and then calls 
cocoon.sendPage("/find_luxury_hotels/hotelSearchResults_1.html"). The 
match for /find_luxury_hotels/hotelSearchResults_*.html in a separate 
mounted sub-sitemap calls flowscript which pulls somethings out of the 
request and session and uses CForms to create and bind the refine search 
form, ultimately calling form.showForm with a xhtml template that 
presents the search results page with the embedded form xincluded.

Regards,
Eric

Re: Cocoon Performance Woes, Is it flow? I don't know!

Posted by "Eric E. Meyer" <er...@quoininc.com>.
Ralph Goers wrote:

> So does this mean the problem you are having seems to be with cocoon 
> forms or that specific form? It sounds like you have eliminated flow 
> as the culprit?

I don't think it means that completely... I have other pages in my load 
test that also have cocoon forms (though somewhat simpler) that don't 
exhibit the same performance scalability problems (the hotel detail 
page, concretely). That's why I'm wondering if there is an interaction 
between flow calling cocoon.sendPage() which invokes flow in another 
sitemap.

Regards,
Eric

>
>>
>> Using JMeter, 50 threads each requesting the five pages in sequence - 
>> with an approximately 3 second wait time between page requests. 
>> Looking with Ethereal, I can seen that the client is not using 
>> cookies, and that each thread appears to have it's own keep-alive 
>> connection.
>>
>> Note that flow is used even in the case when I removed the 
>> refine-search form, as the top-level configuration of the search 
>> based upon the URL is done by some flowscript that ultimately calls 
>> cocoon.sendPage() to serve the results page. That second pipeline 
>> normally has top-level flow to create and bind the refine search 
>> form, and that's what I removed.
>>
>> Regards,
>> Eric
>
>
>


Re: Cocoon Performance Woes, Is it flow? I don't know!

Posted by Ralph Goers <Ra...@dslextreme.com>.
So does this mean the problem you are having seems to be with cocoon 
forms or that specific form? It sounds like you have eliminated flow as 
the culprit?

Ralph
 
Eric E. Meyer wrote:

>
> Using JMeter, 50 threads each requesting the five pages in sequence - 
> with an approximately 3 second wait time between page requests. 
> Looking with Ethereal, I can seen that the client is not using 
> cookies, and that each thread appears to have it's own keep-alive 
> connection.
>
> Note that flow is used even in the case when I removed the 
> refine-search form, as the top-level configuration of the search based 
> upon the URL is done by some flowscript that ultimately calls 
> cocoon.sendPage() to serve the results page. That second pipeline 
> normally has top-level flow to create and bind the refine search form, 
> and that's what I removed.
>
> Regards,
> Eric



Re: Cocoon Performance Woes, Is it flow? I don't know!

Posted by "Eric E. Meyer" <er...@quoininc.com>.
Sylvain Wallez wrote:

> Eric E. Meyer wrote:
>
>> Also, for curiosity, I re-ran the tests with the embedded form in the 
>> search results page removed. That had a positive impact on performance:
>>
>> With refine-search form:
>>
>> users overall home  search1 search2 search3  detail  total
>>      avg ms  page                                   num reqs
>>
>> 10      486    208      637    588     644     355    500
>> 20     1704    378     2684   1837    2875     745    500
>> 30     3725    682     5987   4626    6270    1059    450
>> 40    19461   1411    36021  23089   34726    2059    600
>> 50    72942   3213   130482  90993  131666    8356    500
>>
>> Without refine-search form:
>> 10     645     518      646    725     711     627    500
>> 20     800     266     1116    963    1024     629    500
>> 30    2539     634     4451   3003    3908     698    450
>> 40    4716    1430     6936   4829    8040    2347    600
>> 50    7978    1328    13063   8669   14122    2710    500
>
>
>
> The difference is impressive. Do I read well that it's 10 times faster 
> with 50 users without forms? How are the parallel users defined?

Using JMeter, 50 threads each requesting the five pages in sequence - 
with an approximately 3 second wait time between page requests. Looking 
with Ethereal, I can seen that the client is not using cookies, and that 
each thread appears to have it's own keep-alive connection.

Note that flow is used even in the case when I removed the refine-search 
form, as the top-level configuration of the search based upon the URL is 
done by some flowscript that ultimately calls cocoon.sendPage() to serve 
the results page. That second pipeline normally has top-level flow to 
create and bind the refine search form, and that's what I removed.

> Execution of a flowscript is synchronized on the global variable 
> scope, which is bound to the session. Although this shouldn't be a 
> problem in real world as a single user is not very likely to send 
> parallel requests, you should verify that your load testing engine 
> uses different sessions (or no session at all) for the simulated 
> concurrent users. That may explain these numbers.
>
> Sylvain
>
Regards,
Eric

Re: Cocoon Performance Woes, Is it flow? I don't know!

Posted by Sylvain Wallez <sy...@apache.org>.
Eric E. Meyer wrote:

> Also, for curiosity, I re-ran the tests with the embedded form in the 
> search results page removed. That had a positive impact on performance:
>
> With refine-search form:
>
> users overall home  search1 search2 search3  detail  total
>      avg ms  page                                   num reqs
>
> 10      486    208      637    588     644     355    500
> 20     1704    378     2684   1837    2875     745    500
> 30     3725    682     5987   4626    6270    1059    450
> 40    19461   1411    36021  23089   34726    2059    600
> 50    72942   3213   130482  90993  131666    8356    500
>
> Without refine-search form:
> 10     645     518      646    725     711     627    500
> 20     800     266     1116    963    1024     629    500
> 30    2539     634     4451   3003    3908     698    450
> 40    4716    1430     6936   4829    8040    2347    600
> 50    7978    1328    13063   8669   14122    2710    500


The difference is impressive. Do I read well that it's 10 times faster 
with 50 users without forms? How are the parallel users defined?

Execution of a flowscript is synchronized on the global variable scope, 
which is bound to the session. Although this shouldn't be a problem in 
real world as a single user is not very likely to send parallel 
requests, you should verify that your load testing engine uses different 
sessions (or no session at all) for the simulated concurrent users. That 
may explain these numbers.

Sylvain

-- 
Sylvain Wallez                        Anyware Technologies
http://apache.org/~sylvain            http://anyware-tech.com
Apache Software Foundation Member     Research & Technology Director


Re: Cocoon Performance Woes, Is it flow? I don't know!

Posted by "Eric E. Meyer" <er...@quoininc.com>.
Reinhard Poetz wrote:

> Eric E. Meyer wrote:
>
>> Cocoon Performance Woes, Is it flow? I don't know!
>
>
>> CForms, Javascript flow, mix of JX template generators, XInclude
>> transformer and custom generator transformers. Core application
>> components implemented in Java. Hibernate persistence.
>
>
> Do you use jxmacros in your cforms templates? I remember of some 
> performance tests a few months ago that revealed problems when 
> jxmacros are used. As I've never used jxmacros in production, I'm not 
> sure if this was a problem in my tests or if it is a real existing 
> problem.
>
No, we are not using jxmacros. The JX is really just to parameterize 
xincludes into our top-level page template.

Ralph Goers wrote:

> Also, that seems like a high number of users for such a little amount 
> of memory.  You may just be garbage collecting a lot. 

On Linux, with 1G physical, 512M mx,ms and verbose:gc, I did not see any 
excessively long garbage collection cycles. On Windows, with mx256M, I'm 
generally seeing short incremental GC cycles around 50ms, but I did see 
one 11 second full GC during my last run.  But the deployment machine is 
much beefier (and the VM memory allocation twice the size). I'll 
re-verify on the Linux server.

Also, for curiosity, I re-ran the tests with the embedded form in the 
search results page removed. That had a positive impact on performance:

With refine-search form:

users overall home  search1 search2 search3  detail  total
      avg ms  page                                   num reqs

10      486    208      637    588     644     355    500
20     1704    378     2684   1837    2875     745    500
30     3725    682     5987   4626    6270    1059    450
40    19461   1411    36021  23089   34726    2059    600
50    72942   3213   130482  90993  131666    8356    500

Without refine-search form:
10     645     518      646    725     711     627    500
20     800     266     1116    963    1024     629    500
30    2539     634     4451   3003    3908     698    450
40    4716    1430     6936   4829    8040    2347    600
50    7978    1328    13063   8669   14122    2710    500

Regards,
Eric Meyer


Re: Cocoon Performance Woes, Is it flow? I don't know!

Posted by Reinhard Poetz <re...@apache.org>.
Eric E. Meyer wrote:
> Cocoon Performance Woes, Is it flow? I don't know!

> CForms, Javascript flow, mix of JX template generators, XInclude
> transformer and custom generator transformers. Core application
> components implemented in Java. Hibernate persistence.

Do you use jxmacros in your cforms templates? I remember of some performance 
tests a few months ago that revealed problems when jxmacros are used. As I've 
never used jxmacros in production, I'm not sure if this was a problem in my 
tests or if it is a real existing problem.

-- 
Reinhard Pötz           Independent Consultant, Trainer & (IT)-Coach 

{Software Engineering, Open Source, Web Applications, Apache Cocoon}

                                        web(log): http://www.poetz.cc
--------------------------------------------------------------------

Re: Cocoon Performance Woes, Is it flow? I don't know!

Posted by Antonio Gallardo <ag...@agssa.net>.
Hi Eric:

I have no time now to read all the answer you got. Some points:

1. java 1.4.2_06 is buggy -> update to 1.4.2_07
2. Flow is slow is FUD. I recently read a comparision between diferent
scripting languages and rhino is the fastest one. The performance is very
good. So I don't think Flow can be a bottleneck.
3. Maybe this could help, but not sure: Yesterday I compiled a new version
of xalan. I also put the target of all the code to java 1.3 innstead of
the default (java 1.1) and just replacing the lib seems like the same
application is noticeable faster. I want to update cocoon with this newer
xalan version but cannot update right now (freeze time before release).

Maybe later today I can find more time to see to the problem closer and
perhaps give you some more advices.

Please give my greetings to JP. ;-)

Best Regards,

Antonio Gallardo.



On Jue, 17 de Marzo de 2005, 11:30, Eric E. Meyer dijo:
> Cocoon Performance Woes, Is it flow? I don't know!
>
> Hello all,
>
> My team developed and deployed a web application for a client which is
> built on top of Cocoon (www.fivestaralliance.com).
>
> I have spent over two weeks now attempting to improve the
> performance and scalability of the application – with some real
> improvements. However, I continue to feel like I'm flying blind –
> because the apparent bottlenecks are somewhere outside of my code. I
> have read a number of previous debates about javascript flow being slow,
> and we are using CForms with Javascript flow, but my main concern is
> that I cannot determine where the bottlenecks are.
>
> Even when the application is bogging down, the components that my team
> wrote are performing their tasks (generation, transformation,
> flowscripting) at very fast rates (as seen by tracing/logging), but
> something else in the framework/process is bogging down the requests.
> The analysis tools don't show any obvious resource limitation even at
> the highest loading levels.
>
> Some pages in the site scale quite well, while one in particular does
> not scale well at all. While I know that I'm close to having a system
> that runs blindingly fast, I'm currently faced with a situation where I
> cannot effectively argue that the architecture isn't "fundamentally
> flawed" and I'm unable to address a major scalability concern for my
> client. I would welcome any concrete suggestions on how to better
> determine my bottlenecks and any additional tuning advice.
>
> Brief description of the application architecture
>
> CForms, Javascript flow, mix of JX template generators, XInclude
> transformer and custom generator transformers. Core application
> components implemented in Java. Hibernate persistence.
>
> Profiling and Monitoring
>
> My biggest problem is that I've only been able to determine where the
> problem isn't at this point. I've used a variety of tools to attempt to
> see what's going on, and why the application is bogging down, but I
> cannot seem to get a comprehensive picture of what delays/bottlenecks
> there are within Cocoon.
>
> Specifically, it would be extremely helpful to monitor the number of
> generators/transformers/other pooled components in use, allocated, freed
> while under load. Additionally, it would be useful to see the time taken
> up by each of the steps in the process of servicing a request – not just
> the set-up and generation/transform times as shown by the Cocoon profiler.
>
> These are the tools/approaches that I have used:
>
> Multi-thread load test with JMeter.
> Profiling of application code using JProbe (CPU and memory analysis).
> Profiling of Cocoon components using Cocoon pipeline profiler.
> Monitoring of Cocoon components using the Cocoon instrumentation client.
> Monitoring of Cocoon server using Status generator.
> Monitored Linux system activity with SAR, iostat, mpstat and vmstat
> during load-testing.
> Profiling of custom generators, transformers and flowscript using
> Jakarata Commons StopWatch and log statements.
>
> Tweaks made thus far
>
> Adjusted Java virtual machine parameters
> 	-server -Xms512M -Xmx512M
> Adjust logging levels - turned down logging
> Adjusted thread pool sizes in Tomcat
> 	150 -> 350 max
> Adjusted database connection pool size up to 50
> Adjusted sitemap component pool sizes up
> Optimized some Java code based upon JProbe profiling
> Added additional objects to the in-memory cache to reduce database queries
> Turned off reloading of sitemaps and javascript files
> Replaced default Cocoon JCS cache with Whirlycache
> Replaced default Cocoon Xalan XSL transformer with faster Saxon XSL
> transformer
> Configured Cocoon to reuse XML parsers
> Removed Cocoon store janitor
> Preloading key OO javascript flowscript at server startup
>
> Observations
>
> Windows XP
> Pentium 4 1.8Ghz
> JDK 1.4.2_06
> Tomcat 5.0.28
> Cocoon 2.1.5.1
> 512MB physical RAM
> JVM -server -Xms256M -Xmx256M
>
> Load test with num users threads each making 5 successive request in a
> loop with approximately 3 second think time between requests. No derived
> resources – only the main page.
>
> users overall home  search1 search2 search3  detail  total
>        avg ms  page                                   num reqs
>
> 10      486    208      637    588     644     355    500
> 20     1704    378     2684   1837    2875     745    500
> 30     3725    682     5987   4626    6270    1059    450
> 40    19461   1411    36021  23089   34726    2059    600
> 50    72942   3213   130482  90993  131666    8356    500
>
> home page: /
> search1: /luxury_hotels/europe__france__paris/index.html
> search2: /luxury_hotels/bahamas_%26_the_caribbean/beach_resort/index.html
> search3:
> /luxury_hotels/europe__france__paris/city_centre_location/index.html
> detail: /luxury_hotel/new_york,_ny/the_carlyle
>
> Platform:
>
> Development
> Windows XP
> JDK 1.4.2_06
> Tomcat 5.0.28
> Cocoon 2.1.5.1
>
> Deployment
> Linux 2.6.x
>
> We see similar degradation on Linux as on Windows.
>
> The home page has no flowscript or cforms, but does have jxtemplate
> generation, xinclude, xslt, and a custom generator.
>
> The search and detail pages include a cform, and are therefore driven
> with flowscript at the top-level matching (and create continuations in
> the process of displaying their forms). These pages use jxtemplate
> generation, xinclude, xslt, custom generation, custom transformation,
> and internal-only sub pipelines. When looking at the pipeline times with
> a profiling pipeline, the total times (while under load) are much higher
> that the displayed times for the setup and generation steps -- so where
> is the time going?
>
> Regards,
> Eric Meyer
>
>
>
>


Re: Cocoon Performance Woes, Is it flow? I don't know!

Posted by peter royal <pr...@apache.org>.
On Mar 18, 2005, at 5:19 PM, Antonio Gallardo wrote:
> Try to switch to ehcache. I hear some ratings to the slowness of jcs 
> cache
> (the default cache system in in 2.1.5.1). BTW in 2.1.6 I think the 
> default
> cache system is ehcache, one more reason to move to 2.1.6 ;_).

he's already using whirlycache, which is fastest ;)

http://whirlycache.dev.java.net/

-pete


Re: Cocoon Performance Woes, Is it flow? I don't know!

Posted by Antonio Gallardo <ag...@agssa.net>.
eric:

Two more tips:

Try to switch to ehcache. I hear some ratings to the slowness of jcs cache
(the default cache system in in 2.1.5.1). BTW in 2.1.6 I think the default
cache system is ehcache, one more reason to move to 2.1.6 ;_).

if posible switch from xinclude to cinclude, AFAIK still today xinclude is
not cached and cinclude can be cached. This should improve the application
performance. On the oher hand, using cinclude you are "married" with
cocoon.  All in all, I think this is not a minus for cinclude since using
other cocoon related techno you are already "married".

BTW, I am wondering about the changes you did. Helped them a little bit in
the performance or not?

Best Regards,

Antonio Gallardo.


On Vie, 18 de Marzo de 2005, 15:46, Eric E. Meyer dijo:
> Antonio Gallardo wrote:
>
>>Can you post the sizes of the returned pages in the thread? I guess this
>>is an important point too.
>>
>>
>
> The pages are roughly similar in size:
>
> home page: /
> 17.54 KB (17964 bytes)
>
> search1: /luxury_hotels/europe__france__paris/index.html
> 29.88 KB (30599 bytes)
>
> search2: /luxury_hotels/bahamas_%26_the_caribbean/beach_resort/index.html
> 29.33 KB (30037 bytes)
>
> search3:
> /luxury_hotels/europe__france__paris/city_centre_location/index.html
> 30.29 KB (31017 bytes)
>
> detail: /luxury_hotel/new_york,_ny/the_carlyle
> 25.65 KB (26267 bytes)
>
>
>>Also, consider to move to cocoon 2.1.6, now here is why:
>>
>>http://issues.apache.org/bugzilla/show_bug.cgi?id=31760
>>
>>I saw you canged some pools inside tomcat. Did you changed some cocoon
>>pools? Did you adjusted cocoon memory usage inside cocoon.xconf?
>>
>>
> Hm - I didn't (until just now) see that there were pool sizes specified
> in the cocoon.xconf. These are the values that I have
>
>   <xml-parser class="org.apache.excalibur.xml.impl.JaxpParser"
> logger="core.xml-parser" pool-grow="4" pool-max="32" pool-min="8">
>     <parameter name="validate" value="false"/>
>     <parameter name="namespace-prefixes" value="false"/>
>     <parameter name="stop-on-warning" value="true"/>
>     <parameter name="stop-on-recoverable-error" value="true"/>
>     <parameter name="reuse-parsers" value="true"/> <!-- EM: changed-->
>     <parameter name="drop-dtd-comments" value="true"/>
>   </xml-parser>
>
>   <xml-serializer
> class="org.apache.cocoon.components.sax.XMLByteStreamCompiler"
> logger="core.xml-serializer" pool-grow="4" pool-max="32" pool-min="8"/>
>   <xml-deserializer
> class="org.apache.cocoon.components.sax.XMLByteStreamInterpreter"
> logger="core.xml-deserializer" pool-grow="4" pool-max="32" pool-min="8"/>
>
> I removed the store janitor and switched both the store and
> transient-store to Whirly Cache.
>
>   <transient-store logger="core.store.transient">
>         <backend>com.whirlycott.cache.impl.ConcurrentHashMapImpl</backend>
>         <tuner-sleeptime>10</tuner-sleeptime>
>         <!-- evicts least frequently used items when pruning -->
>         <policy>com.whirlycott.cache.policy.LFUMaintenancePolicy</policy>
>         <maxsize>10000</maxsize>
>   </transient-store>
>
>   <store logger="core.store">
>     <parameter name="use-cache-directory" value="true"/>
>         <backend>com.whirlycott.cache.impl.ConcurrentHashMapImpl</backend>
>         <tuner-sleeptime>10</tuner-sleeptime>
>         <!-- evicts least frequently used items when pruning -->
>         <policy>com.whirlycott.cache.policy.LFUMaintenancePolicy</policy>
>         <maxsize>10000</maxsize>
>   </store>
>
> Regards,
> Eric Meyer
>


Re: Cocoon Performance Woes, Is it flow? I don't know!

Posted by "Eric E. Meyer" <er...@quoininc.com>.
Antonio Gallardo wrote:

>Can you post the sizes of the returned pages in the thread? I guess this
>is an important point too.
>  
>

The pages are roughly similar in size:

home page: / 
17.54 KB (17964 bytes)

search1: /luxury_hotels/europe__france__paris/index.html
29.88 KB (30599 bytes)

search2: /luxury_hotels/bahamas_%26_the_caribbean/beach_resort/index.html
29.33 KB (30037 bytes)

search3:
/luxury_hotels/europe__france__paris/city_centre_location/index.html
30.29 KB (31017 bytes)

detail: /luxury_hotel/new_york,_ny/the_carlyle
25.65 KB (26267 bytes)


>Also, consider to move to cocoon 2.1.6, now here is why:
>
>http://issues.apache.org/bugzilla/show_bug.cgi?id=31760
>
>I saw you canged some pools inside tomcat. Did you changed some cocoon
>pools? Did you adjusted cocoon memory usage inside cocoon.xconf?
>  
>
Hm - I didn't (until just now) see that there were pool sizes specified 
in the cocoon.xconf. These are the values that I have

  <xml-parser class="org.apache.excalibur.xml.impl.JaxpParser" 
logger="core.xml-parser" pool-grow="4" pool-max="32" pool-min="8">
    <parameter name="validate" value="false"/>
    <parameter name="namespace-prefixes" value="false"/>
    <parameter name="stop-on-warning" value="true"/>
    <parameter name="stop-on-recoverable-error" value="true"/>
    <parameter name="reuse-parsers" value="true"/> <!-- EM: changed-->
    <parameter name="drop-dtd-comments" value="true"/>
  </xml-parser>

  <xml-serializer 
class="org.apache.cocoon.components.sax.XMLByteStreamCompiler" 
logger="core.xml-serializer" pool-grow="4" pool-max="32" pool-min="8"/>
  <xml-deserializer 
class="org.apache.cocoon.components.sax.XMLByteStreamInterpreter" 
logger="core.xml-deserializer" pool-grow="4" pool-max="32" pool-min="8"/>

I removed the store janitor and switched both the store and 
transient-store to Whirly Cache.

  <transient-store logger="core.store.transient">
        <backend>com.whirlycott.cache.impl.ConcurrentHashMapImpl</backend>
        <tuner-sleeptime>10</tuner-sleeptime>
        <!-- evicts least frequently used items when pruning -->
        <policy>com.whirlycott.cache.policy.LFUMaintenancePolicy</policy>
        <maxsize>10000</maxsize>
  </transient-store>

  <store logger="core.store">
    <parameter name="use-cache-directory" value="true"/>
        <backend>com.whirlycott.cache.impl.ConcurrentHashMapImpl</backend>
        <tuner-sleeptime>10</tuner-sleeptime>
        <!-- evicts least frequently used items when pruning -->
        <policy>com.whirlycott.cache.policy.LFUMaintenancePolicy</policy>
        <maxsize>10000</maxsize>
  </store>

Regards,
Eric Meyer

Re: Cocoon Performance Woes, Is it flow? I don't know!

Posted by Antonio Gallardo <ag...@agssa.net>.
Hi Eric,

Now I read carefully your post. I saw you are already using
instrumentation.;-)

Can you post the sizes of the returned pages in the thread? I guess this
is an important point too.

Also, consider to move to cocoon 2.1.6, now here is why:

http://issues.apache.org/bugzilla/show_bug.cgi?id=31760

I saw you canged some pools inside tomcat. Did you changed some cocoon
pools? Did you adjusted cocoon memory usage inside cocoon.xconf?

Best Regards,

Antonio Gallardo.


On Jue, 17 de Marzo de 2005, 11:30, Eric E. Meyer dijo:
> Cocoon Performance Woes, Is it flow? I don't know!
>
> Hello all,
>
> My team developed and deployed a web application for a client which is
> built on top of Cocoon (www.fivestaralliance.com).
>
> I have spent over two weeks now attempting to improve the
> performance and scalability of the application – with some real
> improvements. However, I continue to feel like I'm flying blind –
> because the apparent bottlenecks are somewhere outside of my code. I
> have read a number of previous debates about javascript flow being slow,
> and we are using CForms with Javascript flow, but my main concern is
> that I cannot determine where the bottlenecks are.
>
> Even when the application is bogging down, the components that my team
> wrote are performing their tasks (generation, transformation,
> flowscripting) at very fast rates (as seen by tracing/logging), but
> something else in the framework/process is bogging down the requests.
> The analysis tools don't show any obvious resource limitation even at
> the highest loading levels.
>
> Some pages in the site scale quite well, while one in particular does
> not scale well at all. While I know that I'm close to having a system
> that runs blindingly fast, I'm currently faced with a situation where I
> cannot effectively argue that the architecture isn't "fundamentally
> flawed" and I'm unable to address a major scalability concern for my
> client. I would welcome any concrete suggestions on how to better
> determine my bottlenecks and any additional tuning advice.
>
> Brief description of the application architecture
>
> CForms, Javascript flow, mix of JX template generators, XInclude
> transformer and custom generator transformers. Core application
> components implemented in Java. Hibernate persistence.
>
> Profiling and Monitoring
>
> My biggest problem is that I've only been able to determine where the
> problem isn't at this point. I've used a variety of tools to attempt to
> see what's going on, and why the application is bogging down, but I
> cannot seem to get a comprehensive picture of what delays/bottlenecks
> there are within Cocoon.
>
> Specifically, it would be extremely helpful to monitor the number of
> generators/transformers/other pooled components in use, allocated, freed
> while under load. Additionally, it would be useful to see the time taken
> up by each of the steps in the process of servicing a request – not just
> the set-up and generation/transform times as shown by the Cocoon profiler.
>
> These are the tools/approaches that I have used:
>
> Multi-thread load test with JMeter.
> Profiling of application code using JProbe (CPU and memory analysis).
> Profiling of Cocoon components using Cocoon pipeline profiler.
> Monitoring of Cocoon components using the Cocoon instrumentation client.
> Monitoring of Cocoon server using Status generator.
> Monitored Linux system activity with SAR, iostat, mpstat and vmstat
> during load-testing.
> Profiling of custom generators, transformers and flowscript using
> Jakarata Commons StopWatch and log statements.
>
> Tweaks made thus far
>
> Adjusted Java virtual machine parameters
> 	-server -Xms512M -Xmx512M
> Adjust logging levels - turned down logging
> Adjusted thread pool sizes in Tomcat
> 	150 -> 350 max
> Adjusted database connection pool size up to 50
> Adjusted sitemap component pool sizes up
> Optimized some Java code based upon JProbe profiling
> Added additional objects to the in-memory cache to reduce database queries
> Turned off reloading of sitemaps and javascript files
> Replaced default Cocoon JCS cache with Whirlycache
> Replaced default Cocoon Xalan XSL transformer with faster Saxon XSL
> transformer
> Configured Cocoon to reuse XML parsers
> Removed Cocoon store janitor
> Preloading key OO javascript flowscript at server startup
>
> Observations
>
> Windows XP
> Pentium 4 1.8Ghz
> JDK 1.4.2_06
> Tomcat 5.0.28
> Cocoon 2.1.5.1
> 512MB physical RAM
> JVM -server -Xms256M -Xmx256M
>
> Load test with num users threads each making 5 successive request in a
> loop with approximately 3 second think time between requests. No derived
> resources – only the main page.
>
> users overall home  search1 search2 search3  detail  total
>        avg ms  page                                   num reqs
>
> 10      486    208      637    588     644     355    500
> 20     1704    378     2684   1837    2875     745    500
> 30     3725    682     5987   4626    6270    1059    450
> 40    19461   1411    36021  23089   34726    2059    600
> 50    72942   3213   130482  90993  131666    8356    500
>
> home page: /
> search1: /luxury_hotels/europe__france__paris/index.html
> search2: /luxury_hotels/bahamas_%26_the_caribbean/beach_resort/index.html
> search3:
> /luxury_hotels/europe__france__paris/city_centre_location/index.html
> detail: /luxury_hotel/new_york,_ny/the_carlyle
>
> Platform:
>
> Development
> Windows XP
> JDK 1.4.2_06
> Tomcat 5.0.28
> Cocoon 2.1.5.1
>
> Deployment
> Linux 2.6.x
>
> We see similar degradation on Linux as on Windows.
>
> The home page has no flowscript or cforms, but does have jxtemplate
> generation, xinclude, xslt, and a custom generator.
>
> The search and detail pages include a cform, and are therefore driven
> with flowscript at the top-level matching (and create continuations in
> the process of displaying their forms). These pages use jxtemplate
> generation, xinclude, xslt, custom generation, custom transformation,
> and internal-only sub pipelines. When looking at the pipeline times with
> a profiling pipeline, the total times (while under load) are much higher
> that the displayed times for the setup and generation steps -- so where
> is the time going?
>
> Regards,
> Eric Meyer
>
>
>
>


Re: Cocoon Performance Woes, Is it flow? I don't know!

Posted by Ralph Goers <Ra...@dslextreme.com>.
Also, that seems like a high number of users for such a little amount of 
memory.  You may just be garbage collecting a lot.

Eric E. Meyer wrote:

>
> Observations
>
> Windows XP
> Pentium 4 1.8Ghz
> JDK 1.4.2_06
> Tomcat 5.0.28
> Cocoon 2.1.5.1
> 512MB physical RAM
> JVM -server -Xms256M -Xmx256M
>
> Load test with num users threads each making 5 successive request in a
> loop with approximately 3 second think time between requests. No derived
> resources – only the main page.
>
> users overall home  search1 search2 search3  detail  total
>       avg ms  page                                   num reqs
>
> 10      486    208      637    588     644     355    500
> 20     1704    378     2684   1837    2875     745    500
> 30     3725    682     5987   4626    6270    1059    450
> 40    19461   1411    36021  23089   34726    2059    600
> 50    72942   3213   130482  90993  131666    8356    500
>
> home page: /
> search1: /luxury_hotels/europe__france__paris/index.html
> search2: /luxury_hotels/bahamas_%26_the_caribbean/beach_resort/index.html
> search3:
> /luxury_hotels/europe__france__paris/city_centre_location/index.html
> detail: /luxury_hotel/new_york,_ny/the_carlyle
>
> Platform:
>
> Development
> Windows XP
> JDK 1.4.2_06
> Tomcat 5.0.28
> Cocoon 2.1.5.1
>
> Deployment
> Linux 2.6.x
>
> We see similar degradation on Linux as on Windows.
>
> The home page has no flowscript or cforms, but does have jxtemplate
> generation, xinclude, xslt, and a custom generator.
>
> The search and detail pages include a cform, and are therefore driven
> with flowscript at the top-level matching (and create continuations in
> the process of displaying their forms). These pages use jxtemplate
> generation, xinclude, xslt, custom generation, custom transformation,
> and internal-only sub pipelines. When looking at the pipeline times with
> a profiling pipeline, the total times (while under load) are much higher
> that the displayed times for the setup and generation steps -- so where
> is the time going?
>
> Regards,
> Eric Meyer
>
>
>
>


Re: Cocoon Performance Woes, Is it flow? I don't know!

Posted by Colin Paul Adams <co...@colina.demon.co.uk>.
Rhyme error:

this should be:

"Cocoon Performance Woe, Is it flow? I don't know!"

:-)
-- 
Colin Paul Adams
Preston Lancashire