You are viewing a plain text version of this content. The canonical link for it is here.

Posted to fop-users@xmlgraphics.apache.org by David Le Strat <Da...@iis.com> on 2002/03/12 21:05:20 UTC

FOP Performance Limitations?

All,

I am currently working on a project where we are dynamically creating PDF
documents based on a user input.  When a user selects a specific period of
time, we pull out the matching records from the database, convert the
dataset to XML and render a PDF report based on that dataset.  Now,
everything works fine when we are manipulating up to 200 records (we get the
result in 1 or 2 minutes).  However some reports manipulate 7000 or 8000
records and in these particular instances, the performance degrades fairly
significantly (no report was rendered after 40 minutes).

Does any of you have any idea/input on how to improve performance using FOP
in such cases and what type of performance we should expect for the above
examples?  

Thank you for your help.

David Le Strat.

Re: FOP Performance Limitations?

Posted by alex <al...@yahoo.com>.

Actually the FAQ (at http://www.owal.co.uk/cgi-bin/fopfaq.cgi since I have 
given up on the interactive Jyve program)
doesn't contain much on memory consumption. I will add in the "new page 
sequence" when I (or someone else) comes up with an example.

Alex

At 20:22 12/03/2002, Chuck Paussa wrote:
>David,
>
>Most likely you're running out of memory. You should set a new 
>page-sequence every once in a while (~60 lines of report would make it 
>every page) Look on the FAQ about memory consumption
>Chuck

Re: FOP Performance Limitations?

Posted by Chuck Paussa <Ch...@systems.dhl.com>.

David,

Most likely you're running out of memory. You should set a new 
page-sequence every once in a while (~60 lines of report would make it 
every page) Look on the FAQ about memory consumption

Chuck

David Le Strat wrote:

> All,
>
> I am currently working on a project where we are dynamically creating 
> PDF documents based on a user input.  When a user selects a specific 
> period of time, we pull out the matching records from the database, 
> convert the dataset to XML and render a PDF report based on that 
> dataset.  Now, everything works fine when we are manipulating up to 
> 200 records (we get the result in 1 or 2 minutes).  However some 
> reports manipulate 7000 or 8000 records and in these particular 
> instances, the performance degrades fairly significantly (no report 
> was rendered after 40 minutes).
>
> Does any of you have any idea/input on how to improve performance 
> using FOP in such cases and what type of performance we should expect 
> for the above examples? 
>
> Thank you for your help.
>
> David Le Strat.
>

Re: FOP Performance Limitations?

Posted by David Wood <ob...@panix.com>.

I've been making pretty big PDFs with a similar system and can share a few
off-the-cuff comments.

It's obvious to me that the structure of your fo document - sequences,
page layout, flows, etc - can make a significant difference in memory
usage and speed. However, I don't have enough concrete conclusions about
what exactly does what and how to offer any useful advice on that level...
perhaps others can help you there...

The big thing I noticed was page numbers. If you use them and especially
if you make references to them (i.e. building a table of contents) you'll
see a significant speed and memory impact. GC consequences to references,
perhaps.

Basically you need to tune your fop script to give FOP the maximum
possible heap size (i.e. fop.bat starts with "java -Xmx256M ..."). If you
make it too large, you'll discover that between Java and FOP the memory
access patterns will spank the shit out of your VM once the heap exceeds
the available RAM and you start to swap. Some observations of your
machine's memory availability during normal use and some experimentation
should get you to the right number.

Your experience of having 40+ minute rendering times is strongly
suggestive of swap binding. Practically speaking, you need to make your
heap small enough that java never swaps, and limit your recordset size on
the front end to make sure that you never hit that memory ceiling.

Between this and throwing hardware at the problem (multiple Xeons and 1GB+
RAM) we've made a go of it for 1000+ page documents. But of course every
recordset+template is different so that pagecount isn't necessarily
meaningful at all.

One thing that I haven't tried yet but am very curious to experiment with
is FOP + IBM JVM. CW has it that the IBM VM is significantly superior to
Sun's VM on both CPU and RAM efficiency.

If I manage to get to this, I'll post my results. If anyone else has or
happens to get to it first, I'd love to hear what happens. I mean, after
your "write-once-porting-is-slightly-less-painful" experience.

On Tue, 12 Mar 2002, David Le Strat wrote:

> All,
>
> I am currently working on a project where we are dynamically creating PDF
> documents based on a user input.  When a user selects a specific period of
> time, we pull out the matching records from the database, convert the
> dataset to XML and render a PDF report based on that dataset.  Now,
> everything works fine when we are manipulating up to 200 records (we get the
> result in 1 or 2 minutes).  However some reports manipulate 7000 or 8000
> records and in these particular instances, the performance degrades fairly
> significantly (no report was rendered after 40 minutes).
>
> Does any of you have any idea/input on how to improve performance using FOP
> in such cases and what type of performance we should expect for the above
> examples?
>
> Thank you for your help.
>
> David Le Strat.
>