You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-users@xmlgraphics.apache.org by Chuck Paussa <Ch...@systems.dhl.com> on 2001/11/14 21:30:56 UTC
Re: How to avoid using too much memory to create relatively large PDF file

Xingjian,

80 Mb is not bad. We allocate 250Mb to process my documents. I've been 
monitering the conversations on memory usage on fop-dev and the short 
story is this:

They're working on several schemes to reduce memory usage. Those schemes 
won't bear fruit for a while. Until then, the only suggestion I've seen 
is do the following:

Create new page sequences as often as you can. I'm able to calculate my 
page sizes before calling FOP (It's a line oriented report), so I'm 
manually creating a page sequence for each page. Other people are 
creating page sequences for each section etc. The drawback to creating a 
page sequence is that a page sequence forces a page break. (Oh well, I'm 
not busted up about this because I'm manually calculating my pages.) The 
plus to this is that now you can process huge HUGE files. My old process 
was crashing whenever it encountered a report with greater than 70 pages 
or so. Now I can process reports of arbitrary size.

With the page sequences you will notice that FOP consumes all available 
memory and then starts garbage collecting when it runs out. Garbage 
collection takes time, so processing speed is not improved. Your ability 
to process more pages is improved though.

How we calculate page sequences:

We process the original XML into a new XML that holds all the lines.

    <line>
        <col1>gdsj</col1>
        <col2>e8whe</col2>
        ...
    </line>
    <line type="subtotal">
    ...

If the number of lines will create a report that's bigger than 40 pages 
(meaning it might crash FOP) We reprocess the thing to add in <page> tags
<page>
    <line>
        <col1>gdsj</col1>
        <col2>e8whe</col2>
        ...
    </line>
    <line type="subtotal">
    ...
</page>

And then reprocess that xml to generate a new page sequence per page. 
(Example of the code inserted at each page break follows:

            </fo:table-body>
         </fo:table>
      </fo:flow>
   </fo:page-sequence>
<!-- page break -->
   <fo:page-sequence master-name="master-sequence">
      <fo:static-content flow-name="header">
         <fo:block> pg. <fo:page-number/> of <fo:page-number-citation 
ref-id="terminator"/> </fo:block>
      </fo:static-content>
      <fo:flow flow-name="xsl-region-body">
         <fo:table font-family="Helvetica" font-size="7pt" 
font-weight="normal">
            <fo:table-column column-width="18mm" column-number="1"/>
            <fo:table-column column-width="13mm" column-number="2"/>
            <fo:table-column column-width="20mm" column-number="3"/>
            .  more columns here
            . . .
            <fo:table-header>
               <fo:table-row>
                  <fo:table-cell display-align="after" column-number="1">
                     <fo:block text-align="start">DATE </fo:block>
                  </fo:table-cell>
                  <fo:table-cell display-align="after" column-number="2">
                     <fo:block text-align="start"> CODE </fo:block>
                  </fo:table-cell>
                  . more columns here
                  . . .
               </fo:table-row>
            </fo:table-header>
            <fo:table-body>



Xingjian Shangguan wrote:

> Hi,
>
> I am using most recent FOP, Xalan and Xerces to create PDF from the 
> relatively large amount of data from database.
>
> The program works like this: I first get data from database, produce 
> XML and use Xalan to process xml/xsl to create FO, then I use FOP to 
> create PDF file.  I am able to create PDF files without any problem, 
> however, I need to manually allocate a lot of memory to process the FO 
> (like 80M).  The vast amount of memory is needed exactly when FO is 
> processed to create PDF file.  My FO create from my XML data and XSL 
> is as large as 1.5M.  Does any one have any idea how can I avoid 
> occupying too much memory.
>
> Thanks in advance for your suggestion.
>
> Sean
>
>
> ====================
> Xingjian Shangguan
> (732) 424-3980 (H)
> (732) 718-9522 (C)
> (212) 622-3098 (O)
> ====================
>
>
> _________________________________________________________________
> Get your FREE download of MSN Explorer at 
> http://explorer.msn.com/intl.asp
>