You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-users@xmlgraphics.apache.org by Stephen Clouse <st...@gmail.com> on 2009/10/02 00:38:45 UTC

Excessive PDF output file size

I am currently working on replacing PDFLib with FOP in an existing
application and have hit a roadblock due to the file sizes of the PDFs FOP
is producing.

If you point your web browser to http://warpcore.org/fop/ you will find two
versions of the first page of a report, one from the existing
PDFLib-generated file, the other as rendered by FOP.  The FOP version is
almost 4 times larger (16KB vs 4.2KB).

The full report (51 pages of the same style output) is even worse, 633KB vs.
86KB (over 7 times larger).  I thought it might have something to do with
the table borders, with FOP rendering individual border segments whereas the
PDFLib version is basically lines being drawn manually, but even with all
borders supressed (reducing it to text-only) the FOP version is close to
400KB.  I have used FOP plenty in the past and not encountered such an
issue, but I have also never done anything this heavy on tables.

Is there anything you can recommend to get this file down to a reasonable
size?  If it's something where FOP needs to be optimized, where can I start
looking?  (I'm definitely not opposed to doing some development work on FOP
but I haven't had need to work with the source code at all to date.)

-- 
Stephen Clouse <st...@gmail.com>

Re: Excessive PDF output file size

Posted by Jeremias Maerki <de...@jeremias-maerki.ch>.
First of all, I don't think there's anything you can do short of
changing FOP to bring the PDF sizes down. I don't really feel like going
into all the details but:

- XSL-FO offers quite some functionality that results in a lot of PDF
commands to paint the borders if no border segment merging is done
(which could be quite difficult to do and is pretty much impossible
before a certain bug [1] is fixed).

- Each table-cell is specified to produce a reference area which
currently results in a q/cm/<content>/Q sequence. State handling then
causes additional commands (like font selection) for each reference area
which could be avoided if no q/Q were used. But getting rid of that
would have a fallout of its own.

The end result would we an considerably increase of complexity in the
rendering source code at the very least.

Anyway, I guess with the new intermediate format [2] you could actually
write an optimizer (specialized for your use case) on the IF XML level
to get rid of many <g> elements generated by the reference areas from
the table-cells.

[1] http://markmail.org/message/2jmh4pvwae2kjgkw
[2] http://xmlgraphics.apache.org/fop/trunk/intermediate.html

On 02.10.2009 00:38:45 Stephen Clouse wrote:
> I am currently working on replacing PDFLib with FOP in an existing
> application and have hit a roadblock due to the file sizes of the PDFs FOP
> is producing.
> 
> If you point your web browser to http://warpcore.org/fop/ you will find two
> versions of the first page of a report, one from the existing
> PDFLib-generated file, the other as rendered by FOP.  The FOP version is
> almost 4 times larger (16KB vs 4.2KB).
> 
> The full report (51 pages of the same style output) is even worse, 633KB vs.
> 86KB (over 7 times larger).  I thought it might have something to do with
> the table borders, with FOP rendering individual border segments whereas the
> PDFLib version is basically lines being drawn manually, but even with all
> borders supressed (reducing it to text-only) the FOP version is close to
> 400KB.  I have used FOP plenty in the past and not encountered such an
> issue, but I have also never done anything this heavy on tables.
> 
> Is there anything you can recommend to get this file down to a reasonable
> size?  If it's something where FOP needs to be optimized, where can I start
> looking?  (I'm definitely not opposed to doing some development work on FOP
> but I haven't had need to work with the source code at all to date.)
> 
> -- 
> Stephen Clouse <st...@gmail.com>




Jeremias Maerki


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org