You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-users@xmlgraphics.apache.org by Mario Madunic <Ma...@newflyer.com> on 2010/06/23 14:45:45 UTC

Compression of output question

Well first off, I do not know much if anything about PDF compression methods, techniques, or technologies.

Using FOP 0.94, Windows XP SP3, 2Gigs ram
My FOP config files compression is set to 
<value>flate</value>

(At the moment sticking with 0.94 as I have some fix-ups to do with 0.95 and am unable to get a proper build of the trunk happening here at work)

Here is the issue with PDF output file size.

When the old workflow of creating our manuals (FM to PDF, images are all WMF) the PDF file is around 20megs. In the workflow under development it is DITA (DITA Open Toolkit not used, it is a custom transformation) to PDF using SVGs for images and BGs, the PDF is around 70megs. FYI there are 500+ full page images, some very simple like a shock absorber breakdown (around 20k) to some really complex breakdowns 900k to 4megs.

So my question is what can be done to bring the file size down? Am I misunderstanding PDF compression and what is possible? Just so you know how I'm thinking of compression, when I'm working in Photoshop and exporting for the web and using the percentage slider to see the quality of pixilation I'm getting. Or is PDF compression more like Zipping a file?

Any insight will be appreciated.

Thanks

Marijan (Mario) Madunic
Publishing Specialist
New Flyer Industries

--------------------------------------------------------------------
Please consider the environment before printing this e-mail.

CONFIDENTIALITY STATEMENT: This communication (and  any and all information or material transmitted with this communication) is confidential, may be privileged and is intended only for the use of the intended recipient. If you are not the intended recipient, any review, retransmission, circulation, distribution, reproduction, conversion to hard copy, copying or other use of this communication, information or material is strictly prohibited and may be illegal. If you received this communication in error or if it is forwarded to you without the express authorization of New Flyer, please notify us immediately by telephone or by return email and permanently delete the communication, information and material from any computer, disk drive, diskette or other storage device or media. Thank you.


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


Re: Compression of output question

Posted by Craig Ringer <cr...@postnewspapers.com.au>.
On 23/06/10 20:45, Mario Madunic wrote:

> So my question is what can be done to bring the file size down? 

How'd you go with this? Did you have any luck figuring out where the
size came from and how to reduce it?

If you like you can pop the PDF on a private FTP/HTTP server and lob me
the URL by private mail. I can run it through Adobe Acrobat (and the
Enfocus PitStop plugin) here and see what it says about the space use.

-- 
Craig Ringer

Tech-related writing: http://soapyfrogs.blogspot.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


Re: AW: Compression of output question

Posted by Jeremias Maerki <de...@jeremias-maerki.ch>.
You got that right.

On 28.06.2010 15:22:17 Eric Douglas wrote:
> I'm printing an image on every page of every document (company logo).  I
> couldn't figure out a format to reference an external file (BMP/JPG/etc)
> which works regardless of which machine the transform runs on, so I
> embedded the image as SVG.  I was wondering if there's a way to embed it
> once and reference it on each page as a link to that one embedded image.
> If I'm reading you right, this is not currently possible with SVG but
> this is the default for the external images so I'll need to revisit how
> to get that working.
>  
> 
> -----Original Message-----
> From: Jeremias Maerki [mailto:dev@jeremias-maerki.ch] 
> Sent: Thursday, June 24, 2010 5:24 AM
> To: fop-users@xmlgraphics.apache.org
> Subject: Re: AW: Compression of output question
> 
> Georg,
> 
> no, that's not the case. Bitmap images identified by the same URI are
> always just embedded once in a PDF file, irrespective of page sequences
> which can't be mapped to PDF 1.4 anyway. Only with SVG that is
> different.
> 
> SVG is a bit tricky concerning Form XObjects since it supports links
> which have to be embedded in PDF with absolute page coordinates. So far
> nobody has tackled that problem. Without the links, this would be quite
> easy. But even with the links it is not a big deal, I think. Just a bit
> more effort.
> 
> On 24.06.2010 11:11:13 Georg Datterl wrote:
> > Hi Jeremias,
> > 
> > > Besides that, FOP currently always embeds images as is. It doesn't 
> > > support resampling.
> > 
> > Does that imply that if I have a fo file with lots of small
> page-sequences, each having the same background image, the image is
> included again for each page-sequence?
> > 
> > Regards,
> > 
> > Georg Datterl
> 
> 
> 
> Jeremias Maerki
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
> For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
> For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org
> 




Jeremias Maerki


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


RE: AW: Compression of output question

Posted by Eric Douglas <ed...@blockhouse.com>.
I'm printing an image on every page of every document (company logo).  I
couldn't figure out a format to reference an external file (BMP/JPG/etc)
which works regardless of which machine the transform runs on, so I
embedded the image as SVG.  I was wondering if there's a way to embed it
once and reference it on each page as a link to that one embedded image.
If I'm reading you right, this is not currently possible with SVG but
this is the default for the external images so I'll need to revisit how
to get that working.
 

-----Original Message-----
From: Jeremias Maerki [mailto:dev@jeremias-maerki.ch] 
Sent: Thursday, June 24, 2010 5:24 AM
To: fop-users@xmlgraphics.apache.org
Subject: Re: AW: Compression of output question

Georg,

no, that's not the case. Bitmap images identified by the same URI are
always just embedded once in a PDF file, irrespective of page sequences
which can't be mapped to PDF 1.4 anyway. Only with SVG that is
different.

SVG is a bit tricky concerning Form XObjects since it supports links
which have to be embedded in PDF with absolute page coordinates. So far
nobody has tackled that problem. Without the links, this would be quite
easy. But even with the links it is not a big deal, I think. Just a bit
more effort.

On 24.06.2010 11:11:13 Georg Datterl wrote:
> Hi Jeremias,
> 
> > Besides that, FOP currently always embeds images as is. It doesn't 
> > support resampling.
> 
> Does that imply that if I have a fo file with lots of small
page-sequences, each having the same background image, the image is
included again for each page-sequence?
> 
> Regards,
> 
> Georg Datterl



Jeremias Maerki


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


Re: AW: Compression of output question

Posted by Jeremias Maerki <de...@jeremias-maerki.ch>.
Georg,

no, that's not the case. Bitmap images identified by the same URI are
always just embedded once in a PDF file, irrespective of page sequences
which can't be mapped to PDF 1.4 anyway. Only with SVG that is different.

SVG is a bit tricky concerning Form XObjects since it supports links
which have to be embedded in PDF with absolute page coordinates. So far
nobody has tackled that problem. Without the links, this would be quite
easy. But even with the links it is not a big deal, I think. Just a bit
more effort.

On 24.06.2010 11:11:13 Georg Datterl wrote:
> Hi Jeremias,
> 
> > Besides that, FOP currently always embeds images as is. It doesn't
> > support resampling.
> 
> Does that imply that if I have a fo file with lots of small page-sequences, each having the same background image, the image is included again for each page-sequence?
> 
> Regards,
> 
> Georg Datterl



Jeremias Maerki


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


AW: Compression of output question

Posted by Georg Datterl <ge...@geneon.de>.
Hi Jeremias,

> Besides that, FOP currently always embeds images as is. It doesn't
> support resampling.

Does that imply that if I have a fo file with lots of small page-sequences, each having the same background image, the image is included again for each page-sequence?

Regards,

Georg Datterl

------ Kontakt ------

Georg Datterl

Geneon media solutions gmbh
Gutenstetter Straße 8a
90449 Nürnberg

HRB Nürnberg: 17193
Geschäftsführer: Yong-Harry Steiert

Tel.: 0911/36 78 88 - 26
Fax: 0911/36 78 88 - 20

www.geneon.de

Weitere Mitglieder der Willmy MediaGroup:

IRS Integrated Realization Services GmbH:    www.irs-nbg.de
Willmy PrintMedia GmbH:                            www.willmy.de
Willmy Consult & Content GmbH:                 www.willmycc.de


Re: Compression of output question

Posted by Jeremias Maerki <de...@jeremias-maerki.ch>.
On 23.06.2010 15:34:28 Craig Ringer wrote:
<snip/>
> I don't know much about fop's PDF output capabilities, so I can't tell
> you much about what it's doing with its input and how it produces the
> output. In particular, I haven't the foggiest if it can render SVG
> directly to PDF or if it's flattening it, and if it does try to render
> it to PDF what cases it has to fall back to flattening for.

FOP produces decent vector graphics output by default from SVG.
Flattening to a bitmap is available with an extension attribute. But FOP
doesn't produce Form XObjects so when the same SVG is used over and over,
it will be output to the PDF as many times.

Besides that, FOP currently always embeds images as is. It doesn't
support resampling.

There is currently room for improvement concerning Flate compression: FOP
currently doesn't support the PNG predictors. The could improve the
compression factor a bit (!) for images that are not compressed with
JPEG compression. TIFF images with CCITT compression are embedded as is
if many cases (but not all). I've got a clean-room CCITT algorithm
implementation but that's not published, yet. But of course, that would
only help with bi-level images. However, for bi-level images a JBIG2
implementation would be even better.


Jeremias Maerki


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


RE: Compression of output question

Posted by Mario Madunic <Ma...@newflyer.com>.
Craig,

I haven't even finished reading your message but it is a keeper. Just wanted to let you know how much I appreciate you taking the time to go into the depth on this subject as you have.

Thanks

Marijan (Mario) Madunic
Publishing Specialist
New Flyer Industries

-----Original Message-----
From: Craig Ringer [mailto:craig@postnewspapers.com.au] 
Sent: Wednesday, June 23, 2010 8:34 AM
To: fop-users@xmlgraphics.apache.org
Cc: Mario Madunic
Subject: Re: Compression of output question

On 23/06/10 20:45, Mario Madunic wrote:
> Well first off, I do not know much if anything about PDF compression methods, techniques, or technologies.

[snip]

> So my question is what can be done to bring the file size down? Am I misunderstanding PDF compression and what is possible? Just so you know how I'm thinking of compression, when I'm working in Photoshop and exporting for the web and using the percentage slider to see the quality of pixilation I'm getting. Or is PDF compression more like Zipping a file?

PDF compression is ... complicated.

PDF files contain a series of 'objects', which describe pages, page
contents, fonts, images, etc etc. These objects may have 'streams' of
data associated with them, such as JPEG image data, the contents of a
font, sequences of PDF drawing operations in a simplified
PostScript-like text notation, etc.

PDF streams may be filtered in a number of ways to compress or otherwise
process them. Some types of content, like image data, has special
compression options that apply only to that image data type - say JPEG
compression, CITT fax compression, etc. Other streams can only be
compressed with generalized lossless compression algorithms, of which
PDF only currently supports Deflate (ie gzip) compression. This is true
of the PDF content streams that contain the drawing operations, and of
fonts, among other things.

Additionally, for extra space savings the PDF object structure its self
may be compressed into "object streams" (PDF 1.5 or newer only) if the
PDF producer supports them.

As you're probably beginning to see, it's not as simple as "compressing"
the PDF. You need to know what parts of the PDF are taking up the space
before you can make reasonable decisions.

In most cases, it'll turn out that most of the space is taken up by big
raster (ie bitmap) images. In this case your options for shrinking the
PDF are limited to using stronger/lossier image compression, and/or
resampling the images to lower resolutions. This is best done at
production time if at all possible, since resampling and recompressing
after the PDF is originally produced is generally lossy and can result
in lower image quality for a given file size. If you can't set image
resolution limits at PDF production time, you can use a post-processing
tool like Adobe Acrobat (**NOT** adobe reader) to "optimize" the PDF by
resampling and recompressing images, but this can degrade quality
significantly.

Another thing that can make a PDF big is fonts - particularly wide asian
fonts - being embedded in their entirety into the PDF rather than being
subset so that only the glyphs that are actually used get included.

Yet another possible reason for big PDFs is if you have complex vector
graphics that can't be expressed directly in PDF drawing operations, or
graphics that your PDF producer doesn't know how to convert to PDF
drawing operations. In this situation the PDF producer has to "flatten"
the vector artwork to a raster image - which can, depending on the PDF's
required resolution, be quite huge - and include the raster image. This
is particularly common when dealing with vector graphics that use alpha
transparency while producing PDF 1.3 or older, which do not support it.

To find out how much space different parts of the PDF use, you're best
off opening it up with a tool like Adobe Acrobat (**NOT** Adobe Reader,
the crippled view-only version) and using the PDF size analysis tools of
its PDF Optimizer feature. There are other tools that do PDF size
analysis too, though.

If you have vector graphics like SVG included, they'll be reported as
part of the "image" byte count if they've been flattened, and as part of
the "pdf content streams" or "pdf objects" count if they're not
flattened. It can be hard to figure out whether they've been flattened
to raster or not - the best approach is really to either zoom in on them
to a crazy level (1600 times or more) on the PDF and see if they're
pixellated, or to use the object editor in a tool like Acrobat to see if
they're a single image or a bunch of individual lines/arcs/areas. If you
have access to more advanced PDF tools like Enfocus PitStop (which is
buggy, but amazingly useful) you can find out more.

I don't know much about fop's PDF output capabilities, so I can't tell
you much about what it's doing with its input and how it produces the
output. In particular, I haven't the foggiest if it can render SVG
directly to PDF or if it's flattening it, and if it does try to render
it to PDF what cases it has to fall back to flattening for.

--
Craig Ringer

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


--------------------------------------------------------------------
Please consider the environment before printing this e-mail.

CONFIDENTIALITY STATEMENT: This communication (and  any and all information or material transmitted with this communication) is confidential, may be privileged and is intended only for the use of the intended recipient. If you are not the intended recipient, any review, retransmission, circulation, distribution, reproduction, conversion to hard copy, copying or other use of this communication, information or material is strictly prohibited and may be illegal. If you received this communication in error or if it is forwarded to you without the express authorization of New Flyer, please notify us immediately by telephone or by return email and permanently delete the communication, information and material from any computer, disk drive, diskette or other storage device or media. Thank you.


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


Re: Compression of output question

Posted by Craig Ringer <cr...@postnewspapers.com.au>.
On 23/06/10 20:45, Mario Madunic wrote:
> Well first off, I do not know much if anything about PDF compression methods, techniques, or technologies.

[snip]

> So my question is what can be done to bring the file size down? Am I misunderstanding PDF compression and what is possible? Just so you know how I'm thinking of compression, when I'm working in Photoshop and exporting for the web and using the percentage slider to see the quality of pixilation I'm getting. Or is PDF compression more like Zipping a file?

PDF compression is ... complicated.

PDF files contain a series of 'objects', which describe pages, page
contents, fonts, images, etc etc. These objects may have 'streams' of
data associated with them, such as JPEG image data, the contents of a
font, sequences of PDF drawing operations in a simplified
PostScript-like text notation, etc.

PDF streams may be filtered in a number of ways to compress or otherwise
process them. Some types of content, like image data, has special
compression options that apply only to that image data type - say JPEG
compression, CITT fax compression, etc. Other streams can only be
compressed with generalized lossless compression algorithms, of which
PDF only currently supports Deflate (ie gzip) compression. This is true
of the PDF content streams that contain the drawing operations, and of
fonts, among other things.

Additionally, for extra space savings the PDF object structure its self
may be compressed into "object streams" (PDF 1.5 or newer only) if the
PDF producer supports them.

As you're probably beginning to see, it's not as simple as "compressing"
the PDF. You need to know what parts of the PDF are taking up the space
before you can make reasonable decisions.

In most cases, it'll turn out that most of the space is taken up by big
raster (ie bitmap) images. In this case your options for shrinking the
PDF are limited to using stronger/lossier image compression, and/or
resampling the images to lower resolutions. This is best done at
production time if at all possible, since resampling and recompressing
after the PDF is originally produced is generally lossy and can result
in lower image quality for a given file size. If you can't set image
resolution limits at PDF production time, you can use a post-processing
tool like Adobe Acrobat (**NOT** adobe reader) to "optimize" the PDF by
resampling and recompressing images, but this can degrade quality
significantly.

Another thing that can make a PDF big is fonts - particularly wide asian
fonts - being embedded in their entirety into the PDF rather than being
subset so that only the glyphs that are actually used get included.

Yet another possible reason for big PDFs is if you have complex vector
graphics that can't be expressed directly in PDF drawing operations, or
graphics that your PDF producer doesn't know how to convert to PDF
drawing operations. In this situation the PDF producer has to "flatten"
the vector artwork to a raster image - which can, depending on the PDF's
required resolution, be quite huge - and include the raster image. This
is particularly common when dealing with vector graphics that use alpha
transparency while producing PDF 1.3 or older, which do not support it.

To find out how much space different parts of the PDF use, you're best
off opening it up with a tool like Adobe Acrobat (**NOT** Adobe Reader,
the crippled view-only version) and using the PDF size analysis tools of
its PDF Optimizer feature. There are other tools that do PDF size
analysis too, though.

If you have vector graphics like SVG included, they'll be reported as
part of the "image" byte count if they've been flattened, and as part of
the "pdf content streams" or "pdf objects" count if they're not
flattened. It can be hard to figure out whether they've been flattened
to raster or not - the best approach is really to either zoom in on them
to a crazy level (1600 times or more) on the PDF and see if they're
pixellated, or to use the object editor in a tool like Acrobat to see if
they're a single image or a bunch of individual lines/arcs/areas. If you
have access to more advanced PDF tools like Enfocus PitStop (which is
buggy, but amazingly useful) you can find out more.

I don't know much about fop's PDF output capabilities, so I can't tell
you much about what it's doing with its input and how it produces the
output. In particular, I haven't the foggiest if it can render SVG
directly to PDF or if it's flattening it, and if it does try to render
it to PDF what cases it has to fall back to flattening for.

--
Craig Ringer

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


RE: Compression of output question

Posted by Mario Madunic <Ma...@newflyer.com>.
Eric, I'll get back to you on that as soon as I can. Some quick info though. The build happens but once I use the trunk instead of 0.94 or 0.95 as the FOP parser in ANT, that is where the error occurs.

Sorry I just have a couple of other priorities at the moment.

Marijan (Mario) Madunic
Publishing Specialist
New Flyer Industries

-----Original Message-----
From: Eric Douglas [mailto:edouglas@blockhouse.com] 
Sent: Wednesday, June 23, 2010 8:19 AM
To: fop-users@xmlgraphics.apache.org
Subject: RE: Compression of output question

What errors are you getting on the trunk build? 

-----Original Message-----
From: Mario Madunic [mailto:Mario_Madunic@newflyer.com] 
Sent: Wednesday, June 23, 2010 9:16 AM
To: fop-users@xmlgraphics.apache.org
Subject: RE: Compression of output question

More info,

While talking to a colleague we are going to run a test by removing all
links. There are 17000+ links in the new workflow and around 500 in the
old. We think this could be the issue and not an SVG thing.

Any thoughts or insight?

Marijan (Mario) Madunic
Publishing Specialist
New Flyer Industries

-----Original Message-----
From: Mario Madunic [mailto:Mario_Madunic@newflyer.com]
Sent: Wednesday, June 23, 2010 7:46 AM
To: fop-users@xmlgraphics.apache.org
Subject: Compression of output question

Well first off, I do not know much if anything about PDF compression
methods, techniques, or technologies.

Using FOP 0.94, Windows XP SP3, 2Gigs ram My FOP config files
compression is set to <value>flate</value>

(At the moment sticking with 0.94 as I have some fix-ups to do with 0.95
and am unable to get a proper build of the trunk happening here at work)

Here is the issue with PDF output file size.

When the old workflow of creating our manuals (FM to PDF, images are all
WMF) the PDF file is around 20megs. In the workflow under development it
is DITA (DITA Open Toolkit not used, it is a custom transformation) to
PDF using SVGs for images and BGs, the PDF is around 70megs. FYI there
are 500+ full page images, some very simple like a shock absorber
breakdown (around 20k) to some really complex breakdowns 900k to 4megs.

So my question is what can be done to bring the file size down? Am I
misunderstanding PDF compression and what is possible? Just so you know
how I'm thinking of compression, when I'm working in Photoshop and
exporting for the web and using the percentage slider to see the quality
of pixilation I'm getting. Or is PDF compression more like Zipping a
file?

Any insight will be appreciated.

Thanks

Marijan (Mario) Madunic
Publishing Specialist
New Flyer Industries

--------------------------------------------------------------------
Please consider the environment before printing this e-mail.

CONFIDENTIALITY STATEMENT: This communication (and  any and all
information or material transmitted with this communication) is
confidential, may be privileged and is intended only for the use of the
intended recipient. If you are not the intended recipient, any review,
retransmission, circulation, distribution, reproduction, conversion to
hard copy, copying or other use of this communication, information or
material is strictly prohibited and may be illegal. If you received this
communication in error or if it is forwarded to you without the express
authorization of New Flyer, please notify us immediately by telephone or
by return email and permanently delete the communication, information
and material from any computer, disk drive, diskette or other storage
device or media. Thank you.


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


--------------------------------------------------------------------
Please consider the environment before printing this e-mail.

CONFIDENTIALITY STATEMENT: This communication (and  any and all
information or material transmitted with this communication) is
confidential, may be privileged and is intended only for the use of the
intended recipient. If you are not the intended recipient, any review,
retransmission, circulation, distribution, reproduction, conversion to
hard copy, copying or other use of this communication, information or
material is strictly prohibited and may be illegal. If you received this
communication in error or if it is forwarded to you without the express
authorization of New Flyer, please notify us immediately by telephone or
by return email and permanently delete the communication, information
and material from any computer, disk drive, diskette or other storage
device or media. Thank you.


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


--------------------------------------------------------------------
Please consider the environment before printing this e-mail.

CONFIDENTIALITY STATEMENT: This communication (and  any and all information or material transmitted with this communication) is confidential, may be privileged and is intended only for the use of the intended recipient. If you are not the intended recipient, any review, retransmission, circulation, distribution, reproduction, conversion to hard copy, copying or other use of this communication, information or material is strictly prohibited and may be illegal. If you received this communication in error or if it is forwarded to you without the express authorization of New Flyer, please notify us immediately by telephone or by return email and permanently delete the communication, information and material from any computer, disk drive, diskette or other storage device or media. Thank you.


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


RE: Compression of output question

Posted by Eric Douglas <ed...@blockhouse.com>.
What errors are you getting on the trunk build? 

-----Original Message-----
From: Mario Madunic [mailto:Mario_Madunic@newflyer.com] 
Sent: Wednesday, June 23, 2010 9:16 AM
To: fop-users@xmlgraphics.apache.org
Subject: RE: Compression of output question

More info,

While talking to a colleague we are going to run a test by removing all
links. There are 17000+ links in the new workflow and around 500 in the
old. We think this could be the issue and not an SVG thing.

Any thoughts or insight?

Marijan (Mario) Madunic
Publishing Specialist
New Flyer Industries

-----Original Message-----
From: Mario Madunic [mailto:Mario_Madunic@newflyer.com]
Sent: Wednesday, June 23, 2010 7:46 AM
To: fop-users@xmlgraphics.apache.org
Subject: Compression of output question

Well first off, I do not know much if anything about PDF compression
methods, techniques, or technologies.

Using FOP 0.94, Windows XP SP3, 2Gigs ram My FOP config files
compression is set to <value>flate</value>

(At the moment sticking with 0.94 as I have some fix-ups to do with 0.95
and am unable to get a proper build of the trunk happening here at work)

Here is the issue with PDF output file size.

When the old workflow of creating our manuals (FM to PDF, images are all
WMF) the PDF file is around 20megs. In the workflow under development it
is DITA (DITA Open Toolkit not used, it is a custom transformation) to
PDF using SVGs for images and BGs, the PDF is around 70megs. FYI there
are 500+ full page images, some very simple like a shock absorber
breakdown (around 20k) to some really complex breakdowns 900k to 4megs.

So my question is what can be done to bring the file size down? Am I
misunderstanding PDF compression and what is possible? Just so you know
how I'm thinking of compression, when I'm working in Photoshop and
exporting for the web and using the percentage slider to see the quality
of pixilation I'm getting. Or is PDF compression more like Zipping a
file?

Any insight will be appreciated.

Thanks

Marijan (Mario) Madunic
Publishing Specialist
New Flyer Industries

--------------------------------------------------------------------
Please consider the environment before printing this e-mail.

CONFIDENTIALITY STATEMENT: This communication (and  any and all
information or material transmitted with this communication) is
confidential, may be privileged and is intended only for the use of the
intended recipient. If you are not the intended recipient, any review,
retransmission, circulation, distribution, reproduction, conversion to
hard copy, copying or other use of this communication, information or
material is strictly prohibited and may be illegal. If you received this
communication in error or if it is forwarded to you without the express
authorization of New Flyer, please notify us immediately by telephone or
by return email and permanently delete the communication, information
and material from any computer, disk drive, diskette or other storage
device or media. Thank you.


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


--------------------------------------------------------------------
Please consider the environment before printing this e-mail.

CONFIDENTIALITY STATEMENT: This communication (and  any and all
information or material transmitted with this communication) is
confidential, may be privileged and is intended only for the use of the
intended recipient. If you are not the intended recipient, any review,
retransmission, circulation, distribution, reproduction, conversion to
hard copy, copying or other use of this communication, information or
material is strictly prohibited and may be illegal. If you received this
communication in error or if it is forwarded to you without the express
authorization of New Flyer, please notify us immediately by telephone or
by return email and permanently delete the communication, information
and material from any computer, disk drive, diskette or other storage
device or media. Thank you.


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


RE: Compression of output question

Posted by Mario Madunic <Ma...@newflyer.com>.
Apparently it is not the 17000+ links. When the links are suppressed, the file size is only decreased by 4megs. I guess if I removed all the @ids associated with the links it might free up another 2megs.

Marijan (Mario) Madunic
Publishing Specialist
New Flyer Industries

-----Original Message-----
From: Mario Madunic [mailto:Mario_Madunic@newflyer.com] 
Sent: Wednesday, June 23, 2010 8:16 AM
To: fop-users@xmlgraphics.apache.org
Subject: RE: Compression of output question

More info,

While talking to a colleague we are going to run a test by removing all links. There are 17000+ links in the new workflow and around 500 in the old. We think this could be the issue and not an SVG thing.

Any thoughts or insight?

Marijan (Mario) Madunic
Publishing Specialist
New Flyer Industries

-----Original Message-----
From: Mario Madunic [mailto:Mario_Madunic@newflyer.com] 
Sent: Wednesday, June 23, 2010 7:46 AM
To: fop-users@xmlgraphics.apache.org
Subject: Compression of output question

Well first off, I do not know much if anything about PDF compression methods, techniques, or technologies.

Using FOP 0.94, Windows XP SP3, 2Gigs ram
My FOP config files compression is set to 
<value>flate</value>

(At the moment sticking with 0.94 as I have some fix-ups to do with 0.95 and am unable to get a proper build of the trunk happening here at work)

Here is the issue with PDF output file size.

When the old workflow of creating our manuals (FM to PDF, images are all WMF) the PDF file is around 20megs. In the workflow under development it is DITA (DITA Open Toolkit not used, it is a custom transformation) to PDF using SVGs for images and BGs, the PDF is around 70megs. FYI there are 500+ full page images, some very simple like a shock absorber breakdown (around 20k) to some really complex breakdowns 900k to 4megs.

So my question is what can be done to bring the file size down? Am I misunderstanding PDF compression and what is possible? Just so you know how I'm thinking of compression, when I'm working in Photoshop and exporting for the web and using the percentage slider to see the quality of pixilation I'm getting. Or is PDF compression more like Zipping a file?

Any insight will be appreciated.

Thanks

Marijan (Mario) Madunic
Publishing Specialist
New Flyer Industries

--------------------------------------------------------------------
Please consider the environment before printing this e-mail.

CONFIDENTIALITY STATEMENT: This communication (and  any and all information or material transmitted with this communication) is confidential, may be privileged and is intended only for the use of the intended recipient. If you are not the intended recipient, any review, retransmission, circulation, distribution, reproduction, conversion to hard copy, copying or other use of this communication, information or material is strictly prohibited and may be illegal. If you received this communication in error or if it is forwarded to you without the express authorization of New Flyer, please notify us immediately by telephone or by return email and permanently delete the communication, information and material from any computer, disk drive, diskette or other storage device or media. Thank you.


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


--------------------------------------------------------------------
Please consider the environment before printing this e-mail.

CONFIDENTIALITY STATEMENT: This communication (and  any and all information or material transmitted with this communication) is confidential, may be privileged and is intended only for the use of the intended recipient. If you are not the intended recipient, any review, retransmission, circulation, distribution, reproduction, conversion to hard copy, copying or other use of this communication, information or material is strictly prohibited and may be illegal. If you received this communication in error or if it is forwarded to you without the express authorization of New Flyer, please notify us immediately by telephone or by return email and permanently delete the communication, information and material from any computer, disk drive, diskette or other storage device or media. Thank you.


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


--------------------------------------------------------------------
Please consider the environment before printing this e-mail.

CONFIDENTIALITY STATEMENT: This communication (and  any and all information or material transmitted with this communication) is confidential, may be privileged and is intended only for the use of the intended recipient. If you are not the intended recipient, any review, retransmission, circulation, distribution, reproduction, conversion to hard copy, copying or other use of this communication, information or material is strictly prohibited and may be illegal. If you received this communication in error or if it is forwarded to you without the express authorization of New Flyer, please notify us immediately by telephone or by return email and permanently delete the communication, information and material from any computer, disk drive, diskette or other storage device or media. Thank you.


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


Re: Compression of output question

Posted by Craig Ringer <cr...@postnewspapers.com.au>.
On 23/06/10 21:57, Eric Douglas wrote:
> I managed to compile the trunk but haven't tested it yet.  PDFs getting
> created by FOP 0.95 are showing as v1.4.

Even if it was 1.5, that wouldn't guarantee that it was using object
streams. Object streams are an optional feature that most PDF producers
do not support.

Thankfully documents can be post-processed (though only AFAIK by
commercial tools) to convert regular objects and xref tables into object
streams + xref streams in a lossless manner.

In any case, before worrying too much about PDF version and whether
object + xref streams are in use, it's best to figure out where the
space use is actually coming from.

-- 
Craig Ringer

Tech-related writing: http://soapyfrogs.blogspot.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


RE: Compression of output question

Posted by Eric Douglas <ed...@blockhouse.com>.
I managed to compile the trunk but haven't tested it yet.  PDFs getting
created by FOP 0.95 are showing as v1.4.

I did have some errors in using FOP to send output directly to a printer
(it sent to wrong printer, or wrong paper tray), so I worked around it
by using FOP's PDF Mime on the format with the direct print Mime for the
output which created output in PDF format but sent it to the
ByteArrayOutputStream instead of a PDF file.  I used that output as
input to Apache's pdfbox program which converts it to document format
which is able to be printed from the java.awt.print.PrinterJob using
javax.print.PrintService. Pdfbox can also be used to create a PDF file,
or read in a PDF file.  If fop trunk can't create a PDF v1.5, maybe
pdfbox can?


-----Original Message-----
From: Craig Ringer [mailto:craig@postnewspapers.com.au] 
Sent: Wednesday, June 23, 2010 9:36 AM
To: fop-users@xmlgraphics.apache.org
Cc: Mario Madunic
Subject: Re: Compression of output question

On 23/06/10 21:16, Mario Madunic wrote:
> More info,
> 
> While talking to a colleague we are going to run a test by removing
all links. There are 17000+ links in the new workflow and around 500 in
the old. We think this could be the issue and not an SVG thing.
> 
> Any thoughts or insight?

If links are the issue, you really need to generate PDF 1.5 with
compressed object streams. PDF 1.5's object streams feature was
primarily designed with things like links and other numerous annotations
in mind.

No idea if fop can do that, but if not there are post-processing options
that can convert a PDF to use object streams without any degradation of
quality.

--
Craig Ringer

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


Re: Compression of output question

Posted by Craig Ringer <cr...@postnewspapers.com.au>.
On 23/06/10 21:16, Mario Madunic wrote:
> More info,
> 
> While talking to a colleague we are going to run a test by removing all links. There are 17000+ links in the new workflow and around 500 in the old. We think this could be the issue and not an SVG thing.
> 
> Any thoughts or insight?

If links are the issue, you really need to generate PDF 1.5 with
compressed object streams. PDF 1.5's object streams feature was
primarily designed with things like links and other numerous annotations
in mind.

No idea if fop can do that, but if not there are post-processing options
that can convert a PDF to use object streams without any degradation of
quality.

--
Craig Ringer

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


RE: Compression of output question

Posted by Mario Madunic <Ma...@newflyer.com>.
More info,

While talking to a colleague we are going to run a test by removing all links. There are 17000+ links in the new workflow and around 500 in the old. We think this could be the issue and not an SVG thing.

Any thoughts or insight?

Marijan (Mario) Madunic
Publishing Specialist
New Flyer Industries

-----Original Message-----
From: Mario Madunic [mailto:Mario_Madunic@newflyer.com] 
Sent: Wednesday, June 23, 2010 7:46 AM
To: fop-users@xmlgraphics.apache.org
Subject: Compression of output question

Well first off, I do not know much if anything about PDF compression methods, techniques, or technologies.

Using FOP 0.94, Windows XP SP3, 2Gigs ram
My FOP config files compression is set to 
<value>flate</value>

(At the moment sticking with 0.94 as I have some fix-ups to do with 0.95 and am unable to get a proper build of the trunk happening here at work)

Here is the issue with PDF output file size.

When the old workflow of creating our manuals (FM to PDF, images are all WMF) the PDF file is around 20megs. In the workflow under development it is DITA (DITA Open Toolkit not used, it is a custom transformation) to PDF using SVGs for images and BGs, the PDF is around 70megs. FYI there are 500+ full page images, some very simple like a shock absorber breakdown (around 20k) to some really complex breakdowns 900k to 4megs.

So my question is what can be done to bring the file size down? Am I misunderstanding PDF compression and what is possible? Just so you know how I'm thinking of compression, when I'm working in Photoshop and exporting for the web and using the percentage slider to see the quality of pixilation I'm getting. Or is PDF compression more like Zipping a file?

Any insight will be appreciated.

Thanks

Marijan (Mario) Madunic
Publishing Specialist
New Flyer Industries

--------------------------------------------------------------------
Please consider the environment before printing this e-mail.

CONFIDENTIALITY STATEMENT: This communication (and  any and all information or material transmitted with this communication) is confidential, may be privileged and is intended only for the use of the intended recipient. If you are not the intended recipient, any review, retransmission, circulation, distribution, reproduction, conversion to hard copy, copying or other use of this communication, information or material is strictly prohibited and may be illegal. If you received this communication in error or if it is forwarded to you without the express authorization of New Flyer, please notify us immediately by telephone or by return email and permanently delete the communication, information and material from any computer, disk drive, diskette or other storage device or media. Thank you.


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


--------------------------------------------------------------------
Please consider the environment before printing this e-mail.

CONFIDENTIALITY STATEMENT: This communication (and  any and all information or material transmitted with this communication) is confidential, may be privileged and is intended only for the use of the intended recipient. If you are not the intended recipient, any review, retransmission, circulation, distribution, reproduction, conversion to hard copy, copying or other use of this communication, information or material is strictly prohibited and may be illegal. If you received this communication in error or if it is forwarded to you without the express authorization of New Flyer, please notify us immediately by telephone or by return email and permanently delete the communication, information and material from any computer, disk drive, diskette or other storage device or media. Thank you.


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org