You are viewing a plain text version of this content. The canonical link for it is here.

Posted to fop-users@xmlgraphics.apache.org by Bill Gamble <ga...@gmail.com> on 2009/09/11 04:58:09 UTC

large image embedding problems

Hello Everyone,
We are generating PDFs which are very graphic intensive. A typical PDF has
50 pages and has 4 4000x4000 images on a page, and the images can
have transparency.

We are using Batik for generating each page as an SVG file, and then
referencing the SVG using the <fox:external-document when converting to PDF.

We run into performance problems when the images embedded in the SVG file
are anything but JPEGs. JPEGs are lighting fast and have a resulting pdf
file size 10X smaller than any other format. Unfortunately the embedded
images can have transparency, so standard JPEG format cannot be used, and
all other file formats run into memory problems and generate enormous pdf
files (300MB+).

After finding that PDF has had support for JPXDecode (for JPEG 2000) since
1.5 I was hoping to find that JPEG 2000 could injected into the PDF without
the need to decode the image, but that does not appear to be the case (we
run into the same performance problems with JPEG 2000).

Can anyone comment on:

1) Is this a limitation of the PDF format, or how FOP is rendering the PDF?
2) Any suggestions or other approaches that to how to solve our problem?

Thanks in advance!

Re: large image embedding problems

Posted by "dan.mccabe" <mc...@gmail.com>.

The solution we're using definitely isn't ideal, but with the point we're at,
we just needed something that would work reliably.  I may explore an
fo:external-graphic extension at some point if it's a piece we're planning
on leaving in place, but that will all be based on when I get time to
revisit it.  If we come up with a more permanent, non-hackish solution, I
will post it here and most likely submit a patch.  At the very least, I
wanted to post how we got around the problem so that if other people run
into the same issue, it might give them some ideas.

I would agree that the JP2 implementation is probably not in high demand
(certainly didn't find much information about it searching through the user
list), and with the extra dependencies that are involved, it would probably
make a lot more sense to have it as an external download to FOP.  I'll have
to get back to you on how we might make it available to others though. 
There would still be additional work to be done anyways, because I didn't
build in the switch for the PDF version number yet, since I was just trying
to prototype a solution (wouldn't have made sense to put that in if I didn't
end up needing it).

Dan



Jeremias Maerki-2 wrote:
> 
> Thanks for the extensive feedback, Dan. It's good to have that in the
> archives.
> 
> A BSD dependency is generally not a problem. However, JJ2000 has a
> peculiar license which would probably have to go by the legal corner of
> the ASF. Anyway, in the past, we were quite cautious about adding new
> dependencies. In this case it's not exactly mainstream functionality so
> only few people would actually use this.
> 
> Did you also do the switch for the PDF version I mentioned? If not, that
> means some additional work before it could be included. I guess if there
> are voices who want that in FOP we can certainly take a closer look at
> the licensing situation. I have only very limited time for FOP at the
> moment so I don't promise anything. At any rate, you could simply put
> the code somewhere on your company's website (and we link to it) or into
> a Bugzilla issue. You could even start a Google Code project for the
> plug-in, for example. After all, the plug-in parts can live outside of
> FOP in a separate JAR that can be added if someone wants JPEG2000
> support.
> 
> As for the memory situation, I've gone to great lengths during the
> creation of the image loading framework that memory consumption is kept
> at a minimum and the images are only locked in memory as long as needed.
> Of course, with special decoders (bypassing BufferedImage), we could
> bring the memory consumption down for PNG, TIFF & Co.. Image data could
> be decoded in stripes or line by line and immediately brought to the
> target format. But this would be quite some work as you'd essentially
> write new decoders from scratch in some cases.
> 
> Your solution with JPEG is interesting but probably not ideal since you
> essentially have to fork FOP (and keep it synchronized in the future) as
> it's a proprietary hack. But it could be possible to define an extension
> attribute for fo:external-graphic with an additional URL to an 8-bit
> image with the soft map. That could give you a chance to soften the
> rough edges an turn your changes into a patch to FOP. Just an idea.
> Still a very special use case.
> 
> On 16.09.2009 20:59:54 dan.mccabe wrote:
>> 
>> Hey Jeremias,
>> 
>> First and foremost, I want to thank you for all your help.  I was able to
>> follow along with your instructions and get an implementation that could
>> embed raw JP2 images into a PDF in a pretty short amount of time, which I
>> definitely couldn't have done without your help.  If you have an interest
>> in
>> the source code for this, I would be more than happy to share it.  The
>> one
>> caveat with it currently is that it relies on the JAI ImageIO project
>> (https://jai-imageio-core.dev.java.net/) to be able to parse the header
>> for
>> the JP2 file in my PreloadJP2 class, so I'm not sure if it would be
>> possible
>> or not to distribute it with the graphics commons library (it's under the
>> BSD license).  I took a look under the hood and it appears that it uses
>> JJ2000 (http://jj2000.epfl.ch/; http://jpeg2000.epfl.ch/) to do some of
>> the
>> heavy lifting, so if licensing is a problem, there may be some
>> alternatives
>> that can be explored.
>> 
>> However, after getting familiar with the FOP source code, I think I've
>> found
>> the root cause of the problem we were experiencing, which has caused us
>> to
>> go with a slightly different implementation.  I came across the issue
>> after
>> I had implemented the JP2 support and I was still not seeing the
>> transparency we needed in the resulting PDF.  I compared the resulting
>> PDF
>> with one we had generated using PNGs and noticed that all of the PNGs had
>> a
>> soft-mask associated with them while none of the JP2s did.  After taking
>> a
>> look at the implementation, I found that I needed to add code to my
>> ImageRawJP2Adapter to return a soft-mask reference when there was
>> transparency in the image.
>> 
>> This all worked fine, but it dawned on me that because the transparency
>> was
>> controlled by the soft-mask and not by the type of image itself, there
>> was
>> no reason we couldn't use JPG files as long as we found a way to specify
>> an
>> accompanying mask for it.  Because of some issues with rendering JP2
>> files
>> (we were using im4java as an interface to ImageMagick to generate the
>> images, but there are some hoops you have to jump through to get
>> ImageMagick
>> to run on some machines), it was definitely preferable to use JPGs if
>> possible.  The solution we settled on was inside of our custom image
>> handler
>> for generating the images in the SVG, we took the BufferedImage that
>> needed
>> to be saved and wrote out two JPG files for it, one for the image and one
>> for the mask.  In the setup method in ImageRawJPEGAdapater, I put in some
>> custom code to check for this accompanying mask file and add a soft-mask
>> using it if it was available.  This is definitely a bit of a hack, but
>> it'll
>> work for now, so we should be good.
>> 
>> I spent some time looking into the relationship between RenderedImage,
>> ImageRendered, ImageRenderedAdapter, PDFDocument, and PDFImageXObject,
>> and
>> it appears that the images should get garbage collected properly under
>> normal circumstances (thanks to the overloaded output( OutputStream )
>> method
>> in PDFImageXObject).  I also spent some time tracing through the code for
>> rendering SVGs to PDF, and it looked like the images got cleaned up
>> correctly there too.  However, what is clear is that whenever
>> ImageRenderedAdapter gets used with our application, OutOfMemoryErrors
>> will
>> ensue.  The images shouldn't be too large to fit in memory altogether
>> though, so I'm not entirely sure what was causing the issue.  When it
>> does
>> go down this path, the program usually gets through a couple pages before
>> it
>> errors out, which originally made me think that it was maybe holding
>> images
>> in memory for longer than they needed to be, but now I'm not so sure.
>> 
>> This may not be any news to you, but I figured as long as I had done some
>> research into figuring out what the problem was, I would share it in case
>> it
>> was helpful.  Thanks again for all the help!
>> 
>> Dan
>> 
>> 
>> Jeremias Maerki-2 wrote:
>> > 
>> > Hi Dan,
>> > 
>> > I'm afraid I don't see any other possibility than to implement this
>> > properly. At least the good news is that with JPEG you've got a full
>> > example of how to embed that format uncompressed into a PDF. Here are
>> > some pointers on what needs to be done:
>> > 
>> > XML Graphics Commons:
>> > http://xmlgraphics.apache.org/commons/image-loader.html
>> > 
>> > [1]
>> >
>> http://svn.apache.org/viewvc/xmlgraphics/commons/trunk/src/java/org/apache/xmlgraphics/image/loader/impl/ImageRawJPEG.java?view=markup
>> > [2]
>> >
>> http://svn.apache.org/viewvc/xmlgraphics/commons/trunk/src/java/org/apache/xmlgraphics/image/loader/impl/ImageLoaderRawJPEG.java?view=markup
>> > [3]
>> >
>> http://svn.apache.org/viewvc/xmlgraphics/commons/trunk/src/java/org/apache/xmlgraphics/image/loader/impl/PreloaderJPEG.java?view=markup
>> > [4]
>> >
>> http://svn.apache.org/viewvc/xmlgraphics/commons/trunk/src/java/org/apache/xmlgraphics/image/loader/impl/ImageLoaderFactoryRaw.java?view=markup
>> > [5]
>> >
>> http://svn.apache.org/viewvc/xmlgraphics/commons/trunk/src/resources/META-INF/services/org.apache.xmlgraphics.image.loader.spi.ImagePreloader?view=markup
>> > [6]
>> >
>> http://svn.apache.org/viewvc/xmlgraphics/commons/trunk/src/resources/META-INF/services/org.apache.xmlgraphics.image.loader.spi.ImageLoaderFactory?view=markup
>> > 
>> > FOP:
>> > 
>> > [7]
>> >
>> http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/java/org/apache/fop/render/pdf/PDFImageHandlerRawJPEG.java?view=markup
>> > [8]
>> >
>> http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/java/org/apache/fop/render/pdf/ImageRawJPEGAdapter.java?view=markup
>> > [9]
>> >
>> http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/java/META-INF/services/org.apache.fop.render.ImageHandler?view=markup
>> > [10]
>> >
>> http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/java/org/apache/fop/pdf/
>> > [11]
>> >
>> http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/java/org/apache/fop/pdf/PDFDocument.java?view=markup
>> > 
>> > 
>> > First of all, you need a plug-in for the image loading framework in XML
>> > Graphics Commons. The "preloader" [3] is responsible for detecting the
>> > file format and extracting some basic information about the image
>> > (hopefully without loading the full image already). That way the layout
>> > engine doesn't have to load the full image into memory. Only the
>> > renderer needs access to the full image. In the case of JPEG, it
>> doesn't
>> > even have to be loaded in memory. Hopefully, the same will be possible
>> > for JPEG 2000. The preloader needs to be registered in [5].
>> > 
>> > The second step is providing an image class representing the undecoded
>> > JPEG 2000 image [1]. Then you need a loader that builds that
>> > representation [2] and a factory with metadata for the loader [4].
>> > 
>> > Once you have that FOP will be able to provide a JPEG 2000 image in its
>> > raw format. At this point, you'll have to teach FOP how to make use of
>> > that. A PDF-specific image handler [7] (which is also a plug-in [9])
>> > needs to be built. Its presence will tell the image loading framework
>> > that it can provide JPEG 2000 images in raw format. Otherwise, it will
>> > simply check if ImageIO has a codec for JPEG 2000 (but this means the
>> > image gets decoded). The image handler then uses an image adapter [8]
>> to
>> > finally embed the image into the PDF. I assume you will also need a few
>> > modifications in FOP's PDF library to support the JPXDecode filter
>> [10].
>> > 
>> > Since JPXDecode is a PDF 1.5 feature, you will also need to introduce a
>> > switch [11] between PDF 1.4 and 1.5. That is necessary because of PDF/A
>> > and PDF/X functionality which require keeping PDF on version 1.4. So
>> > JPEG 2000 should only be available when PDF 1.5 is enabled.
>> > 
>> > I guess one of the first steps should also be studying the JPEG 2000
>> > specification and the PDF specification so you can decide whether the
>> > direct embedding of JPEG 2000 images is possible in the first place.
>> > Otherwise, you might spend a lot of time on something that may not work
>> > in the end. I don't know the JPEG 2000 format so I can't tell if it's
>> > possible without diving into this myself.
>> > 
>> > HTH and good luck!
>> > 
>> > On 12.09.2009 00:24:41 dan.mccabe wrote:
>> >> 
>> >> Hey Jeremias,
>> >> 
>> >> I'm working on this problem with Bill, and it looks like we may be
>> >> reaching
>> >> a point where we need to try to tackle embedding JPEG 2000 images. 
>> >> Assuming
>> >> we do need to go down that path, do you have any recommendations for
>> >> where
>> >> we should start?
>> >> 
>> >> However, this is assuming that we can't find another way to do what we
>> >> need
>> >> to do.  Based on your description, it certainly doesn't sound like an
>> >> easy
>> >> task to get this implemented, so we really only want to do this as a
>> last
>> >> resort.  Based on the description of what we are trying to do, do you
>> >> have
>> >> any suggestions for an alternative approach that might help us reach
>> our
>> >> goal?
>> >> 
>> >> Thanks.
>> >> 
>> >> 
>> >> Jeremias Maerki-2 wrote:
>> >> > 
>> >> > FOP currently produces PDF 1.4 so there's no support for JPEG 2000,
>> >> yet.
>> >> > One could (probably) add support for embedding undecoded JPEG 2000
>> >> > images (JPXDecode) to FOP and add an option with which to control
>> the
>> >> > PDF version produced by FOP. Of course, that means digging into the
>> >> > source code of FOP and XML Graphics Commons. I can give you pointers
>> if
>> >> > you decide to do that.
>> >> > 
>> >> > However, I haven't investigated if it's as simple as with JPEG to
>> also
>> >> > embed JPEG 2000 images. I mention that since I've once tried to get
>> >> > undecoded PNG graphics directly into PDF. After all, the FlateDecode
>> >> > filter supports about the same predictors as PNG but I couldn't make
>> >> > this work in reasonable time. This just as a caveat.
>> >> > 
>> >> > On 11.09.2009 04:58:09 Bill Gamble wrote:
>> >> >> Hello Everyone,
>> >> >> We are generating PDFs which are very graphic intensive. A typical
>> PDF
>> >> >> has
>> >> >> 50 pages and has 4 4000x4000 images on a page, and the images can
>> >> >> have transparency.
>> >> >> 
>> >> >> We are using Batik for generating each page as an SVG file, and
>> then
>> >> >> referencing the SVG using the <fox:external-document when
>> converting
>> >> to
>> >> >> PDF.
>> >> >> 
>> >> >> We run into performance problems when the images embedded in the
>> SVG
>> >> file
>> >> >> are anything but JPEGs. JPEGs are lighting fast and have a
>> resulting
>> >> pdf
>> >> >> file size 10X smaller than any other format. Unfortunately the
>> >> embedded
>> >> >> images can have transparency, so standard JPEG format cannot be
>> used,
>> >> and
>> >> >> all other file formats run into memory problems and generate
>> enormous
>> >> pdf
>> >> >> files (300MB+).
>> >> >> 
>> >> >> After finding that PDF has had support for JPXDecode (for JPEG
>> 2000)
>> >> >> since
>> >> >> 1.5 I was hoping to find that JPEG 2000 could injected into the PDF
>> >> >> without
>> >> >> the need to decode the image, but that does not appear to be the
>> case
>> >> (we
>> >> >> run into the same performance problems with JPEG 2000).
>> >> >> 
>> >> >> Can anyone comment on:
>> >> >> 
>> >> >> 1) Is this a limitation of the PDF format, or how FOP is rendering
>> the
>> >> >> PDF?
>> >> >> 2) Any suggestions or other approaches that to how to solve our
>> >> problem?
>> >> >> 
>> >> >> Thanks in advance!
>> >> > 
>> >> > 
>> >> > 
>> >> > 
>> >> > Jeremias Maerki
>> >> > 
>> >> > 
>> >> >
>> ---------------------------------------------------------------------
>> >> > To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
>> >> > For additional commands, e-mail:
>> fop-users-help@xmlgraphics.apache.org
>> >> > 
>> >> > 
>> >> > 
>> >> 
>> >> -- 
>> >> View this message in context:
>> >>
>> http://www.nabble.com/large-image-embedding-problems-tp25394304p25409319.html
>> >> Sent from the FOP - Users mailing list archive at Nabble.com.
>> >> 
>> > 
>> > 
>> > 
>> > Jeremias Maerki
>> > 
>> > 
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
>> > For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org
>> > 
>> > 
>> > 
>> 
>> -- 
>> View this message in context:
>> http://www.nabble.com/large-image-embedding-problems-tp25394304p25478390.html
>> Sent from the FOP - Users mailing list archive at Nabble.com.
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
>> For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org
> 
> 
> 
> 
> Jeremias Maerki
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
> For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/large-image-embedding-problems-tp25394304p25480579.html
Sent from the FOP - Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org

Re: large image embedding problems

Posted by Jeremias Maerki <de...@jeremias-maerki.ch>.

Thanks for the extensive feedback, Dan. It's good to have that in the
archives.

A BSD dependency is generally not a problem. However, JJ2000 has a
peculiar license which would probably have to go by the legal corner of
the ASF. Anyway, in the past, we were quite cautious about adding new
dependencies. In this case it's not exactly mainstream functionality so
only few people would actually use this.

Did you also do the switch for the PDF version I mentioned? If not, that
means some additional work before it could be included. I guess if there
are voices who want that in FOP we can certainly take a closer look at
the licensing situation. I have only very limited time for FOP at the
moment so I don't promise anything. At any rate, you could simply put
the code somewhere on your company's website (and we link to it) or into
a Bugzilla issue. You could even start a Google Code project for the
plug-in, for example. After all, the plug-in parts can live outside of
FOP in a separate JAR that can be added if someone wants JPEG2000
support.

As for the memory situation, I've gone to great lengths during the
creation of the image loading framework that memory consumption is kept
at a minimum and the images are only locked in memory as long as needed.
Of course, with special decoders (bypassing BufferedImage), we could
bring the memory consumption down for PNG, TIFF & Co.. Image data could
be decoded in stripes or line by line and immediately brought to the
target format. But this would be quite some work as you'd essentially
write new decoders from scratch in some cases.

Your solution with JPEG is interesting but probably not ideal since you
essentially have to fork FOP (and keep it synchronized in the future) as
it's a proprietary hack. But it could be possible to define an extension
attribute for fo:external-graphic with an additional URL to an 8-bit
image with the soft map. That could give you a chance to soften the
rough edges an turn your changes into a patch to FOP. Just an idea.
Still a very special use case.

On 16.09.2009 20:59:54 dan.mccabe wrote:
> 
> Hey Jeremias,
> 
> First and foremost, I want to thank you for all your help.  I was able to
> follow along with your instructions and get an implementation that could
> embed raw JP2 images into a PDF in a pretty short amount of time, which I
> definitely couldn't have done without your help.  If you have an interest in
> the source code for this, I would be more than happy to share it.  The one
> caveat with it currently is that it relies on the JAI ImageIO project
> (https://jai-imageio-core.dev.java.net/) to be able to parse the header for
> the JP2 file in my PreloadJP2 class, so I'm not sure if it would be possible
> or not to distribute it with the graphics commons library (it's under the
> BSD license).  I took a look under the hood and it appears that it uses
> JJ2000 (http://jj2000.epfl.ch/; http://jpeg2000.epfl.ch/) to do some of the
> heavy lifting, so if licensing is a problem, there may be some alternatives
> that can be explored.
> 
> However, after getting familiar with the FOP source code, I think I've found
> the root cause of the problem we were experiencing, which has caused us to
> go with a slightly different implementation.  I came across the issue after
> I had implemented the JP2 support and I was still not seeing the
> transparency we needed in the resulting PDF.  I compared the resulting PDF
> with one we had generated using PNGs and noticed that all of the PNGs had a
> soft-mask associated with them while none of the JP2s did.  After taking a
> look at the implementation, I found that I needed to add code to my
> ImageRawJP2Adapter to return a soft-mask reference when there was
> transparency in the image.
> 
> This all worked fine, but it dawned on me that because the transparency was
> controlled by the soft-mask and not by the type of image itself, there was
> no reason we couldn't use JPG files as long as we found a way to specify an
> accompanying mask for it.  Because of some issues with rendering JP2 files
> (we were using im4java as an interface to ImageMagick to generate the
> images, but there are some hoops you have to jump through to get ImageMagick
> to run on some machines), it was definitely preferable to use JPGs if
> possible.  The solution we settled on was inside of our custom image handler
> for generating the images in the SVG, we took the BufferedImage that needed
> to be saved and wrote out two JPG files for it, one for the image and one
> for the mask.  In the setup method in ImageRawJPEGAdapater, I put in some
> custom code to check for this accompanying mask file and add a soft-mask
> using it if it was available.  This is definitely a bit of a hack, but it'll
> work for now, so we should be good.
> 
> I spent some time looking into the relationship between RenderedImage,
> ImageRendered, ImageRenderedAdapter, PDFDocument, and PDFImageXObject, and
> it appears that the images should get garbage collected properly under
> normal circumstances (thanks to the overloaded output( OutputStream ) method
> in PDFImageXObject).  I also spent some time tracing through the code for
> rendering SVGs to PDF, and it looked like the images got cleaned up
> correctly there too.  However, what is clear is that whenever
> ImageRenderedAdapter gets used with our application, OutOfMemoryErrors will
> ensue.  The images shouldn't be too large to fit in memory altogether
> though, so I'm not entirely sure what was causing the issue.  When it does
> go down this path, the program usually gets through a couple pages before it
> errors out, which originally made me think that it was maybe holding images
> in memory for longer than they needed to be, but now I'm not so sure.
> 
> This may not be any news to you, but I figured as long as I had done some
> research into figuring out what the problem was, I would share it in case it
> was helpful.  Thanks again for all the help!
> 
> Dan
> 
> 
> Jeremias Maerki-2 wrote:
> > 
> > Hi Dan,
> > 
> > I'm afraid I don't see any other possibility than to implement this
> > properly. At least the good news is that with JPEG you've got a full
> > example of how to embed that format uncompressed into a PDF. Here are
> > some pointers on what needs to be done:
> > 
> > XML Graphics Commons:
> > http://xmlgraphics.apache.org/commons/image-loader.html
> > 
> > [1]
> > http://svn.apache.org/viewvc/xmlgraphics/commons/trunk/src/java/org/apache/xmlgraphics/image/loader/impl/ImageRawJPEG.java?view=markup
> > [2]
> > http://svn.apache.org/viewvc/xmlgraphics/commons/trunk/src/java/org/apache/xmlgraphics/image/loader/impl/ImageLoaderRawJPEG.java?view=markup
> > [3]
> > http://svn.apache.org/viewvc/xmlgraphics/commons/trunk/src/java/org/apache/xmlgraphics/image/loader/impl/PreloaderJPEG.java?view=markup
> > [4]
> > http://svn.apache.org/viewvc/xmlgraphics/commons/trunk/src/java/org/apache/xmlgraphics/image/loader/impl/ImageLoaderFactoryRaw.java?view=markup
> > [5]
> > http://svn.apache.org/viewvc/xmlgraphics/commons/trunk/src/resources/META-INF/services/org.apache.xmlgraphics.image.loader.spi.ImagePreloader?view=markup
> > [6]
> > http://svn.apache.org/viewvc/xmlgraphics/commons/trunk/src/resources/META-INF/services/org.apache.xmlgraphics.image.loader.spi.ImageLoaderFactory?view=markup
> > 
> > FOP:
> > 
> > [7]
> > http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/java/org/apache/fop/render/pdf/PDFImageHandlerRawJPEG.java?view=markup
> > [8]
> > http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/java/org/apache/fop/render/pdf/ImageRawJPEGAdapter.java?view=markup
> > [9]
> > http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/java/META-INF/services/org.apache.fop.render.ImageHandler?view=markup
> > [10]
> > http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/java/org/apache/fop/pdf/
> > [11]
> > http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/java/org/apache/fop/pdf/PDFDocument.java?view=markup
> > 
> > 
> > First of all, you need a plug-in for the image loading framework in XML
> > Graphics Commons. The "preloader" [3] is responsible for detecting the
> > file format and extracting some basic information about the image
> > (hopefully without loading the full image already). That way the layout
> > engine doesn't have to load the full image into memory. Only the
> > renderer needs access to the full image. In the case of JPEG, it doesn't
> > even have to be loaded in memory. Hopefully, the same will be possible
> > for JPEG 2000. The preloader needs to be registered in [5].
> > 
> > The second step is providing an image class representing the undecoded
> > JPEG 2000 image [1]. Then you need a loader that builds that
> > representation [2] and a factory with metadata for the loader [4].
> > 
> > Once you have that FOP will be able to provide a JPEG 2000 image in its
> > raw format. At this point, you'll have to teach FOP how to make use of
> > that. A PDF-specific image handler [7] (which is also a plug-in [9])
> > needs to be built. Its presence will tell the image loading framework
> > that it can provide JPEG 2000 images in raw format. Otherwise, it will
> > simply check if ImageIO has a codec for JPEG 2000 (but this means the
> > image gets decoded). The image handler then uses an image adapter [8] to
> > finally embed the image into the PDF. I assume you will also need a few
> > modifications in FOP's PDF library to support the JPXDecode filter [10].
> > 
> > Since JPXDecode is a PDF 1.5 feature, you will also need to introduce a
> > switch [11] between PDF 1.4 and 1.5. That is necessary because of PDF/A
> > and PDF/X functionality which require keeping PDF on version 1.4. So
> > JPEG 2000 should only be available when PDF 1.5 is enabled.
> > 
> > I guess one of the first steps should also be studying the JPEG 2000
> > specification and the PDF specification so you can decide whether the
> > direct embedding of JPEG 2000 images is possible in the first place.
> > Otherwise, you might spend a lot of time on something that may not work
> > in the end. I don't know the JPEG 2000 format so I can't tell if it's
> > possible without diving into this myself.
> > 
> > HTH and good luck!
> > 
> > On 12.09.2009 00:24:41 dan.mccabe wrote:
> >> 
> >> Hey Jeremias,
> >> 
> >> I'm working on this problem with Bill, and it looks like we may be
> >> reaching
> >> a point where we need to try to tackle embedding JPEG 2000 images. 
> >> Assuming
> >> we do need to go down that path, do you have any recommendations for
> >> where
> >> we should start?
> >> 
> >> However, this is assuming that we can't find another way to do what we
> >> need
> >> to do.  Based on your description, it certainly doesn't sound like an
> >> easy
> >> task to get this implemented, so we really only want to do this as a last
> >> resort.  Based on the description of what we are trying to do, do you
> >> have
> >> any suggestions for an alternative approach that might help us reach our
> >> goal?
> >> 
> >> Thanks.
> >> 
> >> 
> >> Jeremias Maerki-2 wrote:
> >> > 
> >> > FOP currently produces PDF 1.4 so there's no support for JPEG 2000,
> >> yet.
> >> > One could (probably) add support for embedding undecoded JPEG 2000
> >> > images (JPXDecode) to FOP and add an option with which to control the
> >> > PDF version produced by FOP. Of course, that means digging into the
> >> > source code of FOP and XML Graphics Commons. I can give you pointers if
> >> > you decide to do that.
> >> > 
> >> > However, I haven't investigated if it's as simple as with JPEG to also
> >> > embed JPEG 2000 images. I mention that since I've once tried to get
> >> > undecoded PNG graphics directly into PDF. After all, the FlateDecode
> >> > filter supports about the same predictors as PNG but I couldn't make
> >> > this work in reasonable time. This just as a caveat.
> >> > 
> >> > On 11.09.2009 04:58:09 Bill Gamble wrote:
> >> >> Hello Everyone,
> >> >> We are generating PDFs which are very graphic intensive. A typical PDF
> >> >> has
> >> >> 50 pages and has 4 4000x4000 images on a page, and the images can
> >> >> have transparency.
> >> >> 
> >> >> We are using Batik for generating each page as an SVG file, and then
> >> >> referencing the SVG using the <fox:external-document when converting
> >> to
> >> >> PDF.
> >> >> 
> >> >> We run into performance problems when the images embedded in the SVG
> >> file
> >> >> are anything but JPEGs. JPEGs are lighting fast and have a resulting
> >> pdf
> >> >> file size 10X smaller than any other format. Unfortunately the
> >> embedded
> >> >> images can have transparency, so standard JPEG format cannot be used,
> >> and
> >> >> all other file formats run into memory problems and generate enormous
> >> pdf
> >> >> files (300MB+).
> >> >> 
> >> >> After finding that PDF has had support for JPXDecode (for JPEG 2000)
> >> >> since
> >> >> 1.5 I was hoping to find that JPEG 2000 could injected into the PDF
> >> >> without
> >> >> the need to decode the image, but that does not appear to be the case
> >> (we
> >> >> run into the same performance problems with JPEG 2000).
> >> >> 
> >> >> Can anyone comment on:
> >> >> 
> >> >> 1) Is this a limitation of the PDF format, or how FOP is rendering the
> >> >> PDF?
> >> >> 2) Any suggestions or other approaches that to how to solve our
> >> problem?
> >> >> 
> >> >> Thanks in advance!
> >> > 
> >> > 
> >> > 
> >> > 
> >> > Jeremias Maerki
> >> > 
> >> > 
> >> > ---------------------------------------------------------------------
> >> > To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
> >> > For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org
> >> > 
> >> > 
> >> > 
> >> 
> >> -- 
> >> View this message in context:
> >> http://www.nabble.com/large-image-embedding-problems-tp25394304p25409319.html
> >> Sent from the FOP - Users mailing list archive at Nabble.com.
> >> 
> > 
> > 
> > 
> > Jeremias Maerki
> > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
> > For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org
> > 
> > 
> > 
> 
> -- 
> View this message in context: http://www.nabble.com/large-image-embedding-problems-tp25394304p25478390.html
> Sent from the FOP - Users mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
> For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org




Jeremias Maerki


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org

Re: large image embedding problems

Posted by "dan.mccabe" <mc...@gmail.com>.

Hey Jeremias,

First and foremost, I want to thank you for all your help.  I was able to
follow along with your instructions and get an implementation that could
embed raw JP2 images into a PDF in a pretty short amount of time, which I
definitely couldn't have done without your help.  If you have an interest in
the source code for this, I would be more than happy to share it.  The one
caveat with it currently is that it relies on the JAI ImageIO project
(https://jai-imageio-core.dev.java.net/) to be able to parse the header for
the JP2 file in my PreloadJP2 class, so I'm not sure if it would be possible
or not to distribute it with the graphics commons library (it's under the
BSD license).  I took a look under the hood and it appears that it uses
JJ2000 (http://jj2000.epfl.ch/; http://jpeg2000.epfl.ch/) to do some of the
heavy lifting, so if licensing is a problem, there may be some alternatives
that can be explored.

However, after getting familiar with the FOP source code, I think I've found
the root cause of the problem we were experiencing, which has caused us to
go with a slightly different implementation.  I came across the issue after
I had implemented the JP2 support and I was still not seeing the
transparency we needed in the resulting PDF.  I compared the resulting PDF
with one we had generated using PNGs and noticed that all of the PNGs had a
soft-mask associated with them while none of the JP2s did.  After taking a
look at the implementation, I found that I needed to add code to my
ImageRawJP2Adapter to return a soft-mask reference when there was
transparency in the image.

This all worked fine, but it dawned on me that because the transparency was
controlled by the soft-mask and not by the type of image itself, there was
no reason we couldn't use JPG files as long as we found a way to specify an
accompanying mask for it.  Because of some issues with rendering JP2 files
(we were using im4java as an interface to ImageMagick to generate the
images, but there are some hoops you have to jump through to get ImageMagick
to run on some machines), it was definitely preferable to use JPGs if
possible.  The solution we settled on was inside of our custom image handler
for generating the images in the SVG, we took the BufferedImage that needed
to be saved and wrote out two JPG files for it, one for the image and one
for the mask.  In the setup method in ImageRawJPEGAdapater, I put in some
custom code to check for this accompanying mask file and add a soft-mask
using it if it was available.  This is definitely a bit of a hack, but it'll
work for now, so we should be good.

I spent some time looking into the relationship between RenderedImage,
ImageRendered, ImageRenderedAdapter, PDFDocument, and PDFImageXObject, and
it appears that the images should get garbage collected properly under
normal circumstances (thanks to the overloaded output( OutputStream ) method
in PDFImageXObject).  I also spent some time tracing through the code for
rendering SVGs to PDF, and it looked like the images got cleaned up
correctly there too.  However, what is clear is that whenever
ImageRenderedAdapter gets used with our application, OutOfMemoryErrors will
ensue.  The images shouldn't be too large to fit in memory altogether
though, so I'm not entirely sure what was causing the issue.  When it does
go down this path, the program usually gets through a couple pages before it
errors out, which originally made me think that it was maybe holding images
in memory for longer than they needed to be, but now I'm not so sure.

This may not be any news to you, but I figured as long as I had done some
research into figuring out what the problem was, I would share it in case it
was helpful.  Thanks again for all the help!

Dan

Jeremias Maerki-2 wrote:
> 
> Hi Dan,
> 
> I'm afraid I don't see any other possibility than to implement this
> properly. At least the good news is that with JPEG you've got a full
> example of how to embed that format uncompressed into a PDF. Here are
> some pointers on what needs to be done:
> 
> XML Graphics Commons:
> http://xmlgraphics.apache.org/commons/image-loader.html
> 
> [1]
> http://svn.apache.org/viewvc/xmlgraphics/commons/trunk/src/java/org/apache/xmlgraphics/image/loader/impl/ImageRawJPEG.java?view=markup
> [2]
> http://svn.apache.org/viewvc/xmlgraphics/commons/trunk/src/java/org/apache/xmlgraphics/image/loader/impl/ImageLoaderRawJPEG.java?view=markup
> [3]
> http://svn.apache.org/viewvc/xmlgraphics/commons/trunk/src/java/org/apache/xmlgraphics/image/loader/impl/PreloaderJPEG.java?view=markup
> [4]
> http://svn.apache.org/viewvc/xmlgraphics/commons/trunk/src/java/org/apache/xmlgraphics/image/loader/impl/ImageLoaderFactoryRaw.java?view=markup
> [5]
> http://svn.apache.org/viewvc/xmlgraphics/commons/trunk/src/resources/META-INF/services/org.apache.xmlgraphics.image.loader.spi.ImagePreloader?view=markup
> [6]
> http://svn.apache.org/viewvc/xmlgraphics/commons/trunk/src/resources/META-INF/services/org.apache.xmlgraphics.image.loader.spi.ImageLoaderFactory?view=markup
> 
> FOP:
> 
> [7]
> http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/java/org/apache/fop/render/pdf/PDFImageHandlerRawJPEG.java?view=markup
> [8]
> http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/java/org/apache/fop/render/pdf/ImageRawJPEGAdapter.java?view=markup
> [9]
> http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/java/META-INF/services/org.apache.fop.render.ImageHandler?view=markup
> [10]
> http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/java/org/apache/fop/pdf/
> [11]
> http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/java/org/apache/fop/pdf/PDFDocument.java?view=markup
> 
> 
> First of all, you need a plug-in for the image loading framework in XML
> Graphics Commons. The "preloader" [3] is responsible for detecting the
> file format and extracting some basic information about the image
> (hopefully without loading the full image already). That way the layout
> engine doesn't have to load the full image into memory. Only the
> renderer needs access to the full image. In the case of JPEG, it doesn't
> even have to be loaded in memory. Hopefully, the same will be possible
> for JPEG 2000. The preloader needs to be registered in [5].
> 
> The second step is providing an image class representing the undecoded
> JPEG 2000 image [1]. Then you need a loader that builds that
> representation [2] and a factory with metadata for the loader [4].
> 
> Once you have that FOP will be able to provide a JPEG 2000 image in its
> raw format. At this point, you'll have to teach FOP how to make use of
> that. A PDF-specific image handler [7] (which is also a plug-in [9])
> needs to be built. Its presence will tell the image loading framework
> that it can provide JPEG 2000 images in raw format. Otherwise, it will
> simply check if ImageIO has a codec for JPEG 2000 (but this means the
> image gets decoded). The image handler then uses an image adapter [8] to
> finally embed the image into the PDF. I assume you will also need a few
> modifications in FOP's PDF library to support the JPXDecode filter [10].
> 
> Since JPXDecode is a PDF 1.5 feature, you will also need to introduce a
> switch [11] between PDF 1.4 and 1.5. That is necessary because of PDF/A
> and PDF/X functionality which require keeping PDF on version 1.4. So
> JPEG 2000 should only be available when PDF 1.5 is enabled.
> 
> I guess one of the first steps should also be studying the JPEG 2000
> specification and the PDF specification so you can decide whether the
> direct embedding of JPEG 2000 images is possible in the first place.
> Otherwise, you might spend a lot of time on something that may not work
> in the end. I don't know the JPEG 2000 format so I can't tell if it's
> possible without diving into this myself.
> 
> HTH and good luck!
> 
> On 12.09.2009 00:24:41 dan.mccabe wrote:
>> 
>> Hey Jeremias,
>> 
>> I'm working on this problem with Bill, and it looks like we may be
>> reaching
>> a point where we need to try to tackle embedding JPEG 2000 images. 
>> Assuming
>> we do need to go down that path, do you have any recommendations for
>> where
>> we should start?
>> 
>> However, this is assuming that we can't find another way to do what we
>> need
>> to do.  Based on your description, it certainly doesn't sound like an
>> easy
>> task to get this implemented, so we really only want to do this as a last
>> resort.  Based on the description of what we are trying to do, do you
>> have
>> any suggestions for an alternative approach that might help us reach our
>> goal?
>> 
>> Thanks.
>> 
>> 
>> Jeremias Maerki-2 wrote:
>> > 
>> > FOP currently produces PDF 1.4 so there's no support for JPEG 2000,
>> yet.
>> > One could (probably) add support for embedding undecoded JPEG 2000
>> > images (JPXDecode) to FOP and add an option with which to control the
>> > PDF version produced by FOP. Of course, that means digging into the
>> > source code of FOP and XML Graphics Commons. I can give you pointers if
>> > you decide to do that.
>> > 
>> > However, I haven't investigated if it's as simple as with JPEG to also
>> > embed JPEG 2000 images. I mention that since I've once tried to get
>> > undecoded PNG graphics directly into PDF. After all, the FlateDecode
>> > filter supports about the same predictors as PNG but I couldn't make
>> > this work in reasonable time. This just as a caveat.
>> > 
>> > On 11.09.2009 04:58:09 Bill Gamble wrote:
>> >> Hello Everyone,
>> >> We are generating PDFs which are very graphic intensive. A typical PDF
>> >> has
>> >> 50 pages and has 4 4000x4000 images on a page, and the images can
>> >> have transparency.
>> >> 
>> >> We are using Batik for generating each page as an SVG file, and then
>> >> referencing the SVG using the <fox:external-document when converting
>> to
>> >> PDF.
>> >> 
>> >> We run into performance problems when the images embedded in the SVG
>> file
>> >> are anything but JPEGs. JPEGs are lighting fast and have a resulting
>> pdf
>> >> file size 10X smaller than any other format. Unfortunately the
>> embedded
>> >> images can have transparency, so standard JPEG format cannot be used,
>> and
>> >> all other file formats run into memory problems and generate enormous
>> pdf
>> >> files (300MB+).
>> >> 
>> >> After finding that PDF has had support for JPXDecode (for JPEG 2000)
>> >> since
>> >> 1.5 I was hoping to find that JPEG 2000 could injected into the PDF
>> >> without
>> >> the need to decode the image, but that does not appear to be the case
>> (we
>> >> run into the same performance problems with JPEG 2000).
>> >> 
>> >> Can anyone comment on:
>> >> 
>> >> 1) Is this a limitation of the PDF format, or how FOP is rendering the
>> >> PDF?
>> >> 2) Any suggestions or other approaches that to how to solve our
>> problem?
>> >> 
>> >> Thanks in advance!
>> > 
>> > 
>> > 
>> > 
>> > Jeremias Maerki
>> > 
>> > 
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
>> > For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org
>> > 
>> > 
>> > 
>> 
>> -- 
>> View this message in context:
>> http://www.nabble.com/large-image-embedding-problems-tp25394304p25409319.html
>> Sent from the FOP - Users mailing list archive at Nabble.com.
>> 
> 
> 
> 
> Jeremias Maerki
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
> For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/large-image-embedding-problems-tp25394304p25478390.html
Sent from the FOP - Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org

Re: large image embedding problems

Posted by Jeremias Maerki <de...@jeremias-maerki.ch>.

Hi Dan,

I'm afraid I don't see any other possibility than to implement this
properly. At least the good news is that with JPEG you've got a full
example of how to embed that format uncompressed into a PDF. Here are
some pointers on what needs to be done:

XML Graphics Commons: http://xmlgraphics.apache.org/commons/image-loader.html

[1] http://svn.apache.org/viewvc/xmlgraphics/commons/trunk/src/java/org/apache/xmlgraphics/image/loader/impl/ImageRawJPEG.java?view=markup
[2] http://svn.apache.org/viewvc/xmlgraphics/commons/trunk/src/java/org/apache/xmlgraphics/image/loader/impl/ImageLoaderRawJPEG.java?view=markup
[3] http://svn.apache.org/viewvc/xmlgraphics/commons/trunk/src/java/org/apache/xmlgraphics/image/loader/impl/PreloaderJPEG.java?view=markup
[4] http://svn.apache.org/viewvc/xmlgraphics/commons/trunk/src/java/org/apache/xmlgraphics/image/loader/impl/ImageLoaderFactoryRaw.java?view=markup
[5] http://svn.apache.org/viewvc/xmlgraphics/commons/trunk/src/resources/META-INF/services/org.apache.xmlgraphics.image.loader.spi.ImagePreloader?view=markup
[6] http://svn.apache.org/viewvc/xmlgraphics/commons/trunk/src/resources/META-INF/services/org.apache.xmlgraphics.image.loader.spi.ImageLoaderFactory?view=markup

FOP:

[7] http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/java/org/apache/fop/render/pdf/PDFImageHandlerRawJPEG.java?view=markup
[8] http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/java/org/apache/fop/render/pdf/ImageRawJPEGAdapter.java?view=markup
[9] http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/java/META-INF/services/org.apache.fop.render.ImageHandler?view=markup
[10] http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/java/org/apache/fop/pdf/
[11] http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/java/org/apache/fop/pdf/PDFDocument.java?view=markup

First of all, you need a plug-in for the image loading framework in XML
Graphics Commons. The "preloader" [3] is responsible for detecting the
file format and extracting some basic information about the image
(hopefully without loading the full image already). That way the layout
engine doesn't have to load the full image into memory. Only the
renderer needs access to the full image. In the case of JPEG, it doesn't
even have to be loaded in memory. Hopefully, the same will be possible
for JPEG 2000. The preloader needs to be registered in [5].

The second step is providing an image class representing the undecoded
JPEG 2000 image [1]. Then you need a loader that builds that
representation [2] and a factory with metadata for the loader [4].

Once you have that FOP will be able to provide a JPEG 2000 image in its
raw format. At this point, you'll have to teach FOP how to make use of
that. A PDF-specific image handler [7] (which is also a plug-in [9])
needs to be built. Its presence will tell the image loading framework
that it can provide JPEG 2000 images in raw format. Otherwise, it will
simply check if ImageIO has a codec for JPEG 2000 (but this means the
image gets decoded). The image handler then uses an image adapter [8] to
finally embed the image into the PDF. I assume you will also need a few
modifications in FOP's PDF library to support the JPXDecode filter [10].

Since JPXDecode is a PDF 1.5 feature, you will also need to introduce a
switch [11] between PDF 1.4 and 1.5. That is necessary because of PDF/A
and PDF/X functionality which require keeping PDF on version 1.4. So
JPEG 2000 should only be available when PDF 1.5 is enabled.

I guess one of the first steps should also be studying the JPEG 2000
specification and the PDF specification so you can decide whether the
direct embedding of JPEG 2000 images is possible in the first place.
Otherwise, you might spend a lot of time on something that may not work
in the end. I don't know the JPEG 2000 format so I can't tell if it's
possible without diving into this myself.

HTH and good luck!

On 12.09.2009 00:24:41 dan.mccabe wrote:
> 
> Hey Jeremias,
> 
> I'm working on this problem with Bill, and it looks like we may be reaching
> a point where we need to try to tackle embedding JPEG 2000 images.  Assuming
> we do need to go down that path, do you have any recommendations for where
> we should start?
> 
> However, this is assuming that we can't find another way to do what we need
> to do.  Based on your description, it certainly doesn't sound like an easy
> task to get this implemented, so we really only want to do this as a last
> resort.  Based on the description of what we are trying to do, do you have
> any suggestions for an alternative approach that might help us reach our
> goal?
> 
> Thanks.
> 
> 
> Jeremias Maerki-2 wrote:
> > 
> > FOP currently produces PDF 1.4 so there's no support for JPEG 2000, yet.
> > One could (probably) add support for embedding undecoded JPEG 2000
> > images (JPXDecode) to FOP and add an option with which to control the
> > PDF version produced by FOP. Of course, that means digging into the
> > source code of FOP and XML Graphics Commons. I can give you pointers if
> > you decide to do that.
> > 
> > However, I haven't investigated if it's as simple as with JPEG to also
> > embed JPEG 2000 images. I mention that since I've once tried to get
> > undecoded PNG graphics directly into PDF. After all, the FlateDecode
> > filter supports about the same predictors as PNG but I couldn't make
> > this work in reasonable time. This just as a caveat.
> > 
> > On 11.09.2009 04:58:09 Bill Gamble wrote:
> >> Hello Everyone,
> >> We are generating PDFs which are very graphic intensive. A typical PDF
> >> has
> >> 50 pages and has 4 4000x4000 images on a page, and the images can
> >> have transparency.
> >> 
> >> We are using Batik for generating each page as an SVG file, and then
> >> referencing the SVG using the <fox:external-document when converting to
> >> PDF.
> >> 
> >> We run into performance problems when the images embedded in the SVG file
> >> are anything but JPEGs. JPEGs are lighting fast and have a resulting pdf
> >> file size 10X smaller than any other format. Unfortunately the embedded
> >> images can have transparency, so standard JPEG format cannot be used, and
> >> all other file formats run into memory problems and generate enormous pdf
> >> files (300MB+).
> >> 
> >> After finding that PDF has had support for JPXDecode (for JPEG 2000)
> >> since
> >> 1.5 I was hoping to find that JPEG 2000 could injected into the PDF
> >> without
> >> the need to decode the image, but that does not appear to be the case (we
> >> run into the same performance problems with JPEG 2000).
> >> 
> >> Can anyone comment on:
> >> 
> >> 1) Is this a limitation of the PDF format, or how FOP is rendering the
> >> PDF?
> >> 2) Any suggestions or other approaches that to how to solve our problem?
> >> 
> >> Thanks in advance!
> > 
> > 
> > 
> > 
> > Jeremias Maerki
> > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
> > For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org
> > 
> > 
> > 
> 
> -- 
> View this message in context: http://www.nabble.com/large-image-embedding-problems-tp25394304p25409319.html
> Sent from the FOP - Users mailing list archive at Nabble.com.
> 

Jeremias Maerki

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org

Re: large image embedding problems

Posted by "dan.mccabe" <mc...@gmail.com>.

Hey Jeremias,

I'm working on this problem with Bill, and it looks like we may be reaching
a point where we need to try to tackle embedding JPEG 2000 images.  Assuming
we do need to go down that path, do you have any recommendations for where
we should start?

However, this is assuming that we can't find another way to do what we need
to do.  Based on your description, it certainly doesn't sound like an easy
task to get this implemented, so we really only want to do this as a last
resort.  Based on the description of what we are trying to do, do you have
any suggestions for an alternative approach that might help us reach our
goal?

Thanks.


Jeremias Maerki-2 wrote:
> 
> FOP currently produces PDF 1.4 so there's no support for JPEG 2000, yet.
> One could (probably) add support for embedding undecoded JPEG 2000
> images (JPXDecode) to FOP and add an option with which to control the
> PDF version produced by FOP. Of course, that means digging into the
> source code of FOP and XML Graphics Commons. I can give you pointers if
> you decide to do that.
> 
> However, I haven't investigated if it's as simple as with JPEG to also
> embed JPEG 2000 images. I mention that since I've once tried to get
> undecoded PNG graphics directly into PDF. After all, the FlateDecode
> filter supports about the same predictors as PNG but I couldn't make
> this work in reasonable time. This just as a caveat.
> 
> On 11.09.2009 04:58:09 Bill Gamble wrote:
>> Hello Everyone,
>> We are generating PDFs which are very graphic intensive. A typical PDF
>> has
>> 50 pages and has 4 4000x4000 images on a page, and the images can
>> have transparency.
>> 
>> We are using Batik for generating each page as an SVG file, and then
>> referencing the SVG using the <fox:external-document when converting to
>> PDF.
>> 
>> We run into performance problems when the images embedded in the SVG file
>> are anything but JPEGs. JPEGs are lighting fast and have a resulting pdf
>> file size 10X smaller than any other format. Unfortunately the embedded
>> images can have transparency, so standard JPEG format cannot be used, and
>> all other file formats run into memory problems and generate enormous pdf
>> files (300MB+).
>> 
>> After finding that PDF has had support for JPXDecode (for JPEG 2000)
>> since
>> 1.5 I was hoping to find that JPEG 2000 could injected into the PDF
>> without
>> the need to decode the image, but that does not appear to be the case (we
>> run into the same performance problems with JPEG 2000).
>> 
>> Can anyone comment on:
>> 
>> 1) Is this a limitation of the PDF format, or how FOP is rendering the
>> PDF?
>> 2) Any suggestions or other approaches that to how to solve our problem?
>> 
>> Thanks in advance!
> 
> 
> 
> 
> Jeremias Maerki
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
> For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/large-image-embedding-problems-tp25394304p25409319.html
Sent from the FOP - Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org

Re: large image embedding problems

Posted by Jeremias Maerki <de...@jeremias-maerki.ch>.

FOP currently produces PDF 1.4 so there's no support for JPEG 2000, yet.
One could (probably) add support for embedding undecoded JPEG 2000
images (JPXDecode) to FOP and add an option with which to control the
PDF version produced by FOP. Of course, that means digging into the
source code of FOP and XML Graphics Commons. I can give you pointers if
you decide to do that.

However, I haven't investigated if it's as simple as with JPEG to also
embed JPEG 2000 images. I mention that since I've once tried to get
undecoded PNG graphics directly into PDF. After all, the FlateDecode
filter supports about the same predictors as PNG but I couldn't make
this work in reasonable time. This just as a caveat.

On 11.09.2009 04:58:09 Bill Gamble wrote:
> Hello Everyone,
> We are generating PDFs which are very graphic intensive. A typical PDF has
> 50 pages and has 4 4000x4000 images on a page, and the images can
> have transparency.
> 
> We are using Batik for generating each page as an SVG file, and then
> referencing the SVG using the <fox:external-document when converting to PDF.
> 
> We run into performance problems when the images embedded in the SVG file
> are anything but JPEGs. JPEGs are lighting fast and have a resulting pdf
> file size 10X smaller than any other format. Unfortunately the embedded
> images can have transparency, so standard JPEG format cannot be used, and
> all other file formats run into memory problems and generate enormous pdf
> files (300MB+).
> 
> After finding that PDF has had support for JPXDecode (for JPEG 2000) since
> 1.5 I was hoping to find that JPEG 2000 could injected into the PDF without
> the need to decode the image, but that does not appear to be the case (we
> run into the same performance problems with JPEG 2000).
> 
> Can anyone comment on:
> 
> 1) Is this a limitation of the PDF format, or how FOP is rendering the PDF?
> 2) Any suggestions or other approaches that to how to solve our problem?
> 
> Thanks in advance!

Jeremias Maerki

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org