You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by David Newton <da...@wrycan.com> on 2011/03/07 23:57:27 UTC

Garbled output for page images in PDFToImage

Hi,

I've looked back through the message archives on this and found a few 
questions related to this, but no solutions. Through both the PDFToImage 
command line tool and the convertToImage() method, some pictures in PDFs 
come out with multicoloured 'bands' across them, or are made more 
severely illegible. From iterating through the properties of the images 
in a sample document, it looks like this happens to PNG images which use 
a DeviceRGB ColorSpace, but those that use an Indexed Colorspace are fine.

An issue that demonstrates the problem, with 'before' and 'after' 
images, is https://issues.apache.org/jira/browse/PDFBOX-958 .

Does anyone have any further ideas as to what might be going on, or what 
could be done to fix or work around this? The issue seems to be 
somewhere in or below the getRGBImage() method, as images written 
straight out from here already show the deterioration, before they're 
included in a complete PDF page image.

Thanks
David

Re: Garbled output for page images in PDFToImage

Posted by David Newton <da...@wrycan.com>.
Thanks a lot, Andreas - I'd traced it down into a problem with the 
FlateFilter, but was at a loss as to how to correct it. Your fix sorts 
out the tearing I was seeing on my own PDFs, as well as that one in the 
issue :)

David

Andreas Lehmkuehler wrote:
> Hi,
> This issue is fixed in the current trunk. See PDFBOX-958 for further 
> details.
>
> BR
> Andreas Lehmkühler
>

Re: Garbled output for page images in PDFToImage

Posted by Andreas Lehmkuehler <an...@lehmi.de>.
Hi,

Am 07.03.2011 23:57, schrieb David Newton:
> Hi,
>
> I've looked back through the message archives on this and found a few questions
> related to this, but no solutions. Through both the PDFToImage command line tool
> and the convertToImage() method, some pictures in PDFs come out with
> multicoloured 'bands' across them, or are made more severely illegible. From
> iterating through the properties of the images in a sample document, it looks
> like this happens to PNG images which use a DeviceRGB ColorSpace, but those that
> use an Indexed Colorspace are fine.
>
> An issue that demonstrates the problem, with 'before' and 'after' images, is
> https://issues.apache.org/jira/browse/PDFBOX-958 .
>
> Does anyone have any further ideas as to what might be going on, or what could
> be done to fix or work around this? The issue seems to be somewhere in or below
> the getRGBImage() method, as images written straight out from here already show
> the deterioration, before they're included in a complete PDF page image.
This issue is fixed in the current trunk. See PDFBOX-958 for further details.

BR
Andreas Lehmkühler