You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Michael Klink (Jira)" <ji...@apache.org> on 2019/10/25 09:11:00 UTC

[jira] [Comment Edited] (PDFBOX-4674) PDF Page Render Background Image has Gray Smudges

    [ https://issues.apache.org/jira/browse/PDFBOX-4674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16959558#comment-16959558 ] 

Michael Klink edited comment on PDFBOX-4674 at 10/25/19 9:10 AM:
-----------------------------------------------------------------

When I try to open your PDF in Adobe Reader, it warns me that an error exists on the page and that it may not be displayed correctly.

Errors on the page imply that the page may be displayed differently on different viewers. PDFBox shows the scanned page in that shadowy way. And Adobe Reader here does not display the scanned page at all.

Garbage in - garbage out.

You should consider using non-broken PDFs.

In more detail:

The scanned images on the pages of your PDF are invalid. Their dictionaries have these values:

{noformat}
/Filter/DCTDecode
/BitsPerComponent 5
{noformat}

According to the PDF specification, though:

{panel:title=My title}
*BitsPerComponent* - integer - _(Required except for image masks and images that use the *JPXDecode* filter)_ The number of bits used to represent each colour component. Only a single value shall be specified; the number of bits shall be the same for all colour components. The value shall be _1, 2, 4, 8,_ or (from PDF 1.5) _16_. If *ImageMask* is _true_, this entry is optional, but if specified, its value shall be _1_.
If the image stream uses a filter, the value of *BitsPerComponent* shall be consistent with the size of the data samples that the filter delivers. In particular, a *CCITTFaxDecode* or *JBIG2Decode* filter shall always deliver 1-bit samples, a *RunLengthDecode* or *DCTDecode* filter shall always deliver 8-bit samples, and an *LZWDecode* or *FlateDecode* filter shall deliver samples of a specified size if a predictor function is used.|
{panel}

Thus, *BitsPerComponent* must be one of _1, 2, 4, 8,_ and _16_ anyways, and in case of a *Filter DCTDecode* it must be 8.

In your case it is _5_, i.e. invalid.

PDFBox apparently tries to render it nonetheless and the output is the garbage you observed:

 !es-page-image2455431271065294360.png! 


was (Author: mkl):
When I try to open your PDF in Adobe Reader, it warns me that an error exists on the page and that it may not be displayed correctly.

Errors on the page imply that the page may be displayed differently on different viewers. You should consider using non-broken PDFs.

> PDF Page Render Background Image has Gray Smudges
> -------------------------------------------------
>
>                 Key: PDFBOX-4674
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4674
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Rendering
>    Affects Versions: 2.0.17
>            Reporter: Joseph Jezerinac
>            Priority: Major
>         Attachments: bad_page_image.pdf, es-page-image2455431271065294360.png
>
>
> The following text produces a PNG that has gray smudges in it.  I've attached the pdf and the PNG that is produced.
>  
> {code:java}
> public class TestPdfPageImage {
>     @Test
>     public void testGetPageImage() throws IOException {
>         try (PDDocument pdDocument = PDDocument.load(FileUtils.toFile(getClass().getResource("/bad_page_image.pdf")))) {
>             final PDFRenderer pdfRenderer = new PDFRenderer(pdDocument);
>             final BufferedImage bufferedImage = pdfRenderer.renderImage(0);
>             final Path tempPath = Files.createTempFile("es-page-image", ".png");
>             try {
>                 final File tempFile = tempPath.toFile();
>                 ImageIO.write(bufferedImage, "png", tempFile);
>                 Assert.assertTrue(Files.size(tempPath) > 0);
>             } finally {
>                 Files.delete(tempPath);
>             }
>         }
>     }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org