You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Petr Slaby (JIRA)" <ji...@apache.org> on 2016/05/04 13:08:12 UTC

[jira] [Created] (PDFBOX-3338) CCITT Fax decoder fails

Petr Slaby created PDFBOX-3338:
----------------------------------

             Summary: CCITT Fax decoder fails
                 Key: PDFBOX-3338
                 URL: https://issues.apache.org/jira/browse/PDFBOX-3338
             Project: PDFBox
          Issue Type: Bug
    Affects Versions: 2.0.1, 1.8.12
            Reporter: Petr Slaby


I have a PDF which does not render in PDFBox. It contains pages from a scanner, encoded as CCITT Fax Tiffs. On each page, the decoder always runs into IOException("TIFFFaxDecoder: EOL encountered in black run.")  (or the same message just with "white" instead of "black"). Unfortunately, the PDF contains sensitive data and I cannot share it.

As a test, I have replaced the TIFFFaxDecoder by the class CCITTFaxDecoderStream from the Twelve Monkeys ImageIO library. All worked fine after that and PDFToImage produced the expected result. 

I have extracted the first few bytes of the TIFF to show the problem without sharing the confidential content. See the attached test program and test file.

I have tested this against latest trunk version of PDFBox, but I think the decoder implementation is basically the same in all versions. 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org