You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Tilman Hausherr (JIRA)" <ji...@apache.org> on 2014/03/26 18:23:21 UTC

[jira] [Commented] (PDFBOX-1086) Error when decoding CCITT compressed data that contains EOLs, fill bits etc.

    [ https://issues.apache.org/jira/browse/PDFBOX-1086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948169#comment-13948169 ] 

Tilman Hausherr commented on PDFBOX-1086:
-----------------------------------------

I fixed two of three decoders re fill bits. (I could fix the third one but would prefer to have a test file). Now there's only PDFBOX-457 left. It could be an EOL, but I can neither prove or disprove that theory. An EOL would make no sense in a G4 encoded document, at least according to wikipedia:
https://en.wikipedia.org/wiki/Group_4_compression

> Error when decoding CCITT compressed data that contains EOLs, fill bits etc.
> ----------------------------------------------------------------------------
>
>                 Key: PDFBOX-1086
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1086
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>            Reporter: Jeremias Maerki
>            Assignee: Jeremias Maerki
>              Labels: CCITTFaxDecode, ccitt
>
> The TIFFFaxDecoder class (originally coming from JAI via XML Graphics Commons) does not handle cases like EOLs between lines and in front. But the PDF CCITTFaxDecode filter needs to allow many different variants of the encoding. Apparently, TIFF has a relatively restricted way of encoding CCITT data, so TIFFFaxDecoder was not written to be as flexible as we need it. Ideally, PDFBox should handle anything that gets thrown at it.
> It apprears that it would be rather difficult to retrofit TIFFFaxDecoder with the necessary flexibility. So, new decoders for T.4 and T.6 should probably be written.



--
This message was sent by Atlassian JIRA
(v6.2#6252)