You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "John Hewson (JIRA)" <ji...@apache.org> on 2014/06/13 01:27:03 UTC
[jira] [Comment Edited] (PDFBOX-2133) Parsing of a Type1 font fails with a NumberFormatException

    [ https://issues.apache.org/jira/browse/PDFBOX-2133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14029998#comment-14029998 ] 

John Hewson edited comment on PDFBOX-2133 at 6/12/14 11:26 PM:
---------------------------------------------------------------

{quote}
I am not sure whether the parser or the PDF is wrong, but the fact that it renders fine in Acrobat and in 1.7.x indicates that the former is true.
{quote}

Ha, I wish! Unfortunately Acrobat is able to parse all manner of corrupt Type 1 font files. In this case, as Tilman observed, BlueShift is supposed to be an integer, yet in this file it is a float.

However, if Acrobat can read it then we want PDFBox to be able to read it, so I've modified the Type 1 parser so that it will always parse integers as floats and then truncate them to ints - that way we avoid any future similar issues. I've added this to the trunk in [r1602311.|http://svn.apache.org/r1602311].


was (Author: jahewson):
{quote}
I am not sure whether the parser or the PDF is wrong, but the fact that it renders fine in Acrobat and in 1.7.x indicates that the former is true.
{quote}

Ha, I wish! Unfortunately Acrobat is able to parse all manner of corrupt Type 1 font files. In this case, as Tilman observed, BlueShift is supposed to be an integer, yet in this file it is a float.

However, if Acrobat can read it then we want PDFBox to be able to read it, so I've modified the Type 1 parser so that it will always parse integers as floats and then truncate them to ints - that way we avoid any future similar issues. I've added this to the trunk in [r1602311.|http://svn.apache.org/r1602311.].

> Parsing of a Type1 font fails with a NumberFormatException
> ----------------------------------------------------------
>
>                 Key: PDFBOX-2133
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2133
>             Project: PDFBox
>          Issue Type: Bug
>          Components: FontBox
>    Affects Versions: 2.0.0
>            Reporter: Petr Slaby
>            Assignee: John Hewson
>            Priority: Minor
>             Fix For: 2.0.0
>
>         Attachments: 000116.pdf, 000304.pdf, testrun.log
>
>
> When rendering the attached PDF, parsing of a font fails with a NumberFormatException. Many NullPointerExceptions and "missing fonts" are being reported then. The PDF rendered fine in our modified 1.7.x where fonts were read using AWT. I did not try with current 1.8.x. Stack traces are attached.
> Note: This is just a file from my test suite, not a production problem. I am not sure whether the parser or the PDF is wrong, but the fact that it renders fine in Acrobat and in 1.7.x indicates that the former is true. The offending font is F2, if I catch and ignore the runtime exception in PDResources#getFonts() then it is reported as missing in the PageDrawer.



--
This message was sent by Atlassian JIRA
(v6.2#6252)