You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Andrea Vacondio (JIRA)" <ji...@apache.org> on 2015/07/23 19:30:05 UTC

[jira] [Commented] (PDFBOX-2845) Error parsing PDF

    [ https://issues.apache.org/jira/browse/PDFBOX-2845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14639199#comment-14639199 ] 

Andrea Vacondio commented on PDFBOX-2845:
-----------------------------------------

I took a quick look and the issue is that object 515 is a stream, its length is an indirect object (554) which is defined in an Object Stream. Currently PDFBox requires the stream length to not being defined in an object stream as per PDF spec chap. 7.5.7.

> Error parsing PDF
> -----------------
>
>                 Key: PDFBOX-2845
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2845
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 2.0.0
>            Reporter: Christopher Clark
>             Fix For: 2.0.0
>
>
> I get the following error when parsing this pdf:  http://jmlr.csail.mit.edu/proceedings/papers/v28/ranganath13.pdf
> java.io.IOException: Object must be defined and must not be compressed object: 554:0
> Stack trace:
> Exception in thread "main" java.io.IOException: Object must be defined and must not be compressed object: 554:0
>         at org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:682)
>         at org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:646)
>         at org.apache.pdfbox.pdfparser.COSParser.getLength(COSParser.java:847)
>         at org.apache.pdfbox.pdfparser.COSParser.parseCOSStream(COSParser.java:906)
>         at org.apache.pdfbox.pdfparser.COSParser.parseFileObject(COSParser.java:732)
>         at org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:693)
>         at org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:646)
>         at org.apache.pdfbox.pdfparser.COSParser.parseDictObjects(COSParser.java:607)
>         at org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:198)
>         at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:225)
>         at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:848)
>         at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:793)
>         at org.apache.pdfbox.tools.ExtractText.startExtraction(ExtractText.java:192)
>         at org.apache.pdfbox.tools.ExtractText.main(ExtractText.java:81)
>         at org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:55)
> Note this problem does not occur in 1.8.9



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org