You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Tilman Hausherr (JIRA)" <ji...@apache.org> on 2018/11/24 14:08:00 UTC

[jira] [Comment Edited] (PDFBOX-4385) IOException "expected number, actual=COSFloat{18446744073430152624}" when loading PDF

    [ https://issues.apache.org/jira/browse/PDFBOX-4385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16697832#comment-16697832 ] 

Tilman Hausherr edited comment on PDFBOX-4385 at 11/24/18 2:07 PM:
-------------------------------------------------------------------

Of course the PDF is invalid. 18446744073430152624 is not a valid page object number and it indicates the creator software of your client has a bug. Parsing on demand is a strategy which we don't support yet (although it would have its advantages), it means parse only what we need, and the part with the bad object number is in the structure tree which isn't needed unless you're blind (very simplified, there are other uses too). But the structure tree isn't used by PDFBox for what we usually do (rendering, text extraction, signing, etc) although a basic API exists.


was (Author: tilman):
Of course the PDF is invalid. 18446744073430152624 is not a valid page object number and it indicates the creator software of your client has a bug. Parsing on demand is a strategy which we don't support, although it has its advantages), parse only the stuff we need, and the part with the bad object number is in the structure tree which isn't needed unless you're blind (very simplified, there are other uses too). But the structure tree isn't used by PDFBox for what we usually do (rendering, text extraction, signing, etc) although a basic API exists.

> IOException "expected number, actual=COSFloat{18446744073430152624}" when loading PDF 
> --------------------------------------------------------------------------------------
>
>                 Key: PDFBOX-4385
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4385
>             Project: PDFBox
>          Issue Type: Bug
>          Components: PDModel
>    Affects Versions: 2.0.12
>         Environment: Mac OS 10.14.1
>            Reporter: Kasper Schnack
>            Priority: Major
>
> On a PDF document, which opens fine with Adobe Reader and Preview on Mac OS, the PDDocument.load() method throws the following:
> java.io.IOException: expected number, actual=COSFloat\{18446744073430152624} at offset 33182
>  at org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionaryValue(BaseParser.java:166)
>  at org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionaryNameValuePair(BaseParser.java:279)
>  at org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary(BaseParser.java:212)
>  at org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:862)
>  at org.apache.pdfbox.pdfparser.COSParser.parseFileObject(COSParser.java:905)
>  at org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:874)
>  at org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:794)
>  at org.apache.pdfbox.pdfparser.COSParser.parseDictObjects(COSParser.java:754)
>  at org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:185)
>  at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:220)
>  at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1160)
>  at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1057)
> Sorry the material is sensitive so I can't attach it :(
>  
> However if I cat the file it looks like this around the offset:
> 48 0 obj
> << /Type /StructElem /S /P /P 30 0 R /Pg 2 0 R /K 15 >>
> endobj
> 49 0 obj
> << /Type /StructElem /S /P /P 30 0 R /Pg 2 0 R /K 16 >>
> endobj
> 50 0 obj
> << /Type /StructElem /S /P /P 30 0 R /Pg 2 0 R /K 17 >>
> endobj
> 51 0 obj
> << /Type /StructElem /S /P /P 30 0 R /Pg 2 0 R /K 18 >>
> endobj
> 52 0 obj
> << /Type /StructElem /S /P /P 30 0 R /Pg 18446744073430152624 0 R /K [ 99 0 R
> 100 0 R ] >>
> endobj
> 99 0 obj
> << /Type /StructElem /S /Span /P 52 0 R /Pg 2 0 R /K 19 >>
> endobj
> 100 0 obj



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org