You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Pierre Huttin (JIRA)" <ji...@apache.org> on 2013/03/21 09:17:14 UTC

[jira] [Commented] (PDFBOX-1544) Not able to loadNonSeq document larger than 2GB

    [ https://issues.apache.org/jira/browse/PDFBOX-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13608738#comment-13608738 ] 

Pierre Huttin commented on PDFBOX-1544:
---------------------------------------

After a quick patch  (transform BaseParser.readInt method into BaseParser.readLong, and fixing references).

I'm able to open my 21GB file, but it took 3H30 to open the document, and the scratchfile was arround the same size than the document.
                
> Not able to loadNonSeq document larger than 2GB
> -----------------------------------------------
>
>                 Key: PDFBOX-1544
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1544
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing, PDModel
>    Affects Versions: 1.7.1
>            Reporter: Pierre Huttin
>
> When I try to open open a document larger than 2GB (I have test with a 21GB document) using the method PDDocument.loadNonSeq(). The PDFParser trigger me the following error:
> Exception in thread "main" java.io.IOException: Error: Expected an integer type, actual='22580639698'
> 	at org.apache.pdfbox.pdfparser.BaseParser.readInt(BaseParser.java:1608)                            
> 	at org.apache.pdfbox.pdfparser.PDFParser.parseStartXref(PDFParser.java:677)                        
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.initialParse(NonSequentialPDFParser.java:237)
> 	at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parse(NonSequentialPDFParser.java:574)       
> 	at org.apache.pdfbox.pdmodel.PDDocument.loadNonSeq(PDDocument.java:1124)                           
> 	at org.apache.pdfbox.pdmodel.PDDocument.loadNonSeq(PDDocument.java:1107)                           
> 	
> the problem seems to come from BaseParser which try to return int type.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira