You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Manoj Patel (JIRA)" <ji...@apache.org> on 2013/01/23 10:50:12 UTC

[jira] [Comment Edited] (PDFBOX-1498) Index Out Of Bounds Exception while reading large PDF Document

    [ https://issues.apache.org/jira/browse/PDFBOX-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560504#comment-13560504 ] 

Manoj Patel edited comment on PDFBOX-1498 at 1/23/13 9:48 AM:
--------------------------------------------------------------

Sorry but i cannot share document with anyone. I have created new document which is around 700mb. Now when i try  same program, it is giving below Java heap space exception, even i have set -Xmx1024 parameter for that

Exception in thread "main" org.apache.pdfbox.exceptions.WrappedIOException
	at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:243)
	at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1071)
	at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1038)
	at imageData.ReadLargeFile.main(ReadLargeFile.java:13)
Caused by: java.lang.OutOfMemoryError: Java heap space
	at java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:59)
	at org.apache.pdfbox.cos.COSStream.createFilteredStream(COSStream.java:415)
	at org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:452)
	at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:566)
	at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:187)
	... 3 more

Is there any way to read it?
                
      was (Author: patelmanoj):
    Sorry but i cannot share document with anyone. I have created new document which is around 700mb. Now when i try  same program it is giving below Java heap space exception, even i have set -Xmx1024 parameter for that

Exception in thread "main" org.apache.pdfbox.exceptions.WrappedIOException
	at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:243)
	at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1071)
	at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1038)
	at imageData.ReadLargeFile.main(ReadLargeFile.java:13)
Caused by: java.lang.OutOfMemoryError: Java heap space
	at java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:59)
	at org.apache.pdfbox.cos.COSStream.createFilteredStream(COSStream.java:415)
	at org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:452)
	at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:566)
	at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:187)
	... 3 more

Is there any way to read it?
                  
> Index Out Of Bounds Exception while reading large PDF Document 
> ---------------------------------------------------------------
>
>                 Key: PDFBOX-1498
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1498
>             Project: PDFBox
>          Issue Type: Bug
>            Reporter: Manoj Patel
>            Assignee: Andreas Lehmkühler
>
> I am getting java.lang.IndexOutOfBoundsException while reading large PDF document (800 mb). 
> Below is the full stack
> Exception in thread "main" org.apache.pdfbox.exceptions.WrappedIOException
> 	at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:243)
> 	at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1071)
> 	at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1038)
> 	at imageData.AddFooter.main(AddFooter.java:26)
> Caused by: java.lang.IndexOutOfBoundsException: Index: 3377, Size: 3377
> 	at java.util.ArrayList.RangeCheck(ArrayList.java:547)
> 	at java.util.ArrayList.get(ArrayList.java:322)
> 	at org.apache.pdfbox.io.RandomAccessBuffer.seek(RandomAccessBuffer.java:84)
> 	at org.apache.pdfbox.io.RandomAccessFileOutputStream.write(RandomAccessFileOutputStream.java:106)
> 	at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
> 	at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
> 	at java.io.FilterOutputStream.close(FilterOutputStream.java:140)
> 	at org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:606)
> 	at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:566)
> 	at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:187)
> 	... 3 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira