You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "James Green (JIRA)" <ji...@apache.org> on 2014/11/19 16:54:34 UTC

[jira] [Commented] (PDFBOX-1586) IndexOutOfBoundsException when saving a document (at random)

    [ https://issues.apache.org/jira/browse/PDFBOX-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14218041#comment-14218041 ] 

James Green commented on PDFBOX-1586:
-------------------------------------

[~lehmi] We've just had another random crash. I say random - there have been several builds without failure but jenkins just performed a release of the project and the next snapshot failed so I'm guessing it's another garbage collection issue. Stacktrace follows, do you want this as a new JIRA? Using PDFBox-1.8.7.

org.apache.pdfbox.exceptions.COSVisitorException: java.lang.NullPointerException
	at org.apache.pdfbox.pdfwriter.COSWriter.visitFromStream(COSWriter.java:1367)
	at org.apache.pdfbox.cos.COSStream.accept(COSStream.java:238)
	at org.apache.pdfbox.cos.COSObject.accept(COSObject.java:206)
	at org.apache.pdfbox.pdfwriter.COSWriter.doWriteObject(COSWriter.java:530)
	at org.apache.pdfbox.pdfwriter.COSWriter.doWriteBody(COSWriter.java:436)
	at org.apache.pdfbox.pdfwriter.COSWriter.visitFromDocument(COSWriter.java:1135)
	at org.apache.pdfbox.cos.COSDocument.accept(COSDocument.java:568)
	at org.apache.pdfbox.pdfwriter.COSWriter.write(COSWriter.java:1517)
	at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1346)
[ our app ... ]
Caused by: java.lang.NullPointerException
	at org.apache.pdfbox.io.RandomAccessFileInputStream.read(RandomAccessFileInputStream.java:94)
	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
	at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
	at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
	at org.apache.pdfbox.pdfwriter.COSWriter.visitFromStream(COSWriter.java:1350)
	... 41 more

> IndexOutOfBoundsException when saving a document (at random)
> ------------------------------------------------------------
>
>                 Key: PDFBOX-1586
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1586
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Writing
>    Affects Versions: 1.8.1
>            Reporter: James Green
>            Assignee: Andreas Lehmkühler
>            Priority: Critical
>             Fix For: 2.0.0
>
>         Attachments: TestBuildNewDocumentFromMultipleSources.java
>
>
> Getting the following stacktrace:
> org.apache.pdfbox.exceptions.COSVisitorException: java.lang.IndexOutOfBoundsException: Index: 28, Size: 0
>     at org.apache.pdfbox.pdfwriter.COSWriter.visitFromStream(COSWriter.java:1245)
>     at org.apache.pdfbox.cos.COSStream.accept(COSStream.java:201)
>     at org.apache.pdfbox.cos.COSObject.accept(COSObject.java:206)
>     at org.apache.pdfbox.pdfwriter.COSWriter.doWriteObject(COSWriter.java:524)
>     at org.apache.pdfbox.pdfwriter.COSWriter.doWriteBody(COSWriter.java:434)
>     at org.apache.pdfbox.pdfwriter.COSWriter.visitFromDocument(COSWriter.java:1056)
>     at org.apache.pdfbox.cos.COSDocument.accept(COSDocument.java:496)
>     at org.apache.pdfbox.pdfwriter.COSWriter.write(COSWriter.java:1392)
>     at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1157)
>     at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1138)
> ...
> Caused by: java.lang.IndexOutOfBoundsException: Index: 28, Size: 0
>     at java.util.ArrayList.rangeCheck(ArrayList.java:604)
>     at java.util.ArrayList.get(ArrayList.java:382)
>     at org.apache.pdfbox.io.RandomAccessBuffer.seek(RandomAccessBuffer.java:84)
>     at org.apache.pdfbox.io.RandomAccessFileInputStream.read(RandomAccessFileInputStream.java:96)
>     at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>     at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
>     at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
>     at org.apache.pdfbox.pdfwriter.COSWriter.visitFromStream(COSWriter.java:1232)
> I'll add some context. We have a "data pipeline" in which a Windows Print Monitor sends postscript into a servlet which then uses GhostScript 9.05 to convert in-memory to PDF. This PDF is then loaded into PDFBox using PDDocument.load().
> At this point we split the original PDF into multiple smaller ones each of which is saved to a ByteArrayOutputStream. At the point of save() we are having serious reliability issues.
> Taking an original PDF from Ghostscript we have saved this into a unit test to replicate the problem without success. If we attempt to re-execute the pipeline to take the original PDF and split it, we get apparently random percentages of saved documents.
> For instance, on a 990 page document (text, no images), to be split into 990 1-page documents using Tomcat 7 with -Xmx=512m:
> Pass 1: 50% were saved, 50% ended with stack traces
> Pass 2: 100% were saved
> Pass 3: 100% were saved
> The same test with -Xmx=128m ended several times with just 1 document saved, the rest were stack traces.
> We have also seen this randomly hit a sample document consisting of four pages to be split into two two-page documents so it does not appear to be memory related. We also added code to catch the IndexOutOfBoundsException and make up to ten attempts to repeat, but it seems the save() either works the first time or not at all.
> We're thinking there are environmental factors here but we're now focused on getting this nailed. Any advice or assistance will be welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)