You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Sean Bridges (JIRA)" <ji...@apache.org> on 2009/05/13 00:01:46 UTC
[jira] Created: (PDFBOX-468) index out of bounds exception
index out of bounds exception
-----------------------------
Key: PDFBOX-468
URL: https://issues.apache.org/jira/browse/PDFBOX-468
Project: PDFBox
Issue Type: Bug
Reporter: Sean Bridges
Fix For: 0.8.0-incubator
This is with svn revision 773978
I get an index out of bounds exception parsing pdf files, I can't give you the file but the exception is,
Caused by: org.apache.pdfbox.exceptions.WrappedIOException
at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:228)
at message_analyzer.extractor.PDFExtractor.getContent(PDFExtractor.java:32)
... 19 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
at org.apache.pdfbox.pdfparser.BaseParser.cmpCircularBuffer(BaseParser.java:398)
at org.apache.pdfbox.pdfparser.BaseParser.readUntilEndStream(BaseParser.java:355)
at org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:322)
at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:490)
at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:169)
... 20 more
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PDFBOX-468) index out of bounds exception
Posted by "Sean Bridges (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PDFBOX-468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Bridges updated PDFBOX-468:
--------------------------------
Attachment: patch
This patch fixes the issue. The problem is the first read,
int nextIdx = pdfSource.read(buffer) % buffer.length;
may return no contents, and nextIdx is -1. This causes an index out of bounds exception on the first call to cmpCircularBuffer
To make the parser more reliable in the face of invalid input, it might be good to always do,
pdfSource.unread( ENDSTREAM ); or pdfSource.unread( ENDOBJ );
in this method.
> index out of bounds exception
> -----------------------------
>
> Key: PDFBOX-468
> URL: https://issues.apache.org/jira/browse/PDFBOX-468
> Project: PDFBox
> Issue Type: Bug
> Reporter: Sean Bridges
> Fix For: 0.8.0-incubator
>
> Attachments: patch
>
>
> This is with svn revision 773978
> I get an index out of bounds exception parsing pdf files, I can't give you the file but the exception is,
> Caused by: org.apache.pdfbox.exceptions.WrappedIOException
> at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:228)
> at message_analyzer.extractor.PDFExtractor.getContent(PDFExtractor.java:32)
> ... 19 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
> at org.apache.pdfbox.pdfparser.BaseParser.cmpCircularBuffer(BaseParser.java:398)
> at org.apache.pdfbox.pdfparser.BaseParser.readUntilEndStream(BaseParser.java:355)
> at org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:322)
> at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:490)
> at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:169)
> ... 20 more
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (PDFBOX-468) index out of bounds exception
Posted by "Brian Carrier (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PDFBOX-468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Brian Carrier resolved PDFBOX-468.
----------------------------------
Resolution: Fixed
Checked into trunk. The extra unread() is not correct though because -1 is returned when no bytes were read, therefore nothing needs to be unread().
Sending trunk/src/main/java/org/apache/pdfbox/pdfparser/BaseParser.java
Transmitting file data .
Committed revision 778883.
> index out of bounds exception
> -----------------------------
>
> Key: PDFBOX-468
> URL: https://issues.apache.org/jira/browse/PDFBOX-468
> Project: PDFBox
> Issue Type: Bug
> Reporter: Sean Bridges
> Fix For: 0.8.0-incubator
>
> Attachments: patch
>
>
> This is with svn revision 773978
> I get an index out of bounds exception parsing pdf files, I can't give you the file but the exception is,
> Caused by: org.apache.pdfbox.exceptions.WrappedIOException
> at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:228)
> at message_analyzer.extractor.PDFExtractor.getContent(PDFExtractor.java:32)
> ... 19 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
> at org.apache.pdfbox.pdfparser.BaseParser.cmpCircularBuffer(BaseParser.java:398)
> at org.apache.pdfbox.pdfparser.BaseParser.readUntilEndStream(BaseParser.java:355)
> at org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:322)
> at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:490)
> at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:169)
> ... 20 more
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.