You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2008/08/04 19:46:44 UTC
[jira] Commented: (PDFBOX-343) java.lang.ClassCastException:
org.pdfbox.cos.COSArray cannot
[ https://issues.apache.org/jira/browse/PDFBOX-343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12619614#action_12619614 ]
Jukka Zitting commented on PDFBOX-343:
--------------------------------------
[Comment on SourceForge]
Date: 2008-05-27 15:43
Sender: danielwilson
Logged In: YES
user_id=1737686
Originator: NO
German, I cannot replicate the issue with the current code.
Please try the current code -- which has been significantly modified from
version 0.73 -- and see if you have success.
Thanks for including a good test case!
> java.lang.ClassCastException: org.pdfbox.cos.COSArray cannot
> ------------------------------------------------------------
>
> Key: PDFBOX-343
> URL: https://issues.apache.org/jira/browse/PDFBOX-343
> Project: PDFBox
> Issue Type: Bug
> Components: Text extraction
>
> [imported from SourceForge]
> http://sourceforge.net/tracker/index.php?group_id=78314&atid=552832&aid=1901534
> Originally submitted by nobody on 2008-02-25 09:10.
> I'm working with pdfbox 0.7.3
> I'm extracting text from pdf files and It's work fine. But I found a pdf file that crashes the extraction (pdf file attached).
> The code wrote is:
> stream = new FileInputStream(file);
> pdfDocument = PDDocument.load(stream);
> if (pdfDocument.isEncrypted()) {
> pdfDocument.decrypt("");
> }
> StringWriter writer = new StringWriter();
> PDFTextStripper stripper = new PDFTextStripper();
> stripper.writeText(pdfDocument, writer);
> contents = writer.getBuffer().toString();
> When trying to extract text from this file I'm getting the following exception:
> java.lang.ClassCastException: org.pdfbox.cos.COSArray cannot be cast to org.pdfbox.cos.COSDictionary
> at org.pdfbox.filter.FlateFilter.decode(FlateFilter.java:70)
> at org.pdfbox.cos.COSStream.doDecode(COSStream.java:319)
> at org.pdfbox.cos.COSStream.doDecode(COSStream.java:261)
> at org.pdfbox.cos.COSStream.getUnfilteredStream(COSStream.java:173)
> at org.pdfbox.pdfparser.PDFStreamParser.<init>(PDFStreamParser.java:91)
> at org.pdfbox.cos.COSStream.getStreamTokens(COSStream.java:135)
> at org.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:189)
> at org.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:160)
> at org.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:355)
> at org.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:268)
> at org.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:220)
> Thanks
> german.gf@gmail.com
> [attachment on SourceForge]
> http://sourceforge.net/tracker/download.php?group_id=78314&atid=552832&aid=1901534&file_id=267915
> attachment.pdf (application/pdf), 85947 bytes
> pdf file that It does not work fine
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.