You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2008/08/04 19:46:44 UTC
[jira] Commented: (PDFBOX-343) java.lang.ClassCastException: org.pdfbox.cos.COSArray cannot

    [ https://issues.apache.org/jira/browse/PDFBOX-343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12619614#action_12619614 ] 

Jukka Zitting commented on PDFBOX-343:
--------------------------------------

[Comment on SourceForge]
Date: 2008-05-27 15:43
Sender: danielwilson
Logged In: YES 
user_id=1737686
Originator: NO

German, I cannot replicate the issue with the current code.

Please try the current code -- which has been significantly modified from
version 0.73 -- and see if you have success. 

Thanks for including a good test case!


> java.lang.ClassCastException: org.pdfbox.cos.COSArray cannot
> ------------------------------------------------------------
>
>                 Key: PDFBOX-343
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-343
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>
> [imported from SourceForge]
> http://sourceforge.net/tracker/index.php?group_id=78314&atid=552832&aid=1901534
> Originally submitted by nobody on 2008-02-25 09:10.
> I'm working with pdfbox 0.7.3
> I'm extracting text from pdf files and It's work fine. But I found a pdf file that crashes the extraction (pdf file attached).
> The code wrote is:
> stream = new FileInputStream(file);
> pdfDocument = PDDocument.load(stream);
> if (pdfDocument.isEncrypted()) {
>     pdfDocument.decrypt("");
> }
> StringWriter writer = new StringWriter();
> PDFTextStripper stripper = new PDFTextStripper();
> stripper.writeText(pdfDocument, writer);
> contents = writer.getBuffer().toString();
> When trying to extract text from this file I'm getting the following exception:
> java.lang.ClassCastException: org.pdfbox.cos.COSArray cannot be cast to org.pdfbox.cos.COSDictionary
>         at org.pdfbox.filter.FlateFilter.decode(FlateFilter.java:70)
>         at org.pdfbox.cos.COSStream.doDecode(COSStream.java:319)
>         at org.pdfbox.cos.COSStream.doDecode(COSStream.java:261)
>         at org.pdfbox.cos.COSStream.getUnfilteredStream(COSStream.java:173)
>         at org.pdfbox.pdfparser.PDFStreamParser.<init>(PDFStreamParser.java:91)
>         at org.pdfbox.cos.COSStream.getStreamTokens(COSStream.java:135)
>         at org.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:189)
>         at org.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:160)
>         at org.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:355)
>         at org.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:268)
>         at org.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:220)
> Thanks
> german.gf@gmail.com
> [attachment on SourceForge]
> http://sourceforge.net/tracker/download.php?group_id=78314&atid=552832&aid=1901534&file_id=267915
> attachment.pdf (application/pdf), 85947 bytes
> pdf file that It does not work fine

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.