You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@tika.apache.org by "Joachim Zittmayr (JIRA)" <ji...@apache.org> on 2009/08/13 13:42:15 UTC

[jira] Commented: (TIKA-223) PDFParser causes Problems when using encrypted PDF documents

    [ https://issues.apache.org/jira/browse/TIKA-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12742806#action_12742806 ] 

Joachim Zittmayr commented on TIKA-223:
---------------------------------------

sorry, guys for not having responded to this issue. recently i downloaded the fresh new 0.4 release, which still has this bug.
if you could tell me, how you want this patch sent/filed/comitted - i am an absolute fresher regarding filing bugs against opensource software projects...

> PDFParser causes Problems when using encrypted PDF documents
> ------------------------------------------------------------
>
>                 Key: TIKA-223
>                 URL: https://issues.apache.org/jira/browse/TIKA-223
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 0.3
>         Environment: Java 1.5.x on MAC, WIN, LIN
>            Reporter: Joachim Zittmayr
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> The PDFParser.parse() method decrypts the document for the metadata already and then passes it over to PDF2XHTML.process(), which in turn calls the inherited getText(). This calls writeText(), which tries to decrypt the PDDocument again, but this will fail as it is already decrypted. The solution would be to override  writeText(), without the document.isEncrypted check.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.