You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Bertrand Caron (Jira)" <ji...@apache.org> on 2021/03/18 12:34:00 UTC

[jira] [Updated] (TIKA-3331) Return a more informative error when trying to parse an encrypted file

     [ https://issues.apache.org/jira/browse/TIKA-3331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bertrand Caron updated TIKA-3331:
---------------------------------
    Priority: Minor  (was: Major)

> Return a more informative error when trying to parse an encrypted file
> ----------------------------------------------------------------------
>
>                 Key: TIKA-3331
>                 URL: https://issues.apache.org/jira/browse/TIKA-3331
>             Project: Tika
>          Issue Type: Bug
>    Affects Versions: 1.24.1
>         Environment: See enclosed picture.
>            Reporter: Bertrand Caron
>            Priority: Minor
>         Attachments: encrypte.odt, system.png
>
>
> When parsing a PDF or ODF encrypted file, Tika returns a long, cryptic error message. A more informative message would be useful for the user - at least mention the encryption, and perhaps the algorithm used?
>  
> I enclose a fabricated example, but real-world examples can be found in a similar issue for the JHOVE tool: [https://github.com/openpreserve/jhove/issues/640]
>  
> The error log obtained:
>  
> Apache Tika was unable to parse the document
> at /home/bertrand/Téléchargements/Toponymic guidelines_Instituto geografico nacional_2011.pdf.
> The full exception stack trace is included below:
> org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.pdf.PDFParser@5e7e878d
>     at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:293)
>     at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
>     at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
>     at org.apache.tika.parser.ParserDecorator.parse(ParserDecorator.java:188)
>     at org.apache.tika.parser.DigestingParser.parse(DigestingParser.java:84)
>     at org.apache.tika.gui.TikaGUI.handleStream(TikaGUI.java:358)
>     at org.apache.tika.gui.TikaGUI.openFile(TikaGUI.java:309)
>     at org.apache.tika.gui.TikaGUI.actionPerformed(TikaGUI.java:267)
>     at java.desktop/javax.swing.AbstractButton.fireActionPerformed(AbstractButton.java:1967)
>     at java.desktop/javax.swing.AbstractButton$Handler.actionPerformed(AbstractButton.java:2308)
>     at java.desktop/javax.swing.DefaultButtonModel.fireActionPerformed(DefaultButtonModel.java:405)
>     at java.desktop/javax.swing.DefaultButtonModel.setPressed(DefaultButtonModel.java:262)
>     at java.desktop/javax.swing.AbstractButton.doClick(AbstractButton.java:369)
>     at java.desktop/javax.swing.plaf.basic.BasicMenuItemUI.doClick(BasicMenuItemUI.java:1020)
>     at java.desktop/javax.swing.plaf.basic.BasicMenuItemUI$Handler.mouseReleased(BasicMenuItemUI.java:1064)
>     at java.desktop/java.awt.Component.processMouseEvent(Component.java:6636)
>     at java.desktop/javax.swing.JComponent.processMouseEvent(JComponent.java:3342)
>     at java.desktop/java.awt.Component.processEvent(Component.java:6401)
>     at java.desktop/java.awt.Container.processEvent(Container.java:2263)
>     at java.desktop/java.awt.Component.dispatchEventImpl(Component.java:5012)
>     at java.desktop/java.awt.Container.dispatchEventImpl(Container.java:2321)
>     at java.desktop/java.awt.Component.dispatchEvent(Component.java:4844)
>     at java.desktop/java.awt.LightweightDispatcher.retargetMouseEvent(Container.java:4919)
>     at java.desktop/java.awt.LightweightDispatcher.processMouseEvent(Container.java:4548)
>     at java.desktop/java.awt.LightweightDispatcher.dispatchEvent(Container.java:4489)
>     at java.desktop/java.awt.Container.dispatchEventImpl(Container.java:2307)
>     at java.desktop/java.awt.Window.dispatchEventImpl(Window.java:2764)
>     at java.desktop/java.awt.Component.dispatchEvent(Component.java:4844)
>     at java.desktop/java.awt.EventQueue.dispatchEventImpl(EventQueue.java:772)
>     at java.desktop/java.awt.EventQueue$4.run(EventQueue.java:721)
>     at java.desktop/java.awt.EventQueue$4.run(EventQueue.java:715)
>     at java.base/java.security.AccessController.doPrivileged(AccessController.java:391)
>     at java.base/java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(ProtectionDomain.java:85)
>     at java.base/java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(ProtectionDomain.java:95)
>     at java.desktop/java.awt.EventQueue$5.run(EventQueue.java:745)
>     at java.desktop/java.awt.EventQueue$5.run(EventQueue.java:743)
>     at java.base/java.security.AccessController.doPrivileged(AccessController.java:391)
>     at java.base/java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(ProtectionDomain.java:85)
>     at java.desktop/java.awt.EventQueue.dispatchEvent(EventQueue.java:742)
>     at java.desktop/java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:203)
>     at java.desktop/java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:124)
>     at java.desktop/java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:113)
>     at java.desktop/java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:109)
>     at java.desktop/java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:101)
>     at java.desktop/java.awt.EventDispatchThread.run(EventDispatchThread.java:90)
> Caused by: java.lang.NullPointerException
>     at org.apache.tika.parser.pdf.AbstractPDF2XHTML.extractXMPXFA(AbstractPDF2XHTML.java:209)
>     at org.apache.tika.parser.pdf.AbstractPDF2XHTML.endDocument(AbstractPDF2XHTML.java:678)
>     at org.apache.pdfbox.text.PDFTextStripper.writeText(PDFTextStripper.java:267)
>     at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:96)
>     at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:174)
>     at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
>     ... 44 more



--
This message was sent by Atlassian Jira
(v8.3.4#803005)