You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tim Allison (JIRA)" <ji...@apache.org> on 2018/10/12 20:34:00 UTC

[jira] [Commented] (TIKA-2752) Tika-App RTFParser crashes with NullPointerException

    [ https://issues.apache.org/jira/browse/TIKA-2752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16648422#comment-16648422 ] 

Tim Allison commented on TIKA-2752:
-----------------------------------

These are the tags I get before the NPE is thrown:

{noformat}
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta name="X-Parsed-By" content="org.apache.tika.parser.DefaultParser" />
<meta name="X-Parsed-By" content="org.apache.tika.parser.rtf.RTFParser" />
<meta name="Content-Type" content="application/rtf" />
<title></title>
</head>
<body><p>
Test 1</li>
</ul>
<p />
<p />
<p />
<p />
<p />
<p>
</p>
<p>

</p>
<p>
{noformat}
When I try to open the document in MSWord, it complains that something about the tables is corrupt and it offers to fix the document.

I think this file is corrupt.  I'm not sure what we can/should do about it.  Recommendations?

> Tika-App RTFParser crashes with NullPointerException
> ----------------------------------------------------
>
>                 Key: TIKA-2752
>                 URL: https://issues.apache.org/jira/browse/TIKA-2752
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.19.1
>         Environment: tika-app 1.19.1 macOS
> tika-core lib java
>            Reporter: Vicky Chawda
>            Priority: Critical
>              Labels: RTFParser, tika-app
>         Attachments: sample.rtf
>
>
> The RTFParser seems to crash on RTF files. The attached file produces the following stacktrace:
> Apache Tika was unable to parse the document
> at sample.rtf.
> Apache Tika was unable to parse the document
> at /Users/vchawda/Desktop/sample.rtf.
> The full exception stack trace is included below:
> org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.rtf.RTFParser@3d8af52b
>  at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:282)
>  at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
>  at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
>  at org.apache.tika.parser.ParserDecorator.parse(ParserDecorator.java:188)
>  at org.apache.tika.parser.DigestingParser.parse(DigestingParser.java:84)
>  at org.apache.tika.gui.TikaGUI.handleStream(TikaGUI.java:358)
>  at org.apache.tika.gui.TikaGUI.openFile(TikaGUI.java:309)
>  at org.apache.tika.gui.TikaGUI.actionPerformed(TikaGUI.java:267)
>  at javax.swing.AbstractButton.fireActionPerformed(AbstractButton.java:2022)
>  at javax.swing.AbstractButton$Handler.actionPerformed(AbstractButton.java:2348)
>  at javax.swing.DefaultButtonModel.fireActionPerformed(DefaultButtonModel.java:402)
>  at javax.swing.DefaultButtonModel.setPressed(DefaultButtonModel.java:259)
>  at javax.swing.AbstractButton.doClick(AbstractButton.java:376)
>  at javax.swing.plaf.basic.BasicMenuItemUI.doClick(BasicMenuItemUI.java:833)
>  at com.apple.laf.AquaMenuItemUI.doClick(AquaMenuItemUI.java:157)
>  at javax.swing.plaf.basic.BasicMenuItemUI$Handler.mouseReleased(BasicMenuItemUI.java:877)
>  at java.awt.Component.processMouseEvent(Component.java:6533)
>  at javax.swing.JComponent.processMouseEvent(JComponent.java:3324)
>  at java.awt.Component.processEvent(Component.java:6298)
>  at java.awt.Container.processEvent(Container.java:2236)
>  at java.awt.Component.dispatchEventImpl(Component.java:4889)
>  at java.awt.Container.dispatchEventImpl(Container.java:2294)
>  at java.awt.Component.dispatchEvent(Component.java:4711)
>  at java.awt.LightweightDispatcher.retargetMouseEvent(Container.java:4888)
>  at java.awt.LightweightDispatcher.processMouseEvent(Container.java:4525)
>  at java.awt.LightweightDispatcher.dispatchEvent(Container.java:4466)
>  at java.awt.Container.dispatchEventImpl(Container.java:2280)
>  at java.awt.Window.dispatchEventImpl(Window.java:2746)
>  at java.awt.Component.dispatchEvent(Component.java:4711)
>  at java.awt.EventQueue.dispatchEventImpl(EventQueue.java:758)
>  at java.awt.EventQueue.access$500(EventQueue.java:97)
>  at java.awt.EventQueue$3.run(EventQueue.java:709)
>  at java.awt.EventQueue$3.run(EventQueue.java:703)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(ProtectionDomain.java:80)
>  at java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(ProtectionDomain.java:90)
>  at java.awt.EventQueue$4.run(EventQueue.java:731)
>  at java.awt.EventQueue$4.run(EventQueue.java:729)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(ProtectionDomain.java:80)
>  at java.awt.EventQueue.dispatchEvent(EventQueue.java:728)
>  at java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:201)
>  at java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:116)
>  at java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:105)
>  at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:101)
>  at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:93)
>  at java.awt.EventDispatchThread.run(EventDispatchThread.java:82)
> Caused by: java.lang.NullPointerException
>  at com.sun.org.apache.xml.internal.serializer.ToHTMLStream.endElement(ToHTMLStream.java:911)
>  at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerHandlerImpl.endElement(TransformerHandlerImpl.java:284)
>  at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136)
>  at org.apache.tika.gui.TikaGUI$2.endElement(TikaGUI.java:581)
>  at org.apache.tika.sax.TeeContentHandler.endElement(TeeContentHandler.java:94)
>  at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136)
>  at org.apache.tika.sax.SecureContentHandler.endElement(SecureContentHandler.java:256)
>  at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136)
>  at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136)
>  at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136)
>  at org.apache.tika.sax.SafeContentHandler.endElement(SafeContentHandler.java:274)
>  at org.apache.tika.sax.XHTMLContentHandler.endDocument(XHTMLContentHandler.java:232)
>  at org.apache.tika.parser.rtf.TextExtractor.extract(TextExtractor.java:481)
>  at org.apache.tika.parser.rtf.TextExtractor.extract(TextExtractor.java:442)
>  at org.apache.tika.parser.rtf.RTFParser.parse(RTFParser.java:98)
>  at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
>  ... 46 more



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)