You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/04 14:06:00 UTC

[jira] [Commented] (TIKA-2497) Unexpected RuntimeException from org.apache.tika.parser.microsoft.ooxml.OOXMLParser

    [ https://issues.apache.org/jira/browse/TIKA-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16276815#comment-16276815 ] 

Tim Allison commented on TIKA-2497:
-----------------------------------

*WARNING* You'll hit a binary incompatibility if you swap in POI 3.17 jars with the 3.17-beta1 jars without also upgrading Tika.

{noformat}
Exception in thread "Thread-17" java.lang.NoSuchMethodError: org.apache.poi.hwmf.record.HwmfFont.getCharSet()Lorg/apache/poi/hwmf/record/HwmfFont$WmfCharset;
        at org.apache.tika.parser.microsoft.WMFParser.parse(WMFParser.java:74)
        at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
        at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
{noformat}

The only solution is to upgrade Tika to 1.17, which is on the way.

> Unexpected RuntimeException from org.apache.tika.parser.microsoft.ooxml.OOXMLParser
> -----------------------------------------------------------------------------------
>
>                 Key: TIKA-2497
>                 URL: https://issues.apache.org/jira/browse/TIKA-2497
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.16
>            Reporter: Advokat
>             Fix For: 1.17
>
>         Attachments: BugExample.pptx, fileupload_passt_configset.zip, solr.log
>
>
> Getting this exception when parsing certain pptx files. Example included.
> <response>
> <lst name="responseHeader"><int name="status">500</int><int name="QTime">204</int></lst><lst name="error"><lst name="metadata"><str name="error-class">org.apache.solr.common.SolrException</str><str name="root-error-class">java.lang.IllegalStateException</str></lst><str name="msg">org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.microsoft.ooxml.OOXMLParser@3225ac62</str><str name="trace">org.apache.solr.common.SolrException: org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.microsoft.ooxml.OOXMLParser@3225ac62
> 	at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:234)
> 	at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
> 	at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)
> 	at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
> 	at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723)
> 	at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)
> 	at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361)
> 	at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)
> 	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
> 	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
> 	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> 	at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
> 	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
> 	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
> 	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
> 	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
> 	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
> 	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> 	at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
> 	at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
> 	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
> 	at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
> 	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
> 	at org.eclipse.jetty.server.Server.handle(Server.java:534)
> 	at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
> 	at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
> 	at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
> 	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
> 	at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
> 	at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
> 	at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
> 	at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
> 	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
> 	at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
> 	at java.lang.Thread.run(Unknown Source)
> Caused by: org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.microsoft.ooxml.OOXMLParser@3225ac62
> 	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:282)
> 	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
> 	at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:135)
> 	at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:228)
> 	... 34 more
> Caused by: java.lang.IllegalStateException: Schemas (*.xsb) for CTTable can't be loaded - usually this happens when OSGI loading is used and the thread context classloader has no reference to the xmlbeans classes - use POIXMLTypeLoader.setClassLoader() to set the loader, e.g. with CTTable.class.getClassLoader()
> 	at org.apache.poi.xslf.usermodel.XSLFTable.&lt;init&gt;(XSLFTable.java:76)
> 	at org.apache.poi.xslf.usermodel.XSLFGraphicFrame.create(XSLFGraphicFrame.java:90)
> 	at org.apache.poi.xslf.usermodel.XSLFSheet.buildShapes(XSLFSheet.java:112)
> 	at org.apache.poi.xslf.usermodel.XSLFSheet.initDrawingAndShapes(XSLFSheet.java:173)
> 	at org.apache.poi.xslf.usermodel.XSLFSheet.getShapes(XSLFSheet.java:157)
> 	at org.apache.tika.parser.microsoft.ooxml.XSLFPowerPointExtractorDecorator.buildXHTML(XSLFPowerPointExtractorDecorator.java:110)
> 	at org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.getXHTML(AbstractOOXMLExtractor.java:139)
> 	at org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:142)
> 	at org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:106)
> 	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
> 	... 37 more
> </str><int name="code">500</int></lst>
> </response>



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)