You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Strucken, Michael" <M....@binserv.de> on 2013/05/15 00:01:58 UTC
TIKA 1.3
Hi,
I updated TIKA to 1.3 in Solr 3.6.2 myself. Everything seemed to work fine, but extracting exe files is now broken. I also tried Solr 4.3.0 where TIKA 1.3 is already integrated (SOLR-4416) and both nightly builds solr-4.4-2013-05-13_21-22-06 / solr-5.0-2013-05-14_16-36-39 but ended up with the very same result:
org.apache.solr.common.SolrException: org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.executable.ExecutableParser@26fd2d76
at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:225)
at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:242)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1832)
at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:656)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:359)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:155)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:368)
at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:647)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Unknown Source)
Caused by: org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.executable.ExecutableParser@26fd2d76
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:219)
... 32 more
Caused by: java.lang.IllegalStateException: NoWriterSupplied: No writer supplied for serializer.
at org.apache.xml.serialize.XMLSerializer.startElement(Unknown Source)
at org.apache.tika.sax.ContentHandlerDecorator.startElement(ContentHandlerDecorator.java:126)
at org.apache.tika.sax.SecureContentHandler.startElement(SecureContentHandler.java:250)
at org.apache.tika.sax.ContentHandlerDecorator.startElement(ContentHandlerDecorator.java:126)
at org.apache.tika.sax.ContentHandlerDecorator.startElement(ContentHandlerDecorator.java:126)
at org.apache.tika.sax.ContentHandlerDecorator.startElement(ContentHandlerDecorator.java:126)
at org.apache.tika.sax.SafeContentHandler.startElement(SafeContentHandler.java:264)
at org.apache.tika.sax.XHTMLContentHandler.lazyStartHead(XHTMLContentHandler.java:131)
at org.apache.tika.sax.XHTMLContentHandler.lazyEndHead(XHTMLContentHandler.java:149)
at org.apache.tika.sax.XHTMLContentHandler.endDocument(XHTMLContentHandler.java:209)
at org.apache.tika.parser.executable.ExecutableParser.parse(ExecutableParser.java:82)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
... 35 more
TIKA 1.3 as standalone jar (tika-app-1.3.jar) works as expected:
X:>java -jar tika-app-1.3.jar "D:\apps\apache-2.2.23\bin\ApacheMonitor.exe"
<?xml version="1.0" encoding="UTF-8"?><html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta name="machine:machineType" content="x86-32"/>
<meta name="Creation-Date" content="2012-08-24T09:27:34Z"/>
<meta name="machine:endian" content="Little"/>
<meta name="machine:platform" content="Windows"/>
<meta name="machine:architectureBits" content="32"/>
<meta name="Content-Length" content="35840"/>
<meta name="Content-Type" content="application/x-msdownload; format=pe32"/>
<meta name="Content-Type" content="application/x-msdownload"/>
<meta name="resourceName" content="ApacheMonitor.exe"/>
<title/>
</head>
<body/></html>
It does not depend on a specific exe file.
Any comments and feedback would be gratefully appreciated!
Regards,
Michael Strucken