You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Strucken, Michael" <M....@binserv.de> on 2013/05/15 00:01:58 UTC

TIKA 1.3

Hi,

I updated TIKA to 1.3 in Solr 3.6.2 myself. Everything seemed to work fine, but extracting exe files is now broken. I also tried Solr 4.3.0 where TIKA 1.3 is already integrated (SOLR-4416) and both nightly builds solr-4.4-2013-05-13_21-22-06 / solr-5.0-2013-05-14_16-36-39 but ended up with the very same result:

	org.apache.solr.common.SolrException: org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.executable.ExecutableParser@26fd2d76
        at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:225)
        at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
        at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
        at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:242)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1832)
        at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:656)
        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:359)
        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:155)
        at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
        at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
        at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
        at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
        at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
        at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
        at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
        at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
        at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
        at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
        at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
        at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
        at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
        at org.eclipse.jetty.server.Server.handle(Server.java:368)
        at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
        at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
        at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
        at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)
        at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:647)
        at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
        at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
        at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
        at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
        at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
        at java.lang.Thread.run(Unknown Source)
	Caused by: org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.executable.ExecutableParser@26fd2d76
        at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244)
        at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
        at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
        at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:219)
        ... 32 more
	Caused by: java.lang.IllegalStateException: NoWriterSupplied: No writer supplied for serializer.
        at org.apache.xml.serialize.XMLSerializer.startElement(Unknown Source)
        at org.apache.tika.sax.ContentHandlerDecorator.startElement(ContentHandlerDecorator.java:126)
        at org.apache.tika.sax.SecureContentHandler.startElement(SecureContentHandler.java:250)
        at org.apache.tika.sax.ContentHandlerDecorator.startElement(ContentHandlerDecorator.java:126)
        at org.apache.tika.sax.ContentHandlerDecorator.startElement(ContentHandlerDecorator.java:126)
        at org.apache.tika.sax.ContentHandlerDecorator.startElement(ContentHandlerDecorator.java:126)
        at org.apache.tika.sax.SafeContentHandler.startElement(SafeContentHandler.java:264)
        at org.apache.tika.sax.XHTMLContentHandler.lazyStartHead(XHTMLContentHandler.java:131)
        at org.apache.tika.sax.XHTMLContentHandler.lazyEndHead(XHTMLContentHandler.java:149)
        at org.apache.tika.sax.XHTMLContentHandler.endDocument(XHTMLContentHandler.java:209)
        at org.apache.tika.parser.executable.ExecutableParser.parse(ExecutableParser.java:82)
        at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
        ... 35 more

TIKA 1.3 as standalone jar (tika-app-1.3.jar) works as expected:

	X:>java -jar tika-app-1.3.jar "D:\apps\apache-2.2.23\bin\ApacheMonitor.exe"
	<?xml version="1.0" encoding="UTF-8"?><html xmlns="http://www.w3.org/1999/xhtml">
	<head>
	<meta name="machine:machineType" content="x86-32"/>
	<meta name="Creation-Date" content="2012-08-24T09:27:34Z"/>
	<meta name="machine:endian" content="Little"/>
	<meta name="machine:platform" content="Windows"/>
	<meta name="machine:architectureBits" content="32"/>
	<meta name="Content-Length" content="35840"/>
	<meta name="Content-Type" content="application/x-msdownload; format=pe32"/>
	<meta name="Content-Type" content="application/x-msdownload"/>
	<meta name="resourceName" content="ApacheMonitor.exe"/>
	<title/>
	</head>
	<body/></html>

It does not depend on a specific exe file.

Any comments and feedback would be gratefully appreciated!

Regards,
Michael Strucken