You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tika.apache.org by Kaspar Fischer <ka...@dreizak.com> on 2010/08/13 17:02:32 UTC
TikaConfig problems
Hi everybody,
I am using Tika in a GWT project. My JUnit tests run fine and so does my GWT project in the so-called "GWT production mode". However, when I want to run GWT in the development mode from within eclipse, the statement
Tika tika = new Tika();
creates internally a TikaConfig with no parsers. Looking at the sources, I see that the ServiceRegistry is used to find the parsers, and this one finds none at all.
I tried to do
Tika tika = new Tika(new TikaConfig(ClassLoader.getSystemClassLoader()));
but when I do this, I get
ava.lang.ClassCastException: org.apache.tika.parser.asm.ClassParser cannot be cast to org.apache.tika.parser.Parser
I am not sure where exactly the problem lies. Obviously, my code finds the class Tika and I therefore suppose that all would work fine if only TikaConfig had its parsers list setup properly. So is there a way to populate this list with defaults WITHOUT calling the ServiceRegistry? (I do not have any custom parsers in my application.)
For the sake of completeness, I attach some information about my GWT config.
A lot of thanks in advance for any pointers or ideas, I am pretty lost here.
Thanks,
Kaspar
--
I have a Maven project that has dependencies to include version 0.8-SNAPSHOT of tika-core and tika-parsers. The Maven project uses the gwt-maven-plugin with the following setup (which according to http://mojo.codehaus.org/gwt-maven-plugin/eclipse/google_plugin.html is standard):
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>gwt-maven-plugin</artifactId>
<version>1.2</version>
<executions>
<execution>
<goals>
<goal>compile</goal>
</goals>
</execution>
</executions>
<configuration>
<runTarget>org.myproject.web.Application/Application.html</runTarget>
<extraJvmArgs>-Xmx512M -Xss1024k -ea -enableassertions</extraJvmArgs>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-war-plugin</artifactId>
<version>2.0.2</version>
<configuration>
<warSourceDirectory>war</warSourceDirectory>
<webXml>src/main/webapp/WEB-INF/web.xml</webXml>
</configuration>
</plugin>
Re: TikaConfig problems
Posted by Kaspar Fischer <ka...@dreizak.com>.
On 16.08.2010, at 10:03, Jukka Zitting wrote:
> On Fri, Aug 13, 2010 at 5:02 PM, Kaspar Fischer <ka...@dreizak.com> wrote:
>
>> org.apache.tika.parser.asm.ClassParser cannot be cast to org.apache.tika.parser.Parser
>
> Sounds like you have multiple copies of the tika-core jar in your classpath.
I thought so, too, but could not see any second Tika JAR on the classpath (as reported via 'ps').
What helped was adding the line
Parser.class.isAssignableFrom(Class.forName("org.apache.tika.parser.asm.Cla ssParser"));
in my startup code. Without this,
final Iterator<Parser> it = ServiceRegistry.lookupProviders(org.apache.tika.parser.Parser.class);
while (it.hasNext())
{
System.err.println(it.next().getClass());
}
produces no results; with the above line, I see all parsers from Tika. (It was not enough to simply define a ClassParser instance "ClassParser dummy".)
If you have any idea what the reason could be, I am happy to learn about it.
Best regards and thanks, Jukka, for your feedback,
Kaspar
P.S. For the sake of completeness I posted this also to the Google Web Toolkit group:
http://groups.google.com/group/google-web-toolkit/browse_thread/thread/fb9aa8b02223df03/5ac8dd79b0285d54#5ac8dd79b0285d54
Re: TikaConfig problems
Posted by Jukka Zitting <ju...@gmail.com>.
Hi,
On Fri, Aug 13, 2010 at 5:02 PM, Kaspar Fischer
<ka...@dreizak.com> wrote:
> org.apache.tika.parser.asm.ClassParser cannot be cast to org.apache.tika.parser.Parser
Sounds like you have multiple copies of the tika-core jar in your classpath.
BR,
Jukka Zitting