You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tika.apache.org by Kaspar Fischer <ka...@dreizak.com> on 2010/08/13 17:02:32 UTC

TikaConfig problems

Hi everybody,

I am using Tika in a GWT project. My JUnit tests run fine and so does my GWT project in the so-called "GWT production mode". However, when I want to run GWT in the development mode from within eclipse, the statement

  Tika tika = new Tika();

creates internally a TikaConfig with no parsers. Looking at the sources, I see that the ServiceRegistry is used to find the parsers, and this one finds none at all.

I tried to do 

  Tika tika = new Tika(new TikaConfig(ClassLoader.getSystemClassLoader()));

but when I do this, I get

  ava.lang.ClassCastException: org.apache.tika.parser.asm.ClassParser cannot be cast to org.apache.tika.parser.Parser

I am not sure where exactly the problem lies. Obviously, my code finds the class Tika and I therefore suppose that all would work fine if only TikaConfig had its parsers list setup properly. So is there a way to populate this list with defaults WITHOUT calling the ServiceRegistry? (I do not have any custom parsers in my application.)

For the sake of completeness, I attach some information about my GWT config.

A lot of thanks in advance for any pointers or ideas, I am pretty lost here.

Thanks,
Kaspar

--

I have a Maven project that has dependencies to include version 0.8-SNAPSHOT of tika-core and tika-parsers. The Maven project uses the gwt-maven-plugin with the following setup (which according to http://mojo.codehaus.org/gwt-maven-plugin/eclipse/google_plugin.html is standard):

      <plugin>
        <groupId>org.codehaus.mojo</groupId>
        <artifactId>gwt-maven-plugin</artifactId>
        <version>1.2</version>
        <executions>
          <execution>
            <goals>
              <goal>compile</goal>
            </goals>
          </execution>
        </executions>
        <configuration>
          <runTarget>org.myproject.web.Application/Application.html</runTarget>
          <extraJvmArgs>-Xmx512M -Xss1024k -ea -enableassertions</extraJvmArgs>
        </configuration>
      </plugin>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-war-plugin</artifactId>
        <version>2.0.2</version>
        <configuration>
          <warSourceDirectory>war</warSourceDirectory>
          <webXml>src/main/webapp/WEB-INF/web.xml</webXml>
        </configuration>
      </plugin>


Re: TikaConfig problems

Posted by Kaspar Fischer <ka...@dreizak.com>.
On 16.08.2010, at 10:03, Jukka Zitting wrote:

> On Fri, Aug 13, 2010 at 5:02 PM, Kaspar Fischer <ka...@dreizak.com> wrote:
> 
>> org.apache.tika.parser.asm.ClassParser cannot be cast to org.apache.tika.parser.Parser
> 
> Sounds like you have multiple copies of the tika-core jar in your classpath.

I thought so, too, but could not see any second Tika JAR on the classpath (as reported via 'ps').

What helped was adding the line

  Parser.class.isAssignableFrom(Class.forName("org.apache.tika.parser.asm.Cla ssParser")); 

in my startup code. Without this, 

  final Iterator<Parser> it = ServiceRegistry.lookupProviders(org.apache.tika.parser.Parser.class); 
  while (it.hasNext()) 
  { 
    System.err.println(it.next().getClass()); 
  } 

produces no results; with the above line, I see all parsers from Tika. (It was not enough to simply define a ClassParser instance "ClassParser dummy".) 

If you have any idea what the reason could be, I am happy to learn about it.

Best regards and thanks, Jukka, for your feedback,
Kaspar

P.S. For the sake of completeness I posted this also to the Google Web Toolkit group:

  http://groups.google.com/group/google-web-toolkit/browse_thread/thread/fb9aa8b02223df03/5ac8dd79b0285d54#5ac8dd79b0285d54


Re: TikaConfig problems

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Fri, Aug 13, 2010 at 5:02 PM, Kaspar Fischer
<ka...@dreizak.com> wrote:
> org.apache.tika.parser.asm.ClassParser cannot be cast to org.apache.tika.parser.Parser

Sounds like you have multiple copies of the tika-core jar in your classpath.

BR,

Jukka Zitting