You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tika.apache.org by LAA <la...@gmail.com> on 2009/05/24 22:58:53 UTC

building a custom tika library

Hi!

I'm trying to use Tika in my Eclipse RCP application which has a Lucene
search implemented. I am using the tika 0.3-standalone.jar, built using the
default maven build in the download from apache.org
I encounter a problem when I run the code:
{.....
                    ContentHandler textHandler = new BodyContentHandler();
                    Metadata metadata = new Metadata();
                    metadata.add(Metadata.RESOURCE_NAME_KEY, f.getName());
                    AutoDetectParser parser = new AutoDetectParser();

                    parser.parse(istream, textHandler, metadata); //get
error here: java.lang.LinkageError
.....}

I get an error in the parser.parse(istream, textHandler, metadata) because
my application, which is dependent on java 1.6 uses the
org.xml.sax.ContentHandler found in java 1.6, while the internals of Tika
uses the ContentHandler found in the standalone tika jar:

here's the error:

java.lang.LinkageError: loader constraint violation: when resolving method
"org.apache.tika.parser.AutoDetectParser.parse(Ljava/io/InputStream;Lorg/xml/sax/ContentHandler;Lorg/apache/tika/metadata/Metadata;)V"
the class loader (instance of
org/eclipse/osgi/internal/baseadaptor/DefaultClassLoader) of the current
class, no/telenor/services/search/logic/CreateTikaIndex, and the class
loader (instance of
org/eclipse/osgi/internal/baseadaptor/DefaultClassLoader) for resolved
class, org/apache/tika/parser/AutoDetectParser, have different Class objects
for the type org/xml/sax/ContentHandler used in the signature


Anyone know how to build tika without including the org.xml..... in the
build? is it not a little redundant to include this in the build since it is
already present in java?

hope someone can answer!

best regards!
LAA

Re: building a custom tika library

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Sun, May 24, 2009 at 10:58 PM, LAA <la...@gmail.com> wrote:
> Anyone know how to build tika without including the org.xml..... in the
> build? is it not a little redundant to include this in the build since it is
> already present in java?

The extra XML classes are coming from the transitive xml-apis
dependency through poi-ooxml and dom4j. You could work around the
issue for now by adding an exclusion rule for xml-apis in the pom.xml
file before building the standalone jar.

Going forward, I think we should handle this better in Tika 0.4. Can
you file a Jira issue for that?

BR,

Jukka Zitting