You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by "Greg Holmberg (JIRA)" <de...@uima.apache.org> on 2012/09/27 22:35:07 UTC

[jira] [Created] (UIMA-2472) TikaAnnotator can't find XML parser when used in a PEAR file with Java 1.5 or later

Greg Holmberg created UIMA-2472:
-----------------------------------

             Summary: TikaAnnotator can't find XML parser when used in a PEAR file with Java 1.5 or later
                 Key: UIMA-2472
                 URL: https://issues.apache.org/jira/browse/UIMA-2472
             Project: UIMA
          Issue Type: Bug
          Components: addons
    Affects Versions: 2.3.1Addons
         Environment: Java 1.5 and later
            Reporter: Greg Holmberg
            Priority: Critical


When TikaAnnotator is part of a PEAR file, then when you call UIMAFramework.produceAnalysisEngine() and Tika asks the system for an XML parser, it fails with the exception:

javax.xml.parsers.FactoryConfigurationError: Provider for javax.xml.parsers.DocumentBuilderFactory cannot be found 

This is because the XML parser is now built into Java, but the UIMA classloader (used with PEAR files) finds the parser implementation in xml-apis.jar first, which is older and incompatible with the current XML interfaces.  xml-apis.jar is included because it's one of the eventual maven dependencies for Tika 0.7.  See this issue for more information:

https://issues.apache.org/jira/browse/TIKA-412

This was fixed in Tika 0.8.

A work-around for those UIMA users who want to use TikaAnnotator in PEAR files with Java 1.6 is to exclude xml-apis from their PEAR file:

<dependency> 
  <groupId>org.apache.uima</groupId> 
  <artifactId>TikaAnnotator</artifactId> 
  <exclusions> 
    <exclusion> 
      <groupId>xml-apis</groupId> 
      <artifactId>xml-apis</artifactId> 
    </exclusion> 
  </exclusions> 
</dependency>

However, a better fix would be to update the version of Tika used in TikaAnnotator.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira