You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by Greg Holmberg <ho...@comcast.net> on 2012/09/25 00:46:40 UTC

ClassLoader problems when using PEAR files

Hi UIMA users-- 


When I use PEAR files, the XML parser can't find it's DocumentBuilderFactory. I think it's a ClassLoader issue. Has anyone else seen this? 

I install the PEAR as described in the docs: 

    PackageBrowser pkg = PackageInstaller.installPackage(myDir, pearFile, false); 

    String pearDescPath = pkg.getComponentPearDescPath(); 

    ResourceSpecifier specifier = 
        UIMAFramework.getXMLParser().parseResourceSpecifier( 
            new XMLInputSource(pearDescPath)); 

    ResourceManager resmgr = getResourceManager(); 

    AnalysisEngine engine = UIMAFramework.produceAnalysisEngine(specifier, resmgr, params); 

My PEAR includes TikaAnnotator, and I get the exception shown at the end of this email. Summary: TikaConfig asks for an XML parser, but the system can't find one. 

Outside the analysis engine, it's possible to find an implementation of DocumentBuilderFactory, but inside it seems that the ClassLoader in use doesn't have one. 

javax.xml.parsers.DocumentBuilderFactory.newInstance() has a complicated way of finding the implementation (quoting the JavaDoc):

=======================

    Obtain a new instance of a DocumentBuilderFactory. This static method creates a new factory instance. This method
    uses the following ordered lookup procedure to determine the DocumentBuilderFactory implementation class to load:

    * Use the javax.xml.parsers.DocumentBuilderFactory system property.
    * Use the properties file "lib/jaxp.properties" in the JRE directory. This configuration file is in standard java.util.Properties format and contains the fully qualified name of the implementation class with the key being the system property defined above. The jaxp.properties file is read only once by the JAXP implementation and it's values are then cached for future use. If the file does not exist when the first attempt is made to read from it, no further attempts are made to check for its existence. It is not possible to change the value of any property in jaxp.properties after it has been read for the first time.
    *  Use the Services API (as detailed in the JAR specification), if available, to determine the classname. The Services API will look for a classname in the file META-INF/services/javax.xml.parsers.DocumentBuilderFactory in jars available to the runtime.
    * Platform default DocumentBuilderFactory instance.

=========================

So it seems like the ClassLoader used in the analysis engine prevents DocumentBuilderFactory from finding even the platform default implementation.

Does anyone know how to work around this?  Add something to my metadata/install.xml file perhaps?

Thanks,


Greg Holmberg



org.apache.uima.resource.ResourceInitializationException: Error initializing "org.apache.uima.analysis_engine.impl.PearAnalysisEngineWrapper" from descriptor file:/tmp/taservice/pear/SAPAnalysisEngine/SAPAnalysisEngine_pear.xml. 
at org.apache.uima.util.SimpleResourceFactory.produceResource(SimpleResourceFactory.java:144) 
at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62) 
at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269) 
at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:314) 
at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:425) 
at com.sap.taservice.controller.UimaPipeline.createAnalysisEngine(UimaPipeline.java:343) 
at com.sap.taservice.controller.UimaPipeline.execute(UimaPipeline.java:151) 
at com.sap.taservice.controller.TAServiceWork.execute(TAServiceWork.java:44) 
at com.sap.job.impl.TaskImpl.execute(TaskImpl.java:104) 
at com.sap.taservice.job.impl.remote.RemoteWorker.iteration(RemoteWorker.java:52) 
at com.sap.util.DaemonRunnable.run(DaemonRunnable.java:117) 
at java.lang.Thread.run(Thread.java:662) 
Caused by: javax.xml.parsers.FactoryConfigurationError: Provider for javax.xml.parsers.DocumentBuilderFactory cannot be found 
at javax.xml.parsers.DocumentBuilderFactory.newInstance(Unknown Source) 
at org.apache.tika.config.TikaConfig.getBuilder(TikaConfig.java:228) 
at org.apache.tika.config.TikaConfig.<init>(TikaConfig.java:66) 
at org.apache.uima.tika.MarkupAnnotator.initialize(MarkupAnnotator.java:96) 
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initializeAnalysisComponent(PrimitiveAnalysisEngine_impl.java:252) 
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initialize(PrimitiveAnalysisEngine_impl.java:158) 
at org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94) 
at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62) 
at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269) 
at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:387) 
at org.apache.uima.analysis_engine.asb.impl.ASB_impl.setup(ASB_impl.java:255) 
at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initASB(AggregateAnalysisEngine_impl.java:429) 
at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initializeAggregateAnalysisEngine(AggregateAnalysisEngine_impl.java:373) 
at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initialize(AggregateAnalysisEngine_impl.java:186) 
at org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94) 
at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62) 
at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269) 
at org.apache.uima.internal.util.ResourcePool.fillPool(ResourcePool.java:243) 
at org.apache.uima.internal.util.ResourcePool.<init>(ResourcePool.java:100) 
at org.apache.uima.internal.util.AnalysisEnginePool.<init>(AnalysisEnginePool.java:91) 
at org.apache.uima.analysis_engine.impl.MultiprocessingAnalysisEngine_impl.initialize(MultiprocessingAnalysisEngine_impl.java:118) 
at org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94) 
at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62) 
at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269) 
at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:314) 
at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:425) 
at org.apache.uima.analysis_engine.impl.PearAnalysisEngineWrapper.initialize(PearAnalysisEngineWrapper.java:269) 
at org.apache.uima.util.SimpleResourceFactory.produceResource(SimpleResourceFactory.java:123) 
... 11 more 


Re: ClassLoader problems when using PEAR files

Posted by Greg Holmberg <ho...@comcast.net>.
This Tika issues confirms the problem. 

https://issues.apache.org/jira/browse/TIKA-412 

Jukka Zitting reports:

POI depends on dom4j that in turn depends on the xml-apis jar for some XML-related interfaces that are nowadays a part of the JRE. Normally having such an extra jar around doesn't harm anything as normal class loaders will always use the classes provided by the JRE. However some application servers like JBoss allow applications to override javax.* interfaces, which causes all sorts of trouble. Thus it's better if we exclude the xml-apis dependency from Tika.

This issue was fixed in Tika 0.8.

So upgrading the version of Tika in TikaAnnotator would definitely solve the problem for UIMA PEAR users.

Greg


----- Original Message ----- 
From: "Greg Holmberg" <ho...@comcast.net> 
To: msa@schor.com 
Cc: user@uima.apache.org 
Sent: Wednesday, September 26, 2012 6:11:55 PM 
Subject: Re: ClassLoader problems when using PEAR files 

Hi Marshall-- 


I did try that. What it told me is that DocumentBuilderFactory.newInstance() was able to find an implementation many times right up to the point that Tika tried within the PEAR analysis engine, when it couldn't find an implementation. Which I already knew :-) 

Before that point, it was able to find several different implementations, but mostly com.sun.org.apache.xerces.internal.jaxp.documentbuilderfactoryimpl (the platform fallback). Since this class exists in rt.jar (i.e. it's built into the JDK installation), I was perplexed about how the classloader could fail to find it. Especially when I even called ResourceManager.setExtensionClassPath(Thread.currentThread().getContextClassLoader(), ...). That should have allowed the UIMA class loader to fallback to the system class loader, which should be able to find classes in rt.jar. But it didn't. 

After extensive experimenting and googling (I hate to admit how many days I spent on this), I finally figured it out. The conditions are that one is using: 

* Java 1.6 or later (including 1.7) 
* UIMA Addons 2.3.1, specifically the TikaAnnotator and Tika 0.7. 
* PEAR Installer. 

As you know, when you use PEAR files (PackageInstaller), then UIMAFramework.produceAnalysisEngine() creates a new class loader in order to provide an insulated environment based on the classpath instructions in the PEAR's metadata/install.xml file. 

In my case, the PEAR file was built by maven, which I configured (using the "assembly" plug-in) to unpack the .class files of all the dependencies into the "lib" dir. I wanted to create an "all in one" PEAR file with all the necessary classes, so I configured useTransitiveDependencies to true. (By the way, you have to exclude org.apache.uima:uimaj-*:jar from the assembly.) 

Here's where it goes wrong. Maven smartly follows all the dependencies: TikaAnnotator 2.3.1 -> tika-parsers 0.7 -> poi-ooxml 3.6 -> dom4j 1.6.1 -> xml-apis. The problem is that xml-apis includes an implementation of the javax.xml package (I think, or some part of it, anyway). Apparently, dom4j pre-dates JDK 1.6, because since JDK 1.6 the javax.xml package is built into the JDK, and one doesn't need xml-apis. So what happens, I think, is some implementation of DocumentBuilderFactory is found in xml-apis, and it is somehow incompatible with the interface, and can't be instantiated. So DocumentBuilderFactory gives up, and doesn't even try the one in rt.jar (even though the classloader could find it, if asked). 

In short, due to xml-apis being in the PEAR file, the system can't find the good DocumentBuilderFactory in rt.jar. 

Solution: remove xml-apis from the PEAR file. 

I did it by changing my pom.xml: 

<dependency> 
<groupId>org.apache.uima</groupId> 
<artifactId>TikaAnnotator</artifactId> 
<exclusions> 
<exclusion> 
<groupId>xml-apis</groupId> 
<artifactId>xml-apis</artifactId> 
</exclusion> 
</exclusions> 
</dependency> 

=========== 

May I suggest that UIMA Add-ons upgrades to a newer version of Tika? 0.7 dates to April 2010. Current version is 1.2. I'm guessing that a more current version using a more current POI and DOM4J wouldn't have the dependency on xml-apis (since that package is now included in the JDK). I think that would be the best solution to allow using TikaAnnotator in PEAR files in Java 1.6 and later. 


Hope this helps someone. Can I be the only one using TikaAnnotator in PEAR files on Java 1.6? 


Greg Holmberg 


----- Original Message ----- 
From: "Marshall Schor" <ms...@schor.com> 
To: user@uima.apache.org 
Sent: Wednesday, September 26, 2012 3:57:07 PM 
Subject: Re: ClassLoader problems when using PEAR files 

Hi Greg, 

Did you try troubleshooting this using the "Tip" in the Javadocs for the 
DocumentBuilderFactory class (add -Djaxp.debug=1 to the "java" command line)? 

-Marshall 

On 9/24/2012 6:46 PM, Greg Holmberg wrote: 
> Hi UIMA users-- 
> 
> 
> When I use PEAR files, the XML parser can't find it's DocumentBuilderFactory. I think it's a ClassLoader issue. Has anyone else seen this? 
> 
> I install the PEAR as described in the docs: 
> 
> PackageBrowser pkg = PackageInstaller.installPackage(myDir, pearFile, false); 
> 
> String pearDescPath = pkg.getComponentPearDescPath(); 
> 
> ResourceSpecifier specifier = 
> UIMAFramework.getXMLParser().parseResourceSpecifier( 
> new XMLInputSource(pearDescPath)); 
> 
> ResourceManager resmgr = getResourceManager(); 
> 
> AnalysisEngine engine = UIMAFramework.produceAnalysisEngine(specifier, resmgr, params); 
> 
> My PEAR includes TikaAnnotator, and I get the exception shown at the end of this email. Summary: TikaConfig asks for an XML parser, but the system can't find one. 
> 
> Outside the analysis engine, it's possible to find an implementation of DocumentBuilderFactory, but inside it seems that the ClassLoader in use doesn't have one. 
> 
> javax.xml.parsers.DocumentBuilderFactory.newInstance() has a complicated way of finding the implementation (quoting the JavaDoc): 
> 
> ======================= 
> 
> Obtain a new instance of a DocumentBuilderFactory. This static method creates a new factory instance. This method 
> uses the following ordered lookup procedure to determine the DocumentBuilderFactory implementation class to load: 
> 
> * Use the javax.xml.parsers.DocumentBuilderFactory system property. 
> * Use the properties file "lib/jaxp.properties" in the JRE directory. This configuration file is in standard java.util.Properties format and contains the fully qualified name of the implementation class with the key being the system property defined above. The jaxp.properties file is read only once by the JAXP implementation and it's values are then cached for future use. If the file does not exist when the first attempt is made to read from it, no further attempts are made to check for its existence. It is not possible to change the value of any property in jaxp.properties after it has been read for the first time. 
> * Use the Services API (as detailed in the JAR specification), if available, to determine the classname. The Services API will look for a classname in the file META-INF/services/javax.xml.parsers.DocumentBuilderFactory in jars available to the runtime. 
> * Platform default DocumentBuilderFactory instance. 
> 
> ========================= 
> 
> So it seems like the ClassLoader used in the analysis engine prevents DocumentBuilderFactory from finding even the platform default implementation. 
> 
> Does anyone know how to work around this? Add something to my metadata/install.xml file perhaps? 
> 
> Thanks, 
> 
> 
> Greg Holmberg 
> 
> 
> 
> org.apache.uima.resource.ResourceInitializationException: Error initializing "org.apache.uima.analysis_engine.impl.PearAnalysisEngineWrapper" from descriptor file:/tmp/taservice/pear/SAPAnalysisEngine/SAPAnalysisEngine_pear.xml. 
> at org.apache.uima.util.SimpleResourceFactory.produceResource(SimpleResourceFactory.java:144) 
> at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62) 
> at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269) 
> at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:314) 
> at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:425) 
> at com.sap.taservice.controller.UimaPipeline.createAnalysisEngine(UimaPipeline.java:343) 
> at com.sap.taservice.controller.UimaPipeline.execute(UimaPipeline.java:151) 
> at com.sap.taservice.controller.TAServiceWork.execute(TAServiceWork.java:44) 
> at com.sap.job.impl.TaskImpl.execute(TaskImpl.java:104) 
> at com.sap.taservice.job.impl.remote.RemoteWorker.iteration(RemoteWorker.java:52) 
> at com.sap.util.DaemonRunnable.run(DaemonRunnable.java:117) 
> at java.lang.Thread.run(Thread.java:662) 
> Caused by: javax.xml.parsers.FactoryConfigurationError: Provider for javax.xml.parsers.DocumentBuilderFactory cannot be found 
> at javax.xml.parsers.DocumentBuilderFactory.newInstance(Unknown Source) 
> at org.apache.tika.config.TikaConfig.getBuilder(TikaConfig.java:228) 
> at org.apache.tika.config.TikaConfig.<init>(TikaConfig.java:66) 
> at org.apache.uima.tika.MarkupAnnotator.initialize(MarkupAnnotator.java:96) 
> at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initializeAnalysisComponent(PrimitiveAnalysisEngine_impl.java:252) 
> at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initialize(PrimitiveAnalysisEngine_impl.java:158) 
> at org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94) 
> at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62) 
> at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269) 
> at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:387) 
> at org.apache.uima.analysis_engine.asb.impl.ASB_impl.setup(ASB_impl.java:255) 
> at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initASB(AggregateAnalysisEngine_impl.java:429) 
> at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initializeAggregateAnalysisEngine(AggregateAnalysisEngine_impl.java:373) 
> at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initialize(AggregateAnalysisEngine_impl.java:186) 
> at org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94) 
> at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62) 
> at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269) 
> at org.apache.uima.internal.util.ResourcePool.fillPool(ResourcePool.java:243) 
> at org.apache.uima.internal.util.ResourcePool.<init>(ResourcePool.java:100) 
> at org.apache.uima.internal.util.AnalysisEnginePool.<init>(AnalysisEnginePool.java:91) 
> at org.apache.uima.analysis_engine.impl.MultiprocessingAnalysisEngine_impl.initialize(MultiprocessingAnalysisEngine_impl.java:118) 
> at org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94) 
> at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62) 
> at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269) 
> at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:314) 
> at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:425) 
> at org.apache.uima.analysis_engine.impl.PearAnalysisEngineWrapper.initialize(PearAnalysisEngineWrapper.java:269) 
> at org.apache.uima.util.SimpleResourceFactory.produceResource(SimpleResourceFactory.java:123) 
> ... 11 more 
> 
> 


Re: ClassLoader problems when using PEAR files

Posted by Greg Holmberg <ho...@comcast.net>.
Hi Marshall-- 


I did try that. What it told me is that DocumentBuilderFactory.newInstance() was able to find an implementation many times right up to the point that Tika tried within the PEAR analysis engine, when it couldn't find an implementation. Which I already knew :-) 

Before that point, it was able to find several different implementations, but mostly com.sun.org.apache.xerces.internal.jaxp.documentbuilderfactoryimpl (the platform fallback).  Since this class exists in rt.jar (i.e. it's built into the JDK installation), I was perplexed about how the classloader could fail to find it.  Especially when I even called ResourceManager.setExtensionClassPath(Thread.currentThread().getContextClassLoader(), ...).  That should have allowed the UIMA class loader to fallback to the system class loader, which should be able to find classes in rt.jar.  But it didn't.

After extensive experimenting and googling (I hate to admit how many days I spent on this), I finally figured it out. The conditions are that one is using: 

* Java 1.6 or later (including 1.7)
* UIMA Addons 2.3.1, specifically the TikaAnnotator and Tika 0.7.
* PEAR Installer.

As you know, when you use PEAR files (PackageInstaller), then UIMAFramework.produceAnalysisEngine() creates a new class loader in order to provide an insulated environment based on the classpath instructions in the PEAR's metadata/install.xml file.

In my case, the PEAR file was built by maven, which I configured (using the "assembly" plug-in) to unpack the .class files of all the dependencies into the "lib" dir.  I wanted to create an "all in one" PEAR file with all the necessary classes, so I configured useTransitiveDependencies to true.  (By the way, you have to exclude org.apache.uima:uimaj-*:jar from the assembly.)

Here's where it goes wrong.  Maven smartly follows all the dependencies: TikaAnnotator 2.3.1 -> tika-parsers 0.7 -> poi-ooxml 3.6 -> dom4j 1.6.1 -> xml-apis.  The problem is that xml-apis includes an implementation of the javax.xml package (I think, or some part of it, anyway).  Apparently, dom4j pre-dates JDK 1.6, because since JDK 1.6 the javax.xml package is built into the JDK, and one doesn't need xml-apis.  So what happens, I think, is some implementation of DocumentBuilderFactory is found in xml-apis, and it is somehow incompatible with the interface, and can't be instantiated.  So DocumentBuilderFactory gives up, and doesn't even try the one in rt.jar (even though the classloader could find it, if asked).

In short, due to xml-apis being in the PEAR file, the system can't find the good DocumentBuilderFactory in rt.jar.

Solution: remove xml-apis from the PEAR file.

I did it by changing my pom.xml:

    <dependency>
      <groupId>org.apache.uima</groupId>
      <artifactId>TikaAnnotator</artifactId>
      <exclusions>
        <exclusion>
          <groupId>xml-apis</groupId>
          <artifactId>xml-apis</artifactId>
        </exclusion>
      </exclusions>
    </dependency>

===========

May I suggest that UIMA Add-ons upgrades to a newer version of Tika?  0.7 dates to April 2010.  Current version is 1.2.  I'm guessing that a more current version using a more current POI and DOM4J wouldn't have the dependency on xml-apis (since that package is now included in the JDK).  I think that would be the best solution to allow using TikaAnnotator in PEAR files in Java 1.6 and later.


Hope this helps someone.  Can I be the only one using TikaAnnotator in PEAR files on Java 1.6?


Greg Holmberg


----- Original Message ----- 
From: "Marshall Schor" <ms...@schor.com> 
To: user@uima.apache.org 
Sent: Wednesday, September 26, 2012 3:57:07 PM 
Subject: Re: ClassLoader problems when using PEAR files 

Hi Greg, 

Did you try troubleshooting this using the "Tip" in the Javadocs for the 
DocumentBuilderFactory class (add -Djaxp.debug=1 to the "java" command line)? 

-Marshall 

On 9/24/2012 6:46 PM, Greg Holmberg wrote: 
> Hi UIMA users-- 
> 
> 
> When I use PEAR files, the XML parser can't find it's DocumentBuilderFactory. I think it's a ClassLoader issue. Has anyone else seen this? 
> 
> I install the PEAR as described in the docs: 
> 
> PackageBrowser pkg = PackageInstaller.installPackage(myDir, pearFile, false); 
> 
> String pearDescPath = pkg.getComponentPearDescPath(); 
> 
> ResourceSpecifier specifier = 
> UIMAFramework.getXMLParser().parseResourceSpecifier( 
> new XMLInputSource(pearDescPath)); 
> 
> ResourceManager resmgr = getResourceManager(); 
> 
> AnalysisEngine engine = UIMAFramework.produceAnalysisEngine(specifier, resmgr, params); 
> 
> My PEAR includes TikaAnnotator, and I get the exception shown at the end of this email. Summary: TikaConfig asks for an XML parser, but the system can't find one. 
> 
> Outside the analysis engine, it's possible to find an implementation of DocumentBuilderFactory, but inside it seems that the ClassLoader in use doesn't have one. 
> 
> javax.xml.parsers.DocumentBuilderFactory.newInstance() has a complicated way of finding the implementation (quoting the JavaDoc): 
> 
> ======================= 
> 
> Obtain a new instance of a DocumentBuilderFactory. This static method creates a new factory instance. This method 
> uses the following ordered lookup procedure to determine the DocumentBuilderFactory implementation class to load: 
> 
> * Use the javax.xml.parsers.DocumentBuilderFactory system property. 
> * Use the properties file "lib/jaxp.properties" in the JRE directory. This configuration file is in standard java.util.Properties format and contains the fully qualified name of the implementation class with the key being the system property defined above. The jaxp.properties file is read only once by the JAXP implementation and it's values are then cached for future use. If the file does not exist when the first attempt is made to read from it, no further attempts are made to check for its existence. It is not possible to change the value of any property in jaxp.properties after it has been read for the first time. 
> * Use the Services API (as detailed in the JAR specification), if available, to determine the classname. The Services API will look for a classname in the file META-INF/services/javax.xml.parsers.DocumentBuilderFactory in jars available to the runtime. 
> * Platform default DocumentBuilderFactory instance. 
> 
> ========================= 
> 
> So it seems like the ClassLoader used in the analysis engine prevents DocumentBuilderFactory from finding even the platform default implementation. 
> 
> Does anyone know how to work around this? Add something to my metadata/install.xml file perhaps? 
> 
> Thanks, 
> 
> 
> Greg Holmberg 
> 
> 
> 
> org.apache.uima.resource.ResourceInitializationException: Error initializing "org.apache.uima.analysis_engine.impl.PearAnalysisEngineWrapper" from descriptor file:/tmp/taservice/pear/SAPAnalysisEngine/SAPAnalysisEngine_pear.xml. 
> at org.apache.uima.util.SimpleResourceFactory.produceResource(SimpleResourceFactory.java:144) 
> at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62) 
> at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269) 
> at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:314) 
> at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:425) 
> at com.sap.taservice.controller.UimaPipeline.createAnalysisEngine(UimaPipeline.java:343) 
> at com.sap.taservice.controller.UimaPipeline.execute(UimaPipeline.java:151) 
> at com.sap.taservice.controller.TAServiceWork.execute(TAServiceWork.java:44) 
> at com.sap.job.impl.TaskImpl.execute(TaskImpl.java:104) 
> at com.sap.taservice.job.impl.remote.RemoteWorker.iteration(RemoteWorker.java:52) 
> at com.sap.util.DaemonRunnable.run(DaemonRunnable.java:117) 
> at java.lang.Thread.run(Thread.java:662) 
> Caused by: javax.xml.parsers.FactoryConfigurationError: Provider for javax.xml.parsers.DocumentBuilderFactory cannot be found 
> at javax.xml.parsers.DocumentBuilderFactory.newInstance(Unknown Source) 
> at org.apache.tika.config.TikaConfig.getBuilder(TikaConfig.java:228) 
> at org.apache.tika.config.TikaConfig.<init>(TikaConfig.java:66) 
> at org.apache.uima.tika.MarkupAnnotator.initialize(MarkupAnnotator.java:96) 
> at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initializeAnalysisComponent(PrimitiveAnalysisEngine_impl.java:252) 
> at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initialize(PrimitiveAnalysisEngine_impl.java:158) 
> at org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94) 
> at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62) 
> at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269) 
> at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:387) 
> at org.apache.uima.analysis_engine.asb.impl.ASB_impl.setup(ASB_impl.java:255) 
> at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initASB(AggregateAnalysisEngine_impl.java:429) 
> at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initializeAggregateAnalysisEngine(AggregateAnalysisEngine_impl.java:373) 
> at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initialize(AggregateAnalysisEngine_impl.java:186) 
> at org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94) 
> at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62) 
> at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269) 
> at org.apache.uima.internal.util.ResourcePool.fillPool(ResourcePool.java:243) 
> at org.apache.uima.internal.util.ResourcePool.<init>(ResourcePool.java:100) 
> at org.apache.uima.internal.util.AnalysisEnginePool.<init>(AnalysisEnginePool.java:91) 
> at org.apache.uima.analysis_engine.impl.MultiprocessingAnalysisEngine_impl.initialize(MultiprocessingAnalysisEngine_impl.java:118) 
> at org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94) 
> at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62) 
> at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269) 
> at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:314) 
> at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:425) 
> at org.apache.uima.analysis_engine.impl.PearAnalysisEngineWrapper.initialize(PearAnalysisEngineWrapper.java:269) 
> at org.apache.uima.util.SimpleResourceFactory.produceResource(SimpleResourceFactory.java:123) 
> ... 11 more 
> 
> 


Re: ClassLoader problems when using PEAR files

Posted by Marshall Schor <ms...@schor.com>.
Hi Greg,

Did you try troubleshooting this using the "Tip" in the Javadocs for the
DocumentBuilderFactory class (add -Djaxp.debug=1 to the "java" command line)? 

-Marshall

On 9/24/2012 6:46 PM, Greg Holmberg wrote:
> Hi UIMA users-- 
>
>
> When I use PEAR files, the XML parser can't find it's DocumentBuilderFactory. I think it's a ClassLoader issue. Has anyone else seen this? 
>
> I install the PEAR as described in the docs: 
>
>     PackageBrowser pkg = PackageInstaller.installPackage(myDir, pearFile, false); 
>
>     String pearDescPath = pkg.getComponentPearDescPath(); 
>
>     ResourceSpecifier specifier = 
>         UIMAFramework.getXMLParser().parseResourceSpecifier( 
>             new XMLInputSource(pearDescPath)); 
>
>     ResourceManager resmgr = getResourceManager(); 
>
>     AnalysisEngine engine = UIMAFramework.produceAnalysisEngine(specifier, resmgr, params); 
>
> My PEAR includes TikaAnnotator, and I get the exception shown at the end of this email. Summary: TikaConfig asks for an XML parser, but the system can't find one. 
>
> Outside the analysis engine, it's possible to find an implementation of DocumentBuilderFactory, but inside it seems that the ClassLoader in use doesn't have one. 
>
> javax.xml.parsers.DocumentBuilderFactory.newInstance() has a complicated way of finding the implementation (quoting the JavaDoc):
>
> =======================
>
>     Obtain a new instance of a DocumentBuilderFactory. This static method creates a new factory instance. This method
>     uses the following ordered lookup procedure to determine the DocumentBuilderFactory implementation class to load:
>
>     * Use the javax.xml.parsers.DocumentBuilderFactory system property.
>     * Use the properties file "lib/jaxp.properties" in the JRE directory. This configuration file is in standard java.util.Properties format and contains the fully qualified name of the implementation class with the key being the system property defined above. The jaxp.properties file is read only once by the JAXP implementation and it's values are then cached for future use. If the file does not exist when the first attempt is made to read from it, no further attempts are made to check for its existence. It is not possible to change the value of any property in jaxp.properties after it has been read for the first time.
>     *  Use the Services API (as detailed in the JAR specification), if available, to determine the classname. The Services API will look for a classname in the file META-INF/services/javax.xml.parsers.DocumentBuilderFactory in jars available to the runtime.
>     * Platform default DocumentBuilderFactory instance.
>
> =========================
>
> So it seems like the ClassLoader used in the analysis engine prevents DocumentBuilderFactory from finding even the platform default implementation.
>
> Does anyone know how to work around this?  Add something to my metadata/install.xml file perhaps?
>
> Thanks,
>
>
> Greg Holmberg
>
>
>
> org.apache.uima.resource.ResourceInitializationException: Error initializing "org.apache.uima.analysis_engine.impl.PearAnalysisEngineWrapper" from descriptor file:/tmp/taservice/pear/SAPAnalysisEngine/SAPAnalysisEngine_pear.xml. 
> at org.apache.uima.util.SimpleResourceFactory.produceResource(SimpleResourceFactory.java:144) 
> at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62) 
> at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269) 
> at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:314) 
> at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:425) 
> at com.sap.taservice.controller.UimaPipeline.createAnalysisEngine(UimaPipeline.java:343) 
> at com.sap.taservice.controller.UimaPipeline.execute(UimaPipeline.java:151) 
> at com.sap.taservice.controller.TAServiceWork.execute(TAServiceWork.java:44) 
> at com.sap.job.impl.TaskImpl.execute(TaskImpl.java:104) 
> at com.sap.taservice.job.impl.remote.RemoteWorker.iteration(RemoteWorker.java:52) 
> at com.sap.util.DaemonRunnable.run(DaemonRunnable.java:117) 
> at java.lang.Thread.run(Thread.java:662) 
> Caused by: javax.xml.parsers.FactoryConfigurationError: Provider for javax.xml.parsers.DocumentBuilderFactory cannot be found 
> at javax.xml.parsers.DocumentBuilderFactory.newInstance(Unknown Source) 
> at org.apache.tika.config.TikaConfig.getBuilder(TikaConfig.java:228) 
> at org.apache.tika.config.TikaConfig.<init>(TikaConfig.java:66) 
> at org.apache.uima.tika.MarkupAnnotator.initialize(MarkupAnnotator.java:96) 
> at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initializeAnalysisComponent(PrimitiveAnalysisEngine_impl.java:252) 
> at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initialize(PrimitiveAnalysisEngine_impl.java:158) 
> at org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94) 
> at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62) 
> at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269) 
> at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:387) 
> at org.apache.uima.analysis_engine.asb.impl.ASB_impl.setup(ASB_impl.java:255) 
> at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initASB(AggregateAnalysisEngine_impl.java:429) 
> at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initializeAggregateAnalysisEngine(AggregateAnalysisEngine_impl.java:373) 
> at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initialize(AggregateAnalysisEngine_impl.java:186) 
> at org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94) 
> at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62) 
> at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269) 
> at org.apache.uima.internal.util.ResourcePool.fillPool(ResourcePool.java:243) 
> at org.apache.uima.internal.util.ResourcePool.<init>(ResourcePool.java:100) 
> at org.apache.uima.internal.util.AnalysisEnginePool.<init>(AnalysisEnginePool.java:91) 
> at org.apache.uima.analysis_engine.impl.MultiprocessingAnalysisEngine_impl.initialize(MultiprocessingAnalysisEngine_impl.java:118) 
> at org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94) 
> at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62) 
> at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269) 
> at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:314) 
> at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:425) 
> at org.apache.uima.analysis_engine.impl.PearAnalysisEngineWrapper.initialize(PearAnalysisEngineWrapper.java:269) 
> at org.apache.uima.util.SimpleResourceFactory.produceResource(SimpleResourceFactory.java:123) 
> ... 11 more 
>
>