You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by Bonnie MacKellar <bk...@gmail.com> on 2016/06/22 19:55:12 UTC

problems integrating Ruta and uimaFit

I am still trying to figure out how to count Ruta annotations across a
bunch of input files. There doesn't seem to be any Workbench way to do it.
So now I am trying to call Ruta from UimaFit so I can do the job in Java.

However, I am having serious configuration problems, plus I have a question
on how do bring in PlainTextAnnotator.

I am using Maven, with the jcasgen-maven-plugin, the ruta-maven-plugin, and
the uimafit-maven-plugin. I will include the pom file at the end of this
post.

I want my Java code to be aware of the types declared in the Ruta script -
that is the whole point - I want to count those annotations.

My Ruta script also uses PlainTextAnnotator. The problem with this is that
I can't figure out where to put it. In a Workbench based Ruta project,
PlainTextAnnotator.xml and PlainTextAnnotatorTypeSystem get put
automatically into descriptor/utils, along with a number of other
descriptors that seem to be built into Ruta. But when I create a project
using maven, there is no such location, and these descriptors do not get
put anywhere. I tried a number of places but could not get my script to see
the type system for PlainTextAnnotator. Finally, I hit on putting the files
in target/generated-sources/ruta/descriptor/utils, and finally my script is
able to see the types and I can run it. This is good because at that point,
the ruta-maven-plugin does its job and generates the descriptors for my
script. However, I suspect this is not a good place to put the
PlainTextAnnotator files since doing a clean overwrites them. Where should
they go? Is there any entry in the pom file that is needed?

The second problem is that although my Ruta script works nicely on its own,
the Java code fails.  I get the following exception
Exception in thread "main" org.apache.uima.cas.CASRuntimeException: JCas
type "org.apache.uima.examples.SourceDocumentInformation" used in Java
code,  but was not declared in the XML type descriptor.
at org.apache.uima.jcas.impl.JCasImpl.getTypeInit(JCasImpl.java:435)
at org.apache.uima.jcas.impl.JCasImpl.getType(JCasImpl.java:408)
at org.apache.uima.jcas.cas.TOP.<init>(TOP.java:96)
at org.apache.uima.jcas.cas.AnnotationBase.<init>(AnnotationBase.java:66)
at org.apache.uima.jcas.tcas.Annotation.<init>(Annotation.java:54)
at
org.apache.uima.examples.SourceDocumentInformation.<init>(SourceDocumentInformation.java:80)
at
org.apache.uima.examples.cpe.FileSystemCollectionReader.getNext(FileSystemCollectionReader.java:162)
at
org.apache.uima.fit.pipeline.SimplePipeline.runPipeline(SimplePipeline.java:149)
at PipelineSystem.<init>(PipelineSystem.java:59)
at PipelineSystem.main(PipelineSystem.java:73)

I am guessing that I need to put some other descriptor somewhere but I
can't figure out what it might be.  Here is the code that causes the problem
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
import java.io.IOException;
import java.util.Iterator;

import org.apache.uima.UIMAException;
import org.apache.uima.analysis_engine.AnalysisEngine;
import org.apache.uima.analysis_engine.AnalysisEngineDescription;
import org.apache.uima.analysis_engine.AnalysisEngineProcessException;
import org.apache.uima.cas.Type;
import org.apache.uima.cas.TypeSystem;
import org.apache.uima.collection.CollectionReaderDescription;
import org.apache.uima.examples.cpe.FileSystemCollectionReader;
import org.apache.uima.fit.component.CasDumpWriter;
import org.apache.uima.fit.factory.AnalysisEngineFactory;
import org.apache.uima.fit.factory.CollectionReaderFactory;
import org.apache.uima.fit.pipeline.SimplePipeline;
import org.apache.uima.jcas.JCas;
import org.apache.uima.resource.ResourceInitializationException;
import org.apache.uima.ruta.engine.RutaEngine;

public class PipelineSystem  {
public PipelineSystem() throws IOException, UIMAException
{
try {
CollectionReaderDescription readerDesc =
CollectionReaderFactory.createReaderDescription(
FileSystemCollectionReader.class,
           FileSystemCollectionReader.PARAM_INPUTDIR,
 "/home/bonnie/Research/eclipse-uima-projects/PipeLineWithRuta/input",
           FileSystemCollectionReader.PARAM_ENCODING,  "UTF-8",
           FileSystemCollectionReader.PARAM_LANGUAGE,  "English");
AnalysisEngine rae = AnalysisEngineFactory.createEngine(RutaEngine.class,
RutaEngine.PARAM_MAIN_SCRIPT,
           "ecClassifierRules");
AnalysisEngineDescription rutaEngineDesc =
AnalysisEngineFactory.createEngineDescription(RutaEngine.class,
RutaEngine.PARAM_MAIN_SCRIPT,
           "ecClassifierRules");
AnalysisEngineDescription writerDesc =
AnalysisEngineFactory.createEngineDescription(CasDumpWriter.class,
CasDumpWriter.PARAM_OUTPUT_FILE, "dump.txt");
JCas jCas = rae.newJCas();
SimplePipeline.runPipeline(readerDesc, rutaEngineDesc);
displayRutaResults(jCas);
} catch (ResourceInitializationException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (AnalysisEngineProcessException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}

public static void main(String[] args) throws IOException, UIMAException  {
PipelineSystem p = new PipelineSystem();

}

public void displayRutaResults(JCas jCas)
{
System.out.println("in display ruta results");
TypeSystem ts = jCas.getTypeSystem();
Iterator<Type> typeItr = ts.getTypeIterator();
while (typeItr.hasNext()) {
Type type = (Type) typeItr.next();
if (type.getName().equals("INCL")) {
System.out.println("INCL was found");
}
}
}
------------------------------------------------------------------------------------------------------------------------------------------------

Yes, I know the code doesn't actually count annotations yet - this is
strictly a test of the configuration. The type INCL is declared in the
script

ENGINE utils.PlainTextAnnotator; TYPESYSTEM utils.PlainTextTypeSystem;
Document{-> RETAINTYPE(BREAK)}; Document{-> EXEC(PlainTextAnnotator,
{Line})};

DECLARE INCL; "INCLUSION" -> INCL;

And finally, here is the pom file. I note that the ruta pugin and the
jcasegen plugin are correctly generating the descriptor files for the
script and the Java classes for the types. I have this set up so that the
jcasgen plugin reads the type descriptors from the folder that is generated
by the ruta-maven-plugin (I saw this in one of the examples mentioned
elsewhere on this mailing lsit)
However, the uimafit plugin does not generate anything.

thanks for any help. It is really hard to figure out all these moving parts.

Bonnie MacKellar

---------------------------------------------------------------------------------------------------------------------------------

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="
http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="
http://maven.apache.org/POM/4.0.0
http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion> <groupId>PipeLineWithRuta</groupId>
<artifactId>PipeLineWithRuta</artifactId> <version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging> <name>PipeLineWithRuta</name> <url>
http://maven.apache.org</url> <properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties> <build> <sourceDirectory>src/main/java</sourceDirectory>
<resources> <resource> <directory>src/main/ruta</directory> </resource>
<resource> <directory>src/desc</directory> </resource> </resources>
<plugins> <plugin> <artifactId>maven-compiler-plugin</artifactId>
<version>3.3</version> <configuration> <source>1.8</source>
<target>1.8</target> </configuration> </plugin> <plugin>
<groupId>org.apache.uima</groupId>
<artifactId>jcasgen-maven-plugin</artifactId> <version>2.4.1</version> <!--
change this to the latest version --> <executions> <execution> <goals>
<goal>generate</goal> </goals> <!-- this is the only goal --> <!-- runs in
phase process-resources by default --> <configuration> <!-- REQUIRED -->
<typeSystemIncludes> <!-- one or more ant-like file patterns identifying
top level descriptors -->
<typeSystemInclude>target/generated-sources/ruta/descriptor/ecClassifierRulesTypeSystem.xml</typeSystemInclude>
</typeSystemIncludes> <!-- OPTIONAL --> <!-- a sequence of ant-like file
patterns to exclude from the above include list --> <typeSystemExcludes>
</typeSystemExcludes> <!-- OPTIONAL --> <!-- where the generated files go
--> <!-- default value:
${project.build.directory}/generated-sources/jcasgen" --> <outputDirectory>
</outputDirectory> <!-- true or false, default = false --> <!-- if true,
then although the complete merged type system will be created internally,
only those types whose definition is contained within this maven project
will be generated. The others will be presumed to be available via other
projects. --> <!-- OPTIONAL --> <limitToProject>true</limitToProject>
</configuration> </execution> </executions> </plugin> <plugin>
<groupId>org.apache.uima</groupId>
<artifactId>ruta-maven-plugin</artifactId> <version>2.3.1</version>
<configuration> <scriptPaths> <scriptPath>src/main/ruta/</scriptPath>
</scriptPaths> <!-- Descriptor paths of the generated analysis engine
descriptor. --> <!-- default value: none --> <descriptorPaths>
<descriptorPath>${project.build.directory}/generated-sources/ruta/descriptor</descriptorPath>
</descriptorPaths> <!-- Resource paths of the generated analysis engine
descriptor. --> <!-- default value: none --> <resourcePaths>
<resourcePath>${project.build.directory}/generated-sources/ruta/
resources/</resourcePath> </resourcePaths>
<analysisEngineSuffix>Engine</analysisEngineSuffix>
<typeSystemSuffix>TypeSystem</typeSystemSuffix> <!-- Type of type system
imports. false = import by location. --> <!-- default value: false -->
<importByName>false</importByName> <!-- Option to resolve imports while
building. --> <!-- default value: false -->
<resolveImports>false</resolveImports> <!-- List of packages with language
extensions --> <!-- default value: none --> <extensionPackages>
<extensionPackage>org.apache.uima.ruta</extensionPackage>
</extensionPackages> <!-- Add UIMA Ruta nature to .project --> <!-- default
value: false --> <addRutaNature>true</addRutaNature> <!-- Buildpath of the
UIMA Ruta Workbench (IDE) for this project --> <!-- default value: none -->
<buildPaths> <buildPath>script:src/main/ruta/</buildPath>
<buildPath>descriptor:target/generated-sources/ruta/descriptor/
</buildPath> <buildPath>resources:src/main/resources/</buildPath>
</buildPaths> </configuration> <executions> <execution> <id>default</id>
<phase>process-classes</phase> <goals> <goal>generate</goal> </goals>
</execution> </executions> </plugin> <plugin>
<groupId>org.apache.uima</groupId>
<artifactId>uimafit-maven-plugin</artifactId> <version>2.2.0</version> <!--
change to latest version --> <configuration> <!-- OPTIONAL --> <!-- Path
where the generated resources are written. --> <outputDirectory>
${project.build.directory}/generated-sources/uimafit </outputDirectory>
<!-- OPTIONAL --> <!-- Skip generation of
META-INF/org.apache.uima.fit/components.txt -->
<skipComponentsManifest>false</skipComponentsManifest> <!-- OPTIONAL -->
<!-- Source file encoding. -->
<encoding>${project.build.sourceEncoding}</encoding> </configuration>
<executions> <execution> <id>default</id> <phase>process-classes</phase>
<goals> <goal>generate</goal> </goals> </execution> </executions> </plugin>
</plugins> </build> <dependencies> <dependency>
<groupId>org.apache.uima</groupId> <artifactId>uimafit-core</artifactId>
<version>2.2.0</version> </dependency> <dependency>
<groupId>org.apache.uima</groupId> <artifactId>uimaj-core</artifactId>
<version>2.8.1</version> </dependency> <dependency>
<groupId>org.apache.uima</groupId>
<artifactId>ruta-maven-plugin</artifactId> <version>2.3.1</version>
</dependency> <dependency> <groupId>org.apache.uima</groupId>
<artifactId>uimaj-cpe</artifactId> <version>2.8.1</version> </dependency>
<dependency> <groupId>org.apache.uima</groupId>
<artifactId>uimaj-examples</artifactId> <version>2.8.1</version>
</dependency> </dependencies> </project>

Re: problems integrating Ruta and uimaFit

Posted by Bonnie MacKellar <bk...@gmail.com>.
Hi,
I just wanted to say thanks - your description gave me enough clues that I
finally got this to work. I think I have some questions, though, about WHY
certain things work, but since I am preparing to go out of town, I will
wait on those. I need to understand what I did better so I can configure
these things faster in the future.

thanks,
Bonnie MacKellar

On Thu, Jun 23, 2016 at 5:07 AM, Peter Klügl <pe...@averbis.com>
wrote:

> Hi,
>
>
> sorry, here's just a short reply since  I am currently travelling. If
> the problem still exists I will try to reproduce it and reply with more
> details next week.
>
>
> Yes, in simple UIMA Ruta projects, these descriptors are copied to
> descriptor/utils when you create the project. The descriptor folder is
> listed in the buildpath as a "descriptor" folder, where imported
> descriptors are searched in.
>
> UIMA Ruta supports currently two ways to find the descriptors: the
> absolute paths specified in the descriptorPaths configuration parameter
> and the classpath. Thus, the simplest way for you would be to use the
> classpath to find the descriptor instead of the descriptorPaths (which
> points to the descriptor folder of your ruta project).
>
> Changing the imports to something like: UIMAFIT
> org.apache.uima.ruta.engine.PlainTextAnnotator should do the trick (you
> need also to adapt the TYPESYSTEM import). Then the script does not
> depend on the project structure.
>
>
> If you use the SourceDocumentInformation type system in your ruta
> script, then you need to include it separately. In some situtation, the
> Ruta Workbench does that automatically for you. However, it is not
> mentioned in types.txt in ruta-core. So you need to add it there in your
> maven project so that the typesystem scanning of uimaFIT finds it.
>
>
> If you create the analysis engine (descriptor) for a ruta script
> programmatically, there are sometimes additional configuration
> parameters that need to be set. In your use case, you import additional
> analysis engine in your script. These need to be mentioned in the
> corresponding configuration parameters, e.g., PARAM_ADDITIONAL_ENGINES
> or PARAM_ADDITIONAL_UIMAFIT_ENGINES. Since there are several parameters
> that are rather technical. I normally use the generated descriptor in
> the uimaFIT factory.
>
>
> Best,
>
>
> Peter
>
>
> Am 22.06.2016 um 21:55 schrieb Bonnie MacKellar:
> > I am still trying to figure out how to count Ruta annotations across a
> > bunch of input files. There doesn't seem to be any Workbench way to do
> it.
> > So now I am trying to call Ruta from UimaFit so I can do the job in Java.
> >
> > However, I am having serious configuration problems, plus I have a
> question
> > on how do bring in PlainTextAnnotator.
> >
> > I am using Maven, with the jcasgen-maven-plugin, the ruta-maven-plugin,
> and
> > the uimafit-maven-plugin. I will include the pom file at the end of this
> > post.
> >
> > I want my Java code to be aware of the types declared in the Ruta script
> -
> > that is the whole point - I want to count those annotations.
> >
> > My Ruta script also uses PlainTextAnnotator. The problem with this is
> that
> > I can't figure out where to put it. In a Workbench based Ruta project,
> > PlainTextAnnotator.xml and PlainTextAnnotatorTypeSystem get put
> > automatically into descriptor/utils, along with a number of other
> > descriptors that seem to be built into Ruta. But when I create a project
> > using maven, there is no such location, and these descriptors do not get
> > put anywhere. I tried a number of places but could not get my script to
> see
> > the type system for PlainTextAnnotator. Finally, I hit on putting the
> files
> > in target/generated-sources/ruta/descriptor/utils, and finally my script
> is
> > able to see the types and I can run it. This is good because at that
> point,
> > the ruta-maven-plugin does its job and generates the descriptors for my
> > script. However, I suspect this is not a good place to put the
> > PlainTextAnnotator files since doing a clean overwrites them. Where
> should
> > they go? Is there any entry in the pom file that is needed?
> >
> > The second problem is that although my Ruta script works nicely on its
> own,
> > the Java code fails.  I get the following exception
> > Exception in thread "main" org.apache.uima.cas.CASRuntimeException: JCas
> > type "org.apache.uima.examples.SourceDocumentInformation" used in Java
> > code,  but was not declared in the XML type descriptor.
> > at org.apache.uima.jcas.impl.JCasImpl.getTypeInit(JCasImpl.java:435)
> > at org.apache.uima.jcas.impl.JCasImpl.getType(JCasImpl.java:408)
> > at org.apache.uima.jcas.cas.TOP.<init>(TOP.java:96)
> > at org.apache.uima.jcas.cas.AnnotationBase.<init>(AnnotationBase.java:66)
> > at org.apache.uima.jcas.tcas.Annotation.<init>(Annotation.java:54)
> > at
> >
> org.apache.uima.examples.SourceDocumentInformation.<init>(SourceDocumentInformation.java:80)
> > at
> >
> org.apache.uima.examples.cpe.FileSystemCollectionReader.getNext(FileSystemCollectionReader.java:162)
> > at
> >
> org.apache.uima.fit.pipeline.SimplePipeline.runPipeline(SimplePipeline.java:149)
> > at PipelineSystem.<init>(PipelineSystem.java:59)
> > at PipelineSystem.main(PipelineSystem.java:73)
> >
> > I am guessing that I need to put some other descriptor somewhere but I
> > can't figure out what it might be.  Here is the code that causes the
> problem
> >
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> > import java.io.IOException;
> > import java.util.Iterator;
> >
> > import org.apache.uima.UIMAException;
> > import org.apache.uima.analysis_engine.AnalysisEngine;
> > import org.apache.uima.analysis_engine.AnalysisEngineDescription;
> > import org.apache.uima.analysis_engine.AnalysisEngineProcessException;
> > import org.apache.uima.cas.Type;
> > import org.apache.uima.cas.TypeSystem;
> > import org.apache.uima.collection.CollectionReaderDescription;
> > import org.apache.uima.examples.cpe.FileSystemCollectionReader;
> > import org.apache.uima.fit.component.CasDumpWriter;
> > import org.apache.uima.fit.factory.AnalysisEngineFactory;
> > import org.apache.uima.fit.factory.CollectionReaderFactory;
> > import org.apache.uima.fit.pipeline.SimplePipeline;
> > import org.apache.uima.jcas.JCas;
> > import org.apache.uima.resource.ResourceInitializationException;
> > import org.apache.uima.ruta.engine.RutaEngine;
> >
> > public class PipelineSystem  {
> > public PipelineSystem() throws IOException, UIMAException
> > {
> > try {
> > CollectionReaderDescription readerDesc =
> > CollectionReaderFactory.createReaderDescription(
> > FileSystemCollectionReader.class,
> >            FileSystemCollectionReader.PARAM_INPUTDIR,
> >  "/home/bonnie/Research/eclipse-uima-projects/PipeLineWithRuta/input",
> >            FileSystemCollectionReader.PARAM_ENCODING,  "UTF-8",
> >            FileSystemCollectionReader.PARAM_LANGUAGE,  "English");
> > AnalysisEngine rae = AnalysisEngineFactory.createEngine(RutaEngine.class,
> > RutaEngine.PARAM_MAIN_SCRIPT,
> >            "ecClassifierRules");
> > AnalysisEngineDescription rutaEngineDesc =
> > AnalysisEngineFactory.createEngineDescription(RutaEngine.class,
> > RutaEngine.PARAM_MAIN_SCRIPT,
> >            "ecClassifierRules");
> > AnalysisEngineDescription writerDesc =
> > AnalysisEngineFactory.createEngineDescription(CasDumpWriter.class,
> > CasDumpWriter.PARAM_OUTPUT_FILE, "dump.txt");
> > JCas jCas = rae.newJCas();
> > SimplePipeline.runPipeline(readerDesc, rutaEngineDesc);
> > displayRutaResults(jCas);
> > } catch (ResourceInitializationException e) {
> > // TODO Auto-generated catch block
> > e.printStackTrace();
> > } catch (AnalysisEngineProcessException e) {
> > // TODO Auto-generated catch block
> > e.printStackTrace();
> > }
> > }
> >
> > public static void main(String[] args) throws IOException,
> UIMAException  {
> > PipelineSystem p = new PipelineSystem();
> >
> > }
> >
> > public void displayRutaResults(JCas jCas)
> > {
> > System.out.println("in display ruta results");
> > TypeSystem ts = jCas.getTypeSystem();
> > Iterator<Type> typeItr = ts.getTypeIterator();
> > while (typeItr.hasNext()) {
> > Type type = (Type) typeItr.next();
> > if (type.getName().equals("INCL")) {
> > System.out.println("INCL was found");
> > }
> > }
> > }
> >
> ------------------------------------------------------------------------------------------------------------------------------------------------
> >
> > Yes, I know the code doesn't actually count annotations yet - this is
> > strictly a test of the configuration. The type INCL is declared in the
> > script
> >
> > ENGINE utils.PlainTextAnnotator; TYPESYSTEM utils.PlainTextTypeSystem;
> > Document{-> RETAINTYPE(BREAK)}; Document{-> EXEC(PlainTextAnnotator,
> > {Line})};
> >
> > DECLARE INCL; "INCLUSION" -> INCL;
> >
> > And finally, here is the pom file. I note that the ruta pugin and the
> > jcasegen plugin are correctly generating the descriptor files for the
> > script and the Java classes for the types. I have this set up so that the
> > jcasgen plugin reads the type descriptors from the folder that is
> generated
> > by the ruta-maven-plugin (I saw this in one of the examples mentioned
> > elsewhere on this mailing lsit)
> > However, the uimafit plugin does not generate anything.
> >
> > thanks for any help. It is really hard to figure out all these moving
> parts.
> >
> > Bonnie MacKellar
> >
> >
> ---------------------------------------------------------------------------------------------------------------------------------
> >
> > <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="
> > http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="
> > http://maven.apache.org/POM/4.0.0
> > http://maven.apache.org/xsd/maven-4.0.0.xsd">
> > <modelVersion>4.0.0</modelVersion> <groupId>PipeLineWithRuta</groupId>
> > <artifactId>PipeLineWithRuta</artifactId>
> <version>0.0.1-SNAPSHOT</version>
> > <packaging>jar</packaging> <name>PipeLineWithRuta</name> <url>
> > http://maven.apache.org</url> <properties>
> > <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
> > </properties> <build> <sourceDirectory>src/main/java</sourceDirectory>
> > <resources> <resource> <directory>src/main/ruta</directory> </resource>
> > <resource> <directory>src/desc</directory> </resource> </resources>
> > <plugins> <plugin> <artifactId>maven-compiler-plugin</artifactId>
> > <version>3.3</version> <configuration> <source>1.8</source>
> > <target>1.8</target> </configuration> </plugin> <plugin>
> > <groupId>org.apache.uima</groupId>
> > <artifactId>jcasgen-maven-plugin</artifactId> <version>2.4.1</version>
> <!--
> > change this to the latest version --> <executions> <execution> <goals>
> > <goal>generate</goal> </goals> <!-- this is the only goal --> <!-- runs
> in
> > phase process-resources by default --> <configuration> <!-- REQUIRED -->
> > <typeSystemIncludes> <!-- one or more ant-like file patterns identifying
> > top level descriptors -->
> >
> <typeSystemInclude>target/generated-sources/ruta/descriptor/ecClassifierRulesTypeSystem.xml</typeSystemInclude>
> > </typeSystemIncludes> <!-- OPTIONAL --> <!-- a sequence of ant-like file
> > patterns to exclude from the above include list --> <typeSystemExcludes>
> > </typeSystemExcludes> <!-- OPTIONAL --> <!-- where the generated files go
> > --> <!-- default value:
> > ${project.build.directory}/generated-sources/jcasgen" -->
> <outputDirectory>
> > </outputDirectory> <!-- true or false, default = false --> <!-- if true,
> > then although the complete merged type system will be created internally,
> > only those types whose definition is contained within this maven project
> > will be generated. The others will be presumed to be available via other
> > projects. --> <!-- OPTIONAL --> <limitToProject>true</limitToProject>
> > </configuration> </execution> </executions> </plugin> <plugin>
> > <groupId>org.apache.uima</groupId>
> > <artifactId>ruta-maven-plugin</artifactId> <version>2.3.1</version>
> > <configuration> <scriptPaths> <scriptPath>src/main/ruta/</scriptPath>
> > </scriptPaths> <!-- Descriptor paths of the generated analysis engine
> > descriptor. --> <!-- default value: none --> <descriptorPaths>
> >
> <descriptorPath>${project.build.directory}/generated-sources/ruta/descriptor</descriptorPath>
> > </descriptorPaths> <!-- Resource paths of the generated analysis engine
> > descriptor. --> <!-- default value: none --> <resourcePaths>
> > <resourcePath>${project.build.directory}/generated-sources/ruta/
> > resources/</resourcePath> </resourcePaths>
> > <analysisEngineSuffix>Engine</analysisEngineSuffix>
> > <typeSystemSuffix>TypeSystem</typeSystemSuffix> <!-- Type of type system
> > imports. false = import by location. --> <!-- default value: false -->
> > <importByName>false</importByName> <!-- Option to resolve imports while
> > building. --> <!-- default value: false -->
> > <resolveImports>false</resolveImports> <!-- List of packages with
> language
> > extensions --> <!-- default value: none --> <extensionPackages>
> > <extensionPackage>org.apache.uima.ruta</extensionPackage>
> > </extensionPackages> <!-- Add UIMA Ruta nature to .project --> <!--
> default
> > value: false --> <addRutaNature>true</addRutaNature> <!-- Buildpath of
> the
> > UIMA Ruta Workbench (IDE) for this project --> <!-- default value: none
> -->
> > <buildPaths> <buildPath>script:src/main/ruta/</buildPath>
> > <buildPath>descriptor:target/generated-sources/ruta/descriptor/
> > </buildPath> <buildPath>resources:src/main/resources/</buildPath>
> > </buildPaths> </configuration> <executions> <execution> <id>default</id>
> > <phase>process-classes</phase> <goals> <goal>generate</goal> </goals>
> > </execution> </executions> </plugin> <plugin>
> > <groupId>org.apache.uima</groupId>
> > <artifactId>uimafit-maven-plugin</artifactId> <version>2.2.0</version>
> <!--
> > change to latest version --> <configuration> <!-- OPTIONAL --> <!-- Path
> > where the generated resources are written. --> <outputDirectory>
> > ${project.build.directory}/generated-sources/uimafit </outputDirectory>
> > <!-- OPTIONAL --> <!-- Skip generation of
> > META-INF/org.apache.uima.fit/components.txt -->
> > <skipComponentsManifest>false</skipComponentsManifest> <!-- OPTIONAL -->
> > <!-- Source file encoding. -->
> > <encoding>${project.build.sourceEncoding}</encoding> </configuration>
> > <executions> <execution> <id>default</id> <phase>process-classes</phase>
> > <goals> <goal>generate</goal> </goals> </execution> </executions>
> </plugin>
> > </plugins> </build> <dependencies> <dependency>
> > <groupId>org.apache.uima</groupId> <artifactId>uimafit-core</artifactId>
> > <version>2.2.0</version> </dependency> <dependency>
> > <groupId>org.apache.uima</groupId> <artifactId>uimaj-core</artifactId>
> > <version>2.8.1</version> </dependency> <dependency>
> > <groupId>org.apache.uima</groupId>
> > <artifactId>ruta-maven-plugin</artifactId> <version>2.3.1</version>
> > </dependency> <dependency> <groupId>org.apache.uima</groupId>
> > <artifactId>uimaj-cpe</artifactId> <version>2.8.1</version> </dependency>
> > <dependency> <groupId>org.apache.uima</groupId>
> > <artifactId>uimaj-examples</artifactId> <version>2.8.1</version>
> > </dependency> </dependencies> </project>
> >
>
>

Re: problems integrating Ruta and uimaFit

Posted by Bonnie MacKellar <bk...@gmail.com>.
Actually, one update: after doing a Maven update project, the lines
UIMAFIT org.apache.uima.ruta.engine.PlainTextAnnotator;
TYPESYSTEM org.apache.uima.ruta.engine.PlainTextTypeSystem;

generate a different exception - basically it can't find BasicTypeSystem.xml

Exception in thread "main"
org.apache.uima.resource.ResourceInitializationException: Initialization of
annotator class "org.apache.uima.ruta.engine.RutaEngine" failed.
 (Descriptor:
file:/home/bonnie/Research/eclipse-uima-projects/PipeLineWithRuta/target/classes/ecClassifierRulesEngine.xml)
at
org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initializeAnalysisComponent(PrimitiveAnalysisEngine_impl.java:264)
at
org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initialize(PrimitiveAnalysisEngine_impl.java:169)
at
org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
at
org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:279)
at
org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:371)
at org.apache.uima.ruta.engine.Ruta.wrapAnalysisEngine(Ruta.java:95)
at
org.apache.uima.ruta.ide.launching.RutaLauncher.main(RutaLauncher.java:123)
Caused by: org.apache.uima.resource.ResourceInitializationException
at org.apache.uima.ruta.engine.RutaEngine.initialize(RutaEngine.java:519)
at
org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initializeAnalysisComponent(PrimitiveAnalysisEngine_impl.java:262)
... 7 more
Caused by: org.apache.uima.analysis_engine.AnalysisEngineProcessException
at
org.apache.uima.ruta.engine.RutaEngine.initializeScript(RutaEngine.java:767)
at org.apache.uima.ruta.engine.RutaEngine.initialize(RutaEngine.java:517)
... 8 more
Caused by: org.apache.uima.resource.ResourceInitializationException
at org.apache.uima.fit.internal.MetaDataUtil.resolve(MetaDataUtil.java:106)
at
org.apache.uima.fit.internal.MetaDataUtil.scanDescriptors(MetaDataUtil.java:170)
at
org.apache.uima.fit.factory.TypeSystemDescriptionFactory.scanTypeDescriptors(TypeSystemDescriptionFactory.java:131)
at
org.apache.uima.fit.factory.TypeSystemDescriptionFactory.createTypeSystemDescription(TypeSystemDescriptionFactory.java:102)
at
org.apache.uima.fit.factory.AnalysisEngineFactory.createEngineDescription(AnalysisEngineFactory.java:967)
at
org.apache.uima.fit.factory.AnalysisEngineFactory.createEngine(AnalysisEngineFactory.java:278)
at
org.apache.uima.ruta.engine.RutaEngine.initializeScript(RutaEngine.java:763)
... 9 more
Caused by: java.io.FileNotFoundException: class path resource
[classpath*:resources/BasicTypeSystem.xml] cannot be resolved to URL
because it does not exist
at
org.springframework.core.io.ClassPathResource.getURL(ClassPathResource.java:177)
at org.apache.uima.fit.internal.MetaDataUtil.resolve(MetaDataUtil.java:101)

On Thu, Jun 23, 2016 at 9:21 AM, Bonnie MacKellar <bk...@gmail.com>
wrote:

> Hi,
>
> Thanks.
>
> I am not using SourceDocumentInformation in my Ruta script. There is no
> dependency there - in the version that is in a regular Ruta Workbench
> project, I can remove it and everything is fine.  I believe, from looking
> at the exception, that the dependency is in UimaFit - it seems to be coming
> from SimplePipeline.runPipeline.   I have tried adding it in UimaFit
> fashion, listing it in
> src/main/resources/META-INF/org.apache.uima.fit/types.txt, but I cannot
> seem to get UimaFit to find this file in the Maven version of this project,
> even though it works fine in the non-Maven project.  I just cannot figure
> out why this is happening.
>
> I also don't understand this
> "Changing the imports to something like: UIMAFIT
> org.apache.uima.ruta.engine.PlainTextAnnotator should do the trick (you
> need also to adapt the TYPESYSTEM import). Then the script does not
> depend on the project structure."
>
> Change which imports? Is this something in the pom file? UIMAFIT brings in
> additional UimaFit annotation engines to the Ruta script, right? I am not
> calling or using any UimaFit annotation engines in my Ruta script. I am
> just trying to bring in PlainTextAnnotator. That isn't a UimaFit annotator
> - it is something built in to Ruta.
>  I tried changing the lines in the script to
> ENGINE org.apache.uima.ruta.engine.PlainTextAnnotator;
> TYPESYSTEM org.apache.uima.ruta.engine.PlainTextTypeSystem;
>
> but that doesn't work - I get a
> "org.apache.uima.ruta.engine.PlainTextAnnotator not found " on the line
> ENGINE org.apache.uima.ruta.engine.PlainTextAnnotator;
>
> I then tried changing to
> UIMAFIT org.apache.uima.ruta.engine.PlainTextAnnotator;
> TYPESYSTEM org.apache.uima.ruta.engine.PlainTextTypeSystem;
>
> No compile error, but when I run the script, I get
> Found no script/block: PlainTextAnnotator
> Exception in thread "main" java.lang.NullPointerException
> at
> org.apache.uima.ruta.engine.RutaEngine.batchProcessComplete(RutaEngine.java:1122)
> at
> org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.batchProcessComplete(PrimitiveAnalysisEngine_impl.java:321)
> at
> org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.batchProcessComplete(AnalysisEngineImplBase.java:447)
> at
> org.apache.uima.ruta.ide.launching.RutaLauncher.main(RutaLauncher.java:133
>
> Clearly it isn't finding PlainTextAnnotator - but that is the crux of my
> problem. Where do I put it?
>
> I think my problem is that I don't understand what these pluigins are all
> doing or how they affect each other: ruta-maven-plugin,
> jcasgen-maven-plugin, and uimafit-maven-plugin.  They all seem to copy
> and/or generate different things to target/classes and
> target/generated-sources, but it is hard to tell exactly which files each
> one is responsible for. I don't have a good mental model of the process!
>
> thanks,
> Bonnie MacKellar
>
> On Thu, Jun 23, 2016 at 5:07 AM, Peter Klügl <pe...@averbis.com>
> wrote:
>
>> Hi,
>>
>>
>> sorry, here's just a short reply since  I am currently travelling. If
>> the problem still exists I will try to reproduce it and reply with more
>> details next week.
>>
>>
>> Yes, in simple UIMA Ruta projects, these descriptors are copied to
>> descriptor/utils when you create the project. The descriptor folder is
>> listed in the buildpath as a "descriptor" folder, where imported
>> descriptors are searched in.
>>
>> UIMA Ruta supports currently two ways to find the descriptors: the
>> absolute paths specified in the descriptorPaths configuration parameter
>> and the classpath. Thus, the simplest way for you would be to use the
>> classpath to find the descriptor instead of the descriptorPaths (which
>> points to the descriptor folder of your ruta project).
>>
>> Changing the imports to something like: UIMAFIT
>> org.apache.uima.ruta.engine.PlainTextAnnotator should do the trick (you
>> need also to adapt the TYPESYSTEM import). Then the script does not
>> depend on the project structure.
>>
>>
>> If you use the SourceDocumentInformation type system in your ruta
>> script, then you need to include it separately. In some situtation, the
>> Ruta Workbench does that automatically for you. However, it is not
>> mentioned in types.txt in ruta-core. So you need to add it there in your
>> maven project so that the typesystem scanning of uimaFIT finds it.
>>
>>
>> If you create the analysis engine (descriptor) for a ruta script
>> programmatically, there are sometimes additional configuration
>> parameters that need to be set. In your use case, you import additional
>> analysis engine in your script. These need to be mentioned in the
>> corresponding configuration parameters, e.g., PARAM_ADDITIONAL_ENGINES
>> or PARAM_ADDITIONAL_UIMAFIT_ENGINES. Since there are several parameters
>> that are rather technical. I normally use the generated descriptor in
>> the uimaFIT factory.
>>
>>
>> Best,
>>
>>
>> Peter
>>
>>
>> Am 22.06.2016 um 21:55 schrieb Bonnie MacKellar:
>> > I am still trying to figure out how to count Ruta annotations across a
>> > bunch of input files. There doesn't seem to be any Workbench way to do
>> it.
>> > So now I am trying to call Ruta from UimaFit so I can do the job in
>> Java.
>> >
>> > However, I am having serious configuration problems, plus I have a
>> question
>> > on how do bring in PlainTextAnnotator.
>> >
>> > I am using Maven, with the jcasgen-maven-plugin, the ruta-maven-plugin,
>> and
>> > the uimafit-maven-plugin. I will include the pom file at the end of this
>> > post.
>> >
>> > I want my Java code to be aware of the types declared in the Ruta
>> script -
>> > that is the whole point - I want to count those annotations.
>> >
>> > My Ruta script also uses PlainTextAnnotator. The problem with this is
>> that
>> > I can't figure out where to put it. In a Workbench based Ruta project,
>> > PlainTextAnnotator.xml and PlainTextAnnotatorTypeSystem get put
>> > automatically into descriptor/utils, along with a number of other
>> > descriptors that seem to be built into Ruta. But when I create a project
>> > using maven, there is no such location, and these descriptors do not get
>> > put anywhere. I tried a number of places but could not get my script to
>> see
>> > the type system for PlainTextAnnotator. Finally, I hit on putting the
>> files
>> > in target/generated-sources/ruta/descriptor/utils, and finally my
>> script is
>> > able to see the types and I can run it. This is good because at that
>> point,
>> > the ruta-maven-plugin does its job and generates the descriptors for my
>> > script. However, I suspect this is not a good place to put the
>> > PlainTextAnnotator files since doing a clean overwrites them. Where
>> should
>> > they go? Is there any entry in the pom file that is needed?
>> >
>> > The second problem is that although my Ruta script works nicely on its
>> own,
>> > the Java code fails.  I get the following exception
>> > Exception in thread "main" org.apache.uima.cas.CASRuntimeException: JCas
>> > type "org.apache.uima.examples.SourceDocumentInformation" used in Java
>> > code,  but was not declared in the XML type descriptor.
>> > at org.apache.uima.jcas.impl.JCasImpl.getTypeInit(JCasImpl.java:435)
>> > at org.apache.uima.jcas.impl.JCasImpl.getType(JCasImpl.java:408)
>> > at org.apache.uima.jcas.cas.TOP.<init>(TOP.java:96)
>> > at
>> org.apache.uima.jcas.cas.AnnotationBase.<init>(AnnotationBase.java:66)
>> > at org.apache.uima.jcas.tcas.Annotation.<init>(Annotation.java:54)
>> > at
>> >
>> org.apache.uima.examples.SourceDocumentInformation.<init>(SourceDocumentInformation.java:80)
>> > at
>> >
>> org.apache.uima.examples.cpe.FileSystemCollectionReader.getNext(FileSystemCollectionReader.java:162)
>> > at
>> >
>> org.apache.uima.fit.pipeline.SimplePipeline.runPipeline(SimplePipeline.java:149)
>> > at PipelineSystem.<init>(PipelineSystem.java:59)
>> > at PipelineSystem.main(PipelineSystem.java:73)
>> >
>> > I am guessing that I need to put some other descriptor somewhere but I
>> > can't figure out what it might be.  Here is the code that causes the
>> problem
>> >
>> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>> > import java.io.IOException;
>> > import java.util.Iterator;
>> >
>> > import org.apache.uima.UIMAException;
>> > import org.apache.uima.analysis_engine.AnalysisEngine;
>> > import org.apache.uima.analysis_engine.AnalysisEngineDescription;
>> > import org.apache.uima.analysis_engine.AnalysisEngineProcessException;
>> > import org.apache.uima.cas.Type;
>> > import org.apache.uima.cas.TypeSystem;
>> > import org.apache.uima.collection.CollectionReaderDescription;
>> > import org.apache.uima.examples.cpe.FileSystemCollectionReader;
>> > import org.apache.uima.fit.component.CasDumpWriter;
>> > import org.apache.uima.fit.factory.AnalysisEngineFactory;
>> > import org.apache.uima.fit.factory.CollectionReaderFactory;
>> > import org.apache.uima.fit.pipeline.SimplePipeline;
>> > import org.apache.uima.jcas.JCas;
>> > import org.apache.uima.resource.ResourceInitializationException;
>> > import org.apache.uima.ruta.engine.RutaEngine;
>> >
>> > public class PipelineSystem  {
>> > public PipelineSystem() throws IOException, UIMAException
>> > {
>> > try {
>> > CollectionReaderDescription readerDesc =
>> > CollectionReaderFactory.createReaderDescription(
>> > FileSystemCollectionReader.class,
>> >            FileSystemCollectionReader.PARAM_INPUTDIR,
>> >  "/home/bonnie/Research/eclipse-uima-projects/PipeLineWithRuta/input",
>> >            FileSystemCollectionReader.PARAM_ENCODING,  "UTF-8",
>> >            FileSystemCollectionReader.PARAM_LANGUAGE,  "English");
>> > AnalysisEngine rae =
>> AnalysisEngineFactory.createEngine(RutaEngine.class,
>> > RutaEngine.PARAM_MAIN_SCRIPT,
>> >            "ecClassifierRules");
>> > AnalysisEngineDescription rutaEngineDesc =
>> > AnalysisEngineFactory.createEngineDescription(RutaEngine.class,
>> > RutaEngine.PARAM_MAIN_SCRIPT,
>> >            "ecClassifierRules");
>> > AnalysisEngineDescription writerDesc =
>> > AnalysisEngineFactory.createEngineDescription(CasDumpWriter.class,
>> > CasDumpWriter.PARAM_OUTPUT_FILE, "dump.txt");
>> > JCas jCas = rae.newJCas();
>> > SimplePipeline.runPipeline(readerDesc, rutaEngineDesc);
>> > displayRutaResults(jCas);
>> > } catch (ResourceInitializationException e) {
>> > // TODO Auto-generated catch block
>> > e.printStackTrace();
>> > } catch (AnalysisEngineProcessException e) {
>> > // TODO Auto-generated catch block
>> > e.printStackTrace();
>> > }
>> > }
>> >
>> > public static void main(String[] args) throws IOException,
>> UIMAException  {
>> > PipelineSystem p = new PipelineSystem();
>> >
>> > }
>> >
>> > public void displayRutaResults(JCas jCas)
>> > {
>> > System.out.println("in display ruta results");
>> > TypeSystem ts = jCas.getTypeSystem();
>> > Iterator<Type> typeItr = ts.getTypeIterator();
>> > while (typeItr.hasNext()) {
>> > Type type = (Type) typeItr.next();
>> > if (type.getName().equals("INCL")) {
>> > System.out.println("INCL was found");
>> > }
>> > }
>> > }
>> >
>> ------------------------------------------------------------------------------------------------------------------------------------------------
>> >
>> > Yes, I know the code doesn't actually count annotations yet - this is
>> > strictly a test of the configuration. The type INCL is declared in the
>> > script
>> >
>> > ENGINE utils.PlainTextAnnotator; TYPESYSTEM utils.PlainTextTypeSystem;
>> > Document{-> RETAINTYPE(BREAK)}; Document{-> EXEC(PlainTextAnnotator,
>> > {Line})};
>> >
>> > DECLARE INCL; "INCLUSION" -> INCL;
>> >
>> > And finally, here is the pom file. I note that the ruta pugin and the
>> > jcasegen plugin are correctly generating the descriptor files for the
>> > script and the Java classes for the types. I have this set up so that
>> the
>> > jcasgen plugin reads the type descriptors from the folder that is
>> generated
>> > by the ruta-maven-plugin (I saw this in one of the examples mentioned
>> > elsewhere on this mailing lsit)
>> > However, the uimafit plugin does not generate anything.
>> >
>> > thanks for any help. It is really hard to figure out all these moving
>> parts.
>> >
>> > Bonnie MacKellar
>> >
>> >
>> ---------------------------------------------------------------------------------------------------------------------------------
>> >
>> > <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="
>> > http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="
>> > http://maven.apache.org/POM/4.0.0
>> > http://maven.apache.org/xsd/maven-4.0.0.xsd">
>> > <modelVersion>4.0.0</modelVersion> <groupId>PipeLineWithRuta</groupId>
>> > <artifactId>PipeLineWithRuta</artifactId>
>> <version>0.0.1-SNAPSHOT</version>
>> > <packaging>jar</packaging> <name>PipeLineWithRuta</name> <url>
>> > http://maven.apache.org</url> <properties>
>> > <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
>> > </properties> <build> <sourceDirectory>src/main/java</sourceDirectory>
>> > <resources> <resource> <directory>src/main/ruta</directory> </resource>
>> > <resource> <directory>src/desc</directory> </resource> </resources>
>> > <plugins> <plugin> <artifactId>maven-compiler-plugin</artifactId>
>> > <version>3.3</version> <configuration> <source>1.8</source>
>> > <target>1.8</target> </configuration> </plugin> <plugin>
>> > <groupId>org.apache.uima</groupId>
>> > <artifactId>jcasgen-maven-plugin</artifactId> <version>2.4.1</version>
>> <!--
>> > change this to the latest version --> <executions> <execution> <goals>
>> > <goal>generate</goal> </goals> <!-- this is the only goal --> <!-- runs
>> in
>> > phase process-resources by default --> <configuration> <!-- REQUIRED -->
>> > <typeSystemIncludes> <!-- one or more ant-like file patterns identifying
>> > top level descriptors -->
>> >
>> <typeSystemInclude>target/generated-sources/ruta/descriptor/ecClassifierRulesTypeSystem.xml</typeSystemInclude>
>> > </typeSystemIncludes> <!-- OPTIONAL --> <!-- a sequence of ant-like file
>> > patterns to exclude from the above include list --> <typeSystemExcludes>
>> > </typeSystemExcludes> <!-- OPTIONAL --> <!-- where the generated files
>> go
>> > --> <!-- default value:
>> > ${project.build.directory}/generated-sources/jcasgen" -->
>> <outputDirectory>
>> > </outputDirectory> <!-- true or false, default = false --> <!-- if true,
>> > then although the complete merged type system will be created
>> internally,
>> > only those types whose definition is contained within this maven project
>> > will be generated. The others will be presumed to be available via other
>> > projects. --> <!-- OPTIONAL --> <limitToProject>true</limitToProject>
>> > </configuration> </execution> </executions> </plugin> <plugin>
>> > <groupId>org.apache.uima</groupId>
>> > <artifactId>ruta-maven-plugin</artifactId> <version>2.3.1</version>
>> > <configuration> <scriptPaths> <scriptPath>src/main/ruta/</scriptPath>
>> > </scriptPaths> <!-- Descriptor paths of the generated analysis engine
>> > descriptor. --> <!-- default value: none --> <descriptorPaths>
>> >
>> <descriptorPath>${project.build.directory}/generated-sources/ruta/descriptor</descriptorPath>
>> > </descriptorPaths> <!-- Resource paths of the generated analysis engine
>> > descriptor. --> <!-- default value: none --> <resourcePaths>
>> > <resourcePath>${project.build.directory}/generated-sources/ruta/
>> > resources/</resourcePath> </resourcePaths>
>> > <analysisEngineSuffix>Engine</analysisEngineSuffix>
>> > <typeSystemSuffix>TypeSystem</typeSystemSuffix> <!-- Type of type system
>> > imports. false = import by location. --> <!-- default value: false -->
>> > <importByName>false</importByName> <!-- Option to resolve imports while
>> > building. --> <!-- default value: false -->
>> > <resolveImports>false</resolveImports> <!-- List of packages with
>> language
>> > extensions --> <!-- default value: none --> <extensionPackages>
>> > <extensionPackage>org.apache.uima.ruta</extensionPackage>
>> > </extensionPackages> <!-- Add UIMA Ruta nature to .project --> <!--
>> default
>> > value: false --> <addRutaNature>true</addRutaNature> <!-- Buildpath of
>> the
>> > UIMA Ruta Workbench (IDE) for this project --> <!-- default value: none
>> -->
>> > <buildPaths> <buildPath>script:src/main/ruta/</buildPath>
>> > <buildPath>descriptor:target/generated-sources/ruta/descriptor/
>> > </buildPath> <buildPath>resources:src/main/resources/</buildPath>
>> > </buildPaths> </configuration> <executions> <execution> <id>default</id>
>> > <phase>process-classes</phase> <goals> <goal>generate</goal> </goals>
>> > </execution> </executions> </plugin> <plugin>
>> > <groupId>org.apache.uima</groupId>
>> > <artifactId>uimafit-maven-plugin</artifactId> <version>2.2.0</version>
>> <!--
>> > change to latest version --> <configuration> <!-- OPTIONAL --> <!-- Path
>> > where the generated resources are written. --> <outputDirectory>
>> > ${project.build.directory}/generated-sources/uimafit </outputDirectory>
>> > <!-- OPTIONAL --> <!-- Skip generation of
>> > META-INF/org.apache.uima.fit/components.txt -->
>> > <skipComponentsManifest>false</skipComponentsManifest> <!-- OPTIONAL -->
>> > <!-- Source file encoding. -->
>> > <encoding>${project.build.sourceEncoding}</encoding> </configuration>
>> > <executions> <execution> <id>default</id> <phase>process-classes</phase>
>> > <goals> <goal>generate</goal> </goals> </execution> </executions>
>> </plugin>
>> > </plugins> </build> <dependencies> <dependency>
>> > <groupId>org.apache.uima</groupId> <artifactId>uimafit-core</artifactId>
>> > <version>2.2.0</version> </dependency> <dependency>
>> > <groupId>org.apache.uima</groupId> <artifactId>uimaj-core</artifactId>
>> > <version>2.8.1</version> </dependency> <dependency>
>> > <groupId>org.apache.uima</groupId>
>> > <artifactId>ruta-maven-plugin</artifactId> <version>2.3.1</version>
>> > </dependency> <dependency> <groupId>org.apache.uima</groupId>
>> > <artifactId>uimaj-cpe</artifactId> <version>2.8.1</version>
>> </dependency>
>> > <dependency> <groupId>org.apache.uima</groupId>
>> > <artifactId>uimaj-examples</artifactId> <version>2.8.1</version>
>> > </dependency> </dependencies> </project>
>> >
>>
>>
>

Re: problems integrating Ruta and uimaFit

Posted by Bonnie MacKellar <bk...@gmail.com>.
Hi,

Thanks.

I am not using SourceDocumentInformation in my Ruta script. There is no
dependency there - in the version that is in a regular Ruta Workbench
project, I can remove it and everything is fine.  I believe, from looking
at the exception, that the dependency is in UimaFit - it seems to be coming
from SimplePipeline.runPipeline.   I have tried adding it in UimaFit
fashion, listing it in
src/main/resources/META-INF/org.apache.uima.fit/types.txt, but I cannot
seem to get UimaFit to find this file in the Maven version of this project,
even though it works fine in the non-Maven project.  I just cannot figure
out why this is happening.

I also don't understand this
"Changing the imports to something like: UIMAFIT
org.apache.uima.ruta.engine.PlainTextAnnotator should do the trick (you
need also to adapt the TYPESYSTEM import). Then the script does not
depend on the project structure."

Change which imports? Is this something in the pom file? UIMAFIT brings in
additional UimaFit annotation engines to the Ruta script, right? I am not
calling or using any UimaFit annotation engines in my Ruta script. I am
just trying to bring in PlainTextAnnotator. That isn't a UimaFit annotator
- it is something built in to Ruta.
 I tried changing the lines in the script to
ENGINE org.apache.uima.ruta.engine.PlainTextAnnotator;
TYPESYSTEM org.apache.uima.ruta.engine.PlainTextTypeSystem;

but that doesn't work - I get a
"org.apache.uima.ruta.engine.PlainTextAnnotator not found " on the line
ENGINE org.apache.uima.ruta.engine.PlainTextAnnotator;

I then tried changing to
UIMAFIT org.apache.uima.ruta.engine.PlainTextAnnotator;
TYPESYSTEM org.apache.uima.ruta.engine.PlainTextTypeSystem;

No compile error, but when I run the script, I get
Found no script/block: PlainTextAnnotator
Exception in thread "main" java.lang.NullPointerException
at
org.apache.uima.ruta.engine.RutaEngine.batchProcessComplete(RutaEngine.java:1122)
at
org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.batchProcessComplete(PrimitiveAnalysisEngine_impl.java:321)
at
org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.batchProcessComplete(AnalysisEngineImplBase.java:447)
at
org.apache.uima.ruta.ide.launching.RutaLauncher.main(RutaLauncher.java:133

Clearly it isn't finding PlainTextAnnotator - but that is the crux of my
problem. Where do I put it?

I think my problem is that I don't understand what these pluigins are all
doing or how they affect each other: ruta-maven-plugin,
jcasgen-maven-plugin, and uimafit-maven-plugin.  They all seem to copy
and/or generate different things to target/classes and
target/generated-sources, but it is hard to tell exactly which files each
one is responsible for. I don't have a good mental model of the process!

thanks,
Bonnie MacKellar

On Thu, Jun 23, 2016 at 5:07 AM, Peter Klügl <pe...@averbis.com>
wrote:

> Hi,
>
>
> sorry, here's just a short reply since  I am currently travelling. If
> the problem still exists I will try to reproduce it and reply with more
> details next week.
>
>
> Yes, in simple UIMA Ruta projects, these descriptors are copied to
> descriptor/utils when you create the project. The descriptor folder is
> listed in the buildpath as a "descriptor" folder, where imported
> descriptors are searched in.
>
> UIMA Ruta supports currently two ways to find the descriptors: the
> absolute paths specified in the descriptorPaths configuration parameter
> and the classpath. Thus, the simplest way for you would be to use the
> classpath to find the descriptor instead of the descriptorPaths (which
> points to the descriptor folder of your ruta project).
>
> Changing the imports to something like: UIMAFIT
> org.apache.uima.ruta.engine.PlainTextAnnotator should do the trick (you
> need also to adapt the TYPESYSTEM import). Then the script does not
> depend on the project structure.
>
>
> If you use the SourceDocumentInformation type system in your ruta
> script, then you need to include it separately. In some situtation, the
> Ruta Workbench does that automatically for you. However, it is not
> mentioned in types.txt in ruta-core. So you need to add it there in your
> maven project so that the typesystem scanning of uimaFIT finds it.
>
>
> If you create the analysis engine (descriptor) for a ruta script
> programmatically, there are sometimes additional configuration
> parameters that need to be set. In your use case, you import additional
> analysis engine in your script. These need to be mentioned in the
> corresponding configuration parameters, e.g., PARAM_ADDITIONAL_ENGINES
> or PARAM_ADDITIONAL_UIMAFIT_ENGINES. Since there are several parameters
> that are rather technical. I normally use the generated descriptor in
> the uimaFIT factory.
>
>
> Best,
>
>
> Peter
>
>
> Am 22.06.2016 um 21:55 schrieb Bonnie MacKellar:
> > I am still trying to figure out how to count Ruta annotations across a
> > bunch of input files. There doesn't seem to be any Workbench way to do
> it.
> > So now I am trying to call Ruta from UimaFit so I can do the job in Java.
> >
> > However, I am having serious configuration problems, plus I have a
> question
> > on how do bring in PlainTextAnnotator.
> >
> > I am using Maven, with the jcasgen-maven-plugin, the ruta-maven-plugin,
> and
> > the uimafit-maven-plugin. I will include the pom file at the end of this
> > post.
> >
> > I want my Java code to be aware of the types declared in the Ruta script
> -
> > that is the whole point - I want to count those annotations.
> >
> > My Ruta script also uses PlainTextAnnotator. The problem with this is
> that
> > I can't figure out where to put it. In a Workbench based Ruta project,
> > PlainTextAnnotator.xml and PlainTextAnnotatorTypeSystem get put
> > automatically into descriptor/utils, along with a number of other
> > descriptors that seem to be built into Ruta. But when I create a project
> > using maven, there is no such location, and these descriptors do not get
> > put anywhere. I tried a number of places but could not get my script to
> see
> > the type system for PlainTextAnnotator. Finally, I hit on putting the
> files
> > in target/generated-sources/ruta/descriptor/utils, and finally my script
> is
> > able to see the types and I can run it. This is good because at that
> point,
> > the ruta-maven-plugin does its job and generates the descriptors for my
> > script. However, I suspect this is not a good place to put the
> > PlainTextAnnotator files since doing a clean overwrites them. Where
> should
> > they go? Is there any entry in the pom file that is needed?
> >
> > The second problem is that although my Ruta script works nicely on its
> own,
> > the Java code fails.  I get the following exception
> > Exception in thread "main" org.apache.uima.cas.CASRuntimeException: JCas
> > type "org.apache.uima.examples.SourceDocumentInformation" used in Java
> > code,  but was not declared in the XML type descriptor.
> > at org.apache.uima.jcas.impl.JCasImpl.getTypeInit(JCasImpl.java:435)
> > at org.apache.uima.jcas.impl.JCasImpl.getType(JCasImpl.java:408)
> > at org.apache.uima.jcas.cas.TOP.<init>(TOP.java:96)
> > at org.apache.uima.jcas.cas.AnnotationBase.<init>(AnnotationBase.java:66)
> > at org.apache.uima.jcas.tcas.Annotation.<init>(Annotation.java:54)
> > at
> >
> org.apache.uima.examples.SourceDocumentInformation.<init>(SourceDocumentInformation.java:80)
> > at
> >
> org.apache.uima.examples.cpe.FileSystemCollectionReader.getNext(FileSystemCollectionReader.java:162)
> > at
> >
> org.apache.uima.fit.pipeline.SimplePipeline.runPipeline(SimplePipeline.java:149)
> > at PipelineSystem.<init>(PipelineSystem.java:59)
> > at PipelineSystem.main(PipelineSystem.java:73)
> >
> > I am guessing that I need to put some other descriptor somewhere but I
> > can't figure out what it might be.  Here is the code that causes the
> problem
> >
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> > import java.io.IOException;
> > import java.util.Iterator;
> >
> > import org.apache.uima.UIMAException;
> > import org.apache.uima.analysis_engine.AnalysisEngine;
> > import org.apache.uima.analysis_engine.AnalysisEngineDescription;
> > import org.apache.uima.analysis_engine.AnalysisEngineProcessException;
> > import org.apache.uima.cas.Type;
> > import org.apache.uima.cas.TypeSystem;
> > import org.apache.uima.collection.CollectionReaderDescription;
> > import org.apache.uima.examples.cpe.FileSystemCollectionReader;
> > import org.apache.uima.fit.component.CasDumpWriter;
> > import org.apache.uima.fit.factory.AnalysisEngineFactory;
> > import org.apache.uima.fit.factory.CollectionReaderFactory;
> > import org.apache.uima.fit.pipeline.SimplePipeline;
> > import org.apache.uima.jcas.JCas;
> > import org.apache.uima.resource.ResourceInitializationException;
> > import org.apache.uima.ruta.engine.RutaEngine;
> >
> > public class PipelineSystem  {
> > public PipelineSystem() throws IOException, UIMAException
> > {
> > try {
> > CollectionReaderDescription readerDesc =
> > CollectionReaderFactory.createReaderDescription(
> > FileSystemCollectionReader.class,
> >            FileSystemCollectionReader.PARAM_INPUTDIR,
> >  "/home/bonnie/Research/eclipse-uima-projects/PipeLineWithRuta/input",
> >            FileSystemCollectionReader.PARAM_ENCODING,  "UTF-8",
> >            FileSystemCollectionReader.PARAM_LANGUAGE,  "English");
> > AnalysisEngine rae = AnalysisEngineFactory.createEngine(RutaEngine.class,
> > RutaEngine.PARAM_MAIN_SCRIPT,
> >            "ecClassifierRules");
> > AnalysisEngineDescription rutaEngineDesc =
> > AnalysisEngineFactory.createEngineDescription(RutaEngine.class,
> > RutaEngine.PARAM_MAIN_SCRIPT,
> >            "ecClassifierRules");
> > AnalysisEngineDescription writerDesc =
> > AnalysisEngineFactory.createEngineDescription(CasDumpWriter.class,
> > CasDumpWriter.PARAM_OUTPUT_FILE, "dump.txt");
> > JCas jCas = rae.newJCas();
> > SimplePipeline.runPipeline(readerDesc, rutaEngineDesc);
> > displayRutaResults(jCas);
> > } catch (ResourceInitializationException e) {
> > // TODO Auto-generated catch block
> > e.printStackTrace();
> > } catch (AnalysisEngineProcessException e) {
> > // TODO Auto-generated catch block
> > e.printStackTrace();
> > }
> > }
> >
> > public static void main(String[] args) throws IOException,
> UIMAException  {
> > PipelineSystem p = new PipelineSystem();
> >
> > }
> >
> > public void displayRutaResults(JCas jCas)
> > {
> > System.out.println("in display ruta results");
> > TypeSystem ts = jCas.getTypeSystem();
> > Iterator<Type> typeItr = ts.getTypeIterator();
> > while (typeItr.hasNext()) {
> > Type type = (Type) typeItr.next();
> > if (type.getName().equals("INCL")) {
> > System.out.println("INCL was found");
> > }
> > }
> > }
> >
> ------------------------------------------------------------------------------------------------------------------------------------------------
> >
> > Yes, I know the code doesn't actually count annotations yet - this is
> > strictly a test of the configuration. The type INCL is declared in the
> > script
> >
> > ENGINE utils.PlainTextAnnotator; TYPESYSTEM utils.PlainTextTypeSystem;
> > Document{-> RETAINTYPE(BREAK)}; Document{-> EXEC(PlainTextAnnotator,
> > {Line})};
> >
> > DECLARE INCL; "INCLUSION" -> INCL;
> >
> > And finally, here is the pom file. I note that the ruta pugin and the
> > jcasegen plugin are correctly generating the descriptor files for the
> > script and the Java classes for the types. I have this set up so that the
> > jcasgen plugin reads the type descriptors from the folder that is
> generated
> > by the ruta-maven-plugin (I saw this in one of the examples mentioned
> > elsewhere on this mailing lsit)
> > However, the uimafit plugin does not generate anything.
> >
> > thanks for any help. It is really hard to figure out all these moving
> parts.
> >
> > Bonnie MacKellar
> >
> >
> ---------------------------------------------------------------------------------------------------------------------------------
> >
> > <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="
> > http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="
> > http://maven.apache.org/POM/4.0.0
> > http://maven.apache.org/xsd/maven-4.0.0.xsd">
> > <modelVersion>4.0.0</modelVersion> <groupId>PipeLineWithRuta</groupId>
> > <artifactId>PipeLineWithRuta</artifactId>
> <version>0.0.1-SNAPSHOT</version>
> > <packaging>jar</packaging> <name>PipeLineWithRuta</name> <url>
> > http://maven.apache.org</url> <properties>
> > <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
> > </properties> <build> <sourceDirectory>src/main/java</sourceDirectory>
> > <resources> <resource> <directory>src/main/ruta</directory> </resource>
> > <resource> <directory>src/desc</directory> </resource> </resources>
> > <plugins> <plugin> <artifactId>maven-compiler-plugin</artifactId>
> > <version>3.3</version> <configuration> <source>1.8</source>
> > <target>1.8</target> </configuration> </plugin> <plugin>
> > <groupId>org.apache.uima</groupId>
> > <artifactId>jcasgen-maven-plugin</artifactId> <version>2.4.1</version>
> <!--
> > change this to the latest version --> <executions> <execution> <goals>
> > <goal>generate</goal> </goals> <!-- this is the only goal --> <!-- runs
> in
> > phase process-resources by default --> <configuration> <!-- REQUIRED -->
> > <typeSystemIncludes> <!-- one or more ant-like file patterns identifying
> > top level descriptors -->
> >
> <typeSystemInclude>target/generated-sources/ruta/descriptor/ecClassifierRulesTypeSystem.xml</typeSystemInclude>
> > </typeSystemIncludes> <!-- OPTIONAL --> <!-- a sequence of ant-like file
> > patterns to exclude from the above include list --> <typeSystemExcludes>
> > </typeSystemExcludes> <!-- OPTIONAL --> <!-- where the generated files go
> > --> <!-- default value:
> > ${project.build.directory}/generated-sources/jcasgen" -->
> <outputDirectory>
> > </outputDirectory> <!-- true or false, default = false --> <!-- if true,
> > then although the complete merged type system will be created internally,
> > only those types whose definition is contained within this maven project
> > will be generated. The others will be presumed to be available via other
> > projects. --> <!-- OPTIONAL --> <limitToProject>true</limitToProject>
> > </configuration> </execution> </executions> </plugin> <plugin>
> > <groupId>org.apache.uima</groupId>
> > <artifactId>ruta-maven-plugin</artifactId> <version>2.3.1</version>
> > <configuration> <scriptPaths> <scriptPath>src/main/ruta/</scriptPath>
> > </scriptPaths> <!-- Descriptor paths of the generated analysis engine
> > descriptor. --> <!-- default value: none --> <descriptorPaths>
> >
> <descriptorPath>${project.build.directory}/generated-sources/ruta/descriptor</descriptorPath>
> > </descriptorPaths> <!-- Resource paths of the generated analysis engine
> > descriptor. --> <!-- default value: none --> <resourcePaths>
> > <resourcePath>${project.build.directory}/generated-sources/ruta/
> > resources/</resourcePath> </resourcePaths>
> > <analysisEngineSuffix>Engine</analysisEngineSuffix>
> > <typeSystemSuffix>TypeSystem</typeSystemSuffix> <!-- Type of type system
> > imports. false = import by location. --> <!-- default value: false -->
> > <importByName>false</importByName> <!-- Option to resolve imports while
> > building. --> <!-- default value: false -->
> > <resolveImports>false</resolveImports> <!-- List of packages with
> language
> > extensions --> <!-- default value: none --> <extensionPackages>
> > <extensionPackage>org.apache.uima.ruta</extensionPackage>
> > </extensionPackages> <!-- Add UIMA Ruta nature to .project --> <!--
> default
> > value: false --> <addRutaNature>true</addRutaNature> <!-- Buildpath of
> the
> > UIMA Ruta Workbench (IDE) for this project --> <!-- default value: none
> -->
> > <buildPaths> <buildPath>script:src/main/ruta/</buildPath>
> > <buildPath>descriptor:target/generated-sources/ruta/descriptor/
> > </buildPath> <buildPath>resources:src/main/resources/</buildPath>
> > </buildPaths> </configuration> <executions> <execution> <id>default</id>
> > <phase>process-classes</phase> <goals> <goal>generate</goal> </goals>
> > </execution> </executions> </plugin> <plugin>
> > <groupId>org.apache.uima</groupId>
> > <artifactId>uimafit-maven-plugin</artifactId> <version>2.2.0</version>
> <!--
> > change to latest version --> <configuration> <!-- OPTIONAL --> <!-- Path
> > where the generated resources are written. --> <outputDirectory>
> > ${project.build.directory}/generated-sources/uimafit </outputDirectory>
> > <!-- OPTIONAL --> <!-- Skip generation of
> > META-INF/org.apache.uima.fit/components.txt -->
> > <skipComponentsManifest>false</skipComponentsManifest> <!-- OPTIONAL -->
> > <!-- Source file encoding. -->
> > <encoding>${project.build.sourceEncoding}</encoding> </configuration>
> > <executions> <execution> <id>default</id> <phase>process-classes</phase>
> > <goals> <goal>generate</goal> </goals> </execution> </executions>
> </plugin>
> > </plugins> </build> <dependencies> <dependency>
> > <groupId>org.apache.uima</groupId> <artifactId>uimafit-core</artifactId>
> > <version>2.2.0</version> </dependency> <dependency>
> > <groupId>org.apache.uima</groupId> <artifactId>uimaj-core</artifactId>
> > <version>2.8.1</version> </dependency> <dependency>
> > <groupId>org.apache.uima</groupId>
> > <artifactId>ruta-maven-plugin</artifactId> <version>2.3.1</version>
> > </dependency> <dependency> <groupId>org.apache.uima</groupId>
> > <artifactId>uimaj-cpe</artifactId> <version>2.8.1</version> </dependency>
> > <dependency> <groupId>org.apache.uima</groupId>
> > <artifactId>uimaj-examples</artifactId> <version>2.8.1</version>
> > </dependency> </dependencies> </project>
> >
>
>

Re: problems integrating Ruta and uimaFit

Posted by Peter Klügl <pe...@averbis.com>.
Hi,


sorry, here's just a short reply since  I am currently travelling. If
the problem still exists I will try to reproduce it and reply with more
details next week.


Yes, in simple UIMA Ruta projects, these descriptors are copied to
descriptor/utils when you create the project. The descriptor folder is
listed in the buildpath as a "descriptor" folder, where imported
descriptors are searched in.

UIMA Ruta supports currently two ways to find the descriptors: the
absolute paths specified in the descriptorPaths configuration parameter
and the classpath. Thus, the simplest way for you would be to use the
classpath to find the descriptor instead of the descriptorPaths (which
points to the descriptor folder of your ruta project).

Changing the imports to something like: UIMAFIT
org.apache.uima.ruta.engine.PlainTextAnnotator should do the trick (you
need also to adapt the TYPESYSTEM import). Then the script does not
depend on the project structure.


If you use the SourceDocumentInformation type system in your ruta
script, then you need to include it separately. In some situtation, the
Ruta Workbench does that automatically for you. However, it is not
mentioned in types.txt in ruta-core. So you need to add it there in your
maven project so that the typesystem scanning of uimaFIT finds it.


If you create the analysis engine (descriptor) for a ruta script
programmatically, there are sometimes additional configuration
parameters that need to be set. In your use case, you import additional
analysis engine in your script. These need to be mentioned in the
corresponding configuration parameters, e.g., PARAM_ADDITIONAL_ENGINES
or PARAM_ADDITIONAL_UIMAFIT_ENGINES. Since there are several parameters
that are rather technical. I normally use the generated descriptor in
the uimaFIT factory.


Best,


Peter


Am 22.06.2016 um 21:55 schrieb Bonnie MacKellar:
> I am still trying to figure out how to count Ruta annotations across a
> bunch of input files. There doesn't seem to be any Workbench way to do it.
> So now I am trying to call Ruta from UimaFit so I can do the job in Java.
>
> However, I am having serious configuration problems, plus I have a question
> on how do bring in PlainTextAnnotator.
>
> I am using Maven, with the jcasgen-maven-plugin, the ruta-maven-plugin, and
> the uimafit-maven-plugin. I will include the pom file at the end of this
> post.
>
> I want my Java code to be aware of the types declared in the Ruta script -
> that is the whole point - I want to count those annotations.
>
> My Ruta script also uses PlainTextAnnotator. The problem with this is that
> I can't figure out where to put it. In a Workbench based Ruta project,
> PlainTextAnnotator.xml and PlainTextAnnotatorTypeSystem get put
> automatically into descriptor/utils, along with a number of other
> descriptors that seem to be built into Ruta. But when I create a project
> using maven, there is no such location, and these descriptors do not get
> put anywhere. I tried a number of places but could not get my script to see
> the type system for PlainTextAnnotator. Finally, I hit on putting the files
> in target/generated-sources/ruta/descriptor/utils, and finally my script is
> able to see the types and I can run it. This is good because at that point,
> the ruta-maven-plugin does its job and generates the descriptors for my
> script. However, I suspect this is not a good place to put the
> PlainTextAnnotator files since doing a clean overwrites them. Where should
> they go? Is there any entry in the pom file that is needed?
>
> The second problem is that although my Ruta script works nicely on its own,
> the Java code fails.  I get the following exception
> Exception in thread "main" org.apache.uima.cas.CASRuntimeException: JCas
> type "org.apache.uima.examples.SourceDocumentInformation" used in Java
> code,  but was not declared in the XML type descriptor.
> at org.apache.uima.jcas.impl.JCasImpl.getTypeInit(JCasImpl.java:435)
> at org.apache.uima.jcas.impl.JCasImpl.getType(JCasImpl.java:408)
> at org.apache.uima.jcas.cas.TOP.<init>(TOP.java:96)
> at org.apache.uima.jcas.cas.AnnotationBase.<init>(AnnotationBase.java:66)
> at org.apache.uima.jcas.tcas.Annotation.<init>(Annotation.java:54)
> at
> org.apache.uima.examples.SourceDocumentInformation.<init>(SourceDocumentInformation.java:80)
> at
> org.apache.uima.examples.cpe.FileSystemCollectionReader.getNext(FileSystemCollectionReader.java:162)
> at
> org.apache.uima.fit.pipeline.SimplePipeline.runPipeline(SimplePipeline.java:149)
> at PipelineSystem.<init>(PipelineSystem.java:59)
> at PipelineSystem.main(PipelineSystem.java:73)
>
> I am guessing that I need to put some other descriptor somewhere but I
> can't figure out what it might be.  Here is the code that causes the problem
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> import java.io.IOException;
> import java.util.Iterator;
>
> import org.apache.uima.UIMAException;
> import org.apache.uima.analysis_engine.AnalysisEngine;
> import org.apache.uima.analysis_engine.AnalysisEngineDescription;
> import org.apache.uima.analysis_engine.AnalysisEngineProcessException;
> import org.apache.uima.cas.Type;
> import org.apache.uima.cas.TypeSystem;
> import org.apache.uima.collection.CollectionReaderDescription;
> import org.apache.uima.examples.cpe.FileSystemCollectionReader;
> import org.apache.uima.fit.component.CasDumpWriter;
> import org.apache.uima.fit.factory.AnalysisEngineFactory;
> import org.apache.uima.fit.factory.CollectionReaderFactory;
> import org.apache.uima.fit.pipeline.SimplePipeline;
> import org.apache.uima.jcas.JCas;
> import org.apache.uima.resource.ResourceInitializationException;
> import org.apache.uima.ruta.engine.RutaEngine;
>
> public class PipelineSystem  {
> public PipelineSystem() throws IOException, UIMAException
> {
> try {
> CollectionReaderDescription readerDesc =
> CollectionReaderFactory.createReaderDescription(
> FileSystemCollectionReader.class,
>            FileSystemCollectionReader.PARAM_INPUTDIR,
>  "/home/bonnie/Research/eclipse-uima-projects/PipeLineWithRuta/input",
>            FileSystemCollectionReader.PARAM_ENCODING,  "UTF-8",
>            FileSystemCollectionReader.PARAM_LANGUAGE,  "English");
> AnalysisEngine rae = AnalysisEngineFactory.createEngine(RutaEngine.class,
> RutaEngine.PARAM_MAIN_SCRIPT,
>            "ecClassifierRules");
> AnalysisEngineDescription rutaEngineDesc =
> AnalysisEngineFactory.createEngineDescription(RutaEngine.class,
> RutaEngine.PARAM_MAIN_SCRIPT,
>            "ecClassifierRules");
> AnalysisEngineDescription writerDesc =
> AnalysisEngineFactory.createEngineDescription(CasDumpWriter.class,
> CasDumpWriter.PARAM_OUTPUT_FILE, "dump.txt");
> JCas jCas = rae.newJCas();
> SimplePipeline.runPipeline(readerDesc, rutaEngineDesc);
> displayRutaResults(jCas);
> } catch (ResourceInitializationException e) {
> // TODO Auto-generated catch block
> e.printStackTrace();
> } catch (AnalysisEngineProcessException e) {
> // TODO Auto-generated catch block
> e.printStackTrace();
> }
> }
>
> public static void main(String[] args) throws IOException, UIMAException  {
> PipelineSystem p = new PipelineSystem();
>
> }
>
> public void displayRutaResults(JCas jCas)
> {
> System.out.println("in display ruta results");
> TypeSystem ts = jCas.getTypeSystem();
> Iterator<Type> typeItr = ts.getTypeIterator();
> while (typeItr.hasNext()) {
> Type type = (Type) typeItr.next();
> if (type.getName().equals("INCL")) {
> System.out.println("INCL was found");
> }
> }
> }
> ------------------------------------------------------------------------------------------------------------------------------------------------
>
> Yes, I know the code doesn't actually count annotations yet - this is
> strictly a test of the configuration. The type INCL is declared in the
> script
>
> ENGINE utils.PlainTextAnnotator; TYPESYSTEM utils.PlainTextTypeSystem;
> Document{-> RETAINTYPE(BREAK)}; Document{-> EXEC(PlainTextAnnotator,
> {Line})};
>
> DECLARE INCL; "INCLUSION" -> INCL;
>
> And finally, here is the pom file. I note that the ruta pugin and the
> jcasegen plugin are correctly generating the descriptor files for the
> script and the Java classes for the types. I have this set up so that the
> jcasgen plugin reads the type descriptors from the folder that is generated
> by the ruta-maven-plugin (I saw this in one of the examples mentioned
> elsewhere on this mailing lsit)
> However, the uimafit plugin does not generate anything.
>
> thanks for any help. It is really hard to figure out all these moving parts.
>
> Bonnie MacKellar
>
> ---------------------------------------------------------------------------------------------------------------------------------
>
> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="
> http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="
> http://maven.apache.org/POM/4.0.0
> http://maven.apache.org/xsd/maven-4.0.0.xsd">
> <modelVersion>4.0.0</modelVersion> <groupId>PipeLineWithRuta</groupId>
> <artifactId>PipeLineWithRuta</artifactId> <version>0.0.1-SNAPSHOT</version>
> <packaging>jar</packaging> <name>PipeLineWithRuta</name> <url>
> http://maven.apache.org</url> <properties>
> <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
> </properties> <build> <sourceDirectory>src/main/java</sourceDirectory>
> <resources> <resource> <directory>src/main/ruta</directory> </resource>
> <resource> <directory>src/desc</directory> </resource> </resources>
> <plugins> <plugin> <artifactId>maven-compiler-plugin</artifactId>
> <version>3.3</version> <configuration> <source>1.8</source>
> <target>1.8</target> </configuration> </plugin> <plugin>
> <groupId>org.apache.uima</groupId>
> <artifactId>jcasgen-maven-plugin</artifactId> <version>2.4.1</version> <!--
> change this to the latest version --> <executions> <execution> <goals>
> <goal>generate</goal> </goals> <!-- this is the only goal --> <!-- runs in
> phase process-resources by default --> <configuration> <!-- REQUIRED -->
> <typeSystemIncludes> <!-- one or more ant-like file patterns identifying
> top level descriptors -->
> <typeSystemInclude>target/generated-sources/ruta/descriptor/ecClassifierRulesTypeSystem.xml</typeSystemInclude>
> </typeSystemIncludes> <!-- OPTIONAL --> <!-- a sequence of ant-like file
> patterns to exclude from the above include list --> <typeSystemExcludes>
> </typeSystemExcludes> <!-- OPTIONAL --> <!-- where the generated files go
> --> <!-- default value:
> ${project.build.directory}/generated-sources/jcasgen" --> <outputDirectory>
> </outputDirectory> <!-- true or false, default = false --> <!-- if true,
> then although the complete merged type system will be created internally,
> only those types whose definition is contained within this maven project
> will be generated. The others will be presumed to be available via other
> projects. --> <!-- OPTIONAL --> <limitToProject>true</limitToProject>
> </configuration> </execution> </executions> </plugin> <plugin>
> <groupId>org.apache.uima</groupId>
> <artifactId>ruta-maven-plugin</artifactId> <version>2.3.1</version>
> <configuration> <scriptPaths> <scriptPath>src/main/ruta/</scriptPath>
> </scriptPaths> <!-- Descriptor paths of the generated analysis engine
> descriptor. --> <!-- default value: none --> <descriptorPaths>
> <descriptorPath>${project.build.directory}/generated-sources/ruta/descriptor</descriptorPath>
> </descriptorPaths> <!-- Resource paths of the generated analysis engine
> descriptor. --> <!-- default value: none --> <resourcePaths>
> <resourcePath>${project.build.directory}/generated-sources/ruta/
> resources/</resourcePath> </resourcePaths>
> <analysisEngineSuffix>Engine</analysisEngineSuffix>
> <typeSystemSuffix>TypeSystem</typeSystemSuffix> <!-- Type of type system
> imports. false = import by location. --> <!-- default value: false -->
> <importByName>false</importByName> <!-- Option to resolve imports while
> building. --> <!-- default value: false -->
> <resolveImports>false</resolveImports> <!-- List of packages with language
> extensions --> <!-- default value: none --> <extensionPackages>
> <extensionPackage>org.apache.uima.ruta</extensionPackage>
> </extensionPackages> <!-- Add UIMA Ruta nature to .project --> <!-- default
> value: false --> <addRutaNature>true</addRutaNature> <!-- Buildpath of the
> UIMA Ruta Workbench (IDE) for this project --> <!-- default value: none -->
> <buildPaths> <buildPath>script:src/main/ruta/</buildPath>
> <buildPath>descriptor:target/generated-sources/ruta/descriptor/
> </buildPath> <buildPath>resources:src/main/resources/</buildPath>
> </buildPaths> </configuration> <executions> <execution> <id>default</id>
> <phase>process-classes</phase> <goals> <goal>generate</goal> </goals>
> </execution> </executions> </plugin> <plugin>
> <groupId>org.apache.uima</groupId>
> <artifactId>uimafit-maven-plugin</artifactId> <version>2.2.0</version> <!--
> change to latest version --> <configuration> <!-- OPTIONAL --> <!-- Path
> where the generated resources are written. --> <outputDirectory>
> ${project.build.directory}/generated-sources/uimafit </outputDirectory>
> <!-- OPTIONAL --> <!-- Skip generation of
> META-INF/org.apache.uima.fit/components.txt -->
> <skipComponentsManifest>false</skipComponentsManifest> <!-- OPTIONAL -->
> <!-- Source file encoding. -->
> <encoding>${project.build.sourceEncoding}</encoding> </configuration>
> <executions> <execution> <id>default</id> <phase>process-classes</phase>
> <goals> <goal>generate</goal> </goals> </execution> </executions> </plugin>
> </plugins> </build> <dependencies> <dependency>
> <groupId>org.apache.uima</groupId> <artifactId>uimafit-core</artifactId>
> <version>2.2.0</version> </dependency> <dependency>
> <groupId>org.apache.uima</groupId> <artifactId>uimaj-core</artifactId>
> <version>2.8.1</version> </dependency> <dependency>
> <groupId>org.apache.uima</groupId>
> <artifactId>ruta-maven-plugin</artifactId> <version>2.3.1</version>
> </dependency> <dependency> <groupId>org.apache.uima</groupId>
> <artifactId>uimaj-cpe</artifactId> <version>2.8.1</version> </dependency>
> <dependency> <groupId>org.apache.uima</groupId>
> <artifactId>uimaj-examples</artifactId> <version>2.8.1</version>
> </dependency> </dependencies> </project>
>