You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@uima.apache.org by Diego Buoro <jk...@gmail.com> on 2015/05/26 19:49:57 UTC

Marking cosnecutive tokens with RUTA

Hello guys,how are you doing?

I would like to know once i have called RUTA from a Java project, how can i
mark consecutive tokens as a "Problem" (the name of my annotation, in this
case)?

Thanks in advice!

Re: Marking cosnecutive tokens with RUTA

Posted by Peter Klügl <pe...@averbis.com>.

Hi,

that should of course be posssible :-)

How do you call the script from java? Have you considered  the type
priorities when the CAS is created?

Best,

Peter

PS: your examples miss semicolons. I assume that there are some in your
script?


Am 26.05.2015 um 20:30 schrieb Diego Buoro:
> I think i wasn't clear enough, and i should be more specific.
>
> I have a type system in which all words have been annotated as Tokens. I am
> calling a RUTA script from a java class, and that script has only one rule:
> Token Token {-> Problem}
>
> However, with this script, no Problems are created. When I try
> Token {-> Problem}
>
> I get one problem for each Token, which is what I expected. Why can't I
> create annotations using rules with more than one word?
>
> Thanks
>
>
>
>
> 2015-05-26 14:49 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>
>> Hello guys,how are you doing?
>>
>> I would like to know once i have called RUTA from a Java project, how can
>> i mark consecutive tokens as a "Problem" (the name of my annotation, in
>> this case)?
>>
>> Thanks in advice!
>>

Re: Marking cosnecutive tokens with RUTA

Posted by Peter Klügl <pk...@uni-wuerzburg.de>.

... forgot to mention that I found one problem when you would use the 
scripts with the upcoming ruta release. The package declaration does not 
match the package structure the scripts are located in. For ruta 2.2.1, 
this is not really problematic, but it will cause errors for ruta 2.3.0.

Best,

Peter


Am 31.05.2015 um 15:28 schrieb Peter Klügl:
> Hi,
>
> I looked at the code, but I haven't found anything that could cause 
> the problem. The type priotities should be fine by using the 
> BasicEngine.xml of the maven dependency.
>
> Normally, I would assume that it's caused by the visibility, e.g., an 
> annotation starts with something invisible and thus is not machted. Or 
> the filtering settings are changed and the rule expects a token but 
> finds a SPACE. I tested Main.ruta, no problems observed.
>
> Can you provide a minimal example where I can reproduce the problem?
>
> Best,
>
> Peter
>
>
> Am 28.05.2015 um 18:53 schrieb Peter Klügl:
>> Hi,
>>
>> sorry, I haven't found the time to take a closer look yet , but I will
>> at the weekend.
>>
>> Best,
>>
>> Peter
>>
>> Am 27.05.2015 um 19:22 schrieb Diego Buoro:
>>> Hi Peter!
>>> We call the script with the following lines:
>>>
>>>   URL url = Resources.getResource("Main.ruta");
>>> String text = Resources.toString(url, Charsets.UTF_8);
>>>   AnalysisEngineDescription aeDes =
>>> Ruta.createAnalysisEngineDescription(text, tsd);
>>> this.ae = UIMAFramework.produceAnalysisEngine(aeDes);
>>>
>>> CAS cas = ae.newCAS();
>>> converter.populateCas(sentence.getTextSentence(), cas);
>>>   ae.process(cas);
>>>
>>> The populateCAS method is responsible for translating our 
>>> annotations into
>>> RUTA annotations, but it doesn't set any type priority explicitly.
>>> We don't know much about type priorities, the RUTA references we 
>>> found say
>>> very little about that.Are they necessary for doing what we need?
>>>
>>> The file that contains the above lines is available here:
>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/UIMAChecker.java 
>>>
>>> The processCAS mehtod is available here:
>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/uima/UimaCasAdapter.java 
>>>
>>> The script we are calling is available here:
>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-ruta/script/Main.ruta 
>>>
>>>
>>> PS:Yes, We remembered the semicolons.
>>>
>>> Thanks for the help :)
>>>
>>>
>>>
>>> 2015-05-26 15:30 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>
>>>> I think i wasn't clear enough, and i should be more specific.
>>>>
>>>> I have a type system in which all words have been annotated as 
>>>> Tokens. I
>>>> am calling a RUTA script from a java class, and that script has 
>>>> only one
>>>> rule:
>>>> Token Token {-> Problem}
>>>>
>>>> However, with this script, no Problems are created. When I try
>>>> Token {-> Problem}
>>>>
>>>> I get one problem for each Token, which is what I expected. Why 
>>>> can't I
>>>> create annotations using rules with more than one word?
>>>>
>>>> Thanks
>>>>
>>>>
>>>>
>>>>
>>>> 2015-05-26 14:49 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>
>>>>> Hello guys,how are you doing?
>>>>>
>>>>> I would like to know once i have called RUTA from a Java project, 
>>>>> how can
>>>>> i mark consecutive tokens as a "Problem" (the name of my 
>>>>> annotation, in
>>>>> this case)?
>>>>>
>>>>> Thanks in advice!
>>>>>
>

Re: Marking cosnecutive tokens with RUTA

Posted by Peter Klügl <pk...@uni-wuerzburg.de>.

Hi,

I looked at the code, but I haven't found anything that could cause the 
problem. The type priotities should be fine by using the BasicEngine.xml 
of the maven dependency.

Normally, I would assume that it's caused by the visibility, e.g., an 
annotation starts with something invisible and thus is not machted. Or 
the filtering settings are changed and the rule expects a token but 
finds a SPACE. I tested Main.ruta, no problems observed.

Can you provide a minimal example where I can reproduce the problem?

Best,

Peter


Am 28.05.2015 um 18:53 schrieb Peter Klügl:
> Hi,
>
> sorry, I haven't found the time to take a closer look yet , but I will
> at the weekend.
>
> Best,
>
> Peter
>
> Am 27.05.2015 um 19:22 schrieb Diego Buoro:
>> Hi Peter!
>> We call the script with the following lines:
>>
>>   URL url = Resources.getResource("Main.ruta");
>> String text = Resources.toString(url, Charsets.UTF_8);
>>   AnalysisEngineDescription aeDes =
>> Ruta.createAnalysisEngineDescription(text, tsd);
>> this.ae = UIMAFramework.produceAnalysisEngine(aeDes);
>>
>> CAS cas = ae.newCAS();
>> converter.populateCas(sentence.getTextSentence(), cas);
>>   ae.process(cas);
>>
>> The populateCAS method is responsible for translating our annotations into
>> RUTA annotations, but it doesn't set any type priority explicitly.
>> We don't know much about type priorities, the RUTA references we found say
>> very little about that.Are they necessary for doing what we need?
>>
>> The file that contains the above lines is available here:
>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/UIMAChecker.java
>> The processCAS mehtod is available here:
>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/uima/UimaCasAdapter.java
>> The script we are calling is available here:
>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-ruta/script/Main.ruta
>>
>> PS:Yes, We remembered the semicolons.
>>
>> Thanks for the help :)
>>
>>
>>
>> 2015-05-26 15:30 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>
>>> I think i wasn't clear enough, and i should be more specific.
>>>
>>> I have a type system in which all words have been annotated as Tokens. I
>>> am calling a RUTA script from a java class, and that script has only one
>>> rule:
>>> Token Token {-> Problem}
>>>
>>> However, with this script, no Problems are created. When I try
>>> Token {-> Problem}
>>>
>>> I get one problem for each Token, which is what I expected. Why can't I
>>> create annotations using rules with more than one word?
>>>
>>> Thanks
>>>
>>>
>>>
>>>
>>> 2015-05-26 14:49 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>
>>>> Hello guys,how are you doing?
>>>>
>>>> I would like to know once i have called RUTA from a Java project, how can
>>>> i mark consecutive tokens as a "Problem" (the name of my annotation, in
>>>> this case)?
>>>>
>>>> Thanks in advice!
>>>>

Re: Marking cosnecutive tokens with RUTA

Posted by Peter Klügl <pe...@averbis.com>.

Hi,

sorry, I haven't found the time to take a closer look yet , but I will
at the weekend.

Best,

Peter

Am 27.05.2015 um 19:22 schrieb Diego Buoro:
> Hi Peter!
> We call the script with the following lines:
>
>  URL url = Resources.getResource("Main.ruta");
> String text = Resources.toString(url, Charsets.UTF_8);
>  AnalysisEngineDescription aeDes =
> Ruta.createAnalysisEngineDescription(text, tsd);
> this.ae = UIMAFramework.produceAnalysisEngine(aeDes);
>
> CAS cas = ae.newCAS();
> converter.populateCas(sentence.getTextSentence(), cas);
>  ae.process(cas);
>
> The populateCAS method is responsible for translating our annotations into
> RUTA annotations, but it doesn't set any type priority explicitly.
> We don't know much about type priorities, the RUTA references we found say
> very little about that.Are they necessary for doing what we need?
>
> The file that contains the above lines is available here:
> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/UIMAChecker.java
> The processCAS mehtod is available here:
> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/uima/UimaCasAdapter.java
> The script we are calling is available here:
> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-ruta/script/Main.ruta
>
> PS:Yes, We remembered the semicolons.
>
> Thanks for the help :)
>
>
>
> 2015-05-26 15:30 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>
>> I think i wasn't clear enough, and i should be more specific.
>>
>> I have a type system in which all words have been annotated as Tokens. I
>> am calling a RUTA script from a java class, and that script has only one
>> rule:
>> Token Token {-> Problem}
>>
>> However, with this script, no Problems are created. When I try
>> Token {-> Problem}
>>
>> I get one problem for each Token, which is what I expected. Why can't I
>> create annotations using rules with more than one word?
>>
>> Thanks
>>
>>
>>
>>
>> 2015-05-26 14:49 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>
>>> Hello guys,how are you doing?
>>>
>>> I would like to know once i have called RUTA from a Java project, how can
>>> i mark consecutive tokens as a "Problem" (the name of my annotation, in
>>> this case)?
>>>
>>> Thanks in advice!
>>>
>>

Re: Marking cosnecutive tokens with RUTA

Posted by Peter Klügl <pe...@averbis.com>.

After the changes of the last mail, the jar cotnains everything, when
- I change the phase of the ruta-maven-plugin to process-resources
- I set the outputDirectory for type system and analysis engine
directly  to target classes

Best,

Peter

Am 25.06.2015 um 18:03 schrieb Peter Klügl:
> Some first observations...
>
> You should:
> - remove all paths configs in your pom for the ruta-maven-plugin:
> scriptPaths, descriptorPaths, resourcePaths
> - remove the TypeSystem.xml file in the script folder
> - remove build-helper-maven-plugin
>
> Now the generated descriptors should be correctly configured. However,
> they are not part of the jar, but only of the sources jar. There is
> something special about your maven build process which I haven't found
> yet...
>
> Best,
>
> Peter
>
> Am 23.06.2015 um 19:41 schrieb Diego Buoro:
>> Hello Peter,
>>
>> We would like to know how to configurate the line with the "import
>> location" below, of MainTypeSystem.xml(which is genarated by the Ruta
>> plugin by Maven), since in our present state, it is referecing to a file in
>> two layers above of the desired. We searched in pom.xml,but we had no
>> success.
>>
>> <typeSystemDescription xmlns="http://uima.apache.org/resourceSpecifier">
>>     <name>MainTypeSystem</name>
>>     <imports>
>>        * <import location="../../../../../../descriptor/TypeSystem.xml"/>*
>>     </imports>
>> ...
>>
>> No problem Peter, take your time.Thanks to you, we did a good progress :)
>>
>> All Best,
>>
>> Diego
>>
>> 2015-06-22 10:28 GMT-03:00 Peter Klügl <pe...@averbis.com>:
>>
>>> Hi,
>>>
>>> I haven't checked all files in your project, but (without looking at it
>>> right now) there are several TypeSystem.xml files.
>>> If they are the same files (content), then I would recommend to have
>>> only one of these files, e.g., in the descriptor folder. If these files
>>> are different type systems, then I would recommend to give them a
>>> different name.
>>>
>>> You do not need to change anything if all works as expected.
>>>
>>> I am a bit busy right now so that I can't give a more detailed advice
>>> before the end of this week.
>>>
>>> Best,
>>>
>>> Peter
>>>
>>> Am 19.06.2015 um 21:31 schrieb Diego Buoro:
>>>> Hi again, we managed to fix our problem with the features. Our main worry
>>>> at the moment is: the problems that you mentioned about having duplicated
>>>> type systems all over the place are still there. Can you recommend some
>>>> reference that you think would be apropriate for fixing this in our
>>>> project? If not, can you be a bit clearer on what you suggest we should
>>> do?
>>>> All Best,
>>>>
>>>> Diego
>>>>
>>>> 2015-06-19 15:55 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>
>>>>> Hello, Peter, thanks for the suggestion as always :D
>>>>>
>>>>> Unfortunately, our problems weren't solved. We tried removing all paths
>>>>> from the configuration and we tried setting those paths as in the
>>> example
>>>>> project.In both cases we were eventually able to get the program
>>> running,
>>>>> but we were unable to import features from ruta annotations into java.
>>> We
>>>>> Don't know if that happened because of some problem in the configuration
>>>>> itself or in the code (which seemed to be working before these
>>>>> configuration problems started). Do you have any ideas?
>>>>>
>>>>> All Best,
>>>>>
>>>>> Diego
>>>>>

Re: Marking cosnecutive tokens with RUTA

Posted by Peter Klügl <pe...@averbis.com>.

Some first observations...

You should:
- remove all paths configs in your pom for the ruta-maven-plugin:
scriptPaths, descriptorPaths, resourcePaths
- remove the TypeSystem.xml file in the script folder
- remove build-helper-maven-plugin

Now the generated descriptors should be correctly configured. However,
they are not part of the jar, but only of the sources jar. There is
something special about your maven build process which I haven't found
yet...

Best,

Peter

Am 23.06.2015 um 19:41 schrieb Diego Buoro:
> Hello Peter,
>
> We would like to know how to configurate the line with the "import
> location" below, of MainTypeSystem.xml(which is genarated by the Ruta
> plugin by Maven), since in our present state, it is referecing to a file in
> two layers above of the desired. We searched in pom.xml,but we had no
> success.
>
> <typeSystemDescription xmlns="http://uima.apache.org/resourceSpecifier">
>     <name>MainTypeSystem</name>
>     <imports>
>        * <import location="../../../../../../descriptor/TypeSystem.xml"/>*
>     </imports>
> ...
>
> No problem Peter, take your time.Thanks to you, we did a good progress :)
>
> All Best,
>
> Diego
>
> 2015-06-22 10:28 GMT-03:00 Peter Klügl <pe...@averbis.com>:
>
>> Hi,
>>
>> I haven't checked all files in your project, but (without looking at it
>> right now) there are several TypeSystem.xml files.
>> If they are the same files (content), then I would recommend to have
>> only one of these files, e.g., in the descriptor folder. If these files
>> are different type systems, then I would recommend to give them a
>> different name.
>>
>> You do not need to change anything if all works as expected.
>>
>> I am a bit busy right now so that I can't give a more detailed advice
>> before the end of this week.
>>
>> Best,
>>
>> Peter
>>
>> Am 19.06.2015 um 21:31 schrieb Diego Buoro:
>>> Hi again, we managed to fix our problem with the features. Our main worry
>>> at the moment is: the problems that you mentioned about having duplicated
>>> type systems all over the place are still there. Can you recommend some
>>> reference that you think would be apropriate for fixing this in our
>>> project? If not, can you be a bit clearer on what you suggest we should
>> do?
>>> All Best,
>>>
>>> Diego
>>>
>>> 2015-06-19 15:55 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>
>>>> Hello, Peter, thanks for the suggestion as always :D
>>>>
>>>> Unfortunately, our problems weren't solved. We tried removing all paths
>>>> from the configuration and we tried setting those paths as in the
>> example
>>>> project.In both cases we were eventually able to get the program
>> running,
>>>> but we were unable to import features from ruta annotations into java.
>> We
>>>> Don't know if that happened because of some problem in the configuration
>>>> itself or in the code (which seemed to be working before these
>>>> configuration problems started). Do you have any ideas?
>>>>
>>>> All Best,
>>>>
>>>> Diego
>>>>
>>

Re: Marking cosnecutive tokens with RUTA

Posted by Diego Buoro <jk...@gmail.com>.

Hello Peter,

We would like to know how to configurate the line with the "import
location" below, of MainTypeSystem.xml(which is genarated by the Ruta
plugin by Maven), since in our present state, it is referecing to a file in
two layers above of the desired. We searched in pom.xml,but we had no
success.

<typeSystemDescription xmlns="http://uima.apache.org/resourceSpecifier">
    <name>MainTypeSystem</name>
    <imports>
       * <import location="../../../../../../descriptor/TypeSystem.xml"/>*
    </imports>
...

No problem Peter, take your time.Thanks to you, we did a good progress :)

All Best,

Diego

2015-06-22 10:28 GMT-03:00 Peter Klügl <pe...@averbis.com>:

> Hi,
>
> I haven't checked all files in your project, but (without looking at it
> right now) there are several TypeSystem.xml files.
> If they are the same files (content), then I would recommend to have
> only one of these files, e.g., in the descriptor folder. If these files
> are different type systems, then I would recommend to give them a
> different name.
>
> You do not need to change anything if all works as expected.
>
> I am a bit busy right now so that I can't give a more detailed advice
> before the end of this week.
>
> Best,
>
> Peter
>
> Am 19.06.2015 um 21:31 schrieb Diego Buoro:
> > Hi again, we managed to fix our problem with the features. Our main worry
> > at the moment is: the problems that you mentioned about having duplicated
> > type systems all over the place are still there. Can you recommend some
> > reference that you think would be apropriate for fixing this in our
> > project? If not, can you be a bit clearer on what you suggest we should
> do?
> >
> > All Best,
> >
> > Diego
> >
> > 2015-06-19 15:55 GMT-03:00 Diego Buoro <jk...@gmail.com>:
> >
> >> Hello, Peter, thanks for the suggestion as always :D
> >>
> >> Unfortunately, our problems weren't solved. We tried removing all paths
> >> from the configuration and we tried setting those paths as in the
> example
> >> project.In both cases we were eventually able to get the program
> running,
> >> but we were unable to import features from ruta annotations into java.
> We
> >> Don't know if that happened because of some problem in the configuration
> >> itself or in the code (which seemed to be working before these
> >> configuration problems started). Do you have any ideas?
> >>
> >> All Best,
> >>
> >> Diego
> >>
>
>

Re: Marking cosnecutive tokens with RUTA

Posted by Peter Klügl <pe...@averbis.com>.

Hi,

I haven't checked all files in your project, but (without looking at it
right now) there are several TypeSystem.xml files.
If they are the same files (content), then I would recommend to have
only one of these files, e.g., in the descriptor folder. If these files
are different type systems, then I would recommend to give them a
different name.

You do not need to change anything if all works as expected.

I am a bit busy right now so that I can't give a more detailed advice
before the end of this week.

Best,

Peter

Am 19.06.2015 um 21:31 schrieb Diego Buoro:
> Hi again, we managed to fix our problem with the features. Our main worry
> at the moment is: the problems that you mentioned about having duplicated
> type systems all over the place are still there. Can you recommend some
> reference that you think would be apropriate for fixing this in our
> project? If not, can you be a bit clearer on what you suggest we should do?
>
> All Best,
>
> Diego
>
> 2015-06-19 15:55 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>
>> Hello, Peter, thanks for the suggestion as always :D
>>
>> Unfortunately, our problems weren't solved. We tried removing all paths
>> from the configuration and we tried setting those paths as in the example
>> project.In both cases we were eventually able to get the program running,
>> but we were unable to import features from ruta annotations into java. We
>> Don't know if that happened because of some problem in the configuration
>> itself or in the code (which seemed to be working before these
>> configuration problems started). Do you have any ideas?
>>
>> All Best,
>>
>> Diego
>>

Re: Marking cosnecutive tokens with RUTA

Posted by Diego Buoro <jk...@gmail.com>.

Hi again, we managed to fix our problem with the features. Our main worry
at the moment is: the problems that you mentioned about having duplicated
type systems all over the place are still there. Can you recommend some
reference that you think would be apropriate for fixing this in our
project? If not, can you be a bit clearer on what you suggest we should do?

All Best,

Diego

2015-06-19 15:55 GMT-03:00 Diego Buoro <jk...@gmail.com>:

> Hello, Peter, thanks for the suggestion as always :D
>
> Unfortunately, our problems weren't solved. We tried removing all paths
> from the configuration and we tried setting those paths as in the example
> project.In both cases we were eventually able to get the program running,
> but we were unable to import features from ruta annotations into java. We
> Don't know if that happened because of some problem in the configuration
> itself or in the code (which seemed to be working before these
> configuration problems started). Do you have any ideas?
>
> All Best,
>
> Diego
>

Re: Marking cosnecutive tokens with RUTA

Posted by Diego Buoro <jk...@gmail.com>.

Hello, Peter, thanks for the suggestion as always :D

Unfortunately, our problems weren't solved. We tried removing all paths
from the configuration and we tried setting those paths as in the example
project.In both cases we were eventually able to get the program running,
but we were unable to import features from ruta annotations into java. We
Don't know if that happened because of some problem in the configuration
itself or in the code (which seemed to be working before these
configuration problems started). Do you have any ideas?

All Best,

Diego

Re: Marking cosnecutive tokens with RUTA

Posted by Peter Klügl <pk...@uni-wuerzburg.de>.

Another thougth about the ruta-maven-plugin configuration:

You maybe do not want to use descriptor paths at all. Right now, these 
resolve to absolute paths pointing to folders of your build machine 
meaning your descriptions contain these absolute paths. This can cause 
problems in different environments if they are not reconfigured or if 
the classpath-loading fallback mechanism does not work.

I would suggest to remove all "paths" from the configuration: 
scriptPaths, descriptorPaths and resourcePaths. These folders (paths) 
have then to become source folders of your maven project so that their 
content is in the classpath of project (copied to target/classes).  
Normally, you wouldn't even need the build-helper then. See 
example-projects/ruta-maven-example for an example...

Best,

Peter




Am 18.06.2015 um 20:24 schrieb Peter Klügl:
> Missed to mention the cause of the problem:
> The build process wasn't able to find the TypeSystem.xml type system 
> in order to import it in the generated MainTypeSystem.xml type system.
>
> Best,
>
> Peter
>
> Am 18.06.2015 um 20:10 schrieb Peter Klügl:
>> Hi,
>>
>> the Ruta descriptors are built twice in your project. Once in the 
>> normal phase and once during testing (don't know why).
>> The classpath of the two builds is different. In the second one, the 
>> imported type system can be found in the classpath, but not in the 
>> first one. In a normal mvn clean install, the descriptors are not yet 
>> built when they should be added to the jar. That's the reason why it 
>> works in Eclipse (no clean?), but not for a mvn clean install.
>>
>> The simplest solution is to extend the descriptor paths so that the 
>> imported type system is found without the classpath:
>>
>> <descriptorPaths>
>> <descriptorPath>${project.build.directory}/generated-sources/ruta/descriptor</descriptorPath> 
>>
>> <descriptorPath>${basedir}/descriptor</descriptorPath>
>> </descriptorPaths>
>>
>> btw, there are several duplicated files in the project, which 
>> potentially hide problems in the build process.
>>
>> Best,
>>
>> Peter
>>
>> Am 18.06.2015 um 17:30 schrieb Diego Buoro:
>>> Hi Peter, thanks for the support.
>>>
>>> We are now using Java 7, but we are still facing problems. In 
>>> Eclipse, we've set manually the path where descriptor files are 
>>> located (target/generated-sources/ruta/descriptor), and therefore 
>>> it's working. However, when we run mvn clean install, we generate 
>>> our descriptors files in 
>>> cogroo-ruta/target/generated-sources/ruta/descriptor/cogroo/ruta but 
>>> they aren't being copied to .jar file. The errors are in this log 
>>> file, do you have any idea of why they are happening?
>>>
>>> Here is the link to the repository: 
>>> https://github.com/Fichberg/cogroo4/tree/labXP215_Will
>>>
>>> All Best,
>>>
>>> Diego
>>>
>>>
>>>
>>>
>>>
>>>
>>> 2015-06-17 16:20 GMT-03:00 Peter Klügl <peter.kluegl@averbis.com 
>>> <ma...@averbis.com>>:
>>>
>>>     Hi,
>>>
>>>     UIMA Ruta 2.3.0 and also the maven plugin require Java 7. Thus,
>>>     the maven build process has to use the correct Java version. Just
>>>     wanted to mention it because I had this problem right away.
>>>
>>>     The descriptors are not built because the plugin does not find any
>>>     ruta files. The maven plugin is specified in one project while the
>>>     ruta files are located in a different project. The problem is that
>>>     the ruta maven plugin only collects ruta files within the basedir
>>>     of the project -> no files built...
>>>
>>>     In the next release, the maven plugin will get another parameter
>>>     for specifying the input files.
>>>
>>>     With UIMA Ruta 2.3.0, there are two options: Either you put the
>>>     ruta files in the project with the ruta maven plugin, or you add
>>>     the ruta maven plugin to the project pom with the ruta files.
>>>
>>>     Best,
>>>
>>>     Peter
>>>
>>>
>>>     Am 17.06.2015 um 18:30 schrieb Diego Buoro:
>>>
>>>         Hi, Peter! We are attempting to create the descriptors based
>>>         on Ruta 2.3,
>>>         but we're out of luck. We've added the lines from the link you
>>>         gave us to
>>>         the pom.xml file and corrected the directory paths to suit our
>>>         project.
>>>         However, when we try to run Maven with Ruta's "generate" goal,
>>>         no files got
>>>         generated on the folders we set. Is the goal supposed to
>>>         generate the files
>>>         and leave them in the folder or does it do something else?
>>>
>>>         Here is the link to our altered pom.xml. The plugin section is
>>>         at the end
>>>         of the file:
>>> https://raw.githubusercontent.com/Fichberg/cogroo4/labXP215_Will/cogroo-gc/pom.xml 
>>>
>>>
>>>         Thanks for the help so far. :D
>>>
>>>         2015-06-14 9:40 GMT-03:00 Peter Klügl
>>>         <peter.kluegl@averbis.com <ma...@averbis.com>>:
>>>
>>>             Hi,
>>>
>>>             the descriptor are always created at compile time.
>>>
>>>             In Ruta 2.2.1, yes, you need to create the descriptors in
>>>             the UIMA Ruta
>>>             Workbench and then copy them or make them available in
>>>             some other way. This
>>>             is especially necessary if you declare additional types
>>>             (type system
>>>             descriptor changes) or add some subscript (analysis engine
>>>             descriptor
>>>             changes).
>>>
>>>             In Ruta 2.3.0 which was just released, there is a maven
>>>             plugin for
>>>             building the descriptors. Take a look at:
>>> http://uima.apache.org/d/ruta-current/tools.ruta.book.html#d5e3271
>>>             This means that you do not need the UIMA Ruta Workbench
>>>             projects anymore,
>>>             but you can use its development support and descriptor
>>>             building in normal
>>>             maven projects.
>>>
>>>             Best,
>>>
>>>             Peter
>>>
>>>
>>>             Am 12.06.2015 um 21:38 schrieb Diego Buoro:
>>>
>>>                 Hello Peter
>>>
>>>                 We tried your suggestions and it worked liked a
>>>                 charm,thanks :D
>>>                 However, we are facing another problem: It seems that
>>>                 our application
>>>                 isn't
>>>                 creating the mainTypesystem and mainEngine files when
>>>                 we launch it. We
>>>                 don't know whether or not that's is the default
>>>                 behavior, but for now we
>>>                 are having to create these files in separate project
>>>                 and them copy them to
>>>                 the application whenever we change the script, which
>>>                 is a bad solution.
>>>                 Doy you have any suggestions?
>>>
>>>                 All Best,
>>>
>>>                 Diego
>>>
>>>                 2015-06-12 9:19 GMT-03:00 Diego Buoro
>>>                 <jklports@gmail.com <ma...@gmail.com>>:
>>>
>>>                   Hi Peter, Armin
>>>
>>>                     Thanks for the observations made, i hope we can
>>>                     finally get working here.
>>>                     We will try the changes in the next few days and
>>>                     then give you a
>>>                     feedback.
>>>
>>>                     All Best,
>>>
>>>                     Diego
>>>
>>>
>>>
>>>                     2015-06-03 14:14 GMT-03:00 Diego Buoro
>>>                     <jklports@gmail.com <ma...@gmail.com>>:
>>>
>>>                       Hi Peter, the example we used is the small
>>>                     sentence inside a string at
>>>
>>>                         the end of UIMAChecker.java: "Refiro-me à
>>>                         trabalho remunerado.".
>>>                         Based on the Main.ruta we sent you, we
>>>                         expected the output to contain 7
>>>                         "PROBLEM" annotations. This part is working.
>>>                         The problem is when we change the last line of
>>>                         Main.ruta from
>>>                         "cgToken{->PROBLEM};" to "cgToken
>>>                         cgToken{->PROBLEM};"in this case we
>>>                         expected 6 "PROBLEM" annotations: the same
>>>                         ones we had on the first
>>>                         example, excpect for the first one.That's what
>>>                         happens when you run the
>>>                         script on a simple Ruta project, but when we
>>>                         run it in the  Java
>>>                         application we get 0 "PROBLEM" annotations.
>>>                         We think this difference is happening because
>>>                         in the Ruta project we
>>>                         don't use a simple text as input.Instead, we
>>>                         feed it a preprocessed xmi
>>>                         file. On the other hand on the Java
>>>                         application, we do the processing
>>>                         ourselves via the processCas method. It's
>>>                         possible that the processCas
>>>                         method is creating tokens in a way that
>>>                         prevents us from detecting when
>>>                         one
>>>                         is next to the other on the Ruta script.
>>>                         We are sending you the xmi file to use as an
>>>                         example for a simple Ruta
>>>                         project. If there are any other examples you'd
>>>                         like us to send you, just
>>>                         say the word :D
>>>
>>>                         Best,
>>>
>>>                         Diego
>>>
>>>                         2015-06-01 11:15 GMT-03:00 Diego Buoro
>>>                         <jklports@gmail.com 
>>> <ma...@gmail.com>>:
>>>
>>>                           Sorry,please disregard my last answer. The
>>>                         idea wasn't to use the xmi,
>>>
>>>                             we are still thinking in a minimal example
>>>                             to provide to you.
>>>                             We will send you in the next few days.
>>>
>>>                             2015-06-01 10:37 GMT-03:00 Diego Buoro
>>>                             <jklports@gmail.com
>>> <ma...@gmail.com>>:
>>>
>>>                               Hi Peter,how are you doing?
>>>
>>>                                 We were trying to run using the files
>>>                                 such as Crase01.xmi and
>>>                                 rule_xml_001.xmi.
>>>                                 Our goal is trying to run those two
>>>                                 more simpler first,and then run
>>>                                 with Crase.xmi.
>>>
>>>                                 About the package declaration, i still
>>>                                 need to check what ruta version
>>>                                 is.
>>>                                 I will be checking this soon.
>>>
>>>                                 All Best,
>>>
>>>                                 Diego
>>>
>>>
>>>
>>>
>>>
>>>                                 2015-05-30 0:45 GMT-03:00 Diego Buoro
>>>                                 <jklports@gmail.com
>>> <ma...@gmail.com>>:
>>>
>>>                                   Hi Peter!
>>>
>>>                                     No problem, I appreciate your 
>>> support.
>>>
>>>                                     All Best,
>>>
>>>                                     Diego
>>>
>>>                                     2015-05-27 14:22 GMT-03:00 Diego
>>>                                     Buoro <jklports@gmail.com
>>> <ma...@gmail.com>>:
>>>
>>>                                       Hi Peter!
>>>
>>>                                         We call the script with the
>>>                                         following lines:
>>>
>>>                                            URL url =
>>> Resources.getResource("Main.ruta");
>>>                                         String text =
>>> Resources.toString(url,
>>>                                         Charsets.UTF_8);
>>> AnalysisEngineDescription
>>>                                         aeDes =
>>> Ruta.createAnalysisEngineDescription(text,
>>>                                         tsd);
>>>                                         this.ae <http://this.ae> =
>>> UIMAFramework.produceAnalysisEngine(aeDes);
>>>
>>>                                         CAS cas = ae.newCAS();
>>> converter.populateCas(sentence.getTextSentence(),
>>>                                         cas);
>>>                                            ae.process(cas);
>>>
>>>                                         The populateCAS method is
>>>                                         responsible for translating our
>>>                                         annotations
>>>                                         into RUTA annotations, but it
>>>                                         doesn't set any type priority
>>>                                         explicitly.
>>>                                         We don't know much about type
>>>                                         priorities, the RUTA 
>>> references we
>>>                                         found say very little about
>>>                                         that.Are they necessary for
>>>                                         doing what
>>>                                         we need?
>>>
>>>                                         The file that contains the
>>>                                         above lines is available here:
>>>
>>>
>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/UIMAChecker.java 
>>>
>>>                                         The processCAS mehtod is
>>>                                         available here:
>>>
>>>
>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/uima/UimaCasAdapter.java 
>>>
>>>                                         The script we are calling is
>>>                                         available here:
>>>
>>>
>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-ruta/script/Main.ruta 
>>>
>>>
>>>                                         PS:Yes, We remembered the
>>>                                         semicolons.
>>>
>>>                                         Thanks for the help :)
>>>
>>>
>>>
>>>                                         2015-05-26 15:30 GMT-03:00
>>>                                         Diego Buoro
>>>                                         <jklports@gmail.com
>>> <ma...@gmail.com>>:
>>>
>>>                                           I think i wasn't clear
>>>                                         enough, and i should be more
>>>                                         specific.
>>>
>>>                                             I have a type system in
>>>                                             which all words have been
>>>                                             annotated as
>>>                                             Tokens. I am calling a
>>>                                             RUTA script from a java
>>>                                             class, and that
>>>                                             script has
>>>                                             only one rule:
>>>                                             Token Token {-> Problem}
>>>
>>>                                             However, with this script,
>>>                                             no Problems are created.
>>>                                             When I try
>>>                                             Token {-> Problem}
>>>
>>>                                             I get one problem for each
>>>                                             Token, which is what I
>>>                                             expected. Why
>>>                                             can't I create annotations
>>>                                             using rules with more than
>>>                                             one word?
>>>
>>>                                             Thanks
>>>
>>>
>>>
>>>
>>>                                             2015-05-26 14:49 GMT-03:00
>>>                                             Diego Buoro
>>> <jklports@gmail.com
>>> <ma...@gmail.com>>:
>>>
>>>                                               Hello guys,how are you
>>>                                             doing?
>>>
>>>                                                 I would like to know
>>>                                                 once i have called
>>>                                                 RUTA from a Java 
>>> project,
>>>                                                 how can i mark
>>>                                                 consecutive tokens as
>>>                                                 a "Problem" (the name
>>>                                                 of my
>>>                                                 annotation, in this 
>>> case)?
>>>
>>>                                                 Thanks in advice!
>>>
>>>
>>>
>>>
>>
>>
>

Re: Marking cosnecutive tokens with RUTA

Posted by Peter Klügl <pk...@uni-wuerzburg.de>.

Missed to mention the cause of the problem:
The build process wasn't able to find the TypeSystem.xml type system in 
order to import it in the generated MainTypeSystem.xml type system.

Best,

Peter

Am 18.06.2015 um 20:10 schrieb Peter Klügl:
> Hi,
>
> the Ruta descriptors are built twice in your project. Once in the 
> normal phase and once during testing (don't know why).
> The classpath of the two builds is different. In the second one, the 
> imported type system can be found in the classpath, but not in the 
> first one. In a normal mvn clean install, the descriptors are not yet 
> built when they should be added to the jar. That's the reason why it 
> works in Eclipse (no clean?), but not for a mvn clean install.
>
> The simplest solution is to extend the descriptor paths so that the 
> imported type system is found without the classpath:
>
> <descriptorPaths>
> <descriptorPath>${project.build.directory}/generated-sources/ruta/descriptor</descriptorPath> 
>
> <descriptorPath>${basedir}/descriptor</descriptorPath>
> </descriptorPaths>
>
> btw, there are several duplicated files in the project, which 
> potentially hide problems in the build process.
>
> Best,
>
> Peter
>
> Am 18.06.2015 um 17:30 schrieb Diego Buoro:
>> Hi Peter, thanks for the support.
>>
>> We are now using Java 7, but we are still facing problems. In 
>> Eclipse, we've set manually the path where descriptor files are 
>> located (target/generated-sources/ruta/descriptor), and therefore 
>> it's working. However, when we run mvn clean install, we generate our 
>> descriptors files in 
>> cogroo-ruta/target/generated-sources/ruta/descriptor/cogroo/ruta but 
>> they aren't being copied to .jar file. The errors are in this log 
>> file, do you have any idea of why they are happening?
>>
>> Here is the link to the repository: 
>> https://github.com/Fichberg/cogroo4/tree/labXP215_Will
>>
>> All Best,
>>
>> Diego
>>
>>
>>
>>
>>
>>
>> 2015-06-17 16:20 GMT-03:00 Peter Klügl <peter.kluegl@averbis.com 
>> <ma...@averbis.com>>:
>>
>>     Hi,
>>
>>     UIMA Ruta 2.3.0 and also the maven plugin require Java 7. Thus,
>>     the maven build process has to use the correct Java version. Just
>>     wanted to mention it because I had this problem right away.
>>
>>     The descriptors are not built because the plugin does not find any
>>     ruta files. The maven plugin is specified in one project while the
>>     ruta files are located in a different project. The problem is that
>>     the ruta maven plugin only collects ruta files within the basedir
>>     of the project -> no files built...
>>
>>     In the next release, the maven plugin will get another parameter
>>     for specifying the input files.
>>
>>     With UIMA Ruta 2.3.0, there are two options: Either you put the
>>     ruta files in the project with the ruta maven plugin, or you add
>>     the ruta maven plugin to the project pom with the ruta files.
>>
>>     Best,
>>
>>     Peter
>>
>>
>>     Am 17.06.2015 um 18:30 schrieb Diego Buoro:
>>
>>         Hi, Peter! We are attempting to create the descriptors based
>>         on Ruta 2.3,
>>         but we're out of luck. We've added the lines from the link you
>>         gave us to
>>         the pom.xml file and corrected the directory paths to suit our
>>         project.
>>         However, when we try to run Maven with Ruta's "generate" goal,
>>         no files got
>>         generated on the folders we set. Is the goal supposed to
>>         generate the files
>>         and leave them in the folder or does it do something else?
>>
>>         Here is the link to our altered pom.xml. The plugin section is
>>         at the end
>>         of the file:
>> https://raw.githubusercontent.com/Fichberg/cogroo4/labXP215_Will/cogroo-gc/pom.xml
>>
>>         Thanks for the help so far. :D
>>
>>         2015-06-14 9:40 GMT-03:00 Peter Klügl
>>         <peter.kluegl@averbis.com <ma...@averbis.com>>:
>>
>>             Hi,
>>
>>             the descriptor are always created at compile time.
>>
>>             In Ruta 2.2.1, yes, you need to create the descriptors in
>>             the UIMA Ruta
>>             Workbench and then copy them or make them available in
>>             some other way. This
>>             is especially necessary if you declare additional types
>>             (type system
>>             descriptor changes) or add some subscript (analysis engine
>>             descriptor
>>             changes).
>>
>>             In Ruta 2.3.0 which was just released, there is a maven
>>             plugin for
>>             building the descriptors. Take a look at:
>> http://uima.apache.org/d/ruta-current/tools.ruta.book.html#d5e3271
>>             This means that you do not need the UIMA Ruta Workbench
>>             projects anymore,
>>             but you can use its development support and descriptor
>>             building in normal
>>             maven projects.
>>
>>             Best,
>>
>>             Peter
>>
>>
>>             Am 12.06.2015 um 21:38 schrieb Diego Buoro:
>>
>>                 Hello Peter
>>
>>                 We tried your suggestions and it worked liked a
>>                 charm,thanks :D
>>                 However, we are facing another problem: It seems that
>>                 our application
>>                 isn't
>>                 creating the mainTypesystem and mainEngine files when
>>                 we launch it. We
>>                 don't know whether or not that's is the default
>>                 behavior, but for now we
>>                 are having to create these files in separate project
>>                 and them copy them to
>>                 the application whenever we change the script, which
>>                 is a bad solution.
>>                 Doy you have any suggestions?
>>
>>                 All Best,
>>
>>                 Diego
>>
>>                 2015-06-12 9:19 GMT-03:00 Diego Buoro
>>                 <jklports@gmail.com <ma...@gmail.com>>:
>>
>>                   Hi Peter, Armin
>>
>>                     Thanks for the observations made, i hope we can
>>                     finally get working here.
>>                     We will try the changes in the next few days and
>>                     then give you a
>>                     feedback.
>>
>>                     All Best,
>>
>>                     Diego
>>
>>
>>
>>                     2015-06-03 14:14 GMT-03:00 Diego Buoro
>>                     <jklports@gmail.com <ma...@gmail.com>>:
>>
>>                       Hi Peter, the example we used is the small
>>                     sentence inside a string at
>>
>>                         the end of UIMAChecker.java: "Refiro-me à
>>                         trabalho remunerado.".
>>                         Based on the Main.ruta we sent you, we
>>                         expected the output to contain 7
>>                         "PROBLEM" annotations. This part is working.
>>                         The problem is when we change the last line of
>>                         Main.ruta from
>>                         "cgToken{->PROBLEM};" to "cgToken
>>                         cgToken{->PROBLEM};"in this case we
>>                         expected 6 "PROBLEM" annotations: the same
>>                         ones we had on the first
>>                         example, excpect for the first one.That's what
>>                         happens when you run the
>>                         script on a simple Ruta project, but when we
>>                         run it in the  Java
>>                         application we get 0 "PROBLEM" annotations.
>>                         We think this difference is happening because
>>                         in the Ruta project we
>>                         don't use a simple text as input.Instead, we
>>                         feed it a preprocessed xmi
>>                         file. On the other hand on the Java
>>                         application, we do the processing
>>                         ourselves via the processCas method. It's
>>                         possible that the processCas
>>                         method is creating tokens in a way that
>>                         prevents us from detecting when
>>                         one
>>                         is next to the other on the Ruta script.
>>                         We are sending you the xmi file to use as an
>>                         example for a simple Ruta
>>                         project. If there are any other examples you'd
>>                         like us to send you, just
>>                         say the word :D
>>
>>                         Best,
>>
>>                         Diego
>>
>>                         2015-06-01 11:15 GMT-03:00 Diego Buoro
>>                         <jklports@gmail.com 
>> <ma...@gmail.com>>:
>>
>>                           Sorry,please disregard my last answer. The
>>                         idea wasn't to use the xmi,
>>
>>                             we are still thinking in a minimal example
>>                             to provide to you.
>>                             We will send you in the next few days.
>>
>>                             2015-06-01 10:37 GMT-03:00 Diego Buoro
>>                             <jklports@gmail.com
>> <ma...@gmail.com>>:
>>
>>                               Hi Peter,how are you doing?
>>
>>                                 We were trying to run using the files
>>                                 such as Crase01.xmi and
>>                                 rule_xml_001.xmi.
>>                                 Our goal is trying to run those two
>>                                 more simpler first,and then run
>>                                 with Crase.xmi.
>>
>>                                 About the package declaration, i still
>>                                 need to check what ruta version
>>                                 is.
>>                                 I will be checking this soon.
>>
>>                                 All Best,
>>
>>                                 Diego
>>
>>
>>
>>
>>
>>                                 2015-05-30 0:45 GMT-03:00 Diego Buoro
>>                                 <jklports@gmail.com
>> <ma...@gmail.com>>:
>>
>>                                   Hi Peter!
>>
>>                                     No problem, I appreciate your 
>> support.
>>
>>                                     All Best,
>>
>>                                     Diego
>>
>>                                     2015-05-27 14:22 GMT-03:00 Diego
>>                                     Buoro <jklports@gmail.com
>> <ma...@gmail.com>>:
>>
>>                                       Hi Peter!
>>
>>                                         We call the script with the
>>                                         following lines:
>>
>>                                            URL url =
>> Resources.getResource("Main.ruta");
>>                                         String text =
>>                                         Resources.toString(url,
>>                                         Charsets.UTF_8);
>> AnalysisEngineDescription
>>                                         aeDes =
>> Ruta.createAnalysisEngineDescription(text,
>>                                         tsd);
>>                                         this.ae <http://this.ae> =
>> UIMAFramework.produceAnalysisEngine(aeDes);
>>
>>                                         CAS cas = ae.newCAS();
>> converter.populateCas(sentence.getTextSentence(),
>>                                         cas);
>>                                            ae.process(cas);
>>
>>                                         The populateCAS method is
>>                                         responsible for translating our
>>                                         annotations
>>                                         into RUTA annotations, but it
>>                                         doesn't set any type priority
>>                                         explicitly.
>>                                         We don't know much about type
>>                                         priorities, the RUTA 
>> references we
>>                                         found say very little about
>>                                         that.Are they necessary for
>>                                         doing what
>>                                         we need?
>>
>>                                         The file that contains the
>>                                         above lines is available here:
>>
>>
>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/UIMAChecker.java
>>                                         The processCAS mehtod is
>>                                         available here:
>>
>>
>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/uima/UimaCasAdapter.java
>>                                         The script we are calling is
>>                                         available here:
>>
>>
>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-ruta/script/Main.ruta
>>
>>                                         PS:Yes, We remembered the
>>                                         semicolons.
>>
>>                                         Thanks for the help :)
>>
>>
>>
>>                                         2015-05-26 15:30 GMT-03:00
>>                                         Diego Buoro
>>                                         <jklports@gmail.com
>> <ma...@gmail.com>>:
>>
>>                                           I think i wasn't clear
>>                                         enough, and i should be more
>>                                         specific.
>>
>>                                             I have a type system in
>>                                             which all words have been
>>                                             annotated as
>>                                             Tokens. I am calling a
>>                                             RUTA script from a java
>>                                             class, and that
>>                                             script has
>>                                             only one rule:
>>                                             Token Token {-> Problem}
>>
>>                                             However, with this script,
>>                                             no Problems are created.
>>                                             When I try
>>                                             Token {-> Problem}
>>
>>                                             I get one problem for each
>>                                             Token, which is what I
>>                                             expected. Why
>>                                             can't I create annotations
>>                                             using rules with more than
>>                                             one word?
>>
>>                                             Thanks
>>
>>
>>
>>
>>                                             2015-05-26 14:49 GMT-03:00
>>                                             Diego Buoro
>> <jklports@gmail.com
>> <ma...@gmail.com>>:
>>
>>                                               Hello guys,how are you
>>                                             doing?
>>
>>                                                 I would like to know
>>                                                 once i have called
>>                                                 RUTA from a Java 
>> project,
>>                                                 how can i mark
>>                                                 consecutive tokens as
>>                                                 a "Problem" (the name
>>                                                 of my
>>                                                 annotation, in this 
>> case)?
>>
>>                                                 Thanks in advice!
>>
>>
>>
>>
>
>

Re: Marking cosnecutive tokens with RUTA

Posted by Peter Klügl <pk...@uni-wuerzburg.de>.

Hi,

the Ruta descriptors are built twice in your project. Once in the normal 
phase and once during testing (don't know why).
The classpath of the two builds is different. In the second one, the 
imported type system can be found in the classpath, but not in the first 
one. In a normal mvn clean install, the descriptors are not yet built 
when they should be added to the jar. That's the reason why it works in 
Eclipse (no clean?), but not for a mvn clean install.

The simplest solution is to extend the descriptor paths so that the 
imported type system is found without the classpath:

<descriptorPaths>
<descriptorPath>${project.build.directory}/generated-sources/ruta/descriptor</descriptorPath>
<descriptorPath>${basedir}/descriptor</descriptorPath>
</descriptorPaths>

btw, there are several duplicated files in the project, which 
potentially hide problems in the build process.

Best,

Peter

Am 18.06.2015 um 17:30 schrieb Diego Buoro:
> Hi Peter, thanks for the support.
>
> We are now using Java 7, but we are still facing problems. In Eclipse, 
> we've set manually the path where descriptor files are located 
> (target/generated-sources/ruta/descriptor), and therefore it's 
> working. However, when we run mvn clean install, we generate our 
> descriptors files in 
> cogroo-ruta/target/generated-sources/ruta/descriptor/cogroo/ruta but 
> they aren't being copied to .jar file. The errors are in this log 
> file, do you have any idea of why they are happening?
>
> Here is the link to the repository: 
> https://github.com/Fichberg/cogroo4/tree/labXP215_Will
>
> All Best,
>
> Diego
>
>
>
>
>
>
> 2015-06-17 16:20 GMT-03:00 Peter Klügl <peter.kluegl@averbis.com 
> <ma...@averbis.com>>:
>
>     Hi,
>
>     UIMA Ruta 2.3.0 and also the maven plugin require Java 7. Thus,
>     the maven build process has to use the correct Java version. Just
>     wanted to mention it because I had this problem right away.
>
>     The descriptors are not built because the plugin does not find any
>     ruta files. The maven plugin is specified in one project while the
>     ruta files are located in a different project. The problem is that
>     the ruta maven plugin only collects ruta files within the basedir
>     of the project -> no files built...
>
>     In the next release, the maven plugin will get another parameter
>     for specifying the input files.
>
>     With UIMA Ruta 2.3.0, there are two options: Either you put the
>     ruta files in the project with the ruta maven plugin, or you add
>     the ruta maven plugin to the project pom with the ruta files.
>
>     Best,
>
>     Peter
>
>
>     Am 17.06.2015 um 18:30 schrieb Diego Buoro:
>
>         Hi, Peter! We are attempting to create the descriptors based
>         on Ruta 2.3,
>         but we're out of luck. We've added the lines from the link you
>         gave us to
>         the pom.xml file and corrected the directory paths to suit our
>         project.
>         However, when we try to run Maven with Ruta's "generate" goal,
>         no files got
>         generated on the folders we set. Is the goal supposed to
>         generate the files
>         and leave them in the folder or does it do something else?
>
>         Here is the link to our altered pom.xml. The plugin section is
>         at the end
>         of the file:
>         https://raw.githubusercontent.com/Fichberg/cogroo4/labXP215_Will/cogroo-gc/pom.xml
>
>         Thanks for the help so far. :D
>
>         2015-06-14 9:40 GMT-03:00 Peter Klügl
>         <peter.kluegl@averbis.com <ma...@averbis.com>>:
>
>             Hi,
>
>             the descriptor are always created at compile time.
>
>             In Ruta 2.2.1, yes, you need to create the descriptors in
>             the UIMA Ruta
>             Workbench and then copy them or make them available in
>             some other way. This
>             is especially necessary if you declare additional types
>             (type system
>             descriptor changes) or add some subscript (analysis engine
>             descriptor
>             changes).
>
>             In Ruta 2.3.0 which was just released, there is a maven
>             plugin for
>             building the descriptors. Take a look at:
>             http://uima.apache.org/d/ruta-current/tools.ruta.book.html#d5e3271
>             This means that you do not need the UIMA Ruta Workbench
>             projects anymore,
>             but you can use its development support and descriptor
>             building in normal
>             maven projects.
>
>             Best,
>
>             Peter
>
>
>             Am 12.06.2015 um 21:38 schrieb Diego Buoro:
>
>                 Hello Peter
>
>                 We tried your suggestions and it worked liked a
>                 charm,thanks :D
>                 However, we are facing another problem: It seems that
>                 our application
>                 isn't
>                 creating the mainTypesystem and mainEngine files when
>                 we launch it. We
>                 don't know whether or not that's is the default
>                 behavior, but for now we
>                 are having to create these files in separate project
>                 and them copy them to
>                 the application whenever we change the script, which
>                 is a bad solution.
>                 Doy you have any suggestions?
>
>                 All Best,
>
>                 Diego
>
>                 2015-06-12 9:19 GMT-03:00 Diego Buoro
>                 <jklports@gmail.com <ma...@gmail.com>>:
>
>                   Hi Peter, Armin
>
>                     Thanks for the observations made, i hope we can
>                     finally get working here.
>                     We will try the changes in the next few days and
>                     then give you a
>                     feedback.
>
>                     All Best,
>
>                     Diego
>
>
>
>                     2015-06-03 14:14 GMT-03:00 Diego Buoro
>                     <jklports@gmail.com <ma...@gmail.com>>:
>
>                       Hi Peter, the example we used is the small
>                     sentence inside a string at
>
>                         the end of UIMAChecker.java: "Refiro-me à
>                         trabalho remunerado.".
>                         Based on the Main.ruta we sent you, we
>                         expected the output to contain 7
>                         "PROBLEM" annotations. This part is working.
>                         The problem is when we change the last line of
>                         Main.ruta from
>                         "cgToken{->PROBLEM};" to "cgToken
>                         cgToken{->PROBLEM};"in this case we
>                         expected 6 "PROBLEM" annotations: the same
>                         ones we had on the first
>                         example, excpect for the first one.That's what
>                         happens when you run the
>                         script on a simple Ruta project, but when we
>                         run it in the  Java
>                         application we get 0 "PROBLEM" annotations.
>                         We think this difference is happening because
>                         in the Ruta project we
>                         don't use a simple text as input.Instead, we
>                         feed it a preprocessed xmi
>                         file. On the other hand on the Java
>                         application, we do the processing
>                         ourselves via the processCas method. It's
>                         possible that the processCas
>                         method is creating tokens in a way that
>                         prevents us from detecting when
>                         one
>                         is next to the other on the Ruta script.
>                         We are sending you the xmi file to use as an
>                         example for a simple Ruta
>                         project. If there are any other examples you'd
>                         like us to send you, just
>                         say the word :D
>
>                         Best,
>
>                         Diego
>
>                         2015-06-01 11:15 GMT-03:00 Diego Buoro
>                         <jklports@gmail.com <ma...@gmail.com>>:
>
>                           Sorry,please disregard my last answer. The
>                         idea wasn't to use the xmi,
>
>                             we are still thinking in a minimal example
>                             to provide to you.
>                             We will send you in the next few days.
>
>                             2015-06-01 10:37 GMT-03:00 Diego Buoro
>                             <jklports@gmail.com
>                             <ma...@gmail.com>>:
>
>                               Hi Peter,how are you doing?
>
>                                 We were trying to run using the files
>                                 such as Crase01.xmi and
>                                 rule_xml_001.xmi.
>                                 Our goal is trying to run those two
>                                 more simpler first,and then run
>                                 with Crase.xmi.
>
>                                 About the package declaration, i still
>                                 need to check what ruta version
>                                 is.
>                                 I will be checking this soon.
>
>                                 All Best,
>
>                                 Diego
>
>
>
>
>
>                                 2015-05-30 0:45 GMT-03:00 Diego Buoro
>                                 <jklports@gmail.com
>                                 <ma...@gmail.com>>:
>
>                                   Hi Peter!
>
>                                     No problem, I appreciate your support.
>
>                                     All Best,
>
>                                     Diego
>
>                                     2015-05-27 14:22 GMT-03:00 Diego
>                                     Buoro <jklports@gmail.com
>                                     <ma...@gmail.com>>:
>
>                                       Hi Peter!
>
>                                         We call the script with the
>                                         following lines:
>
>                                            URL url =
>                                         Resources.getResource("Main.ruta");
>                                         String text =
>                                         Resources.toString(url,
>                                         Charsets.UTF_8);
>                                            AnalysisEngineDescription
>                                         aeDes =
>                                         Ruta.createAnalysisEngineDescription(text,
>                                         tsd);
>                                         this.ae <http://this.ae> =
>                                         UIMAFramework.produceAnalysisEngine(aeDes);
>
>                                         CAS cas = ae.newCAS();
>                                         converter.populateCas(sentence.getTextSentence(),
>                                         cas);
>                                            ae.process(cas);
>
>                                         The populateCAS method is
>                                         responsible for translating our
>                                         annotations
>                                         into RUTA annotations, but it
>                                         doesn't set any type priority
>                                         explicitly.
>                                         We don't know much about type
>                                         priorities, the RUTA references we
>                                         found say very little about
>                                         that.Are they necessary for
>                                         doing what
>                                         we need?
>
>                                         The file that contains the
>                                         above lines is available here:
>
>
>                                         https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/UIMAChecker.java
>                                         The processCAS mehtod is
>                                         available here:
>
>
>                                         https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/uima/UimaCasAdapter.java
>                                         The script we are calling is
>                                         available here:
>
>
>                                         https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-ruta/script/Main.ruta
>
>                                         PS:Yes, We remembered the
>                                         semicolons.
>
>                                         Thanks for the help :)
>
>
>
>                                         2015-05-26 15:30 GMT-03:00
>                                         Diego Buoro
>                                         <jklports@gmail.com
>                                         <ma...@gmail.com>>:
>
>                                           I think i wasn't clear
>                                         enough, and i should be more
>                                         specific.
>
>                                             I have a type system in
>                                             which all words have been
>                                             annotated as
>                                             Tokens. I am calling a
>                                             RUTA script from a java
>                                             class, and that
>                                             script has
>                                             only one rule:
>                                             Token Token {-> Problem}
>
>                                             However, with this script,
>                                             no Problems are created.
>                                             When I try
>                                             Token {-> Problem}
>
>                                             I get one problem for each
>                                             Token, which is what I
>                                             expected. Why
>                                             can't I create annotations
>                                             using rules with more than
>                                             one word?
>
>                                             Thanks
>
>
>
>
>                                             2015-05-26 14:49 GMT-03:00
>                                             Diego Buoro
>                                             <jklports@gmail.com
>                                             <ma...@gmail.com>>:
>
>                                               Hello guys,how are you
>                                             doing?
>
>                                                 I would like to know
>                                                 once i have called
>                                                 RUTA from a Java project,
>                                                 how can i mark
>                                                 consecutive tokens as
>                                                 a "Problem" (the name
>                                                 of my
>                                                 annotation, in this case)?
>
>                                                 Thanks in advice!
>
>
>
>

Re: Marking cosnecutive tokens with RUTA

Posted by Diego Buoro <jk...@gmail.com>.

Hi Peter, thanks for the support.

We are now using Java 7, but we are still facing problems. In Eclipse,
we've set manually the path where descriptor files are located
(target/generated-sources/ruta/descriptor), and therefore it's working.
However, when we run mvn clean install, we generate our descriptors files
in cogroo-ruta/target/generated-sources/ruta/descriptor/cogroo/ruta but
they aren't being copied to .jar file. The errors are in this log file, do
you have any idea of why they are happening?

Here is the link to the repository:
https://github.com/Fichberg/cogroo4/tree/labXP215_Will

All Best,

Diego






2015-06-17 16:20 GMT-03:00 Peter Klügl <pe...@averbis.com>:

> Hi,
>
> UIMA Ruta 2.3.0 and also the maven plugin require Java 7. Thus, the maven
> build process has to use the correct Java version. Just wanted to mention
> it because I had this problem right away.
>
> The descriptors are not built because the plugin does not find any ruta
> files. The maven plugin is specified in one project while the ruta files
> are located in a different project. The problem is that the ruta maven
> plugin only collects ruta files within the basedir of the project -> no
> files built...
>
> In the next release, the maven plugin will get another parameter for
> specifying the input files.
>
> With UIMA Ruta 2.3.0, there are two options: Either you put the ruta files
> in the project with the ruta maven plugin, or you add the ruta maven plugin
> to the project pom with the ruta files.
>
> Best,
>
> Peter
>
>
> Am 17.06.2015 um 18:30 schrieb Diego Buoro:
>
>> Hi, Peter! We are attempting to create the descriptors based on Ruta 2.3,
>> but we're out of luck. We've added the lines from the link you gave us to
>> the pom.xml file and corrected the directory paths to suit our project.
>> However, when we try to run Maven with Ruta's "generate" goal, no files
>> got
>> generated on the folders we set. Is the goal supposed to generate the
>> files
>> and leave them in the folder or does it do something else?
>>
>> Here is the link to our altered pom.xml. The plugin section is at the end
>> of the file:
>>
>> https://raw.githubusercontent.com/Fichberg/cogroo4/labXP215_Will/cogroo-gc/pom.xml
>>
>> Thanks for the help so far. :D
>>
>> 2015-06-14 9:40 GMT-03:00 Peter Klügl <pe...@averbis.com>:
>>
>>  Hi,
>>>
>>> the descriptor are always created at compile time.
>>>
>>> In Ruta 2.2.1, yes, you need to create the descriptors in the UIMA Ruta
>>> Workbench and then copy them or make them available in some other way.
>>> This
>>> is especially necessary if you declare additional types (type system
>>> descriptor changes) or add some subscript (analysis engine descriptor
>>> changes).
>>>
>>> In Ruta 2.3.0 which was just released, there is a maven plugin for
>>> building the descriptors. Take a look at:
>>> http://uima.apache.org/d/ruta-current/tools.ruta.book.html#d5e3271
>>> This means that you do not need the UIMA Ruta Workbench projects anymore,
>>> but you can use its development support and descriptor building in normal
>>> maven projects.
>>>
>>> Best,
>>>
>>> Peter
>>>
>>>
>>> Am 12.06.2015 um 21:38 schrieb Diego Buoro:
>>>
>>>  Hello Peter
>>>>
>>>> We tried your suggestions and it worked liked a charm,thanks :D
>>>> However, we are facing another problem: It seems that our application
>>>> isn't
>>>> creating the mainTypesystem and mainEngine files when we launch it. We
>>>> don't know whether or not that's is the default behavior, but for now we
>>>> are having to create these files in separate project and them copy them
>>>> to
>>>> the application whenever we change the script, which is a bad solution.
>>>> Doy you have any suggestions?
>>>>
>>>> All Best,
>>>>
>>>> Diego
>>>>
>>>> 2015-06-12 9:19 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>
>>>>   Hi Peter, Armin
>>>>
>>>>> Thanks for the observations made, i hope we can finally get working
>>>>> here.
>>>>> We will try the changes in the next few days and then give you a
>>>>> feedback.
>>>>>
>>>>> All Best,
>>>>>
>>>>> Diego
>>>>>
>>>>>
>>>>>
>>>>> 2015-06-03 14:14 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>>
>>>>>   Hi Peter, the example we used is the small sentence inside a string
>>>>> at
>>>>>
>>>>>> the end of UIMAChecker.java: "Refiro-me à trabalho remunerado.".
>>>>>> Based on the Main.ruta we sent you, we expected the output to contain
>>>>>> 7
>>>>>> "PROBLEM" annotations. This part is working.
>>>>>> The problem is when we change the last line of Main.ruta from
>>>>>> "cgToken{->PROBLEM};" to "cgToken cgToken{->PROBLEM};"in this case we
>>>>>> expected 6 "PROBLEM" annotations: the same ones we had on the first
>>>>>> example, excpect for the first one.That's what happens when you run
>>>>>> the
>>>>>> script on a simple Ruta project, but when we run it in the  Java
>>>>>> application we get 0 "PROBLEM" annotations.
>>>>>> We think this difference is happening because in the Ruta project we
>>>>>> don't use a simple text as input.Instead, we feed it a preprocessed
>>>>>> xmi
>>>>>> file. On the other hand on the Java application, we do the processing
>>>>>> ourselves via the processCas method. It's possible that the processCas
>>>>>> method is creating tokens in a way that prevents us from detecting
>>>>>> when
>>>>>> one
>>>>>> is next to the other on the Ruta script.
>>>>>> We are sending you the xmi file to use as an example for a simple Ruta
>>>>>> project. If there are any other examples you'd like us to send you,
>>>>>> just
>>>>>> say the word :D
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Diego
>>>>>>
>>>>>> 2015-06-01 11:15 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>>>
>>>>>>   Sorry,please disregard my last answer. The idea wasn't to use the
>>>>>> xmi,
>>>>>>
>>>>>>> we are still thinking in a minimal example to provide to you.
>>>>>>> We will send you in the next few days.
>>>>>>>
>>>>>>> 2015-06-01 10:37 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>>>>
>>>>>>>   Hi Peter,how are you doing?
>>>>>>>
>>>>>>>> We were trying to run using the files such as Crase01.xmi and
>>>>>>>> rule_xml_001.xmi.
>>>>>>>> Our goal is trying to run those two more simpler first,and then run
>>>>>>>> with Crase.xmi.
>>>>>>>>
>>>>>>>> About the package declaration, i still need to check what ruta
>>>>>>>> version
>>>>>>>> is.
>>>>>>>> I will be checking this soon.
>>>>>>>>
>>>>>>>> All Best,
>>>>>>>>
>>>>>>>> Diego
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 2015-05-30 0:45 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>>>>>
>>>>>>>>   Hi Peter!
>>>>>>>>
>>>>>>>>> No problem, I appreciate your support.
>>>>>>>>>
>>>>>>>>> All Best,
>>>>>>>>>
>>>>>>>>> Diego
>>>>>>>>>
>>>>>>>>> 2015-05-27 14:22 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>>>>>>
>>>>>>>>>   Hi Peter!
>>>>>>>>>
>>>>>>>>>> We call the script with the following lines:
>>>>>>>>>>
>>>>>>>>>>    URL url = Resources.getResource("Main.ruta");
>>>>>>>>>> String text = Resources.toString(url, Charsets.UTF_8);
>>>>>>>>>>    AnalysisEngineDescription aeDes =
>>>>>>>>>> Ruta.createAnalysisEngineDescription(text, tsd);
>>>>>>>>>> this.ae = UIMAFramework.produceAnalysisEngine(aeDes);
>>>>>>>>>>
>>>>>>>>>> CAS cas = ae.newCAS();
>>>>>>>>>> converter.populateCas(sentence.getTextSentence(), cas);
>>>>>>>>>>    ae.process(cas);
>>>>>>>>>>
>>>>>>>>>> The populateCAS method is responsible for translating our
>>>>>>>>>> annotations
>>>>>>>>>> into RUTA annotations, but it doesn't set any type priority
>>>>>>>>>> explicitly.
>>>>>>>>>> We don't know much about type priorities, the RUTA references we
>>>>>>>>>> found say very little about that.Are they necessary for doing what
>>>>>>>>>> we need?
>>>>>>>>>>
>>>>>>>>>> The file that contains the above lines is available here:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/UIMAChecker.java
>>>>>>>>>> The processCAS mehtod is available here:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/uima/UimaCasAdapter.java
>>>>>>>>>> The script we are calling is available here:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-ruta/script/Main.ruta
>>>>>>>>>>
>>>>>>>>>> PS:Yes, We remembered the semicolons.
>>>>>>>>>>
>>>>>>>>>> Thanks for the help :)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2015-05-26 15:30 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>>>>>>>
>>>>>>>>>>   I think i wasn't clear enough, and i should be more specific.
>>>>>>>>>>
>>>>>>>>>>> I have a type system in which all words have been annotated as
>>>>>>>>>>> Tokens. I am calling a RUTA script from a java class, and that
>>>>>>>>>>> script has
>>>>>>>>>>> only one rule:
>>>>>>>>>>> Token Token {-> Problem}
>>>>>>>>>>>
>>>>>>>>>>> However, with this script, no Problems are created. When I try
>>>>>>>>>>> Token {-> Problem}
>>>>>>>>>>>
>>>>>>>>>>> I get one problem for each Token, which is what I expected. Why
>>>>>>>>>>> can't I create annotations using rules with more than one word?
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 2015-05-26 14:49 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>>>>>>>>
>>>>>>>>>>>   Hello guys,how are you doing?
>>>>>>>>>>>
>>>>>>>>>>>> I would like to know once i have called RUTA from a Java
>>>>>>>>>>>> project,
>>>>>>>>>>>> how can i mark consecutive tokens as a "Problem" (the name of my
>>>>>>>>>>>> annotation, in this case)?
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks in advice!
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>

Re: Marking cosnecutive tokens with RUTA

Posted by Peter Klügl <pe...@averbis.com>.

Hi,

UIMA Ruta 2.3.0 and also the maven plugin require Java 7. Thus, the 
maven build process has to use the correct Java version. Just wanted to 
mention it because I had this problem right away.

The descriptors are not built because the plugin does not find any ruta 
files. The maven plugin is specified in one project while the ruta files 
are located in a different project. The problem is that the ruta maven 
plugin only collects ruta files within the basedir of the project -> no 
files built...

In the next release, the maven plugin will get another parameter for 
specifying the input files.

With UIMA Ruta 2.3.0, there are two options: Either you put the ruta 
files in the project with the ruta maven plugin, or you add the ruta 
maven plugin to the project pom with the ruta files.

Best,

Peter

Am 17.06.2015 um 18:30 schrieb Diego Buoro:
> Hi, Peter! We are attempting to create the descriptors based on Ruta 2.3,
> but we're out of luck. We've added the lines from the link you gave us to
> the pom.xml file and corrected the directory paths to suit our project.
> However, when we try to run Maven with Ruta's "generate" goal, no files got
> generated on the folders we set. Is the goal supposed to generate the files
> and leave them in the folder or does it do something else?
>
> Here is the link to our altered pom.xml. The plugin section is at the end
> of the file:
> https://raw.githubusercontent.com/Fichberg/cogroo4/labXP215_Will/cogroo-gc/pom.xml
>
> Thanks for the help so far. :D
>
> 2015-06-14 9:40 GMT-03:00 Peter Klügl <pe...@averbis.com>:
>
>> Hi,
>>
>> the descriptor are always created at compile time.
>>
>> In Ruta 2.2.1, yes, you need to create the descriptors in the UIMA Ruta
>> Workbench and then copy them or make them available in some other way. This
>> is especially necessary if you declare additional types (type system
>> descriptor changes) or add some subscript (analysis engine descriptor
>> changes).
>>
>> In Ruta 2.3.0 which was just released, there is a maven plugin for
>> building the descriptors. Take a look at:
>> http://uima.apache.org/d/ruta-current/tools.ruta.book.html#d5e3271
>> This means that you do not need the UIMA Ruta Workbench projects anymore,
>> but you can use its development support and descriptor building in normal
>> maven projects.
>>
>> Best,
>>
>> Peter
>>
>>
>> Am 12.06.2015 um 21:38 schrieb Diego Buoro:
>>
>>> Hello Peter
>>>
>>> We tried your suggestions and it worked liked a charm,thanks :D
>>> However, we are facing another problem: It seems that our application
>>> isn't
>>> creating the mainTypesystem and mainEngine files when we launch it. We
>>> don't know whether or not that's is the default behavior, but for now we
>>> are having to create these files in separate project and them copy them to
>>> the application whenever we change the script, which is a bad solution.
>>> Doy you have any suggestions?
>>>
>>> All Best,
>>>
>>> Diego
>>>
>>> 2015-06-12 9:19 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>
>>>   Hi Peter, Armin
>>>> Thanks for the observations made, i hope we can finally get working here.
>>>> We will try the changes in the next few days and then give you a
>>>> feedback.
>>>>
>>>> All Best,
>>>>
>>>> Diego
>>>>
>>>>
>>>>
>>>> 2015-06-03 14:14 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>
>>>>   Hi Peter, the example we used is the small sentence inside a string at
>>>>> the end of UIMAChecker.java: "Refiro-me à trabalho remunerado.".
>>>>> Based on the Main.ruta we sent you, we expected the output to contain 7
>>>>> "PROBLEM" annotations. This part is working.
>>>>> The problem is when we change the last line of Main.ruta from
>>>>> "cgToken{->PROBLEM};" to "cgToken cgToken{->PROBLEM};"in this case we
>>>>> expected 6 "PROBLEM" annotations: the same ones we had on the first
>>>>> example, excpect for the first one.That's what happens when you run the
>>>>> script on a simple Ruta project, but when we run it in the  Java
>>>>> application we get 0 "PROBLEM" annotations.
>>>>> We think this difference is happening because in the Ruta project we
>>>>> don't use a simple text as input.Instead, we feed it a preprocessed xmi
>>>>> file. On the other hand on the Java application, we do the processing
>>>>> ourselves via the processCas method. It's possible that the processCas
>>>>> method is creating tokens in a way that prevents us from detecting when
>>>>> one
>>>>> is next to the other on the Ruta script.
>>>>> We are sending you the xmi file to use as an example for a simple Ruta
>>>>> project. If there are any other examples you'd like us to send you, just
>>>>> say the word :D
>>>>>
>>>>> Best,
>>>>>
>>>>> Diego
>>>>>
>>>>> 2015-06-01 11:15 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>>
>>>>>   Sorry,please disregard my last answer. The idea wasn't to use the xmi,
>>>>>> we are still thinking in a minimal example to provide to you.
>>>>>> We will send you in the next few days.
>>>>>>
>>>>>> 2015-06-01 10:37 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>>>
>>>>>>   Hi Peter,how are you doing?
>>>>>>> We were trying to run using the files such as Crase01.xmi and
>>>>>>> rule_xml_001.xmi.
>>>>>>> Our goal is trying to run those two more simpler first,and then run
>>>>>>> with Crase.xmi.
>>>>>>>
>>>>>>> About the package declaration, i still need to check what ruta version
>>>>>>> is.
>>>>>>> I will be checking this soon.
>>>>>>>
>>>>>>> All Best,
>>>>>>>
>>>>>>> Diego
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2015-05-30 0:45 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>>>>
>>>>>>>   Hi Peter!
>>>>>>>> No problem, I appreciate your support.
>>>>>>>>
>>>>>>>> All Best,
>>>>>>>>
>>>>>>>> Diego
>>>>>>>>
>>>>>>>> 2015-05-27 14:22 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>>>>>
>>>>>>>>   Hi Peter!
>>>>>>>>> We call the script with the following lines:
>>>>>>>>>
>>>>>>>>>    URL url = Resources.getResource("Main.ruta");
>>>>>>>>> String text = Resources.toString(url, Charsets.UTF_8);
>>>>>>>>>    AnalysisEngineDescription aeDes =
>>>>>>>>> Ruta.createAnalysisEngineDescription(text, tsd);
>>>>>>>>> this.ae = UIMAFramework.produceAnalysisEngine(aeDes);
>>>>>>>>>
>>>>>>>>> CAS cas = ae.newCAS();
>>>>>>>>> converter.populateCas(sentence.getTextSentence(), cas);
>>>>>>>>>    ae.process(cas);
>>>>>>>>>
>>>>>>>>> The populateCAS method is responsible for translating our
>>>>>>>>> annotations
>>>>>>>>> into RUTA annotations, but it doesn't set any type priority
>>>>>>>>> explicitly.
>>>>>>>>> We don't know much about type priorities, the RUTA references we
>>>>>>>>> found say very little about that.Are they necessary for doing what
>>>>>>>>> we need?
>>>>>>>>>
>>>>>>>>> The file that contains the above lines is available here:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/UIMAChecker.java
>>>>>>>>> The processCAS mehtod is available here:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/uima/UimaCasAdapter.java
>>>>>>>>> The script we are calling is available here:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-ruta/script/Main.ruta
>>>>>>>>>
>>>>>>>>> PS:Yes, We remembered the semicolons.
>>>>>>>>>
>>>>>>>>> Thanks for the help :)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2015-05-26 15:30 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>>>>>>
>>>>>>>>>   I think i wasn't clear enough, and i should be more specific.
>>>>>>>>>> I have a type system in which all words have been annotated as
>>>>>>>>>> Tokens. I am calling a RUTA script from a java class, and that
>>>>>>>>>> script has
>>>>>>>>>> only one rule:
>>>>>>>>>> Token Token {-> Problem}
>>>>>>>>>>
>>>>>>>>>> However, with this script, no Problems are created. When I try
>>>>>>>>>> Token {-> Problem}
>>>>>>>>>>
>>>>>>>>>> I get one problem for each Token, which is what I expected. Why
>>>>>>>>>> can't I create annotations using rules with more than one word?
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2015-05-26 14:49 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>>>>>>>
>>>>>>>>>>   Hello guys,how are you doing?
>>>>>>>>>>> I would like to know once i have called RUTA from a Java project,
>>>>>>>>>>> how can i mark consecutive tokens as a "Problem" (the name of my
>>>>>>>>>>> annotation, in this case)?
>>>>>>>>>>>
>>>>>>>>>>> Thanks in advice!
>>>>>>>>>>>
>>>>>>>>>>>

Re: Marking cosnecutive tokens with RUTA

Posted by Diego Buoro <jk...@gmail.com>.

Hi, Peter! We are attempting to create the descriptors based on Ruta 2.3,
but we're out of luck. We've added the lines from the link you gave us to
the pom.xml file and corrected the directory paths to suit our project.
However, when we try to run Maven with Ruta's "generate" goal, no files got
generated on the folders we set. Is the goal supposed to generate the files
and leave them in the folder or does it do something else?

Here is the link to our altered pom.xml. The plugin section is at the end
of the file:
https://raw.githubusercontent.com/Fichberg/cogroo4/labXP215_Will/cogroo-gc/pom.xml

Thanks for the help so far. :D

2015-06-14 9:40 GMT-03:00 Peter Klügl <pe...@averbis.com>:

> Hi,
>
> the descriptor are always created at compile time.
>
> In Ruta 2.2.1, yes, you need to create the descriptors in the UIMA Ruta
> Workbench and then copy them or make them available in some other way. This
> is especially necessary if you declare additional types (type system
> descriptor changes) or add some subscript (analysis engine descriptor
> changes).
>
> In Ruta 2.3.0 which was just released, there is a maven plugin for
> building the descriptors. Take a look at:
> http://uima.apache.org/d/ruta-current/tools.ruta.book.html#d5e3271
> This means that you do not need the UIMA Ruta Workbench projects anymore,
> but you can use its development support and descriptor building in normal
> maven projects.
>
> Best,
>
> Peter
>
>
> Am 12.06.2015 um 21:38 schrieb Diego Buoro:
>
>> Hello Peter
>>
>> We tried your suggestions and it worked liked a charm,thanks :D
>> However, we are facing another problem: It seems that our application
>> isn't
>> creating the mainTypesystem and mainEngine files when we launch it. We
>> don't know whether or not that's is the default behavior, but for now we
>> are having to create these files in separate project and them copy them to
>> the application whenever we change the script, which is a bad solution.
>> Doy you have any suggestions?
>>
>> All Best,
>>
>> Diego
>>
>> 2015-06-12 9:19 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>
>>  Hi Peter, Armin
>>>
>>> Thanks for the observations made, i hope we can finally get working here.
>>> We will try the changes in the next few days and then give you a
>>> feedback.
>>>
>>> All Best,
>>>
>>> Diego
>>>
>>>
>>>
>>> 2015-06-03 14:14 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>
>>>  Hi Peter, the example we used is the small sentence inside a string at
>>>> the end of UIMAChecker.java: "Refiro-me à trabalho remunerado.".
>>>> Based on the Main.ruta we sent you, we expected the output to contain 7
>>>> "PROBLEM" annotations. This part is working.
>>>> The problem is when we change the last line of Main.ruta from
>>>> "cgToken{->PROBLEM};" to "cgToken cgToken{->PROBLEM};"in this case we
>>>> expected 6 "PROBLEM" annotations: the same ones we had on the first
>>>> example, excpect for the first one.That's what happens when you run the
>>>> script on a simple Ruta project, but when we run it in the  Java
>>>> application we get 0 "PROBLEM" annotations.
>>>> We think this difference is happening because in the Ruta project we
>>>> don't use a simple text as input.Instead, we feed it a preprocessed xmi
>>>> file. On the other hand on the Java application, we do the processing
>>>> ourselves via the processCas method. It's possible that the processCas
>>>> method is creating tokens in a way that prevents us from detecting when
>>>> one
>>>> is next to the other on the Ruta script.
>>>> We are sending you the xmi file to use as an example for a simple Ruta
>>>> project. If there are any other examples you'd like us to send you, just
>>>> say the word :D
>>>>
>>>> Best,
>>>>
>>>> Diego
>>>>
>>>> 2015-06-01 11:15 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>
>>>>  Sorry,please disregard my last answer. The idea wasn't to use the xmi,
>>>>> we are still thinking in a minimal example to provide to you.
>>>>> We will send you in the next few days.
>>>>>
>>>>> 2015-06-01 10:37 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>>
>>>>>  Hi Peter,how are you doing?
>>>>>>
>>>>>> We were trying to run using the files such as Crase01.xmi and
>>>>>> rule_xml_001.xmi.
>>>>>> Our goal is trying to run those two more simpler first,and then run
>>>>>> with Crase.xmi.
>>>>>>
>>>>>> About the package declaration, i still need to check what ruta version
>>>>>> is.
>>>>>> I will be checking this soon.
>>>>>>
>>>>>> All Best,
>>>>>>
>>>>>> Diego
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2015-05-30 0:45 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>>>
>>>>>>  Hi Peter!
>>>>>>> No problem, I appreciate your support.
>>>>>>>
>>>>>>> All Best,
>>>>>>>
>>>>>>> Diego
>>>>>>>
>>>>>>> 2015-05-27 14:22 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>>>>
>>>>>>>  Hi Peter!
>>>>>>>> We call the script with the following lines:
>>>>>>>>
>>>>>>>>   URL url = Resources.getResource("Main.ruta");
>>>>>>>> String text = Resources.toString(url, Charsets.UTF_8);
>>>>>>>>   AnalysisEngineDescription aeDes =
>>>>>>>> Ruta.createAnalysisEngineDescription(text, tsd);
>>>>>>>> this.ae = UIMAFramework.produceAnalysisEngine(aeDes);
>>>>>>>>
>>>>>>>> CAS cas = ae.newCAS();
>>>>>>>> converter.populateCas(sentence.getTextSentence(), cas);
>>>>>>>>   ae.process(cas);
>>>>>>>>
>>>>>>>> The populateCAS method is responsible for translating our
>>>>>>>> annotations
>>>>>>>> into RUTA annotations, but it doesn't set any type priority
>>>>>>>> explicitly.
>>>>>>>> We don't know much about type priorities, the RUTA references we
>>>>>>>> found say very little about that.Are they necessary for doing what
>>>>>>>> we need?
>>>>>>>>
>>>>>>>> The file that contains the above lines is available here:
>>>>>>>>
>>>>>>>>
>>>>>>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/UIMAChecker.java
>>>>>>>> The processCAS mehtod is available here:
>>>>>>>>
>>>>>>>>
>>>>>>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/uima/UimaCasAdapter.java
>>>>>>>> The script we are calling is available here:
>>>>>>>>
>>>>>>>>
>>>>>>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-ruta/script/Main.ruta
>>>>>>>>
>>>>>>>> PS:Yes, We remembered the semicolons.
>>>>>>>>
>>>>>>>> Thanks for the help :)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 2015-05-26 15:30 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>>>>>
>>>>>>>>  I think i wasn't clear enough, and i should be more specific.
>>>>>>>>>
>>>>>>>>> I have a type system in which all words have been annotated as
>>>>>>>>> Tokens. I am calling a RUTA script from a java class, and that
>>>>>>>>> script has
>>>>>>>>> only one rule:
>>>>>>>>> Token Token {-> Problem}
>>>>>>>>>
>>>>>>>>> However, with this script, no Problems are created. When I try
>>>>>>>>> Token {-> Problem}
>>>>>>>>>
>>>>>>>>> I get one problem for each Token, which is what I expected. Why
>>>>>>>>> can't I create annotations using rules with more than one word?
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2015-05-26 14:49 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>>>>>>
>>>>>>>>>  Hello guys,how are you doing?
>>>>>>>>>>
>>>>>>>>>> I would like to know once i have called RUTA from a Java project,
>>>>>>>>>> how can i mark consecutive tokens as a "Problem" (the name of my
>>>>>>>>>> annotation, in this case)?
>>>>>>>>>>
>>>>>>>>>> Thanks in advice!
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>

Re: Marking cosnecutive tokens with RUTA

Posted by Peter Klügl <pe...@averbis.com>.

Hi,

the descriptor are always created at compile time.

In Ruta 2.2.1, yes, you need to create the descriptors in the UIMA Ruta 
Workbench and then copy them or make them available in some other way. 
This is especially necessary if you declare additional types (type 
system descriptor changes) or add some subscript (analysis engine 
descriptor changes).

In Ruta 2.3.0 which was just released, there is a maven plugin for 
building the descriptors. Take a look at: 
http://uima.apache.org/d/ruta-current/tools.ruta.book.html#d5e3271
This means that you do not need the UIMA Ruta Workbench projects 
anymore, but you can use its development support and descriptor building 
in normal maven projects.

Best,

Peter

Am 12.06.2015 um 21:38 schrieb Diego Buoro:
> Hello Peter
>
> We tried your suggestions and it worked liked a charm,thanks :D
> However, we are facing another problem: It seems that our application isn't
> creating the mainTypesystem and mainEngine files when we launch it. We
> don't know whether or not that's is the default behavior, but for now we
> are having to create these files in separate project and them copy them to
> the application whenever we change the script, which is a bad solution.
> Doy you have any suggestions?
>
> All Best,
>
> Diego
>
> 2015-06-12 9:19 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>
>> Hi Peter, Armin
>>
>> Thanks for the observations made, i hope we can finally get working here.
>> We will try the changes in the next few days and then give you a feedback.
>>
>> All Best,
>>
>> Diego
>>
>>
>>
>> 2015-06-03 14:14 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>
>>> Hi Peter, the example we used is the small sentence inside a string at
>>> the end of UIMAChecker.java: "Refiro-me à trabalho remunerado.".
>>> Based on the Main.ruta we sent you, we expected the output to contain 7
>>> "PROBLEM" annotations. This part is working.
>>> The problem is when we change the last line of Main.ruta from
>>> "cgToken{->PROBLEM};" to "cgToken cgToken{->PROBLEM};"in this case we
>>> expected 6 "PROBLEM" annotations: the same ones we had on the first
>>> example, excpect for the first one.That's what happens when you run the
>>> script on a simple Ruta project, but when we run it in the  Java
>>> application we get 0 "PROBLEM" annotations.
>>> We think this difference is happening because in the Ruta project we
>>> don't use a simple text as input.Instead, we feed it a preprocessed xmi
>>> file. On the other hand on the Java application, we do the processing
>>> ourselves via the processCas method. It's possible that the processCas
>>> method is creating tokens in a way that prevents us from detecting when one
>>> is next to the other on the Ruta script.
>>> We are sending you the xmi file to use as an example for a simple Ruta
>>> project. If there are any other examples you'd like us to send you, just
>>> say the word :D
>>>
>>> Best,
>>>
>>> Diego
>>>
>>> 2015-06-01 11:15 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>
>>>> Sorry,please disregard my last answer. The idea wasn't to use the xmi,
>>>> we are still thinking in a minimal example to provide to you.
>>>> We will send you in the next few days.
>>>>
>>>> 2015-06-01 10:37 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>
>>>>> Hi Peter,how are you doing?
>>>>>
>>>>> We were trying to run using the files such as Crase01.xmi and
>>>>> rule_xml_001.xmi.
>>>>> Our goal is trying to run those two more simpler first,and then run
>>>>> with Crase.xmi.
>>>>>
>>>>> About the package declaration, i still need to check what ruta version
>>>>> is.
>>>>> I will be checking this soon.
>>>>>
>>>>> All Best,
>>>>>
>>>>> Diego
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> 2015-05-30 0:45 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>>
>>>>>> Hi Peter!
>>>>>> No problem, I appreciate your support.
>>>>>>
>>>>>> All Best,
>>>>>>
>>>>>> Diego
>>>>>>
>>>>>> 2015-05-27 14:22 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>>>
>>>>>>> Hi Peter!
>>>>>>> We call the script with the following lines:
>>>>>>>
>>>>>>>   URL url = Resources.getResource("Main.ruta");
>>>>>>> String text = Resources.toString(url, Charsets.UTF_8);
>>>>>>>   AnalysisEngineDescription aeDes =
>>>>>>> Ruta.createAnalysisEngineDescription(text, tsd);
>>>>>>> this.ae = UIMAFramework.produceAnalysisEngine(aeDes);
>>>>>>>
>>>>>>> CAS cas = ae.newCAS();
>>>>>>> converter.populateCas(sentence.getTextSentence(), cas);
>>>>>>>   ae.process(cas);
>>>>>>>
>>>>>>> The populateCAS method is responsible for translating our annotations
>>>>>>> into RUTA annotations, but it doesn't set any type priority explicitly.
>>>>>>> We don't know much about type priorities, the RUTA references we
>>>>>>> found say very little about that.Are they necessary for doing what we need?
>>>>>>>
>>>>>>> The file that contains the above lines is available here:
>>>>>>>
>>>>>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/UIMAChecker.java
>>>>>>> The processCAS mehtod is available here:
>>>>>>>
>>>>>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/uima/UimaCasAdapter.java
>>>>>>> The script we are calling is available here:
>>>>>>>
>>>>>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-ruta/script/Main.ruta
>>>>>>>
>>>>>>> PS:Yes, We remembered the semicolons.
>>>>>>>
>>>>>>> Thanks for the help :)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2015-05-26 15:30 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>>>>
>>>>>>>> I think i wasn't clear enough, and i should be more specific.
>>>>>>>>
>>>>>>>> I have a type system in which all words have been annotated as
>>>>>>>> Tokens. I am calling a RUTA script from a java class, and that script has
>>>>>>>> only one rule:
>>>>>>>> Token Token {-> Problem}
>>>>>>>>
>>>>>>>> However, with this script, no Problems are created. When I try
>>>>>>>> Token {-> Problem}
>>>>>>>>
>>>>>>>> I get one problem for each Token, which is what I expected. Why
>>>>>>>> can't I create annotations using rules with more than one word?
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 2015-05-26 14:49 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>>>>>
>>>>>>>>> Hello guys,how are you doing?
>>>>>>>>>
>>>>>>>>> I would like to know once i have called RUTA from a Java project,
>>>>>>>>> how can i mark consecutive tokens as a "Problem" (the name of my
>>>>>>>>> annotation, in this case)?
>>>>>>>>>
>>>>>>>>> Thanks in advice!
>>>>>>>>>
>>>>>>>>

Re: Marking cosnecutive tokens with RUTA

Posted by Diego Buoro <jk...@gmail.com>.

Hello Peter

We tried your suggestions and it worked liked a charm,thanks :D
However, we are facing another problem: It seems that our application isn't
creating the mainTypesystem and mainEngine files when we launch it. We
don't know whether or not that's is the default behavior, but for now we
are having to create these files in separate project and them copy them to
the application whenever we change the script, which is a bad solution.
Doy you have any suggestions?

All Best,

Diego

2015-06-12 9:19 GMT-03:00 Diego Buoro <jk...@gmail.com>:

> Hi Peter, Armin
>
> Thanks for the observations made, i hope we can finally get working here.
> We will try the changes in the next few days and then give you a feedback.
>
> All Best,
>
> Diego
>
>
>
> 2015-06-03 14:14 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>
>> Hi Peter, the example we used is the small sentence inside a string at
>> the end of UIMAChecker.java: "Refiro-me à trabalho remunerado.".
>> Based on the Main.ruta we sent you, we expected the output to contain 7
>> "PROBLEM" annotations. This part is working.
>> The problem is when we change the last line of Main.ruta from
>> "cgToken{->PROBLEM};" to "cgToken cgToken{->PROBLEM};"in this case we
>> expected 6 "PROBLEM" annotations: the same ones we had on the first
>> example, excpect for the first one.That's what happens when you run the
>> script on a simple Ruta project, but when we run it in the  Java
>> application we get 0 "PROBLEM" annotations.
>> We think this difference is happening because in the Ruta project we
>> don't use a simple text as input.Instead, we feed it a preprocessed xmi
>> file. On the other hand on the Java application, we do the processing
>> ourselves via the processCas method. It's possible that the processCas
>> method is creating tokens in a way that prevents us from detecting when one
>> is next to the other on the Ruta script.
>> We are sending you the xmi file to use as an example for a simple Ruta
>> project. If there are any other examples you'd like us to send you, just
>> say the word :D
>>
>> Best,
>>
>> Diego
>>
>> 2015-06-01 11:15 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>
>>> Sorry,please disregard my last answer. The idea wasn't to use the xmi,
>>> we are still thinking in a minimal example to provide to you.
>>> We will send you in the next few days.
>>>
>>> 2015-06-01 10:37 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>
>>>> Hi Peter,how are you doing?
>>>>
>>>> We were trying to run using the files such as Crase01.xmi and
>>>> rule_xml_001.xmi.
>>>> Our goal is trying to run those two more simpler first,and then run
>>>> with Crase.xmi.
>>>>
>>>> About the package declaration, i still need to check what ruta version
>>>> is.
>>>> I will be checking this soon.
>>>>
>>>> All Best,
>>>>
>>>> Diego
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> 2015-05-30 0:45 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>
>>>>> Hi Peter!
>>>>> No problem, I appreciate your support.
>>>>>
>>>>> All Best,
>>>>>
>>>>> Diego
>>>>>
>>>>> 2015-05-27 14:22 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>>
>>>>>> Hi Peter!
>>>>>> We call the script with the following lines:
>>>>>>
>>>>>>  URL url = Resources.getResource("Main.ruta");
>>>>>> String text = Resources.toString(url, Charsets.UTF_8);
>>>>>>  AnalysisEngineDescription aeDes =
>>>>>> Ruta.createAnalysisEngineDescription(text, tsd);
>>>>>> this.ae = UIMAFramework.produceAnalysisEngine(aeDes);
>>>>>>
>>>>>> CAS cas = ae.newCAS();
>>>>>> converter.populateCas(sentence.getTextSentence(), cas);
>>>>>>  ae.process(cas);
>>>>>>
>>>>>> The populateCAS method is responsible for translating our annotations
>>>>>> into RUTA annotations, but it doesn't set any type priority explicitly.
>>>>>> We don't know much about type priorities, the RUTA references we
>>>>>> found say very little about that.Are they necessary for doing what we need?
>>>>>>
>>>>>> The file that contains the above lines is available here:
>>>>>>
>>>>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/UIMAChecker.java
>>>>>> The processCAS mehtod is available here:
>>>>>>
>>>>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/uima/UimaCasAdapter.java
>>>>>> The script we are calling is available here:
>>>>>>
>>>>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-ruta/script/Main.ruta
>>>>>>
>>>>>> PS:Yes, We remembered the semicolons.
>>>>>>
>>>>>> Thanks for the help :)
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2015-05-26 15:30 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>>>
>>>>>>> I think i wasn't clear enough, and i should be more specific.
>>>>>>>
>>>>>>> I have a type system in which all words have been annotated as
>>>>>>> Tokens. I am calling a RUTA script from a java class, and that script has
>>>>>>> only one rule:
>>>>>>> Token Token {-> Problem}
>>>>>>>
>>>>>>> However, with this script, no Problems are created. When I try
>>>>>>> Token {-> Problem}
>>>>>>>
>>>>>>> I get one problem for each Token, which is what I expected. Why
>>>>>>> can't I create annotations using rules with more than one word?
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2015-05-26 14:49 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>>>>
>>>>>>>> Hello guys,how are you doing?
>>>>>>>>
>>>>>>>> I would like to know once i have called RUTA from a Java project,
>>>>>>>> how can i mark consecutive tokens as a "Problem" (the name of my
>>>>>>>> annotation, in this case)?
>>>>>>>>
>>>>>>>> Thanks in advice!
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Marking cosnecutive tokens with RUTA

Posted by Diego Buoro <jk...@gmail.com>.

Hi Peter, Armin

Thanks for the observations made, i hope we can finally get working here.
We will try the changes in the next few days and then give you a feedback.

All Best,

Diego



2015-06-03 14:14 GMT-03:00 Diego Buoro <jk...@gmail.com>:

> Hi Peter, the example we used is the small sentence inside a string at the
> end of UIMAChecker.java: "Refiro-me à trabalho remunerado.".
> Based on the Main.ruta we sent you, we expected the output to contain 7
> "PROBLEM" annotations. This part is working.
> The problem is when we change the last line of Main.ruta from
> "cgToken{->PROBLEM};" to "cgToken cgToken{->PROBLEM};"in this case we
> expected 6 "PROBLEM" annotations: the same ones we had on the first
> example, excpect for the first one.That's what happens when you run the
> script on a simple Ruta project, but when we run it in the  Java
> application we get 0 "PROBLEM" annotations.
> We think this difference is happening because in the Ruta project we don't
> use a simple text as input.Instead, we feed it a preprocessed xmi file. On
> the other hand on the Java application, we do the processing ourselves via
> the processCas method. It's possible that the processCas method is creating
> tokens in a way that prevents us from detecting when one is next to the
> other on the Ruta script.
> We are sending you the xmi file to use as an example for a simple Ruta
> project. If there are any other examples you'd like us to send you, just
> say the word :D
>
> Best,
>
> Diego
>
> 2015-06-01 11:15 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>
>> Sorry,please disregard my last answer. The idea wasn't to use the xmi, we
>> are still thinking in a minimal example to provide to you.
>> We will send you in the next few days.
>>
>> 2015-06-01 10:37 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>
>>> Hi Peter,how are you doing?
>>>
>>> We were trying to run using the files such as Crase01.xmi and
>>> rule_xml_001.xmi.
>>> Our goal is trying to run those two more simpler first,and then run with
>>> Crase.xmi.
>>>
>>> About the package declaration, i still need to check what ruta version
>>> is.
>>> I will be checking this soon.
>>>
>>> All Best,
>>>
>>> Diego
>>>
>>>
>>>
>>>
>>>
>>> 2015-05-30 0:45 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>
>>>> Hi Peter!
>>>> No problem, I appreciate your support.
>>>>
>>>> All Best,
>>>>
>>>> Diego
>>>>
>>>> 2015-05-27 14:22 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>
>>>>> Hi Peter!
>>>>> We call the script with the following lines:
>>>>>
>>>>>  URL url = Resources.getResource("Main.ruta");
>>>>> String text = Resources.toString(url, Charsets.UTF_8);
>>>>>  AnalysisEngineDescription aeDes =
>>>>> Ruta.createAnalysisEngineDescription(text, tsd);
>>>>> this.ae = UIMAFramework.produceAnalysisEngine(aeDes);
>>>>>
>>>>> CAS cas = ae.newCAS();
>>>>> converter.populateCas(sentence.getTextSentence(), cas);
>>>>>  ae.process(cas);
>>>>>
>>>>> The populateCAS method is responsible for translating our annotations
>>>>> into RUTA annotations, but it doesn't set any type priority explicitly.
>>>>> We don't know much about type priorities, the RUTA references we found
>>>>> say very little about that.Are they necessary for doing what we need?
>>>>>
>>>>> The file that contains the above lines is available here:
>>>>>
>>>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/UIMAChecker.java
>>>>> The processCAS mehtod is available here:
>>>>>
>>>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/uima/UimaCasAdapter.java
>>>>> The script we are calling is available here:
>>>>>
>>>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-ruta/script/Main.ruta
>>>>>
>>>>> PS:Yes, We remembered the semicolons.
>>>>>
>>>>> Thanks for the help :)
>>>>>
>>>>>
>>>>>
>>>>> 2015-05-26 15:30 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>>
>>>>>> I think i wasn't clear enough, and i should be more specific.
>>>>>>
>>>>>> I have a type system in which all words have been annotated as
>>>>>> Tokens. I am calling a RUTA script from a java class, and that script has
>>>>>> only one rule:
>>>>>> Token Token {-> Problem}
>>>>>>
>>>>>> However, with this script, no Problems are created. When I try
>>>>>> Token {-> Problem}
>>>>>>
>>>>>> I get one problem for each Token, which is what I expected. Why can't
>>>>>> I create annotations using rules with more than one word?
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2015-05-26 14:49 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>>>
>>>>>>> Hello guys,how are you doing?
>>>>>>>
>>>>>>> I would like to know once i have called RUTA from a Java project,
>>>>>>> how can i mark consecutive tokens as a "Problem" (the name of my
>>>>>>> annotation, in this case)?
>>>>>>>
>>>>>>> Thanks in advice!
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Marking cosnecutive tokens with RUTA

Posted by Peter Klügl <pe...@averbis.com>.

I assumed that these zero-length annotations do not cause problems
anymore... I was wrong and I should do something about it. Either they
will really be ignored completely now or I need to change the sequential
matching so that they will be consumed somehow. If anyone is interested
I would explain the problems and indications in more detail in a new
jira issue.

Best,

Peter

Am 11.06.2015 um 08:38 schrieb Armin.Wegner@bka.bund.de:
> Hi,
>
> yeah, that once hit me, too. It has something to do with the internal sorting of annotations with the same start offset. I annotated some meta data for the whole document in an annotation with start offset 0 and end offset 0. That's not good. The end offset must be the length of the document text. It's fine then.
>
> Cheers,
> Armin
>
> -----Ursprüngliche Nachricht-----
> Von: Peter Klügl [mailto:peter.kluegl@averbis.com] 
> Gesendet: Mittwoch, 10. Juni 2015 21:28
> An: user@uima.apache.org
> Betreff: Re: Marking cosnecutive tokens with RUTA
>
> Hi,
>
> here are the results of my investigations:
>
> - the text of the document is not set directly. You should add something 
> like cas.setDocumentText(sentence.getDocumentText()); before populating 
> the CAS in your method. Otherwise there will be a DocumentAnnotation of 
> length 0. Ruta does not like these... that's the source of the problem. 
> If you add the line, or avoid size length annotations somehow, then the 
> rules should work just fine.
>
> - I'd rather use tcas.addFsToIndexes(sentenceAnn); instead of 
> tcas.getIndexRepository().addFS(sentenceAnn); (but that shouldn't change 
> anything)
>
> - You access the problem type "cogroo.ruta.Base.PROBLEM", but the rules 
> seem to use the type "Main.PROBLEM"
>
> Best,
>
> Peter
>
>
> Am 03.06.2015 um 19:14 schrieb Diego Buoro:
>> Hi Peter, the example we used is the small sentence inside a string at 
>> the end of UIMAChecker.java: "Refiro-me à trabalho remunerado.".
>> Based on the Main.ruta we sent you, we expected the output to contain 
>> 7 "PROBLEM" annotations. This part is working.
>> The problem is when we change the last line of Main.ruta from 
>> "cgToken{->PROBLEM};" to "cgToken cgToken{->PROBLEM};"in this case we 
>> expected 6 "PROBLEM" annotations: the same ones we had on the first 
>> example, excpect for the first one.That's what happens when you run 
>> the script on a simple Ruta project, but when we run it in the  Java 
>> application we get 0 "PROBLEM" annotations.
>> We think this difference is happening because in the Ruta project we 
>> don't use a simple text as input.Instead, we feed it a preprocessed 
>> xmi file. On the other hand on the Java application, we do the 
>> processing ourselves via the processCas method. It's possible that the 
>> processCas method is creating tokens in a way that prevents us from 
>> detecting when one is next to the other on the Ruta script.
>> We are sending you the xmi file to use as an example for a simple Ruta 
>> project. If there are any other examples you'd like us to send you, 
>> just say the word :D
>>
>> Best,
>>
>> Diego
>>
>> 2015-06-01 11:15 GMT-03:00 Diego Buoro <jklports@gmail.com 
>> <ma...@gmail.com>>:
>>
>>     Sorry,please disregard my last answer. The idea wasn't to use the
>>     xmi, we are still thinking in a minimal example to provide to you.
>>     We will send you in the next few days.
>>
>>     2015-06-01 10:37 GMT-03:00 Diego Buoro <jklports@gmail.com
>>     <ma...@gmail.com>>:
>>
>>         Hi Peter,how are you doing?
>>
>>         We were trying to run using the files such as Crase01.xmi and 
>>         rule_xml_001.xmi.
>>         Our goal is trying to run those two more simpler first,and
>>         then run with Crase.xmi.
>>
>>         About the package declaration, i still need to check what ruta
>>         version is.
>>         I will be checking this soon.
>>
>>         All Best,
>>
>>         Diego
>>
>>
>>
>>
>>
>>         2015-05-30 0:45 GMT-03:00 Diego Buoro <jklports@gmail.com
>>         <ma...@gmail.com>>:
>>
>>             Hi Peter!
>>             No problem, I appreciate your support.
>>
>>             All Best,
>>
>>             Diego
>>
>>             2015-05-27 14:22 GMT-03:00 Diego Buoro <jklports@gmail.com
>>             <ma...@gmail.com>>:
>>
>>                 Hi Peter!
>>                 We call the script with the following lines:
>>
>>                  URL url = Resources.getResource("Main.ruta");
>>                 String text = Resources.toString(url, Charsets.UTF_8);
>>                  AnalysisEngineDescription aeDes =
>>                 Ruta.createAnalysisEngineDescription(text, tsd);
>>                 this.ae <http://this.ae> =
>>                 UIMAFramework.produceAnalysisEngine(aeDes);
>>
>>                 CAS cas = ae.newCAS();
>>                 converter.populateCas(sentence.getTextSentence(), cas);
>>                  ae.process(cas);
>>
>>                 The populateCAS method is responsible for translating
>>                 our annotations into RUTA annotations, but it doesn't
>>                 set any type priority explicitly.
>>                 We don't know much about type priorities, the RUTA
>>                 references we found say very little about that.Are
>>                 they necessary for doing what we need?
>>
>>                 The file that contains the above lines is available here:
>>                 https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/UIMAChecker.java
>>                 The processCAS mehtod is available here:
>>                 https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/uima/UimaCasAdapter.java
>>                 The script we are calling is available here:
>>                 https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-ruta/script/Main.ruta
>>
>>                 PS:Yes, We remembered the semicolons.
>>
>>                 Thanks for the help :)
>>
>>
>>
>>                 2015-05-26 15:30 GMT-03:00 Diego Buoro
>>                 <jklports@gmail.com <ma...@gmail.com>>:
>>
>>                     I think i wasn't clear enough, and i should be
>>                     more specific.
>>
>>                     I have a type system in which all words have been
>>                     annotated as Tokens. I am calling a RUTA script
>>                     from a java class, and that script has only one rule:
>>                     Token Token {-> Problem}
>>
>>                     However, with this script, no Problems are
>>                     created. When I try
>>                     Token {-> Problem}
>>
>>                     I get one problem for each Token, which is what I
>>                     expected. Why can't I create annotations using
>>                     rules with more than one word?
>>
>>                     Thanks
>>
>>
>>
>>
>>                     2015-05-26 14:49 GMT-03:00 Diego Buoro
>>                     <jklports@gmail.com <ma...@gmail.com>>:
>>
>>                         Hello guys,how are you doing?
>>
>>                         I would like to know once i have called RUTA
>>                         from a Java project, how can i mark
>>                         consecutive tokens as a "Problem" (the name of
>>                         my annotation, in this case)?
>>
>>                         Thanks in advice!
>>
>>
>>
>>
>>
>>
>>

AW: Marking cosnecutive tokens with RUTA

Posted by Ar...@bka.bund.de.

Hi,

yeah, that once hit me, too. It has something to do with the internal sorting of annotations with the same start offset. I annotated some meta data for the whole document in an annotation with start offset 0 and end offset 0. That's not good. The end offset must be the length of the document text. It's fine then.

Cheers,
Armin

-----Ursprüngliche Nachricht-----
Von: Peter Klügl [mailto:peter.kluegl@averbis.com] 
Gesendet: Mittwoch, 10. Juni 2015 21:28
An: user@uima.apache.org
Betreff: Re: Marking cosnecutive tokens with RUTA

Hi,

here are the results of my investigations:

- the text of the document is not set directly. You should add something 
like cas.setDocumentText(sentence.getDocumentText()); before populating 
the CAS in your method. Otherwise there will be a DocumentAnnotation of 
length 0. Ruta does not like these... that's the source of the problem. 
If you add the line, or avoid size length annotations somehow, then the 
rules should work just fine.

- I'd rather use tcas.addFsToIndexes(sentenceAnn); instead of 
tcas.getIndexRepository().addFS(sentenceAnn); (but that shouldn't change 
anything)

- You access the problem type "cogroo.ruta.Base.PROBLEM", but the rules 
seem to use the type "Main.PROBLEM"

Best,

Peter


Am 03.06.2015 um 19:14 schrieb Diego Buoro:
> Hi Peter, the example we used is the small sentence inside a string at 
> the end of UIMAChecker.java: "Refiro-me à trabalho remunerado.".
> Based on the Main.ruta we sent you, we expected the output to contain 
> 7 "PROBLEM" annotations. This part is working.
> The problem is when we change the last line of Main.ruta from 
> "cgToken{->PROBLEM};" to "cgToken cgToken{->PROBLEM};"in this case we 
> expected 6 "PROBLEM" annotations: the same ones we had on the first 
> example, excpect for the first one.That's what happens when you run 
> the script on a simple Ruta project, but when we run it in the  Java 
> application we get 0 "PROBLEM" annotations.
> We think this difference is happening because in the Ruta project we 
> don't use a simple text as input.Instead, we feed it a preprocessed 
> xmi file. On the other hand on the Java application, we do the 
> processing ourselves via the processCas method. It's possible that the 
> processCas method is creating tokens in a way that prevents us from 
> detecting when one is next to the other on the Ruta script.
> We are sending you the xmi file to use as an example for a simple Ruta 
> project. If there are any other examples you'd like us to send you, 
> just say the word :D
>
> Best,
>
> Diego
>
> 2015-06-01 11:15 GMT-03:00 Diego Buoro <jklports@gmail.com 
> <ma...@gmail.com>>:
>
>     Sorry,please disregard my last answer. The idea wasn't to use the
>     xmi, we are still thinking in a minimal example to provide to you.
>     We will send you in the next few days.
>
>     2015-06-01 10:37 GMT-03:00 Diego Buoro <jklports@gmail.com
>     <ma...@gmail.com>>:
>
>         Hi Peter,how are you doing?
>
>         We were trying to run using the files such as Crase01.xmi and 
>         rule_xml_001.xmi.
>         Our goal is trying to run those two more simpler first,and
>         then run with Crase.xmi.
>
>         About the package declaration, i still need to check what ruta
>         version is.
>         I will be checking this soon.
>
>         All Best,
>
>         Diego
>
>
>
>
>
>         2015-05-30 0:45 GMT-03:00 Diego Buoro <jklports@gmail.com
>         <ma...@gmail.com>>:
>
>             Hi Peter!
>             No problem, I appreciate your support.
>
>             All Best,
>
>             Diego
>
>             2015-05-27 14:22 GMT-03:00 Diego Buoro <jklports@gmail.com
>             <ma...@gmail.com>>:
>
>                 Hi Peter!
>                 We call the script with the following lines:
>
>                  URL url = Resources.getResource("Main.ruta");
>                 String text = Resources.toString(url, Charsets.UTF_8);
>                  AnalysisEngineDescription aeDes =
>                 Ruta.createAnalysisEngineDescription(text, tsd);
>                 this.ae <http://this.ae> =
>                 UIMAFramework.produceAnalysisEngine(aeDes);
>
>                 CAS cas = ae.newCAS();
>                 converter.populateCas(sentence.getTextSentence(), cas);
>                  ae.process(cas);
>
>                 The populateCAS method is responsible for translating
>                 our annotations into RUTA annotations, but it doesn't
>                 set any type priority explicitly.
>                 We don't know much about type priorities, the RUTA
>                 references we found say very little about that.Are
>                 they necessary for doing what we need?
>
>                 The file that contains the above lines is available here:
>                 https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/UIMAChecker.java
>                 The processCAS mehtod is available here:
>                 https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/uima/UimaCasAdapter.java
>                 The script we are calling is available here:
>                 https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-ruta/script/Main.ruta
>
>                 PS:Yes, We remembered the semicolons.
>
>                 Thanks for the help :)
>
>
>
>                 2015-05-26 15:30 GMT-03:00 Diego Buoro
>                 <jklports@gmail.com <ma...@gmail.com>>:
>
>                     I think i wasn't clear enough, and i should be
>                     more specific.
>
>                     I have a type system in which all words have been
>                     annotated as Tokens. I am calling a RUTA script
>                     from a java class, and that script has only one rule:
>                     Token Token {-> Problem}
>
>                     However, with this script, no Problems are
>                     created. When I try
>                     Token {-> Problem}
>
>                     I get one problem for each Token, which is what I
>                     expected. Why can't I create annotations using
>                     rules with more than one word?
>
>                     Thanks
>
>
>
>
>                     2015-05-26 14:49 GMT-03:00 Diego Buoro
>                     <jklports@gmail.com <ma...@gmail.com>>:
>
>                         Hello guys,how are you doing?
>
>                         I would like to know once i have called RUTA
>                         from a Java project, how can i mark
>                         consecutive tokens as a "Problem" (the name of
>                         my annotation, in this case)?
>
>                         Thanks in advice!
>
>
>
>
>
>
>

Re: Marking cosnecutive tokens with RUTA

Posted by Peter Klügl <pe...@averbis.com>.

Hi,

here are the results of my investigations:

- the text of the document is not set directly. You should add something 
like cas.setDocumentText(sentence.getDocumentText()); before populating 
the CAS in your method. Otherwise there will be a DocumentAnnotation of 
length 0. Ruta does not like these... that's the source of the problem. 
If you add the line, or avoid size length annotations somehow, then the 
rules should work just fine.

- I'd rather use tcas.addFsToIndexes(sentenceAnn); instead of 
tcas.getIndexRepository().addFS(sentenceAnn); (but that shouldn't change 
anything)

- You access the problem type "cogroo.ruta.Base.PROBLEM", but the rules 
seem to use the type "Main.PROBLEM"

Best,

Peter


Am 03.06.2015 um 19:14 schrieb Diego Buoro:
> Hi Peter, the example we used is the small sentence inside a string at 
> the end of UIMAChecker.java: "Refiro-me à trabalho remunerado.".
> Based on the Main.ruta we sent you, we expected the output to contain 
> 7 "PROBLEM" annotations. This part is working.
> The problem is when we change the last line of Main.ruta from 
> "cgToken{->PROBLEM};" to "cgToken cgToken{->PROBLEM};"in this case we 
> expected 6 "PROBLEM" annotations: the same ones we had on the first 
> example, excpect for the first one.That's what happens when you run 
> the script on a simple Ruta project, but when we run it in the  Java 
> application we get 0 "PROBLEM" annotations.
> We think this difference is happening because in the Ruta project we 
> don't use a simple text as input.Instead, we feed it a preprocessed 
> xmi file. On the other hand on the Java application, we do the 
> processing ourselves via the processCas method. It's possible that the 
> processCas method is creating tokens in a way that prevents us from 
> detecting when one is next to the other on the Ruta script.
> We are sending you the xmi file to use as an example for a simple Ruta 
> project. If there are any other examples you'd like us to send you, 
> just say the word :D
>
> Best,
>
> Diego
>
> 2015-06-01 11:15 GMT-03:00 Diego Buoro <jklports@gmail.com 
> <ma...@gmail.com>>:
>
>     Sorry,please disregard my last answer. The idea wasn't to use the
>     xmi, we are still thinking in a minimal example to provide to you.
>     We will send you in the next few days.
>
>     2015-06-01 10:37 GMT-03:00 Diego Buoro <jklports@gmail.com
>     <ma...@gmail.com>>:
>
>         Hi Peter,how are you doing?
>
>         We were trying to run using the files such as Crase01.xmi and 
>         rule_xml_001.xmi.
>         Our goal is trying to run those two more simpler first,and
>         then run with Crase.xmi.
>
>         About the package declaration, i still need to check what ruta
>         version is.
>         I will be checking this soon.
>
>         All Best,
>
>         Diego
>
>
>
>
>
>         2015-05-30 0:45 GMT-03:00 Diego Buoro <jklports@gmail.com
>         <ma...@gmail.com>>:
>
>             Hi Peter!
>             No problem, I appreciate your support.
>
>             All Best,
>
>             Diego
>
>             2015-05-27 14:22 GMT-03:00 Diego Buoro <jklports@gmail.com
>             <ma...@gmail.com>>:
>
>                 Hi Peter!
>                 We call the script with the following lines:
>
>                  URL url = Resources.getResource("Main.ruta");
>                 String text = Resources.toString(url, Charsets.UTF_8);
>                  AnalysisEngineDescription aeDes =
>                 Ruta.createAnalysisEngineDescription(text, tsd);
>                 this.ae <http://this.ae> =
>                 UIMAFramework.produceAnalysisEngine(aeDes);
>
>                 CAS cas = ae.newCAS();
>                 converter.populateCas(sentence.getTextSentence(), cas);
>                  ae.process(cas);
>
>                 The populateCAS method is responsible for translating
>                 our annotations into RUTA annotations, but it doesn't
>                 set any type priority explicitly.
>                 We don't know much about type priorities, the RUTA
>                 references we found say very little about that.Are
>                 they necessary for doing what we need?
>
>                 The file that contains the above lines is available here:
>                 https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/UIMAChecker.java
>                 The processCAS mehtod is available here:
>                 https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/uima/UimaCasAdapter.java
>                 The script we are calling is available here:
>                 https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-ruta/script/Main.ruta
>
>                 PS:Yes, We remembered the semicolons.
>
>                 Thanks for the help :)
>
>
>
>                 2015-05-26 15:30 GMT-03:00 Diego Buoro
>                 <jklports@gmail.com <ma...@gmail.com>>:
>
>                     I think i wasn't clear enough, and i should be
>                     more specific.
>
>                     I have a type system in which all words have been
>                     annotated as Tokens. I am calling a RUTA script
>                     from a java class, and that script has only one rule:
>                     Token Token {-> Problem}
>
>                     However, with this script, no Problems are
>                     created. When I try
>                     Token {-> Problem}
>
>                     I get one problem for each Token, which is what I
>                     expected. Why can't I create annotations using
>                     rules with more than one word?
>
>                     Thanks
>
>
>
>
>                     2015-05-26 14:49 GMT-03:00 Diego Buoro
>                     <jklports@gmail.com <ma...@gmail.com>>:
>
>                         Hello guys,how are you doing?
>
>                         I would like to know once i have called RUTA
>                         from a Java project, how can i mark
>                         consecutive tokens as a "Problem" (the name of
>                         my annotation, in this case)?
>
>                         Thanks in advice!
>
>
>
>
>
>
>

Re: Marking cosnecutive tokens with RUTA

Posted by Peter Klügl <pe...@averbis.com>.

Sorry, I haven't found the time yet to look into it, but I will soon...

Best,

Peter

Am 03.06.2015 um 19:14 schrieb Diego Buoro:
> Hi Peter, the example we used is the small sentence inside a string at
> the end of UIMAChecker.java: "Refiro-me à trabalho remunerado.".
> Based on the Main.ruta we sent you, we expected the output to contain
> 7 "PROBLEM" annotations. This part is working.
> The problem is when we change the last line of Main.ruta from
> "cgToken{->PROBLEM};" to "cgToken cgToken{->PROBLEM};"in this case we
> expected 6 "PROBLEM" annotations: the same ones we had on the first
> example, excpect for the first one.That's what happens when you run
> the script on a simple Ruta project, but when we run it in the  Java
> application we get 0 "PROBLEM" annotations.
> We think this difference is happening because in the Ruta project we
> don't use a simple text as input.Instead, we feed it a preprocessed
> xmi file. On the other hand on the Java application, we do the
> processing ourselves via the processCas method. It's possible that the
> processCas method is creating tokens in a way that prevents us from
> detecting when one is next to the other on the Ruta script.
> We are sending you the xmi file to use as an example for a simple Ruta
> project. If there are any other examples you'd like us to send you,
> just say the word :D
>
> Best,
>
> Diego
>
> 2015-06-01 11:15 GMT-03:00 Diego Buoro <jklports@gmail.com
> <ma...@gmail.com>>:
>
>     Sorry,please disregard my last answer. The idea wasn't to use the
>     xmi, we are still thinking in a minimal example to provide to you.
>     We will send you in the next few days.
>
>     2015-06-01 10:37 GMT-03:00 Diego Buoro <jklports@gmail.com
>     <ma...@gmail.com>>:
>
>         Hi Peter,how are you doing?
>
>         We were trying to run using the files such as Crase01.xmi and 
>         rule_xml_001.xmi.
>         Our goal is trying to run those two more simpler first,and
>         then run with Crase.xmi.
>
>         About the package declaration, i still need to check what ruta
>         version is.
>         I will be checking this soon.
>
>         All Best,
>
>         Diego
>
>
>
>
>
>         2015-05-30 0:45 GMT-03:00 Diego Buoro <jklports@gmail.com
>         <ma...@gmail.com>>:
>
>             Hi Peter!
>             No problem, I appreciate your support.
>
>             All Best, 
>
>             Diego
>
>             2015-05-27 14:22 GMT-03:00 Diego Buoro <jklports@gmail.com
>             <ma...@gmail.com>>:
>
>                 Hi Peter!
>                 We call the script with the following lines:
>
>                  URL url = Resources.getResource("Main.ruta");
>                 String text = Resources.toString(url, Charsets.UTF_8);
>                  AnalysisEngineDescription aeDes =
>                 Ruta.createAnalysisEngineDescription(text, tsd);
>                 this.ae <http://this.ae> =
>                 UIMAFramework.produceAnalysisEngine(aeDes);
>
>                 CAS cas = ae.newCAS();
>                 converter.populateCas(sentence.getTextSentence(), cas);
>                  ae.process(cas);           
>
>                 The populateCAS method is responsible for translating
>                 our annotations into RUTA annotations, but it doesn't
>                 set any type priority explicitly.
>                 We don't know much about type priorities, the RUTA
>                 references we found say very little about that.Are
>                 they necessary for doing what we need?
>
>                 The file that contains the above lines is available here:
>                 https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/UIMAChecker.java
>                 The processCAS mehtod is available here:
>                 https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/uima/UimaCasAdapter.java
>                 The script we are calling is available here:
>                 https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-ruta/script/Main.ruta
>
>                 PS:Yes, We remembered the semicolons.
>
>                 Thanks for the help :)
>
>                  
>
>                 2015-05-26 15:30 GMT-03:00 Diego Buoro
>                 <jklports@gmail.com <ma...@gmail.com>>:
>
>                     I think i wasn't clear enough, and i should be
>                     more specific.
>
>                     I have a type system in which all words have been
>                     annotated as Tokens. I am calling a RUTA script
>                     from a java class, and that script has only one rule:
>                     Token Token {-> Problem}
>
>                     However, with this script, no Problems are
>                     created. When I try
>                     Token {-> Problem}
>
>                     I get one problem for each Token, which is what I
>                     expected. Why can't I create annotations using
>                     rules with more than one word?
>
>                     Thanks
>
>
>
>
>                     2015-05-26 14:49 GMT-03:00 Diego Buoro
>                     <jklports@gmail.com <ma...@gmail.com>>:
>
>                         Hello guys,how are you doing?
>
>                         I would like to know once i have called RUTA
>                         from a Java project, how can i mark
>                         consecutive tokens as a "Problem" (the name of
>                         my annotation, in this case)?
>
>                         Thanks in advice!
>
>
>
>
>
>
>

Re: Marking cosnecutive tokens with RUTA

Posted by Diego Buoro <jk...@gmail.com>.

Hi Peter, the example we used is the small sentence inside a string at the
end of UIMAChecker.java: "Refiro-me à trabalho remunerado.".
Based on the Main.ruta we sent you, we expected the output to contain 7
"PROBLEM" annotations. This part is working.
The problem is when we change the last line of Main.ruta from
"cgToken{->PROBLEM};" to "cgToken cgToken{->PROBLEM};"in this case we
expected 6 "PROBLEM" annotations: the same ones we had on the first
example, excpect for the first one.That's what happens when you run the
script on a simple Ruta project, but when we run it in the  Java
application we get 0 "PROBLEM" annotations.
We think this difference is happening because in the Ruta project we don't
use a simple text as input.Instead, we feed it a preprocessed xmi file. On
the other hand on the Java application, we do the processing ourselves via
the processCas method. It's possible that the processCas method is creating
tokens in a way that prevents us from detecting when one is next to the
other on the Ruta script.
We are sending you the xmi file to use as an example for a simple Ruta
project. If there are any other examples you'd like us to send you, just
say the word :D

Best,

Diego

2015-06-01 11:15 GMT-03:00 Diego Buoro <jk...@gmail.com>:

> Sorry,please disregard my last answer. The idea wasn't to use the xmi, we
> are still thinking in a minimal example to provide to you.
> We will send you in the next few days.
>
> 2015-06-01 10:37 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>
>> Hi Peter,how are you doing?
>>
>> We were trying to run using the files such as Crase01.xmi and
>> rule_xml_001.xmi.
>> Our goal is trying to run those two more simpler first,and then run with
>> Crase.xmi.
>>
>> About the package declaration, i still need to check what ruta version is.
>> I will be checking this soon.
>>
>> All Best,
>>
>> Diego
>>
>>
>>
>>
>>
>> 2015-05-30 0:45 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>
>>> Hi Peter!
>>> No problem, I appreciate your support.
>>>
>>> All Best,
>>>
>>> Diego
>>>
>>> 2015-05-27 14:22 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>
>>>> Hi Peter!
>>>> We call the script with the following lines:
>>>>
>>>>  URL url = Resources.getResource("Main.ruta");
>>>> String text = Resources.toString(url, Charsets.UTF_8);
>>>>  AnalysisEngineDescription aeDes =
>>>> Ruta.createAnalysisEngineDescription(text, tsd);
>>>> this.ae = UIMAFramework.produceAnalysisEngine(aeDes);
>>>>
>>>> CAS cas = ae.newCAS();
>>>> converter.populateCas(sentence.getTextSentence(), cas);
>>>>  ae.process(cas);
>>>>
>>>> The populateCAS method is responsible for translating our annotations
>>>> into RUTA annotations, but it doesn't set any type priority explicitly.
>>>> We don't know much about type priorities, the RUTA references we found
>>>> say very little about that.Are they necessary for doing what we need?
>>>>
>>>> The file that contains the above lines is available here:
>>>>
>>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/UIMAChecker.java
>>>> The processCAS mehtod is available here:
>>>>
>>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/uima/UimaCasAdapter.java
>>>> The script we are calling is available here:
>>>>
>>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-ruta/script/Main.ruta
>>>>
>>>> PS:Yes, We remembered the semicolons.
>>>>
>>>> Thanks for the help :)
>>>>
>>>>
>>>>
>>>> 2015-05-26 15:30 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>
>>>>> I think i wasn't clear enough, and i should be more specific.
>>>>>
>>>>> I have a type system in which all words have been annotated as Tokens.
>>>>> I am calling a RUTA script from a java class, and that script has only one
>>>>> rule:
>>>>> Token Token {-> Problem}
>>>>>
>>>>> However, with this script, no Problems are created. When I try
>>>>> Token {-> Problem}
>>>>>
>>>>> I get one problem for each Token, which is what I expected. Why can't
>>>>> I create annotations using rules with more than one word?
>>>>>
>>>>> Thanks
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> 2015-05-26 14:49 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>>
>>>>>> Hello guys,how are you doing?
>>>>>>
>>>>>> I would like to know once i have called RUTA from a Java project, how
>>>>>> can i mark consecutive tokens as a "Problem" (the name of my annotation, in
>>>>>> this case)?
>>>>>>
>>>>>> Thanks in advice!
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Marking cosnecutive tokens with RUTA

Posted by Diego Buoro <jk...@gmail.com>.

Sorry,please disregard my last answer. The idea wasn't to use the xmi, we
are still thinking in a minimal example to provide to you.
We will send you in the next few days.

2015-06-01 10:37 GMT-03:00 Diego Buoro <jk...@gmail.com>:

> Hi Peter,how are you doing?
>
> We were trying to run using the files such as Crase01.xmi and
> rule_xml_001.xmi.
> Our goal is trying to run those two more simpler first,and then run with
> Crase.xmi.
>
> About the package declaration, i still need to check what ruta version is.
> I will be checking this soon.
>
> All Best,
>
> Diego
>
>
>
>
>
> 2015-05-30 0:45 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>
>> Hi Peter!
>> No problem, I appreciate your support.
>>
>> All Best,
>>
>> Diego
>>
>> 2015-05-27 14:22 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>
>>> Hi Peter!
>>> We call the script with the following lines:
>>>
>>>  URL url = Resources.getResource("Main.ruta");
>>> String text = Resources.toString(url, Charsets.UTF_8);
>>>  AnalysisEngineDescription aeDes =
>>> Ruta.createAnalysisEngineDescription(text, tsd);
>>> this.ae = UIMAFramework.produceAnalysisEngine(aeDes);
>>>
>>> CAS cas = ae.newCAS();
>>> converter.populateCas(sentence.getTextSentence(), cas);
>>>  ae.process(cas);
>>>
>>> The populateCAS method is responsible for translating our annotations
>>> into RUTA annotations, but it doesn't set any type priority explicitly.
>>> We don't know much about type priorities, the RUTA references we found
>>> say very little about that.Are they necessary for doing what we need?
>>>
>>> The file that contains the above lines is available here:
>>>
>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/UIMAChecker.java
>>> The processCAS mehtod is available here:
>>>
>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/uima/UimaCasAdapter.java
>>> The script we are calling is available here:
>>>
>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-ruta/script/Main.ruta
>>>
>>> PS:Yes, We remembered the semicolons.
>>>
>>> Thanks for the help :)
>>>
>>>
>>>
>>> 2015-05-26 15:30 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>
>>>> I think i wasn't clear enough, and i should be more specific.
>>>>
>>>> I have a type system in which all words have been annotated as Tokens.
>>>> I am calling a RUTA script from a java class, and that script has only one
>>>> rule:
>>>> Token Token {-> Problem}
>>>>
>>>> However, with this script, no Problems are created. When I try
>>>> Token {-> Problem}
>>>>
>>>> I get one problem for each Token, which is what I expected. Why can't I
>>>> create annotations using rules with more than one word?
>>>>
>>>> Thanks
>>>>
>>>>
>>>>
>>>>
>>>> 2015-05-26 14:49 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>>
>>>>> Hello guys,how are you doing?
>>>>>
>>>>> I would like to know once i have called RUTA from a Java project, how
>>>>> can i mark consecutive tokens as a "Problem" (the name of my annotation, in
>>>>> this case)?
>>>>>
>>>>> Thanks in advice!
>>>>>
>>>>
>>>>
>>>
>>
>

Re: Marking cosnecutive tokens with RUTA

Posted by Diego Buoro <jk...@gmail.com>.

Hi Peter,how are you doing?

We were trying to run using the files such as Crase01.xmi and
rule_xml_001.xmi.
Our goal is trying to run those two more simpler first,and then run with
Crase.xmi.

About the package declaration, i still need to check what ruta version is.
I will be checking this soon.

All Best,

Diego





2015-05-30 0:45 GMT-03:00 Diego Buoro <jk...@gmail.com>:

> Hi Peter!
> No problem, I appreciate your support.
>
> All Best,
>
> Diego
>
> 2015-05-27 14:22 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>
>> Hi Peter!
>> We call the script with the following lines:
>>
>>  URL url = Resources.getResource("Main.ruta");
>> String text = Resources.toString(url, Charsets.UTF_8);
>>  AnalysisEngineDescription aeDes =
>> Ruta.createAnalysisEngineDescription(text, tsd);
>> this.ae = UIMAFramework.produceAnalysisEngine(aeDes);
>>
>> CAS cas = ae.newCAS();
>> converter.populateCas(sentence.getTextSentence(), cas);
>>  ae.process(cas);
>>
>> The populateCAS method is responsible for translating our annotations
>> into RUTA annotations, but it doesn't set any type priority explicitly.
>> We don't know much about type priorities, the RUTA references we found
>> say very little about that.Are they necessary for doing what we need?
>>
>> The file that contains the above lines is available here:
>>
>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/UIMAChecker.java
>> The processCAS mehtod is available here:
>>
>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/uima/UimaCasAdapter.java
>> The script we are calling is available here:
>>
>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-ruta/script/Main.ruta
>>
>> PS:Yes, We remembered the semicolons.
>>
>> Thanks for the help :)
>>
>>
>>
>> 2015-05-26 15:30 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>
>>> I think i wasn't clear enough, and i should be more specific.
>>>
>>> I have a type system in which all words have been annotated as Tokens. I
>>> am calling a RUTA script from a java class, and that script has only one
>>> rule:
>>> Token Token {-> Problem}
>>>
>>> However, with this script, no Problems are created. When I try
>>> Token {-> Problem}
>>>
>>> I get one problem for each Token, which is what I expected. Why can't I
>>> create annotations using rules with more than one word?
>>>
>>> Thanks
>>>
>>>
>>>
>>>
>>> 2015-05-26 14:49 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>>
>>>> Hello guys,how are you doing?
>>>>
>>>> I would like to know once i have called RUTA from a Java project, how
>>>> can i mark consecutive tokens as a "Problem" (the name of my annotation, in
>>>> this case)?
>>>>
>>>> Thanks in advice!
>>>>
>>>
>>>
>>
>

Re: Marking cosnecutive tokens with RUTA

Posted by Diego Buoro <jk...@gmail.com>.

Hi Peter!
No problem, I appreciate your support.

All Best,

Diego

2015-05-27 14:22 GMT-03:00 Diego Buoro <jk...@gmail.com>:

> Hi Peter!
> We call the script with the following lines:
>
>  URL url = Resources.getResource("Main.ruta");
> String text = Resources.toString(url, Charsets.UTF_8);
>  AnalysisEngineDescription aeDes =
> Ruta.createAnalysisEngineDescription(text, tsd);
> this.ae = UIMAFramework.produceAnalysisEngine(aeDes);
>
> CAS cas = ae.newCAS();
> converter.populateCas(sentence.getTextSentence(), cas);
>  ae.process(cas);
>
> The populateCAS method is responsible for translating our annotations into
> RUTA annotations, but it doesn't set any type priority explicitly.
> We don't know much about type priorities, the RUTA references we found say
> very little about that.Are they necessary for doing what we need?
>
> The file that contains the above lines is available here:
>
> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/UIMAChecker.java
> The processCAS mehtod is available here:
>
> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/uima/UimaCasAdapter.java
> The script we are calling is available here:
>
> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-ruta/script/Main.ruta
>
> PS:Yes, We remembered the semicolons.
>
> Thanks for the help :)
>
>
>
> 2015-05-26 15:30 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>
>> I think i wasn't clear enough, and i should be more specific.
>>
>> I have a type system in which all words have been annotated as Tokens. I
>> am calling a RUTA script from a java class, and that script has only one
>> rule:
>> Token Token {-> Problem}
>>
>> However, with this script, no Problems are created. When I try
>> Token {-> Problem}
>>
>> I get one problem for each Token, which is what I expected. Why can't I
>> create annotations using rules with more than one word?
>>
>> Thanks
>>
>>
>>
>>
>> 2015-05-26 14:49 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>>
>>> Hello guys,how are you doing?
>>>
>>> I would like to know once i have called RUTA from a Java project, how
>>> can i mark consecutive tokens as a "Problem" (the name of my annotation, in
>>> this case)?
>>>
>>> Thanks in advice!
>>>
>>
>>
>

Re: Marking cosnecutive tokens with RUTA

Posted by Diego Buoro <jk...@gmail.com>.

Hi Peter!
We call the script with the following lines:

 URL url = Resources.getResource("Main.ruta");
String text = Resources.toString(url, Charsets.UTF_8);
 AnalysisEngineDescription aeDes =
Ruta.createAnalysisEngineDescription(text, tsd);
this.ae = UIMAFramework.produceAnalysisEngine(aeDes);

CAS cas = ae.newCAS();
converter.populateCas(sentence.getTextSentence(), cas);
 ae.process(cas);

The populateCAS method is responsible for translating our annotations into
RUTA annotations, but it doesn't set any type priority explicitly.
We don't know much about type priorities, the RUTA references we found say
very little about that.Are they necessary for doing what we need?

The file that contains the above lines is available here:
https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/UIMAChecker.java
The processCAS mehtod is available here:
https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/uima/UimaCasAdapter.java
The script we are calling is available here:
https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-ruta/script/Main.ruta

PS:Yes, We remembered the semicolons.

Thanks for the help :)



2015-05-26 15:30 GMT-03:00 Diego Buoro <jk...@gmail.com>:

> I think i wasn't clear enough, and i should be more specific.
>
> I have a type system in which all words have been annotated as Tokens. I
> am calling a RUTA script from a java class, and that script has only one
> rule:
> Token Token {-> Problem}
>
> However, with this script, no Problems are created. When I try
> Token {-> Problem}
>
> I get one problem for each Token, which is what I expected. Why can't I
> create annotations using rules with more than one word?
>
> Thanks
>
>
>
>
> 2015-05-26 14:49 GMT-03:00 Diego Buoro <jk...@gmail.com>:
>
>> Hello guys,how are you doing?
>>
>> I would like to know once i have called RUTA from a Java project, how can
>> i mark consecutive tokens as a "Problem" (the name of my annotation, in
>> this case)?
>>
>> Thanks in advice!
>>
>
>

Re: Marking cosnecutive tokens with RUTA

Posted by Diego Buoro <jk...@gmail.com>.

I think i wasn't clear enough, and i should be more specific.

I have a type system in which all words have been annotated as Tokens. I am
calling a RUTA script from a java class, and that script has only one rule:
Token Token {-> Problem}

However, with this script, no Problems are created. When I try
Token {-> Problem}

I get one problem for each Token, which is what I expected. Why can't I
create annotations using rules with more than one word?

Thanks




2015-05-26 14:49 GMT-03:00 Diego Buoro <jk...@gmail.com>:

> Hello guys,how are you doing?
>
> I would like to know once i have called RUTA from a Java project, how can
> i mark consecutive tokens as a "Problem" (the name of my annotation, in
> this case)?
>
> Thanks in advice!
>