You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by digital paula <cy...@hotmail.com> on 2013/11/06 19:40:46 UTC

Using Regex Annotator: Adding a default value for a type system feature

Hi Again UIMA Community  (specifically Marshall and Richard ;-)
 
I've been working with the regex annotator  and adding new types to search on in text.  By the way, the documentation for the regex annotator has been really helpful in explaining how to use this add-on.
 
Okay,  I created a new type called 'Computing' that will annotate from text based on this regular expression 
  regEx="(comp[a-z0-9]+)"   l updated the concepts.xml file and the regex descriptor.   Everything works as I expect but what I'd like to do now is add a default value of 'computing'  to the feature I added 'setTextCapture'  so I have a mapping of all variations found in the text to be associated with one value.    
 
For example, lets say the text stated:   the computting system works as expected and the compituing center is set up correctly.
 
The two misspelled words for computing are annotated per the regex expression but for each annotation I want be able to also add a feature that has a specified default value, in this case it would be setTextCapture="computing".   Is there a way to do this?   
 
Here's what I added to the concepts.xml file (the line setTextCapture is not there) but I don't know what to put for the setTextCapture to make it a default of "computing".   I can't just add "computing" it won't work since it has to be using a regular expression code it appears.  
 
	<concept name="Computing_Detection">    		<rules> 		   <rule  			  regEx="(comp[a-z0-9]+)"          		matchStrategy="matchAll"           			matchType="uima.tcas.DocumentAnnotation" />             		 </rules>    		<createAnnotations>       		<annotation id="compute"       		type="org.apache.uima.Computing">          		<begin group="0" />          		<end group="0" />           		<setFeature name="getTextCapture" type="String" normalization="Trim">$0</setFeature><setFeature name="setTextCapture" type="String" don't know what to put here to make it have a value of "computing"</setFeature>  			</annotation>   		</createAnnotations> </concept> 
 
I've also added an attachment so you can see what I mean as illustrated using the CVD tool.
 
Hope you guys can help.  
 
Thanks.
 
Regards,
Paula
 		 	   		  

RE: Using Regex Annotator: Adding a default value for a type system feature -RESOLVED

Posted by digital paula <cy...@hotmail.com>.
I didn't attach anything in last email just pasted to the email but I guess no point highlighting any lines since the user list site is in plaintext.   While we're at it resolving the issue so attachments can be added can we also transition to a rich text forum to preserve formatting set by the user's email? :-)
 
> From: cybersation@hotmail.com
> To: user@uima.apache.org
> Subject: RE: Using Regex Annotator: Adding a default value for a type system feature -RESOLVED
> Date: Wed, 6 Nov 2013 20:37:17 -0500
> 
> Richard,
>  
> I had no idea that the attachments weren't going through.....why not?   They do for other apache user forums. 
>  
> Thank you so much for your prompt response and help,  all I had to do was add the text 'computing' without any hyphens or quotes.   Sorry for wasting your time on such a trivial case, I tried with the quotes around it "computing" and didn't think to try it without.    I've attached and highlighted the line for other users to see what was done to resolve the issue.   
>  
> 	<concept name="Computing_Detection"> 
> <rules> 
> 
> <rule 
> 
> regEx="(comp[a-z0-9]+)"
> 
> matchStrategy="matchAll" 
> 
> matchType="uima.tcas.DocumentAnnotation" />           
> 
> </rules> 
> 
> <createAnnotations> 
> 
> <annotation id="compute" 
> 
> type="org.apache.uima.Computing"> 
> 
> <begin group="0" /> 
> 
> <end group="0" />  
> 
> <setFeature name="getTextCapture" type="String" normalization="Trim">$0</setFeature>
> 
> <setFeature name="setTextCapture" type="String">computing</setFeature>
> 
> </annotation>
> 
> </createAnnotations> 
> 
> </concept> 
>  
> Regards,
> Paula
> 
>  
> > Subject: Re: Using Regex Annotator: Adding a default value for a type system feature
> > From: rec@apache.org
> > Date: Wed, 6 Nov 2013 22:33:32 +0100
> > To: user@uima.apache.org
> > 
> > Hi,
> > 
> > first, I think that the list strips attachments - at least I never got one in any of your past mails including this one.
> > 
> > Although, the documentation of the regex annotator doesn't seem to state it, setting feature values works for me:
> > 
> >     <createAnnotations>
> >       <annotation id="substQuot" type="de.tudarmstadt.ukp.dkpro.ugd.applychangesannotator.SofaChangeAnnotation">
> >         <begin group="0"/>
> >         <end group="0"/>
> >         <setFeature name="operation" type="String">replace</setFeature>
> >         <setFeature name="value" type="String">&quot;</setFeature>
> >         <setFeature name="reason" type="String">substQuot</setFeature>
> >       </annotation>
> >     </createAnnotations>
> > 
> > So the text in setFeature doesn't have to refer to a capturing group as far as I can tell (mind, it has been quite a while
> > since I last tried that).
> > 
> > -- Richard
> > 
> > On 06.11.2013, at 19:40, digital paula <cy...@hotmail.com> wrote:
> > 
> > > Hi Again UIMA Community  (specifically Marshall and Richard ;-)
> > >  
> > > I've been working with the regex annotator  and adding new types to search on in text.  By the way, the documentation for the regex annotator has been really helpful in explaining how to use this add-on.
> > >  
> > > Okay,  I created a new type called 'Computing' that will annotate from text based on this regular expression 
> > > regEx="(comp[a-z0-9]+)"   l updated the concepts.xml file and the regex descriptor.   Everything works as I expect but what I'd like to do now is add a default value of 'computing'  to the feature I added 'setTextCapture'  so I have a mapping of all variations found in the text to be associated with one value.    
> > >  
> > > For example, lets say the text stated:   the computting system works as expected and the compituing center is set up correctly.
> > >  
> > > The two misspelled words for computing are annotated per the regex expression but for each annotation I want be able to also add a feature that has a specified default value, in this case it would be setTextCapture="computing".   Is there a way to do this?   
> > >  
> > > Here's what I added to the concepts.xml file (the line setTextCapture is not there) but I don't know what to put for the setTextCapture to make it a default of "computing".   I can't just add "computing" it won't work since it has to be using a regular expression code it appears.  
> > >  
> > > <concept name="Computing_Detection">
> > > <rules>
> > > <rule
> > > regEx="(comp[a-z0-9]+)"
> > > matchStrategy="matchAll"
> > > matchType="uima.tcas.DocumentAnnotation" />          
> > > </rules>
> > > <createAnnotations>
> > > <annotation id="compute"
> > > type="org.apache.uima.Computing">
> > > <begin group="0" />
> > > <end group="0" /> 
> > > <setFeature name="getTextCapture" type="String" normalization="Trim">$0</setFeature>
> > > <setFeature name="setTextCapture" type="String" don't know what to put here to make it have a value of "computing"</setFeature>
> > > </annotation>
> > > </createAnnotations>
> > > </concept> 
> > >  
> > > I've also added an attachment so you can see what I mean as illustrated using the CVD tool.
> > >  
> > > Hope you guys can help.  
> > >  
> > > Thanks.
> > >  
> > > Regards,
> > > Paula
> > 
>  		 	   		  
 		 	   		  

Re: Using Regex Annotator: Adding a default value for a type system feature -RESOLVED

Posted by Marshall Schor <ms...@schor.com>.
On 11/7/2013 1:51 PM, digital paula wrote:
>
>
>
>
>
> Hi Richard,
>  
> What I had meant to state is that I've seen attachments display on the forum list for other user lists.  For example,  trying to get the regex annotator working I had spent a lot of time viewing the archives for XML Beans.   That forum permits attachments.   I didn't post there but you can see the attachments added to this particular user's post just scroll to the bottom.  
> http://mail-archives.apache.org/mod_mbox/xmlbeans-user/201311.mbox/%3cCAAXRxE2C0ODEPwzRfu9j9G=K6Vg5UJxwtRtcRORYfYJRpeM3dw@mail.gmail.com%3e
>  
> I thought that my posts with UIMA would've done the same thing and add the attachments to the bottom of the post  like in the above link for the XML Beans user forum.    Not sure if this is something that the user has to do to permit attachments to display or if there's some configuration that needs to be done to the user list on the server end for UIMA to permit attachments. 

The normal configuration for most Apache project's mail servers is to prohibit
attachments.  I've updated the website page for mailing lists, and included the
wording commonly used in other Apache projects to recommend other approaches.

UIMA's mail service doesn't support attachments.

-Marshall
>  
>  As for  your  comment for the concepts. xml file update, thanks SO much for the feedback.   I really haven't gotten much into accessing the CAS, I just know that the CAS (Common Analysis System, I think that's what CAS stands for though I should look it up or maybe it stands for Common Access System)  gets updated as it moves through the pipeline.  Oh,  I also know that that JCAS is the java interface for accessing the CAS.     At any rate, I shouldn't be using method setter/getter names for feature names.   I'll change it to the following: 
>  <setFeature name="Capture" type="String" normalization="Trim">$0</setFeature>
> <setFeature name="Category" type="String">computing</setFeature>
> Again, thank you so much for your additional thoughts here.   As mentioned before, I've started working with UIMA for almost a month now (still learning) and this forum is really helpful.  You guys are great! Regards,Paula
>  
>> Subject: Re: Using Regex Annotator: Adding a default value for a type system feature -RESOLVED
>> From: rec@apache.org
>> Date: Thu, 7 Nov 2013 08:48:51 +0100
>> To: user@uima.apache.org
>>
>> On 07.11.2013, at 02:37, digital paula <cy...@hotmail.com> wrote:
>>
>>> Richard,
>>>
>>> I had no idea that the attachments weren't going through.....why not?   They do for other apache user forums. 
>> Well, I do not know, but every time you said there was an attachment (twice now if I remember correctly), I didn't get any.
>>
>>> Thank you so much for your prompt response and help,  all I had to do was add the text 'computing' without any hyphens or quotes.   Sorry for wasting your time on such a trivial case, I tried with the quotes around it "computing" and didn't think to try it without.    I've attached and highlighted the line for other users to see what was done to resolve the issue.   
>>>
>>> <setFeature name="getTextCapture" type="String" normalization="Trim">$0</setFeature>
>>> <setFeature name="setTextCapture" type="String">computing</setFeature>
>>
>> There seems to be something odd here. Normally, a feature would be called "textCapture" and the JCas class for the annotation that contains the feature would be generated with two methods "getTextCapture" and "setTextCapture". 
>>
>> It appears that you have two features, one called "getTextCapture" and another "setTextCapture". When you generate JCas classes for these in order to access the annotations from Java source code, you would end up with "getGetTextCapture"/"setGetTextCapture" and "getSetTextCapture"/"setSetTextCapture". If you really intend to have two features, you should consider choosing different names, e.g. "capture" and "category" ("type" is a "reserved" feature name).
>>
>> I didn't check the documentation of the regexannotator in detail to see if it does something smart here that I don't know, but I doubt it. 
>>
>> -- Richard
>>
>
>  		 	   		  


RE: Using Regex Annotator: Adding a default value for a type system feature -RESOLVED

Posted by digital paula <cy...@hotmail.com>.





Hi Richard,
 
What I had meant to state is that I've seen attachments display on the forum list for other user lists.  For example,  trying to get the regex annotator working I had spent a lot of time viewing the archives for XML Beans.   That forum permits attachments.   I didn't post there but you can see the attachments added to this particular user's post just scroll to the bottom.  
http://mail-archives.apache.org/mod_mbox/xmlbeans-user/201311.mbox/%3cCAAXRxE2C0ODEPwzRfu9j9G=K6Vg5UJxwtRtcRORYfYJRpeM3dw@mail.gmail.com%3e
 
I thought that my posts with UIMA would've done the same thing and add the attachments to the bottom of the post  like in the above link for the XML Beans user forum.    Not sure if this is something that the user has to do to permit attachments to display or if there's some configuration that needs to be done to the user list on the server end for UIMA to permit attachments. 
 
 As for  your  comment for the concepts. xml file update, thanks SO much for the feedback.   I really haven't gotten much into accessing the CAS, I just know that the CAS (Common Analysis System, I think that's what CAS stands for though I should look it up or maybe it stands for Common Access System)  gets updated as it moves through the pipeline.  Oh,  I also know that that JCAS is the java interface for accessing the CAS.     At any rate, I shouldn't be using method setter/getter names for feature names.   I'll change it to the following: 
 <setFeature name="Capture" type="String" normalization="Trim">$0</setFeature>
<setFeature name="Category" type="String">computing</setFeature>
Again, thank you so much for your additional thoughts here.   As mentioned before, I've started working with UIMA for almost a month now (still learning) and this forum is really helpful.  You guys are great! Regards,Paula
 
> Subject: Re: Using Regex Annotator: Adding a default value for a type system feature -RESOLVED
> From: rec@apache.org
> Date: Thu, 7 Nov 2013 08:48:51 +0100
> To: user@uima.apache.org
> 
> On 07.11.2013, at 02:37, digital paula <cy...@hotmail.com> wrote:
> 
> > Richard,
> > 
> > I had no idea that the attachments weren't going through.....why not?   They do for other apache user forums. 
> 
> Well, I do not know, but every time you said there was an attachment (twice now if I remember correctly), I didn't get any.
> 
> > Thank you so much for your prompt response and help,  all I had to do was add the text 'computing' without any hyphens or quotes.   Sorry for wasting your time on such a trivial case, I tried with the quotes around it "computing" and didn't think to try it without.    I've attached and highlighted the line for other users to see what was done to resolve the issue.   
> > 
> > <setFeature name="getTextCapture" type="String" normalization="Trim">$0</setFeature>
> > <setFeature name="setTextCapture" type="String">computing</setFeature>
> 
> 
> There seems to be something odd here. Normally, a feature would be called "textCapture" and the JCas class for the annotation that contains the feature would be generated with two methods "getTextCapture" and "setTextCapture". 
> 
> It appears that you have two features, one called "getTextCapture" and another "setTextCapture". When you generate JCas classes for these in order to access the annotations from Java source code, you would end up with "getGetTextCapture"/"setGetTextCapture" and "getSetTextCapture"/"setSetTextCapture". If you really intend to have two features, you should consider choosing different names, e.g. "capture" and "category" ("type" is a "reserved" feature name).
> 
> I didn't check the documentation of the regexannotator in detail to see if it does something smart here that I don't know, but I doubt it. 
> 
> -- Richard
> 


 		 	   		  

Re: Using Regex Annotator: Adding a default value for a type system feature -RESOLVED

Posted by Richard Eckart de Castilho <re...@apache.org>.
On 07.11.2013, at 02:37, digital paula <cy...@hotmail.com> wrote:

> Richard,
> 
> I had no idea that the attachments weren't going through.....why not?   They do for other apache user forums. 

Well, I do not know, but every time you said there was an attachment (twice now if I remember correctly), I didn't get any.

> Thank you so much for your prompt response and help,  all I had to do was add the text 'computing' without any hyphens or quotes.   Sorry for wasting your time on such a trivial case, I tried with the quotes around it "computing" and didn't think to try it without.    I've attached and highlighted the line for other users to see what was done to resolve the issue.   
> 
> <setFeature name="getTextCapture" type="String" normalization="Trim">$0</setFeature>
> <setFeature name="setTextCapture" type="String">computing</setFeature>


There seems to be something odd here. Normally, a feature would be called "textCapture" and the JCas class for the annotation that contains the feature would be generated with two methods "getTextCapture" and "setTextCapture". 

It appears that you have two features, one called "getTextCapture" and another "setTextCapture". When you generate JCas classes for these in order to access the annotations from Java source code, you would end up with "getGetTextCapture"/"setGetTextCapture" and "getSetTextCapture"/"setSetTextCapture". If you really intend to have two features, you should consider choosing different names, e.g. "capture" and "category" ("type" is a "reserved" feature name).

I didn't check the documentation of the regexannotator in detail to see if it does something smart here that I don't know, but I doubt it. 

-- Richard


RE: Using Regex Annotator: Adding a default value for a type system feature -RESOLVED

Posted by digital paula <cy...@hotmail.com>.
Richard,
 
I had no idea that the attachments weren't going through.....why not?   They do for other apache user forums. 
 
Thank you so much for your prompt response and help,  all I had to do was add the text 'computing' without any hyphens or quotes.   Sorry for wasting your time on such a trivial case, I tried with the quotes around it "computing" and didn't think to try it without.    I've attached and highlighted the line for other users to see what was done to resolve the issue.   
 
	<concept name="Computing_Detection"> 
<rules> 

<rule 

regEx="(comp[a-z0-9]+)"

matchStrategy="matchAll" 

matchType="uima.tcas.DocumentAnnotation" />           

</rules> 

<createAnnotations> 

<annotation id="compute" 

type="org.apache.uima.Computing"> 

<begin group="0" /> 

<end group="0" />  

<setFeature name="getTextCapture" type="String" normalization="Trim">$0</setFeature>

<setFeature name="setTextCapture" type="String">computing</setFeature>

</annotation>

</createAnnotations> 

</concept> 
 
Regards,
Paula

 
> Subject: Re: Using Regex Annotator: Adding a default value for a type system feature
> From: rec@apache.org
> Date: Wed, 6 Nov 2013 22:33:32 +0100
> To: user@uima.apache.org
> 
> Hi,
> 
> first, I think that the list strips attachments - at least I never got one in any of your past mails including this one.
> 
> Although, the documentation of the regex annotator doesn't seem to state it, setting feature values works for me:
> 
>     <createAnnotations>
>       <annotation id="substQuot" type="de.tudarmstadt.ukp.dkpro.ugd.applychangesannotator.SofaChangeAnnotation">
>         <begin group="0"/>
>         <end group="0"/>
>         <setFeature name="operation" type="String">replace</setFeature>
>         <setFeature name="value" type="String">&quot;</setFeature>
>         <setFeature name="reason" type="String">substQuot</setFeature>
>       </annotation>
>     </createAnnotations>
> 
> So the text in setFeature doesn't have to refer to a capturing group as far as I can tell (mind, it has been quite a while
> since I last tried that).
> 
> -- Richard
> 
> On 06.11.2013, at 19:40, digital paula <cy...@hotmail.com> wrote:
> 
> > Hi Again UIMA Community  (specifically Marshall and Richard ;-)
> >  
> > I've been working with the regex annotator  and adding new types to search on in text.  By the way, the documentation for the regex annotator has been really helpful in explaining how to use this add-on.
> >  
> > Okay,  I created a new type called 'Computing' that will annotate from text based on this regular expression 
> > regEx="(comp[a-z0-9]+)"   l updated the concepts.xml file and the regex descriptor.   Everything works as I expect but what I'd like to do now is add a default value of 'computing'  to the feature I added 'setTextCapture'  so I have a mapping of all variations found in the text to be associated with one value.    
> >  
> > For example, lets say the text stated:   the computting system works as expected and the compituing center is set up correctly.
> >  
> > The two misspelled words for computing are annotated per the regex expression but for each annotation I want be able to also add a feature that has a specified default value, in this case it would be setTextCapture="computing".   Is there a way to do this?   
> >  
> > Here's what I added to the concepts.xml file (the line setTextCapture is not there) but I don't know what to put for the setTextCapture to make it a default of "computing".   I can't just add "computing" it won't work since it has to be using a regular expression code it appears.  
> >  
> > <concept name="Computing_Detection">
> > <rules>
> > <rule
> > regEx="(comp[a-z0-9]+)"
> > matchStrategy="matchAll"
> > matchType="uima.tcas.DocumentAnnotation" />          
> > </rules>
> > <createAnnotations>
> > <annotation id="compute"
> > type="org.apache.uima.Computing">
> > <begin group="0" />
> > <end group="0" /> 
> > <setFeature name="getTextCapture" type="String" normalization="Trim">$0</setFeature>
> > <setFeature name="setTextCapture" type="String" don't know what to put here to make it have a value of "computing"</setFeature>
> > </annotation>
> > </createAnnotations>
> > </concept> 
> >  
> > I've also added an attachment so you can see what I mean as illustrated using the CVD tool.
> >  
> > Hope you guys can help.  
> >  
> > Thanks.
> >  
> > Regards,
> > Paula
> 
 		 	   		  

Re: Using Regex Annotator: Adding a default value for a type system feature

Posted by Richard Eckart de Castilho <re...@apache.org>.
Hi,

first, I think that the list strips attachments - at least I never got one in any of your past mails including this one.

Although, the documentation of the regex annotator doesn't seem to state it, setting feature values works for me:

    <createAnnotations>
      <annotation id="substQuot" type="de.tudarmstadt.ukp.dkpro.ugd.applychangesannotator.SofaChangeAnnotation">
        <begin group="0"/>
        <end group="0"/>
        <setFeature name="operation" type="String">replace</setFeature>
        <setFeature name="value" type="String">&quot;</setFeature>
        <setFeature name="reason" type="String">substQuot</setFeature>
      </annotation>
    </createAnnotations>

So the text in setFeature doesn't have to refer to a capturing group as far as I can tell (mind, it has been quite a while
since I last tried that).

-- Richard

On 06.11.2013, at 19:40, digital paula <cy...@hotmail.com> wrote:

> Hi Again UIMA Community  (specifically Marshall and Richard ;-)
>  
> I've been working with the regex annotator  and adding new types to search on in text.  By the way, the documentation for the regex annotator has been really helpful in explaining how to use this add-on.
>  
> Okay,  I created a new type called 'Computing' that will annotate from text based on this regular expression 
> regEx="(comp[a-z0-9]+)"   l updated the concepts.xml file and the regex descriptor.   Everything works as I expect but what I'd like to do now is add a default value of 'computing'  to the feature I added 'setTextCapture'  so I have a mapping of all variations found in the text to be associated with one value.    
>  
> For example, lets say the text stated:   the computting system works as expected and the compituing center is set up correctly.
>  
> The two misspelled words for computing are annotated per the regex expression but for each annotation I want be able to also add a feature that has a specified default value, in this case it would be setTextCapture="computing".   Is there a way to do this?   
>  
> Here's what I added to the concepts.xml file (the line setTextCapture is not there) but I don't know what to put for the setTextCapture to make it a default of "computing".   I can't just add "computing" it won't work since it has to be using a regular expression code it appears.  
>  
> <concept name="Computing_Detection">
> <rules>
> <rule
> regEx="(comp[a-z0-9]+)"
> matchStrategy="matchAll"
> matchType="uima.tcas.DocumentAnnotation" />          
> </rules>
> <createAnnotations>
> <annotation id="compute"
> type="org.apache.uima.Computing">
> <begin group="0" />
> <end group="0" /> 
> <setFeature name="getTextCapture" type="String" normalization="Trim">$0</setFeature>
> <setFeature name="setTextCapture" type="String" don't know what to put here to make it have a value of "computing"</setFeature>
> </annotation>
> </createAnnotations>
> </concept> 
>  
> I've also added an attachment so you can see what I mean as illustrated using the CVD tool.
>  
> Hope you guys can help.  
>  
> Thanks.
>  
> Regards,
> Paula