You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by Mansi verma <ve...@gmail.com> on 2013/05/23 09:36:39 UTC

Adding features to TokenAnnotation and DictTerm in Concept Mapper

Hi

I am using UIMA for annotating some documents using Concept Mapper. I have
built the dictionaries and configured it to our requirements. However, I
wanted to add
 features to TokenAnnotation and DictTerm.

For Example existing TokenAnnotation annotation supports the following
features :
text, tokenType, tokenClass and uima.tt.tokenAnnotation. Now if I want to
add more features such as POS or group etc.

Is there a neat way of doing it without touching the conceptMapper.jar and
changing our typesystem to extend the two types ?


Thanks
Manisha

Re: Adding features to TokenAnnotation and DictTerm in Concept Mapper

Posted by Mansi verma <ve...@gmail.com>.
Maintaining a single dictionary might be cumbersome, I have over million
entries in all the dictionaries. Keeping them in separate files helps me
maintain them as well. Could we modify concept mapper to accept multiple
dictionaries and output different types per dictionary.

I could parametrize concept mapper further to accept multiple dictionaries
and their output annotations. And load them in a single dictionary in the
memory with output annotation type attached to some feature in that
dictionary.



On Fri, May 24, 2013 at 9:16 PM, Michael Tanenblatt <slothrop@park-slope.net
> wrote:

> I am preparing an update to ConceptMapper that may be of some help with
> your first problem: the dictionary must still be in one file, but entries
> can be create different resulting annotations, based on a particular
> feature value of an entry. For example, this will allow you to have a POS
> feature associated with each entry and then specify the output to be a
> NounAnnotation for POS=noun, VerbAnnotation for POS=verb, etc.
> Additionally, if a term appears multiple times in the dictionary, it can
> produce multiple resulting annotations.
>
> These changes are being tested locally, and if all goes well, I will start
> the process to releasing them
>
> Michael
>
>
>
> On May 24, 2013, at 8:53 AM, Mansi verma <ve...@gmail.com> wrote:
>
> > Hi Renaud
> >
> > I did. Thanks, it will solve the problem. I had another doubt.
> >
> > As of now I have some 10-15 lookup files and as of now I have a
> descriptor
> > for each one of them. Is there a way to engineer lookup from one
> > descriptor. It could take resulting annotation and lookup file as
> arguments
> > and one by one use the lookup file to create resulting annotation.
> >
> > Another question. JCasUtil supports selection of annotations of 1 class
> at
> > a time. What if I want to send a set or list of classes whose annotations
> > have to be selected. Is there a function that supports selection of
> > multiple annotation types in one call.
> >
> > Thanks
> > Manisha
> >
> >
> > On Thu, May 23, 2013 at 2:18 PM, Renaud Richardet
> > <re...@epfl.ch>wrote:
> >
> >> Hi Manisha,
> >>
> >> Did you try the configuration params "AttributeList" and "FeatureList"?
> >>
> >> (see the docs
> >>
> >>
> http://uima.apache.org/downloads/sandbox/ConceptMapperAnnotatorUserGuide/ConceptMapperAnnotatorUserGuide.html#configParams
> >> )
> >>
> >> -- Renaud
> >>
> >>
> >> On Thu, May 23, 2013 at 9:36 AM, Mansi verma <ve...@gmail.com>
> wrote:
> >>
> >>> Hi
> >>>
> >>> I am using UIMA for annotating some documents using Concept Mapper. I
> >> have
> >>> built the dictionaries and configured it to our requirements. However,
> I
> >>> wanted to add
> >>> features to TokenAnnotation and DictTerm.
> >>>
> >>> For Example existing TokenAnnotation annotation supports the following
> >>> features :
> >>> text, tokenType, tokenClass and uima.tt.tokenAnnotation. Now if I want
> to
> >>> add more features such as POS or group etc.
> >>>
> >>> Is there a neat way of doing it without touching the conceptMapper.jar
> >> and
> >>> changing our typesystem to extend the two types ?
> >>>
> >>>
> >>> Thanks
> >>> Manisha
> >>>
> >>
> >>
> >>
> >> --
> >> Renaud Richardet
> >> Blue Brain Project  PhD candidate
> >> EPFL  Station 15
> >> CH-1015 Lausanne
> >> phone: +41-78-675-9501
> >> http://people.epfl.ch/renaud.richardet
> >>
>
>

Re: Adding features to TokenAnnotation and DictTerm in Concept Mapper

Posted by Michael Tanenblatt <sl...@park-slope.net>.
I am preparing an update to ConceptMapper that may be of some help with your first problem: the dictionary must still be in one file, but entries can be create different resulting annotations, based on a particular feature value of an entry. For example, this will allow you to have a POS feature associated with each entry and then specify the output to be a NounAnnotation for POS=noun, VerbAnnotation for POS=verb, etc. Additionally, if a term appears multiple times in the dictionary, it can produce multiple resulting annotations. 

These changes are being tested locally, and if all goes well, I will start the process to releasing them

Michael



On May 24, 2013, at 8:53 AM, Mansi verma <ve...@gmail.com> wrote:

> Hi Renaud
> 
> I did. Thanks, it will solve the problem. I had another doubt.
> 
> As of now I have some 10-15 lookup files and as of now I have a descriptor
> for each one of them. Is there a way to engineer lookup from one
> descriptor. It could take resulting annotation and lookup file as arguments
> and one by one use the lookup file to create resulting annotation.
> 
> Another question. JCasUtil supports selection of annotations of 1 class at
> a time. What if I want to send a set or list of classes whose annotations
> have to be selected. Is there a function that supports selection of
> multiple annotation types in one call.
> 
> Thanks
> Manisha
> 
> 
> On Thu, May 23, 2013 at 2:18 PM, Renaud Richardet
> <re...@epfl.ch>wrote:
> 
>> Hi Manisha,
>> 
>> Did you try the configuration params "AttributeList" and "FeatureList"?
>> 
>> (see the docs
>> 
>> http://uima.apache.org/downloads/sandbox/ConceptMapperAnnotatorUserGuide/ConceptMapperAnnotatorUserGuide.html#configParams
>> )
>> 
>> -- Renaud
>> 
>> 
>> On Thu, May 23, 2013 at 9:36 AM, Mansi verma <ve...@gmail.com> wrote:
>> 
>>> Hi
>>> 
>>> I am using UIMA for annotating some documents using Concept Mapper. I
>> have
>>> built the dictionaries and configured it to our requirements. However, I
>>> wanted to add
>>> features to TokenAnnotation and DictTerm.
>>> 
>>> For Example existing TokenAnnotation annotation supports the following
>>> features :
>>> text, tokenType, tokenClass and uima.tt.tokenAnnotation. Now if I want to
>>> add more features such as POS or group etc.
>>> 
>>> Is there a neat way of doing it without touching the conceptMapper.jar
>> and
>>> changing our typesystem to extend the two types ?
>>> 
>>> 
>>> Thanks
>>> Manisha
>>> 
>> 
>> 
>> 
>> --
>> Renaud Richardet
>> Blue Brain Project  PhD candidate
>> EPFL  Station 15
>> CH-1015 Lausanne
>> phone: +41-78-675-9501
>> http://people.epfl.ch/renaud.richardet
>> 


Re: Adding features to TokenAnnotation and DictTerm in Concept Mapper

Posted by Richard Eckart de Castilho <ri...@gmail.com>.
Am 24.05.2013 um 14:53 schrieb Mansi verma <ve...@gmail.com>:

> Another question. JCasUtil supports selection of annotations of 1 class at
> a time. What if I want to send a set or list of classes whose annotations
> have to be selected. Is there a function that supports selection of
> multiple annotation types in one call.

If all these annotation types have a common super-type, you can select that
super-type. Otherwise, no. If you want such a method, please consider opening
an issue against uimaFIT asking for it and explaining your story.

Cheers,

-- Richard

Re: Adding features to TokenAnnotation and DictTerm in Concept Mapper

Posted by Mansi verma <ve...@gmail.com>.
Hi Renaud

I did. Thanks, it will solve the problem. I had another doubt.

As of now I have some 10-15 lookup files and as of now I have a descriptor
for each one of them. Is there a way to engineer lookup from one
descriptor. It could take resulting annotation and lookup file as arguments
and one by one use the lookup file to create resulting annotation.

Another question. JCasUtil supports selection of annotations of 1 class at
a time. What if I want to send a set or list of classes whose annotations
have to be selected. Is there a function that supports selection of
multiple annotation types in one call.

Thanks
Manisha


On Thu, May 23, 2013 at 2:18 PM, Renaud Richardet
<re...@epfl.ch>wrote:

> Hi Manisha,
>
> Did you try the configuration params "AttributeList" and "FeatureList"?
>
> (see the docs
>
> http://uima.apache.org/downloads/sandbox/ConceptMapperAnnotatorUserGuide/ConceptMapperAnnotatorUserGuide.html#configParams
> )
>
> -- Renaud
>
>
> On Thu, May 23, 2013 at 9:36 AM, Mansi verma <ve...@gmail.com> wrote:
>
> > Hi
> >
> > I am using UIMA for annotating some documents using Concept Mapper. I
> have
> > built the dictionaries and configured it to our requirements. However, I
> > wanted to add
> >  features to TokenAnnotation and DictTerm.
> >
> > For Example existing TokenAnnotation annotation supports the following
> > features :
> > text, tokenType, tokenClass and uima.tt.tokenAnnotation. Now if I want to
> > add more features such as POS or group etc.
> >
> > Is there a neat way of doing it without touching the conceptMapper.jar
> and
> > changing our typesystem to extend the two types ?
> >
> >
> > Thanks
> > Manisha
> >
>
>
>
> --
> Renaud Richardet
> Blue Brain Project  PhD candidate
> EPFL  Station 15
> CH-1015 Lausanne
> phone: +41-78-675-9501
> http://people.epfl.ch/renaud.richardet
>

Re: Adding features to TokenAnnotation and DictTerm in Concept Mapper

Posted by Renaud Richardet <re...@epfl.ch>.
Hi Manisha,

Did you try the configuration params "AttributeList" and "FeatureList"?

(see the docs
http://uima.apache.org/downloads/sandbox/ConceptMapperAnnotatorUserGuide/ConceptMapperAnnotatorUserGuide.html#configParams
)

-- Renaud


On Thu, May 23, 2013 at 9:36 AM, Mansi verma <ve...@gmail.com> wrote:

> Hi
>
> I am using UIMA for annotating some documents using Concept Mapper. I have
> built the dictionaries and configured it to our requirements. However, I
> wanted to add
>  features to TokenAnnotation and DictTerm.
>
> For Example existing TokenAnnotation annotation supports the following
> features :
> text, tokenType, tokenClass and uima.tt.tokenAnnotation. Now if I want to
> add more features such as POS or group etc.
>
> Is there a neat way of doing it without touching the conceptMapper.jar and
> changing our typesystem to extend the two types ?
>
>
> Thanks
> Manisha
>



-- 
Renaud Richardet
Blue Brain Project  PhD candidate
EPFL  Station 15
CH-1015 Lausanne
phone: +41-78-675-9501
http://people.epfl.ch/renaud.richardet