You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by Ahmed Abdeen Hamed <ah...@gmail.com> on 2008/07/06 06:40:18 UTC

ConceptMapper: Dictionary application to certain parts of a document

Hello,
I have a quick question about the ConceptMapper project. How can I apply
dictionary terms to a certain part of a document? For example, if you have
documents that have titles and abstracts and you need only to find terms
that appear in the abstract not the title, how do you do that? Also, if you
would like to apply a filter such as detecting a certain POS like names vs
verbs. How would you approach this problem? Are there examples that I can
take a look at? Please let me know if you have an answer for me.
Thanks in advance!
Ahmed

Re: ConceptMapper: Dictionary application to certain parts of a document

Posted by Ahmed Abdeen Hamed <ah...@gmail.com>.
Thank you Dr. Bill. That sounds very much like what I wanted. I will let you
know if I get stuck with implementing it.
Best wishes,
Ahmed

On Mon, Jul 7, 2008 at 2:45 PM, J. William Murdock <bi...@murdocks.org>
wrote:

> Here is a solution that may be a bit inefficient, but fits well into the
> framework.  Produce an aggregate with three elements:
>
> 1) An annotator that takes the text you want to analyze (e.g.,
> abstractAnnotation.getCoveredText()) and copies it into a new sofa.
>
> 2) ConceptMapper; your main aggregate should specify a sofa mapping to make
> it run on the sofa that the first annotator created.
>
> 3) An annotator that copies all of the annotations from the sofa that the
> first annotator created to the default sofa.  When copying, the annotations,
> it adds the begin offset of the text that was analyzed (e.g.,
> abstractAnnotation.getBegin()).
>
> --
> Bill Murdock, PhD
> UIMA User (but not a UIMA developer or official spokesperson)
> IBM Watson Research Center
> 19 Skyline Dr., Hawthorne, NY  10532  USA
> http://bill.murdocks.org
>
>
>
> Ahmed Abdeen Hamed wrote:
>
>> Hi David,
>> Thank you for your response. I actually wrote annotators that find useful
>> things. Is there a way you can get access to those annotators from your
>> aggregate analysis engine that get produced by UIMAFramework? I could do a
>> work around and only pass the text that I am interested in parsing.
>> However,
>> my solution is required to be within the UIMA framework.
>> Thanks again!
>> Ahmed
>>
>> On Mon, Jul 7, 2008 at 2:11 PM, David Buttler <bu...@llnl.gov> wrote:
>>
>>
>>
>>> This seems very straight-forward to me.  My approach may not be the most
>>> efficient, but I would
>>> 1) write a wrapper around the ConceptMapper code so that you only pass it
>>> spans of text that you would find useful. 2) write a post processing
>>> filter
>>> that throws away any tag that occurs in a region of the text that you
>>> think
>>> is inappropriate (e.g. if you do not want to tag a verb)
>>>
>>> All of this would most easily be put into a single processing component
>>> so
>>> you don't have unwanted annotations in your CAS
>>>
>>> Dave
>>>
>>>
>>>
>>> Ahmed Abdeen Hamed wrote:
>>>
>>>
>>>
>>>> Hello,
>>>> I have a quick question about the ConceptMapper project. How can I apply
>>>> dictionary terms to a certain part of a document? For example, if you
>>>> have
>>>> documents that have titles and abstracts and you need only to find terms
>>>> that appear in the abstract not the title, how do you do that? Also, if
>>>> you
>>>> would like to apply a filter such as detecting a certain POS like names
>>>> vs
>>>> verbs. How would you approach this problem? Are there examples that I
>>>> can
>>>> take a look at? Please let me know if you have an answer for me.
>>>> Thanks in advance!
>>>> Ahmed
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>
>

Re: ConceptMapper: Dictionary application to certain parts of a document

Posted by "J. William Murdock" <bi...@murdocks.org>.
Here is a solution that may be a bit inefficient, but fits well into the 
framework.  Produce an aggregate with three elements:

1) An annotator that takes the text you want to analyze (e.g., 
abstractAnnotation.getCoveredText()) and copies it into a new sofa.

2) ConceptMapper; your main aggregate should specify a sofa mapping to 
make it run on the sofa that the first annotator created.

3) An annotator that copies all of the annotations from the sofa that 
the first annotator created to the default sofa.  When copying, the 
annotations, it adds the begin offset of the text that was analyzed 
(e.g., abstractAnnotation.getBegin()).

-- 
Bill Murdock, PhD
UIMA User (but not a UIMA developer or official spokesperson)
IBM Watson Research Center
19 Skyline Dr., Hawthorne, NY  10532  USA
http://bill.murdocks.org


Ahmed Abdeen Hamed wrote:
> Hi David,
> Thank you for your response. I actually wrote annotators that find useful
> things. Is there a way you can get access to those annotators from your
> aggregate analysis engine that get produced by UIMAFramework? I could do a
> work around and only pass the text that I am interested in parsing. However,
> my solution is required to be within the UIMA framework.
> Thanks again!
> Ahmed
>
> On Mon, Jul 7, 2008 at 2:11 PM, David Buttler <bu...@llnl.gov> wrote:
>
>   
>> This seems very straight-forward to me.  My approach may not be the most
>> efficient, but I would
>> 1) write a wrapper around the ConceptMapper code so that you only pass it
>> spans of text that you would find useful. 2) write a post processing filter
>> that throws away any tag that occurs in a region of the text that you think
>> is inappropriate (e.g. if you do not want to tag a verb)
>>
>> All of this would most easily be put into a single processing component so
>> you don't have unwanted annotations in your CAS
>>
>> Dave
>>
>>
>>
>> Ahmed Abdeen Hamed wrote:
>>
>>     
>>> Hello,
>>> I have a quick question about the ConceptMapper project. How can I apply
>>> dictionary terms to a certain part of a document? For example, if you have
>>> documents that have titles and abstracts and you need only to find terms
>>> that appear in the abstract not the title, how do you do that? Also, if
>>> you
>>> would like to apply a filter such as detecting a certain POS like names vs
>>> verbs. How would you approach this problem? Are there examples that I can
>>> take a look at? Please let me know if you have an answer for me.
>>> Thanks in advance!
>>> Ahmed
>>>
>>>
>>>
>>>       
>>     
>
>   


Re: ConceptMapper: Dictionary application to certain parts of a document

Posted by Ahmed Abdeen Hamed <ah...@gmail.com>.
Hi David,
Thank you for your response. I actually wrote annotators that find useful
things. Is there a way you can get access to those annotators from your
aggregate analysis engine that get produced by UIMAFramework? I could do a
work around and only pass the text that I am interested in parsing. However,
my solution is required to be within the UIMA framework.
Thanks again!
Ahmed

On Mon, Jul 7, 2008 at 2:11 PM, David Buttler <bu...@llnl.gov> wrote:

> This seems very straight-forward to me.  My approach may not be the most
> efficient, but I would
> 1) write a wrapper around the ConceptMapper code so that you only pass it
> spans of text that you would find useful. 2) write a post processing filter
> that throws away any tag that occurs in a region of the text that you think
> is inappropriate (e.g. if you do not want to tag a verb)
>
> All of this would most easily be put into a single processing component so
> you don't have unwanted annotations in your CAS
>
> Dave
>
>
>
> Ahmed Abdeen Hamed wrote:
>
>> Hello,
>> I have a quick question about the ConceptMapper project. How can I apply
>> dictionary terms to a certain part of a document? For example, if you have
>> documents that have titles and abstracts and you need only to find terms
>> that appear in the abstract not the title, how do you do that? Also, if
>> you
>> would like to apply a filter such as detecting a certain POS like names vs
>> verbs. How would you approach this problem? Are there examples that I can
>> take a look at? Please let me know if you have an answer for me.
>> Thanks in advance!
>> Ahmed
>>
>>
>>
>
>

Re: ConceptMapper: Dictionary application to certain parts of a document

Posted by David Buttler <bu...@llnl.gov>.
This seems very straight-forward to me.  My approach may not be the most 
efficient, but I would
1) write a wrapper around the ConceptMapper code so that you only pass 
it spans of text that you would find useful. 
2) write a post processing filter that throws away any tag that occurs 
in a region of the text that you think is inappropriate (e.g. if you do 
not want to tag a verb)

All of this would most easily be put into a single processing component 
so you don't have unwanted annotations in your CAS

Dave


Ahmed Abdeen Hamed wrote:
> Hello,
> I have a quick question about the ConceptMapper project. How can I apply
> dictionary terms to a certain part of a document? For example, if you have
> documents that have titles and abstracts and you need only to find terms
> that appear in the abstract not the title, how do you do that? Also, if you
> would like to apply a filter such as detecting a certain POS like names vs
> verbs. How would you approach this problem? Are there examples that I can
> take a look at? Please let me know if you have an answer for me.
> Thanks in advance!
> Ahmed
>
>